About

An HTTP filter plugin that parses Anthropic Messages API requests and responses and populates Envoy's dynamic filter metadata with structured information extracted from both the request and response bodies.

This allows downstream filters and route configurations to make decisions based on the contents of the LLM request and response without re-parsing the bodies themselves.

Attribute Naming

Metadata keys follow the OpenInference Semantic Conventions. List attributes are flattened using indexed dot notation as described in the LLM Spans attribute flattening spec (e.g. llm.input_messages.0.message.role).

Configuration Reference

Field Type Required Default Description
metadata_namespace string no io.builtonenvoy.anthropic The filter metadata namespace for the decoded fields

Usage Examples

Basic usage (default namespace)

Decode incoming Anthropic Messages API requests and expose metadata under the default io.builtonenvoy.anthropic namespace using OpenInference semantic conventions. Downstream filters can access io.builtonenvoy.anthropic.llm.model_name, io.builtonenvoy.anthropic.llm.input_messages.0.message.role, etc.

boe run --extension anthropic-decoder \
  --test-upstream-host api.anthropic.com

# Send a messages request
curl -X POST http://localhost:10000/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "system": "You are a helpful assistant.",
    "messages": [
      {"role": "user", "content": "What is the weather today?"}
    ]
  }'

# Envoy filter metadata will contain (namespace: "io.builtonenvoy.anthropic"):
# llm.model_name                         = "claude-sonnet-4-20250514"
# llm.system                             = "anthropic"
# llm.input_messages.count               = 2
# llm.input_messages.0.message.role      = "system"
# llm.input_messages.0.message.content   = "You are a helpful assistant."
# llm.input_messages.1.message.role      = "user"
# llm.input_messages.1.message.content   = "What is the weather today?"
# llm.tools.count                        = 0

Custom metadata namespace

Use a custom namespace to avoid conflicts with other filters that also write to filter metadata.

boe run --extension anthropic-decoder \
  --config '{"metadata_namespace": "llm-request"}' \
  --test-upstream-host api.anthropic.com

# Metadata will now be under the "llm-request" namespace:
# llm.model_name                       = "claude-sonnet-4-20250514"
# llm.system                           = "anthropic"
# llm.input_messages.count             = 1
# llm.input_messages.0.message.role    = "user"
# llm.input_messages.0.message.content = "..."
# llm.tools.count                      = 0

Request with tools

When the request includes tool definitions, each tool is stored under llm.tools.N.tool.json_schema as a JSON string, following the OpenInference spec.

boe run --extension anthropic-decoder \
  --test-upstream-host api.anthropic.com

curl -X POST http://localhost:10000/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Book a flight to NYC"}],
    "tools": [
      {"name": "book_flight", "description": "Book a flight", "input_schema": {"type": "object"}},
      {"name": "cancel_flight", "description": "Cancel a flight", "input_schema": {"type": "object"}}
    ]
  }'

# llm.tools.count              = 2
# llm.tools.0.tool.json_schema = '{"name":"book_flight","description":"Book a flight","input_schema":{"type":"object"}}'
# llm.tools.1.tool.json_schema = '{"name":"cancel_flight","description":"Cancel a flight","input_schema":{"type":"object"}}'

Tool use in conversation with response metadata

When a multi-turn conversation includes an assistant message with a tool use, the filter captures the tool call details from the request. The response metadata includes the assistant reply and token usage.

boe run --extension anthropic-decoder \
  --test-upstream-host api.anthropic.com

curl -X POST http://localhost:10000/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is the weather in NYC?"},
      {"role": "assistant", "content": [
        {"type": "tool_use", "id": "toolu_abc", "name": "get_weather", "input": {"location": "NYC"}}
      ]},
      {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_abc", "content": "Sunny, 72F"}
      ]}
    ]
  }'

# Request metadata (namespace: "io.builtonenvoy.anthropic"):
# llm.model_name                                                         = "claude-sonnet-4-20250514"
# llm.system                                                             = "anthropic"
# llm.input_messages.count                                               = 3
# llm.input_messages.0.message.role                                      = "user"
# llm.input_messages.0.message.content                                   = "What is the weather in NYC?"
# llm.input_messages.1.message.role                                      = "assistant"
# llm.input_messages.1.message.tool_calls.count                          = 1
# llm.input_messages.1.message.tool_calls.0.tool_call.id                 = "toolu_abc"
# llm.input_messages.1.message.tool_calls.0.tool_call.function.name      = "get_weather"
# llm.input_messages.1.message.tool_calls.0.tool_call.function.arguments = '{"location":"NYC"}'
# llm.input_messages.2.message.role                                      = "user"
# llm.tools.count                                                        = 0

# Response metadata (when the model sends its final reply):
# llm.output_messages.count                                              = 1
# llm.output_messages.0.message.role                                     = "assistant"
# llm.output_messages.0.message.content                                  = "The weather in NYC is sunny and 72F."
# llm.token_count.prompt                                                 = 85
# llm.token_count.completion                                             = 14
# llm.token_count.total                                                  = 99