About
This extension routes HTTP requests across a peered Envoy mesh using a path vector protocol inspired by BGP. Each Envoy advertises its locally reachable clusters to its peers and learns transitively reachable clusters back. Connection targeting a cluster will be routed to the next hop, to achieve the shortest path.
How it works
The cluster-router plugin reads the x-target-cluster header, looks up the next-hop
in the routing table, and writes x-next-hop so Envoy's cluster_header route
resolves to that cluster. The plugin strips any incoming x-next-hop at filter
entry, so clients and prior filters cannot pre-set the next hop.
A background daemon pulls /advertisements from each configured peer every
poll_interval. After each tick it recomputes the routes and publishes them
via an atomic.Pointer snapshot so request handlers observe a consistent
table without locking.
Peers and locally-fronted terminals are declared explicitly in the plugin
config (peers[], terminals[]). The plugin does not poll Envoy admin or
read cluster filter_metadata; the operator (or control plane) is the
authoritative source. With xDS, push a new cluster via CDS and a matching
filter-config update via LDS together so the new cluster name appears in
terminals at the moment the cluster goes live.
Wiring with Envoy
The Envoy route must use cluster_header: x-next-hop (matching the plugin's
next_hop_header). Add request_headers_to_remove: [x-next-hop] to the route
if you do not want the header forwarded upstream:
routes:
- match: { prefix: "/" }
route:
cluster_header: x-next-hop
request_headers_to_remove: [x-next-hop]The plugin strips x-next-hop only at its own filter entry. If you insert
additional filters between cluster-router and the router that may write
x-next-hop, place cluster-router after them so its decision wins.
Connecting two Envoys
Each edge in the mesh involves two artifacts on the originating Envoy: the
plugin config lists the peer (id, endpoint, local_cluster) and the Envoy
cluster named local_cluster points at the peer's data-plane listener.
The receiving Envoy needs nothing about the sender — it just exposes its
advertise_listen and the terminal clusters it fronts (declared in
terminals[]). Concretely, for envoy1 -> envoy2 -> backend:
envoy1 (upstream side):
static_resources:
listeners:
- name: ingress
address: { socket_address: { address: 0.0.0.0, port_value: 10001 } }
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: envoy1
route_config:
virtual_hosts:
- name: vh
domains: ["*"]
routes:
- match: { prefix: "/", headers: [{ name: x-target-cluster, present_match: true }] }
route:
cluster_header: x-next-hop
http_filters:
- name: cluster-router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.dynamic_modules.v3.DynamicModuleFilter
dynamic_module_config: { name: composer, do_not_close: true }
filter_name: cluster-router
filter_config:
"@type": type.googleapis.com/google.protobuf.StringValue
value: '{"envoy_id":"envoy1","advertise_listen":"0.0.0.0:7001","peers":[{"id":"envoy2","endpoint":"http://envoy2.internal:7002","local_cluster":"peer_envoy2"}]}'
- name: envoy.filters.http.router
typed_config: { "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router }
clusters:
- name: peer_envoy2
connect_timeout: 1s
type: STRICT_DNS
load_assignment:
cluster_name: peer_envoy2
endpoints:
- lb_endpoints:
- endpoint: { address: { socket_address: { address: envoy2.internal, port_value: 10002 } } }envoy2 (downstream side): identical filter-chain shape, with these differences:
- listener on
10002, advertise on7002 "peers": []and"terminals": ["remote_svc"]in the plugin config- no
peer_envoy2cluster - one Envoy cluster named
remote_svcpointing at the real backend
Then a request to curl -H 'x-target-cluster: remote_svc' envoy1:10001/
flows: client → envoy1 → peer_envoy2 cluster → envoy2 → remote_svc cluster
→ backend. Each Envoy makes its own next-hop decision from its own table.
Distance and mesh depth
Each route carries a cumulative distance. Every link currently counts as one
hop, so a route's distance is simply its hop count. Best-path selection prefers
the lowest distance, then the shortest AS-PATH, then the lexicographically
smallest next-hop peer id.
A route is dropped once its distance exceeds 32.
Usage Examples
envoy1 — upstream side of a two-Envoy mesh
Client-facing Envoy. Peers with envoy2 and routes traffic toward it.
The Envoy cluster peer_envoy2 points at envoy2's listener.
boe run --extension cluster-router --config '
{
"envoy_id": "envoy1",
"advertise_listen": "0.0.0.0:7001",
"peers": [
{ "id": "envoy2", "endpoint": "http://envoy2.internal:7002", "local_cluster": "peer_envoy2" }
],
"poll_interval": "10s",
"stale_after": "60s"
}' envoy2 — downstream side of a two-Envoy mesh
Terminal-side Envoy. Has no outbound peers; serves /advertisements
to envoy1 and fronts the remote_svc terminal cluster.
boe run --extension cluster-router --config '
{
"envoy_id": "envoy2",
"advertise_listen": "0.0.0.0:7002",
"peers": [],
"terminals": ["remote_svc"],
"poll_interval": "10s",
"stale_after": "60s"
}'