About

This extension routes HTTP requests across a peered Envoy mesh using a path vector protocol inspired by BGP. Each Envoy advertises its locally reachable clusters to its peers and learns transitively reachable clusters back. Connection targeting a cluster will be routed to the next hop, to achieve the shortest path.

How it works

The cluster-router plugin reads the x-target-cluster header, looks up the next-hop in the routing table, and writes x-next-hop so Envoy's cluster_header route resolves to that cluster. The plugin strips any incoming x-next-hop at filter entry, so clients and prior filters cannot pre-set the next hop.

A background daemon pulls /advertisements from each configured peer every poll_interval. After each tick it recomputes the routes and publishes them via an atomic.Pointer snapshot so request handlers observe a consistent table without locking.

Peers and locally-fronted terminals are declared explicitly in the plugin config (peers[], terminals[]). The plugin does not poll Envoy admin or read cluster filter_metadata; the operator (or control plane) is the authoritative source. With xDS, push a new cluster via CDS and a matching filter-config update via LDS together so the new cluster name appears in terminals at the moment the cluster goes live.

Wiring with Envoy

The Envoy route must use cluster_header: x-next-hop (matching the plugin's next_hop_header). Add request_headers_to_remove: [x-next-hop] to the route if you do not want the header forwarded upstream:

routes:
  - match: { prefix: "/" }
    route:
      cluster_header: x-next-hop
      request_headers_to_remove: [x-next-hop]

The plugin strips x-next-hop only at its own filter entry. If you insert additional filters between cluster-router and the router that may write x-next-hop, place cluster-router after them so its decision wins.

Connecting two Envoys

Each edge in the mesh involves two artifacts on the originating Envoy: the plugin config lists the peer (id, endpoint, local_cluster) and the Envoy cluster named local_cluster points at the peer's data-plane listener. The receiving Envoy needs nothing about the sender — it just exposes its advertise_listen and the terminal clusters it fronts (declared in terminals[]). Concretely, for envoy1 -> envoy2 -> backend:

envoy1 (upstream side):

static_resources:
  listeners:
    - name: ingress
      address: { socket_address: { address: 0.0.0.0, port_value: 10001 } }
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: envoy1
                route_config:
                  virtual_hosts:
                    - name: vh
                      domains: ["*"]
                      routes:
                        - match: { prefix: "/", headers: [{ name: x-target-cluster, present_match: true }] }
                          route:
                            cluster_header: x-next-hop
                http_filters:
                  - name: cluster-router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.dynamic_modules.v3.DynamicModuleFilter
                      dynamic_module_config: { name: composer, do_not_close: true }
                      filter_name: cluster-router
                      filter_config:
                        "@type": type.googleapis.com/google.protobuf.StringValue
                        value: '{"envoy_id":"envoy1","advertise_listen":"0.0.0.0:7001","peers":[{"id":"envoy2","endpoint":"http://envoy2.internal:7002","local_cluster":"peer_envoy2"}]}'
                  - name: envoy.filters.http.router
                    typed_config: { "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router }
  clusters:
    - name: peer_envoy2
      connect_timeout: 1s
      type: STRICT_DNS
      load_assignment:
        cluster_name: peer_envoy2
        endpoints:
          - lb_endpoints:
              - endpoint: { address: { socket_address: { address: envoy2.internal, port_value: 10002 } } }

envoy2 (downstream side): identical filter-chain shape, with these differences:

  • listener on 10002, advertise on 7002
  • "peers": [] and "terminals": ["remote_svc"] in the plugin config
  • no peer_envoy2 cluster
  • one Envoy cluster named remote_svc pointing at the real backend

Then a request to curl -H 'x-target-cluster: remote_svc' envoy1:10001/ flows: client → envoy1 → peer_envoy2 cluster → envoy2 → remote_svc cluster → backend. Each Envoy makes its own next-hop decision from its own table.

Distance and mesh depth

Each route carries a cumulative distance. Every link currently counts as one hop, so a route's distance is simply its hop count. Best-path selection prefers the lowest distance, then the shortest AS-PATH, then the lexicographically smallest next-hop peer id.

A route is dropped once its distance exceeds 32.

Usage Examples

envoy1 — upstream side of a two-Envoy mesh

Client-facing Envoy. Peers with envoy2 and routes traffic toward it. The Envoy cluster peer_envoy2 points at envoy2's listener.

boe run --extension cluster-router --config '
  {
    "envoy_id": "envoy1",
    "advertise_listen": "0.0.0.0:7001",
    "peers": [
      { "id": "envoy2", "endpoint": "http://envoy2.internal:7002", "local_cluster": "peer_envoy2" }
    ],
    "poll_interval": "10s",
    "stale_after": "60s"
  }'

envoy2 — downstream side of a two-Envoy mesh

Terminal-side Envoy. Has no outbound peers; serves /advertisements to envoy1 and fronts the remote_svc terminal cluster.

boe run --extension cluster-router --config '
  {
    "envoy_id": "envoy2",
    "advertise_listen": "0.0.0.0:7002",
    "peers": [],
    "terminals": ["remote_svc"],
    "poll_interval": "10s",
    "stale_after": "60s"
  }'