Skip to content

Latest commit

 

History

History
350 lines (276 loc) · 17.6 KB

File metadata and controls

350 lines (276 loc) · 17.6 KB

SDWAN Network Setup Runbook

End-to-end operator guide for the system extension's SDWAN layer: WireGuard-based overlay networks with first-class VIPs, port mappings, firewall rules, route policies, iBGP/FRR routing, and cross-account federation. Covers slices 1–9 (per memory project_sdwan_routing_state — slice 9 a–f fully complete) plus slice 11 federation acceptance (also live; system_sdwan_propose_federation_peer, system_sdwan_accept_federation_peer, and system_sdwan_revoke_federation_peer are all registered MCP actions).

Audience: external operators, internal SREs, network engineers configuring multi-region overlays.

Concept reference

Concept What it is Backing model
Network An IPv6 overlay (/64 prefix) — the topology container Sdwan::Network
Peer An endpoint on the network — typically a NodeInstance, but also user devices and federation peers Sdwan::Peer
PeerKey Active WireGuard keypair for a peer; rotates on sdwan_peer_remediate Sdwan::PeerKey
Virtual IP (VIP) First-class /128 address with primary + failover holders (slice 3) Sdwan::VirtualIp, Sdwan::VirtualIpAssignment
PortMapping Maps an external port → internal /128:port for inbound traffic Sdwan::PortMapping
FirewallRule nft rule applied per-peer with selectors (peer, tag, cidr) Sdwan::FirewallRule
RoutePolicy JSONB statement compiled to FRR route-map + aux objects (slice 9) Sdwan::RoutePolicy
AccessGrant Token granting a user the right to add a device to a network Sdwan::AccessGrant
UserDevice A WireGuard endpoint authenticated via AccessGrant Sdwan::UserDevice
AccountBgp Per-account AS number + BGP global config Sdwan::AccountBgp
BgpSession iBGP session between two peers; state machine Sdwan::BgpSession
FederationPeer Cross-account peer (slice 11 acceptance flow) System::FederationPeer (note: System:: namespace — there is no Sdwan::FederationPeer class)
SubnetAdvertisement LAN subnet a peer announces over iBGP Sdwan::SubnetAdvertisement

Phase 1 — Create an overlay network ✅

platform.system_sdwan_create_network({
  name: "tokyo-edge",
  description: "CDN edge fabric across us-east-1 + ap-tokyo-1",
  prefix: "fd00:abcd:1::/64",          // optional; auto-allocated if omitted
  routing_mode: "static",              // "static" | "ibgp"
  pod_subnet_prefix: "10.42.0.0/16"    // optional; enables k3s flannel-over-SDWAN routing
})
// → { network: { id, name, prefix, status: "active", ... } }

What to watch:

  • The prefix must be unique per account. If omitted, the platform allocates a /64 from the account's pool.
  • routing_mode: "ibgp" enables iBGP/FRR (slice 9c). All peers on the network get bgpd in their userspace and announce subnets to each other. Use this for multi-region fleets or when you have LAN segments behind peers. Single-FRR-per-host model: only the first iBGP network's BgpConf is active per host.
  • Don't enable iBGP on a network with <3 peers — overhead outweighs benefit.
  • pod_subnet_prefix (shipped 2026-05-19) — when set, k3s clusters bootstrapped on this network with cni_plugin: "flannel" route pod-to-pod traffic over the SDWAN WireGuard overlay (via flannel host-gw bound to wg-sdwan-<handle>). Validation rejects values overlapping the SDWAN /64, any peer lan_subnets, any VIP /128, or any other network's pod_subnet_prefix in the account. Immutable once any cluster references the network (k3s pod CIDR is immutable post-bootstrap). See CONTAINER_RUNTIMES.md §"CNI selection — Routing pod traffic over SDWAN" for the full walkthrough. Setting pod_subnet_prefix on a network with existing flannel clusters triggers a brief ~30–60s pod-network outage per node as k3s reinstalls with the new flags — operator-driven opt-in by setting the field. Note: existing NodeInstances must run a Go agent binary that knows the new flannel_iface/flannel_backend/cluster_cidr BootstrapConfig fields (added 2026-05-19); old binaries silently ignore the fields. Rebuild + republish the agent / initramfs before relying on this feature in an existing fleet — see tutorials/04-k3s-cluster.md §"Pod traffic over SDWAN" for the operator workflow.

Phase 2 — Attach NodeInstance peers ✅

platform.system_sdwan_attach_peer({
  network_id: "<network-id>",
  node_instance_id: "<instance-id>",
  // Optional:
  publicly_reachable: true,            // hub vs spoke; auto-detected by default
  endpoint_address: "203.0.113.5:51820"  // override agent's auto-detect
})
// → { peer: { id, public_key, host_address: "fd00:abcd:1::42", ... } }

The platform allocates a /128 from the network's /64, generates a server-side WireGuard keypair (Sdwan::PeerKey), and waits for the agent's next reconcile to apply the config locally.

Verify:

platform.system_sdwan_list_peers({ network_id: "<network-id>" })
// → { peers: [{ id, status: "handshake_pending"|"connected", last_handshake_at, ... }] }

Status transitions:

  • handshake_pending → agent hasn't picked up config yet (next reconcile, ~30 s)
  • handshaking → wg interface up, awaiting handshake
  • connected → handshake completed; last_handshake_at set
  • silent → no handshake in 5 min; sdwan_reachability_sensor fires

Common failures:

  • EndpointUnreachableError — peer's NAT punching failed. Set publicly_reachable: true on at least one hub peer (the rest connect outbound through it).
  • KeysOutOfSync — agent applied stale config. Run sdwan_peer_remediate skill to rotate keys + force re-handshake.

Phase 3 — Allocate Virtual IPs (slice 3) ✅

VIPs provide stable addresses that survive peer failover:

platform.system_sdwan_create_virtual_ip({
  network_id: "<network-id>",
  name: "k3s-api",
  primary_holder_peer_id: "<peer-id-of-k3s-server-bootstrap-node>",
  failover_holder_peer_ids: ["<peer-id-of-k3s-server-2>", "<peer-id-of-k3s-server-3>"],
  anycast: false                       // false = single-holder; true = anycast
})
// → { virtual_ip: { id, address: "fd00:abcd:1::100", primary_holder_peer_id, ... } }

When the primary holder goes silent (sdwan_vip_reachability_sensor fires sdwan.vip_holder_silent), sdwan_vip_failover skill (require_approval policy) promotes the next failover candidate. The address doesn't change — kubectl + workers' K3S_URL keep working through the transition.

Anycast VIPs (anycast: true) skip failover — multiple holders all serve the address simultaneously; routing converges to closest.

Verify failover:

platform.system_sdwan_failover_virtual_ip({
  virtual_ip_id: "<vip-id>",
  // Optional: explicit target peer; otherwise picks highest-scored candidate
  target_peer_id: "<peer-id>",
  dry_run: true                        // preview the failover without committing
})
// → { resolved: false, previous_holder: ..., new_holder: ..., dry_run: true }

Anti-pattern: single-server K3s clusters cannot use VIP failover — slice 3 requires ≥2 servers. The VIP exists but failover is no-op when only one candidate remains.

Phase 4 — Port mappings (inbound traffic) ✅

For traffic entering the overlay from outside:

platform.system_sdwan_create_port_mapping({
  network_id: "<network-id>",
  external_address: "203.0.113.5",      // public IP of a hub peer
  external_port: 443,
  internal_peer_id: "<target-peer-id>",
  internal_port: 8443,
  protocol: "tcp"
})
// → { port_mapping: { id, ... } }

The hub peer's nftables ruleset is updated; traffic to 203.0.113.5:443 rewrites to [<target-peer-/128>]:8443 over the encrypted overlay.

What to watch:

  • The hub peer must be publicly_reachable: true and have a routable external IP.
  • For Kubernetes Ingress, prefer a VIP over a port mapping — VIPs survive peer failover; port mappings don't.

Phase 5 — Firewall rules ✅

platform.system_sdwan_create_firewall_rule({
  network_id: "<network-id>",
  direction: "ingress",                 // "ingress" | "egress" | "both"
  action: "accept",                     // "accept" | "drop" | "reject"
  selector: {
    kind: "peer",                       // "peer" | "tag" | "cidr" | "all"
    peer_id: "<source-peer-id>"         // when kind=peer
  },
  protocol: "tcp",                      // "any" | "tcp" | "udp" | "icmp6"
  port_range: "8443"                    // optional; "8443-8480" for ranges
})
// → { firewall_rule: { id, ... } }

Compiled to nft on the holding peer. Rule order: more specific selectors win (peer > tag > cidr > all).

Examples:

// Allow tenant-A pods to reach tenant-A's database VIP only
platform.system_sdwan_create_firewall_rule({
  network_id: net,
  direction: "ingress",
  action: "accept",
  selector: { kind: "tag", tag: "tenant-A" },
  protocol: "tcp",
  port_range: "5432"
})

// Default-deny everything else to that VIP
platform.system_sdwan_create_firewall_rule({
  network_id: net,
  direction: "ingress",
  action: "drop",
  selector: { kind: "all" },
  protocol: "tcp",
  port_range: "5432"
})

Phase 6 — Route policies (slice 9) ✅

Route policies shape iBGP advertisements when routing_mode: "ibgp". Statements compile to FRR route-map + auxiliary prefix-list / as-path-list / community-list:

platform.system_sdwan_create_route_policy({
  network_id: "<network-id>",
  name: "prefer-tokyo-via-aggregator",
  direction: "import",                  // "import" | "export"
  statements: [
    {
      seq: 10,
      match: { prefix: "fd00:abcd:1:cafe::/96", peer: "<tokyo-aggregator-peer-id>" },
      set: { local_pref: 200 },
      action: "permit"
    },
    {
      seq: 20,
      match: { prefix: "fd00:abcd:1:cafe::/96" },
      action: "permit",
      set: { local_pref: 100 }            // lower preference for non-aggregator paths
    }
  ]
})

The compiler emits FRR config to each peer's /etc/frr/frr.conf on next reconcile. To audit existing policies:

platform.system_sdwan_list_route_policies({ network_id: "<network-id>" })

The system.sdwan_route_policy_audit autonomy action (auto_approve policy) periodically surfaces inconsistent or shadowed statements.

Phase 7 — User devices (WireGuard VPN) ✅

For human operators connecting from laptops/phones:

// Step 1: create an access grant (single-use bootstrap URL)
platform.system_sdwan_create_access_grant({
  network_id: "<network-id>",
  device_name_hint: "alice-laptop",
  expires_in_seconds: 900                 // 15 min default
})
// → { access_grant: { id, bootstrap_url: "https://platform/.../bootstrap?token=...", expires_at } }

// Step 2: user opens bootstrap URL → returns WireGuard config
//   (operator UI generates QR code for mobile)

// Step 3: issue device after user setup
platform.system_sdwan_issue_user_device({
  network_id: "<network-id>",
  access_grant_id: "<grant-id>",
  public_key: "AbCd...="                  // user's WireGuard public key
})
// → { user_device: { id, host_address: "fd00:abcd:1::200", status: "active" } }

Revoke:

platform.system_sdwan_revoke_user_device({ user_device_id: "<id>" })
// → device removed from network's wg config on next reconcile

system.sdwan_user_device_revoke is require_approval (cuts off a user) — the autonomy executor never runs this without operator approval.

Phase 8 — iBGP / FRR routing (slice 9c) ✅

When routing_mode: "ibgp" is set on a network, peers exchange routes via iBGP. Configure the per-account ASN once:

platform.system_sdwan_update_account_as_number({
  as_number: 65000                         // private-range ASN; 64512–65534
})

Each peer announces its assigned subnets:

platform.system_sdwan_update_peer_lan_subnets({
  peer_id: "<peer-id>",
  subnets: [
    { prefix: "fd00:abcd:1:1::/64", description: "Tokyo office LAN" }
  ]
})

The platform creates Sdwan::SubnetAdvertisement rows; the peer's FRR config gets a matching network statement; routes propagate via iBGP to all other peers on the network.

Verify session health:

platform.system_sdwan_get_bgp_sessions({ network_id: "<network-id>" })
// → { sessions: [{ peer_id, neighbor_id, state: "Established"|"Idle"|..., uptime, ... }] }

States: Idle → Connect → Active → OpenSent → OpenConfirm → Established.

Troubleshooting unhealthy sessions:

// Run the planning-only triage skill
platform.execute_skill({
  skill: "system-sdwan-bgp-session-remediate",
  inputs: { bgp_session_id: "<session-id>", dry_run: true }
})
// → { state: "idle", likely_cause: "...", recommended_action: "vtysh -c 'clear ip bgp <neighbor>'" }

The skill is intentionally planning-only in v1 — operators run the recommended vtysh command after review.

Phase 9 — Federation peers (slice 11 — live) ✅

Cross-account peering. Account A proposes; Account B accepts. The full propose / accept / revoke flow is live as of slice 11; both endpoints route through Sdwan::Executors::{ProposeFederationPeer, AcceptFederationPeer, RevokeFederationPeer} (the model is System::FederationPeer, not Sdwan::*).

// Account A proposes
platform.system_sdwan_propose_federation_peer({
  attributes: {
    network_id:        "<account-a-network-id>",
    remote_account_id: "<account-b-id>",
    remote_network_id: "<account-b-network-id>",
    peer_kind:         "sdwan_only"   // or "platform" for full platform-peer (uses children spawn flow instead)
  }
})
// → { federation_peer: { id, status: "proposed", ... } }

// Account B reviews via UI → accepts via MCP:
platform.system_sdwan_accept_federation_peer({ id: "<fed-peer-id>" })
// → { federation_peer: { id, status: "active", ... } }

// Either side can revoke:
platform.system_sdwan_revoke_federation_peer({ id: "<fed-peer-id>" })

// List the cross-account peers:
platform.system_sdwan_list_federation_peers({ network_id: "<network-id>" })

Notes on the peer_kind:

  • sdwan_only peers are pure SDWAN cross-account bridges (Account A's overlay reaches Account B's overlay, nothing else).
  • platform peers form a federation between two Powernode platforms (e.g. parent ↔ child spawn). For platform peers, the propose endpoint is POST /api/v1/system/federation/children/spawn (via System::SpawnPlatformService), not the SDWAN propose action — see federation-setup.md for that flow.

Slice 11 status: the SDWAN acceptance path is implemented + UI wired. Earlier doc revisions framed acceptance as "operator-driven via SQL"; that's no longer current.

Troubleshooting

Symptom Cause Fix
Peer stuck handshake_pending Agent didn't pick up config (reconcile not yet fired) Wait 30 s; or force systemctl restart powernode-agent on node
Peer stuck handshaking NAT / firewall blocks WireGuard UDP Make at least one peer publicly_reachable: true with port 51820/udp open; others connect outbound through it
Peer goes silent after working Connection lost / node rebooted sdwan_reachability_sensor fires sdwan.hub_unreachable; sdwan_failover skill emits hub-promotion plan
BGP session stuck Idle Wrong AS number or unreachable neighbor Run sdwan_bgp_session_remediate skill (planning) → operator runs vtysh per recommendation
BGP session stuck Active Neighbor doesn't respond to Open message Verify the neighbor is up + has the route to this peer's /128; sdwan_peer_remediate if mTLS is the issue
VIP failover doesn't promote sdwan_vip_failover blocked by require_approval policy Check approval queue UI; operator approves → executor runs
VIP failover marks anycast: true Anycast VIP — failover is informational only This is expected; routing handles failover for anycast
Firewall rule shadows another Selector specificity — more-specific selectors match first Use system.sdwan_route_policy_audit (auto_approve) to surface shadowed rules
User device can't connect after issue Bootstrap URL expired (>15 min) Re-issue via create_access_grantissue_user_device

How the System Concierge should use this

When an operator chats "set up a VPN" / "add a Tokyo edge to our SDWAN" / "kubectl can't reach the cluster":

  1. Identify the phase (network creation, peer attach, VIP, firewall, BGP, federation, user device)
  2. For each phase, surface the relevant MCP action + required inputs
  3. For destructive actions (revoke, failover), use request_confirmation before invoking
  4. After invoking, watch last_handshake_at and BGP state transitions; report changes
  5. If a sensor fires while the operator is waiting (e.g., sdwan.hub_unreachable), surface the relevant skill (sdwan_failover / sdwan_peer_remediate) for operator approval

Related docs