Skip to content

feat: contract-fix + multi-server fan-out#41

Closed
bkrabach wants to merge 2 commits into
mainfrom
feat/multi-server-routing
Closed

feat: contract-fix + multi-server fan-out#41
bkrabach wants to merge 2 commits into
mainfrom
feat/multi-server-routing

Conversation

@bkrabach

@bkrabach bkrabach commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Summary

Enable context-intelligence to fan out each session's events to multiple servers — sending to all destinations whose include/exclude patterns match the session's working directory (A only / both / neither) — while fixing a load-bearing contract violation in the hook's config resolution.

Two changes, folded into one PR per the user's decision: (1) contract fix — the hook becomes a pure mount-config consumer; (2) multi-server fan-out — sessions fan out to matching servers via per-destination include/exclude patterns. There is no target/selector config value — destination selection is purely per-destination pattern matching, and a session can match (and be sent to) more than one.


The Contract Fix (load-bearing)

The context-intelligence hook was the only hook of 8 in the ecosystem that read ~/.amplifier/settings.yaml and os.environ directly inside config_resolver.py — violating MOUNT_PLAN_SPECIFICATION.md / HOOK_CONTRACT.md: "The app layer reads various config sources… Hooks receive configuration via Mount Plan."

  • Reference hook (hooks-logging): pure mount-config consumer — reads only the config dict passed to mount() + one coordinator capability (session.working_dir). Zero os.environ, zero settings.yaml.
  • Fix: removed import os, SETTINGS_PATH/_parse_settings_yaml, _ENV_PREFIX, _env(), and the settings.yaml/env fallbacks from config_resolver.py. All config now arrives via the mount config dict (with ${VAR} already expanded by the app) + coordinator capabilities. Grep confirms config_resolver.py has zero ambient reads.
  • Impact: large NET DELETION — the refactor removes more redundant code than the feature adds.

The app layer (amplifier-app-cli) already resolves config: user→project settings deep-merge, ${VAR} expansion, and per-module overrides via overrides.hook-context-intelligence.config.*. The module consumes that; it no longer re-solves it.


The Feature: Multi-Server Fan-Out

Config shape (mount plan, populated by app-cli's overrides.hook-context-intelligence.config):

destinations:
  personal:
    url: "http://192.168.1.6:8000"
    api_key: "${PERSONAL_CI_KEY}"
    include: ["**"]              # match everything…
    exclude: ["**/secret/**"]    # …except secret paths
  team:
    url: "http://192.168.1.6:8001"
    api_key: "${TEAM_CI_KEY}"
    include: ["**/work/**"]
  • Fan-out: each session's events are dispatched to every destination whose patterns match — A only / both / neither. No first-match, no target selector.
  • Working-directory matching via pathspec gitwildmatch (gitignore semantics), against the session.working_dir capability (fail-loud if absent — no slug fallback). Exclude wins, per-destination.
  • Per-destination failure isolation: each destination gets its own httpx client, queue, and circuit breaker — a dead server never silences the others.
  • Local JSONL always written, regardless of fan-out (so "neither" still records locally).
  • Project-scope override: because destinations is a dict keyed by name, app-cli's user→project settings deep-merge lets a project .amplifier/settings.yaml override one destination's include/exclude (e.g. destinations.team.include) without restating the others.
  • Fail-fast validation: missing/empty url or api_key after expansion → error at mount (not a silent per-event drop). No destinations configured → local-only (valid).

Back-compat: legacy scalar context_intelligence_server_url/api_key is synthesized into a single default destination (include: ["**"]). Existing single-server configs load and route unchanged. The served context-intelligence-graph-query skill binds to the first active destination.


Validation

Unit tests — 355 pass

  • Fold-discipline gate (test_aa_contract_fix.py, sorts first): hook reads only mount config (no ambient files/env); existing single-server config still works — verified before any fan-out test runs.
  • test_destinations_schema.py / test_destinations_validation.py (dict parsing, defaults, legacy synthesis, fail-fast), test_fanout_select.py (pathspec, exclude-wins, A/both/neither), test_dispatcher.py (per-destination isolation), test_on_session_ready_routing.py (working_dir capability, migration warning, fail-loud on absent), test_logging_handler_fanout.py (JSONL always written, fan-out to all dispatchers).

End-to-end in a Digital Twin — real amplifier sessions, two live servers

Validated via amplifier-tester in an isolated DTU running real Amplifier with this branch installed, configured only through ~/.amplifier/settings.yaml overrides.hook-context-intelligence.config.destinations (the patterns above). Three real amplifier sessions:

working dir Server A Server B local JSONL result
plain-proj A only
work/app both
secret/app neither (local-only)

This proves the full real path: settings.yaml override → app-cli resolution → mount plan → hook fan-out by working_dir → correct servers — with the hook reading zero ambient config (the contract fix). The fact that fan-out happens at all from a settings.yaml override proves app-cli resolved destinations into the mount plan and the hook consumed it.


Out of Scope

  • Issue #283 (separately filed, @colombod): the hook force-loads the context-intelligence-graph-query skill every session start even when unused — overhead in pipelines and single-command-series workflows.

Files Changed

Added: amplifier_module_hook_context_intelligence/fanout.py (pure selection logic) + 8 test files.
Modified: config_resolver.py (contract fix + Destination/destinations), handlers/logging_handler.py (per-destination dispatchers), __init__.py (mount integration + fan-out), behaviors/context-intelligence.yaml (config examples + deprecation notes), pyproject.toml (+pathspec>=0.12).

Fix a load-bearing config-contract violation and add multi-server, per-session
FAN-OUT in the context-intelligence hook. Folded at the user's direction: the
contract fix is the foundation; fan-out is built on it.

CONTRACT FIX
The hook was the only hook of 8 that read ~/.amplifier/settings.yaml (via
SETTINGS_PATH/_parse_settings_yaml) and os.environ directly in config_resolver.py,
violating MOUNT_PLAN_SPECIFICATION.md / HOOK_CONTRACT.md (the app resolves config
sources and passes a resolved dict to mount(); the module consumes it). The
reference hook hooks-logging is a pure mount-config consumer. config_resolver.py
now has zero ambient reads; all config arrives via the mount config dict (with
${VAR} pre-expanded by the app) plus coordinator capabilities. Net code deletion.

MULTI-SERVER FAN-OUT
Dict-keyed `destinations`, each {url, api_key, include, exclude}. Each session's
events fan out to ALL destinations whose pathspec include/exclude patterns match
the session.working_dir capability -- A only / both / neither. There is NO
target/selector value. Per-destination failure isolation (own client, queue,
circuit breaker); local JSONL always written; fail-fast validation at mount;
legacy scalar config -> one synthesized `default` destination. Selection logic
lives in fanout.py.

VALIDATION
355 unit tests pass (fold-discipline gate runs first). Proven end-to-end in a
Digital Twin with real `amplifier` sessions against two live servers, configured
only via settings.yaml overrides.hook-context-intelligence.config.destinations:
plain -> A only, work -> both, secret -> neither (local-only).

Out of scope: #283 (skill force-load tax).

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
@bkrabach bkrabach force-pushed the feat/multi-server-routing branch from 3cee885 to 167ae6e Compare June 21, 2026 05:05
@bkrabach bkrabach changed the title feat: contract-fix + multi-server routing feat: contract-fix + multi-server fan-out Jun 21, 2026
@bkrabach

bkrabach commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator Author

Configuring multiple servers (reviewer/user note)

Multi-server fan-out is driven entirely by ~/.amplifier/settings.yaml (or a project-level ./.amplifier/settings.yaml, which deep-merges over it). Under overrides.hook-context-intelligence.config.destinations, list each server by name with its url, api_key (use ${VAR} — the app expands it before mount), and include/exclude globs (gitignore/pathspec semantics, matched against the session's working directory). Each session's events fan out to every destination whose patterns match — one server, both, or neither (neither ⇒ local events.jsonl only). exclude wins over include, per destination. There is no global selector/target — selection is purely per-destination patterns.

overrides:
  hook-context-intelligence:
    config:
      destinations:
        personal:
          url: "http://192.168.1.6:8000"
          api_key: "${PERSONAL_CI_KEY}"
          include: ["**"]                          # everything…
          exclude: ["**/client-*/**"]              # …except client work
        team:
          url: "https://team-node.tailnet.ts.net"
          api_key: "${TEAM_CI_KEY}"
          include: ["**/client-x", "**/client-x/**"]   # the client-x dir AND everything under it

Pattern gotcha (corrected): a **/name/** glob matches only inside the directory, not the directory itself — so a session started with cd client-x && amplifier (working_dir = .../client-x) would NOT match **/client-x/**. To match a project whether the session starts at its root or in a subdir, include both **/name and **/name/**.

A project's ./.amplifier/settings.yaml can override just one destination (e.g. destinations.team.include) without restating the others — that's why destinations are a dict keyed by name. Existing single-server setups keep working unchanged: the legacy context_intelligence_server_url/api_key scalars synthesize a single default destination (include: ["**"]). A legacy url set without an api_key runs local-only with a WARNING (it does not error).

@colombod

colombod commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Edited: trimmed my earlier comment down to the items that are genuinely actionable — a couple were overstated, and one (HTTPS enforcement) was me imposing transport policy on your own config, which isn't the hook's call. Removed.

Nice work on this — the contract fix reads clean, the test_aa_contract_fix.py source-level gate is a sharp regression guard, the per-destination isolation (client + queue + breaker) holds, and the JSONL-always-written invariant is preserved and tested.

Two things I think are worth a decision before merge, one genuine question, and a few minor notes. I verified the first two against the code (traced the diff; ran pathspec==1.1.1).

1. Legacy url set + api_key unset now fails to mount — intended?

# config_resolver.py — destinations (~L429)
legacy_url = self.context_intelligence_server_url
if legacy_url:
    legacy_key = self.context_intelligence_api_key or ""   # "" when key absent
    result = {"default": Destination(url=legacy_url, api_key=legacy_key, include=("**",), exclude=())}

# validate_destinations (~L492): empty api_key -> problems -> raise ValueError
# __init__.py mount() (~L107): resolver.validate_destinations() propagates that raise

Pre-PR, a url-set / key-unset config ran local-only with dispatch disabled. Now it raises at mount, so an existing user in that state goes from "works, no dispatch" to "hook won't mount."

Worth noting test_silent_by_default was changed to assert handler._dispatchers == [], which is trivially true for every LoggingHandler (the ctor always sets self._dispatchers = []) regardless of resolver state — so this path isn't actually covered either way.

If "url without key = local-only" is the intended contract, a guard (synthesize default only when both are present) restores it — a WARNING log would make the degraded state discoverable. If "fail fast" is intended instead, the error could name the missing key. Either is fine; just flagging it's a behavior change from before.

2. The documented destinations example misroutes at the project root

Running the shipped example patterns through pathspec==1.1.1:

exclude ["**/client-*/**"]  vs  /home/user/client-x       -> False   # personal NOT excluded -> stays active
include ["**/client-x/**"]  vs  /home/user/client-x       -> False   # team NOT matched -> inactive
exclude ["**/client-*/**"]  vs  /home/user/client-x/app   -> True    # only matches inside a subdir

normalize_match_key() resolves the working dir to a bare path with no trailing slash, and gitignore-style **/x/** only matches inside x, not x itself. So a session started the common way — cd client-x && amplifier — leaves personal active and team inactive, i.e. the example as written sends client-x sessions to personal rather than team.

The explicit idiom ["**/client-x", "**/client-x/**"] matches both the directory and its contents. So either fix the example + document the idiom, or decide whether a /** pattern should implicitly match its own directory (friendlier, but more magic). I lean toward fixing the example and keeping pattern semantics literal — but it's your design call.

Question: does the coordinator wrap exceptions from on_session_ready?

on_session_ready raises RuntimeError when session.working_dir is absent and destinations are configured. If the coordinator wraps lifecycle-callback exceptions, this is exactly the fail-loud behavior you want. If it doesn't, it'd surface as a session-init crash, which HOOK_CONTRACT's "don't crash the kernel" line cautions against. You'd know the lifecycle contract better than I can infer from the diff — just want to make sure it's the former.

Minor (non-blocking)

  • The YAML comment "When env vars are unset, they expand to empty strings → no destination → local-only" is accurate for the legacy scalars, but for a destinations entry an empty url/api_key hits the mount error in chore(deps): bump pygments from 2.19.2 to 2.20.0 in /modules/tool-blob-read #1 rather than going local-only. Might be worth a clarifying word so the two paths aren't conflated.
  • set_dispatchers() reassigns self._dispatchers without closing the previous list — only matters if on_session_ready can fire more than once per handler. If it can, the old workers stay parked on queue.get(). If it can't, ignore this.
  • Queue-full / circuit-breaker trips permanently disable a destination for the session but log at DEBUG — a WARNING would make a silently-dropped destination visible in normal logs.
  • pathspec>=0.12 has no upper bound and it's load-bearing for routing; a ceiling would guard against a future semantics change.

Happy to re-review once you've had a look.

Addresses @colombod's review of PR #41:

- Correct destinations example patterns. A gitignore-style `**/name/**`
  pattern matches only INSIDE the directory, so a session started with
  `cd name && amplifier` (working_dir = .../name) did not match. Examples
  now use the `["**/name", "**/name/**"]` idiom (directory + contents),
  with a comment documenting the gotcha. Added tests proving the old
  pattern misses the directory and the idiom covers both.

- Legacy `url`-without-`api_key` degrades gracefully instead of failing to
  mount. The synthesized `default` destination is created only when both
  url and api_key are present; url-only now logs a WARNING and runs
  local-JSONL-only (restores pre-PR behavior). Explicit `destinations.<name>`
  entries with a missing url/api_key still fail fast (genuine misconfig).

- Promote dropped-destination state transitions (queue-full, breaker-open)
  from DEBUG to WARNING so a configured destination going dark is visible.

- Bound the pathspec dependency: >=0.12,<1.0.

360 tests pass (+5 new).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
colombod added a commit that referenced this pull request Jun 22, 2026
…mented config; demote legacy scalars

The README still presented the legacy single-server scalars
(context_intelligence_server_url/api_key) as the primary configuration story
across the intro, the "What it does" table, Quick Start, the app-cli override
pattern, and the configuration reference — none of which documented the
`destinations` multi-server fan-out shape introduced in PR #41.

Restructured so the current `destinations` shape is canonical:
- Intro + "What it does" describe per-session fan-out by working directory.
- Quick Start step 2 configures a `destinations` block (not legacy scalars).
- The app-cli override section leads with `destinations`; legacy is a labeled
  back-compat shortcut.
- Configuration reference documents the `destinations` sub-keys, routing/validation
  semantics, and a Source column; legacy scalars moved to a "Deprecated" subsection
  noting the url-without-key local-only degradation.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
bkrabach added a commit that referenced this pull request Jun 22, 2026
…e => no-match

Follow-on hardening on top of this PR's multi-server fan-out, folding in the
remaining item from the now-superseded #41 plus two review decisions:

- Promote dropped-destination state transitions (queue-full, circuit-breaker
  open) from DEBUG to WARNING -- a configured destination going dark is now
  visible at the default log level.
- set_dispatchers() now closes the previously-set dispatchers before swapping
  in the new list, removing a latent worker/httpx-client leak if it is ever
  called more than once per handler.
- Flip the destination default: a destination with NO `include` now matches
  NOTHING (previously catch-all `**`). Prevents accidentally fanning every
  session out to a newly-added server. The legacy single-server synthesis
  still sets include=("**",) explicitly, so existing single-server users are
  unaffected.

367 tests pass.

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
@bkrabach

Copy link
Copy Markdown
Collaborator Author

Superseded by #42, which carries the original fan-out commit (167ae6e) plus all the hardening — gitignore matching, mount-regression graceful-degrade, working_dir-absent degrade-to-local, pathspec bound — and now the follow-on fixes (WARNING on dropped destinations, set_dispatchers leak, no-include⇒no-match default). Everything from this PR is represented in #42. Closing in favor of #42.

@bkrabach bkrabach closed this Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants