Skip to content

chore(conformance): restore wrapper-engine cross-validation (post-#24, #27, #29, #31)#32

Merged
manojp99 merged 3 commits into
mainfrom
chore/restore-conformance-suite
Jun 3, 2026
Merged

chore(conformance): restore wrapper-engine cross-validation (post-#24, #27, #29, #31)#32
manojp99 merged 3 commits into
mainfrom
chore/restore-conformance-suite

Conversation

@manojp99

@manojp99 manojp99 commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Restores the conformance suite to fully green and adds fresh fixtures for the post-cleanup argv surface. Closes the structural gap that let wrapper-engine drift accumulate silently through PRs #27, #29, and #31.

Why

The conformance suite (wrappers/conformance/) cross-validates that wrapper-produced wire frames are accepted by the engine end-to-end — exactly the class of bug PR #31 just fixed. It had been in pre-existing-failure status for multiple PRs:

Each per-PR note was true. But the cumulative effect was that the suite designed to catch wrapper-engine surface drift was muted during the entire window in which #27, #29, and #31 introduced surface changes. PR #31 found and fixed the wrapper-engine argv mismatch (3 flags the wrappers kept emitting after #27 removed them); a green conformance suite would have caught it at #27 time.

This PR catches the conformance suite up.

What's in this PR

1. Fixture-name fix (closes the long-running baseline failure):

2. New baseline fixture (locks in the post-cleanup wire shape):

  • initialize-baseline.yaml — canonical minimum initialize + turn/submit + result/final flow with only the protocol-required params (no mcpConfigPath, no allowProtocolSkew, no hostCapabilities, no envAllowlist, no envExtra). This is the invariant the engine + wrappers must agree on after all four cleanup PRs.

3. New protocol-skew override fixture (locks in design §8 D6):

  • initialize-with-protocol-skew-override.yaml — companion to version_skew.yaml's strict-refuse branch. Exercises allowProtocolSkew: true from the host-config surface (the path that subsumed the --allow-protocol-skew argv flag dropped in feat: host config layer + drop hostCapabilities surface #27).

4. Remediation-text alignment:

  • version_skew.yaml referenced the old --allow-protocol-skew argv flag in its scripted remediation text. Updated to direct users at allowProtocolSkew: true in the host config file (--config).

Test results

Suite Before this PR After this PR
Conformance TS (npm test) 4 passed, 1 failed 7 passed, 0 failed
Conformance Py (uv run pytest) 4 passed, 1 failed 7 passed, 0 failed
Engine (uv run pytest --ignore=wrappers) 518 passed, 4 failed 519 passed, 3 failed (1 closed as side effect of fixture-name fix)
TS wrapper 63/63 ✓ 63/63 ✓ (unchanged)
Python wrapper 48/48 ✓ 48/48 ✓ (unchanged)

Manual smoke against all 5 removed flags: each → click UsageError, exit 2. ✓

Scope note: subprocess-level argv fixtures NOT added

The original suggestion included loud-failure argv fixtures (e.g., initialize-loud-failure-host-capabilities.yaml asserting that the engine rejects the removed flag). Intentionally not in this PR because:

  1. The conformance runners (runner_ts.ts, runner_py.py) use ScriptedTransport — JSON-RPC wire-frame replay — not engine subprocess invocation.
  2. The fixture loader at src/amplifier_agent_lib/protocol/conformance/loader.py requires a non-empty script of wire frames. Adding a "subprocess invocation" fixture mode would require loader machinery outside this PR's scope.
  3. The argv-level loud-failure invariants are already locked in at the engine layer:

If we want subprocess-level argv coverage in the conformance suite, that's a separate "extend conformance loader to support subprocess fixture mode" PR. Tracked as a follow-up — not blocking the version-bump release.

Commits

638eff2 test(conformance): add baseline + protocol-skew-override wire-shape fixtures
ed88a8a chore(conformance): align version_skew remediation text with post-#27 host-config surface
36ba0ed fix(conformance): rename runner refs to initialize-with-mcp-config-path (post-#24)

What this means for the release story

With this PR merged, the conformance suite is genuinely green and the runners actually validate wrapper-engine surface. PRs #27, #29, #31, and #30 (skills block) are all cross-side-tested at the wire-protocol layer. Safe to proceed with coordinated version bumps and the release tags.

Followup

  • Extend conformance loader to support subprocess invocation mode (would let us add argv-level loud-failure fixtures inside the conformance suite rather than in the engine-side tests).
  • The 3 remaining pre-existing engine failures in tests/cli/test_mode_a_v2_envelope.py use the removed --mcp-servers flag. Independent cleanup — same upstream cause as the conformance failure this PR closed (PR fix(wire): align all sites to protocol 0.2.0 (--mcp-config-path, schema rename, fixture bump) #24's incomplete rename), but engine-layer rather than conformance-layer. Worth a small fix-up PR.

Manoj Prabhakar Paidiparthy added 3 commits June 3, 2026 10:39
…th (post-#24)

PR #24 renamed the wire-shape fixture from
``initialize-with-mcpservers.yaml`` to
``initialize-with-mcp-config-path.yaml`` but left three hardcoded
references to the old name behind:

  * wrappers/conformance/test/runner-ts.test.ts (TS conformance suite)
  * wrappers/conformance/tests/test_runner_py.py (Py conformance suite)
  * tests/test_phase_2_1_exit_gate.py (engine-side fixture-set guard)

All three have been failing since PR #24. The conformance suites
showed up as "1 pre-existing failure each" in PRs #27, #29, and
#31, so wrapper/engine drift could land unseen during the entire
window in which we did most of the v0.3.0 argv cleanup -- the
exact class of bug the conformance suite is supposed to catch.

This commit restores the baseline to green by pointing each
reference at the surviving wire-protocol fixture name. No fixture
data changes, no engine logic changes.

Verification (this commit, host machine):
  * wrappers/conformance/ (TS): 5 passed (was 4 passed, 1 failed)
  * wrappers/conformance/ (Py): 5 passed (was 4 passed, 1 failed)
  * tests/test_phase_2_1_exit_gate.py: passes (was failing)

Out of scope (separate cleanup, noted for the PR body):
  * tests/cli/test_mode_a_v2_envelope.py still has 3 pre-existing
    failures referencing the removed ``--mcp-servers`` flag.
    Not touched here per the conformance-restore PR scope.
…host-config surface

PR #27 dropped the --allow-protocol-skew argv flag and moved protocol-skew opt-in into the host config file (allowProtocolSkew: true, surfaced through --config). The engine's real protocol_version_mismatch remediation string was updated in lockstep (src/amplifier_agent_lib/engine.py:160-163).

The version_skew wire-shape fixture's scripted server response still carried the old 'pass --allow-protocol-skew' remediation text, drifting it from current engine behavior even though no assertion in the fixture exercises the field directly. Update the scripted response to match the engine's current text so the fixture remains an honest exemplar of the wire-protocol surface post-#27/#29/#31.

Companion to PR #31 fixture sweep. No engine logic changes, no assertions changed, no test outcome changes.
…ixtures

Extend the conformance suite with two wire-protocol fixtures that lock in the post-cleanup surface and the previously-uncovered override branch of the version-skew contract:

* initialize-baseline.yaml -- canonical minimum init + turn/submit + result/final flow with only protocol-required params (no mcpConfigPath, no allowProtocolSkew, no hostCapabilities, no envAllowlist, no envExtra). Locks in the wire shape that survives PR #27 (host config + drop hostCapabilities), PR #29 (drop --mcp-config-path argv), and PR #31 (drop env-allowlist / env-extra / allow-protocol-skew from wrappers). Any future re-introduction of those legacy fields on the wire must be an intentional, reviewed addition rather than silent drift.

* initialize-with-protocol-skew-override.yaml -- the 'permits the handshake' branch of design §8 D6, companion to version_skew.yaml which covers only the strict-refuse branch. Client sends allowProtocolSkew=true (now sourced from the host config file, not from a --allow-protocol-skew argv flag) and a mismatched protocolVersion; server returns a successful sessionState instead of protocol_version_mismatch.

Both new fixtures are wired into the TS and Python conformance runners and acknowledged in the two fixture-set guard tests (tests/test_protocol_conformance_fixtures.py and tests/test_phase_2_1_exit_gate.py).

Scope note: the conformance runners use scripted JSON-RPC wire replay (ScriptedTransport) -- they do NOT spawn the engine subprocess. Argv-level 'loud failure' invariants for the dropped flags (--host-capabilities, --env-allowlist, --env-extra, --allow-protocol-skew, --mcp-config-path) are already locked in by the engine-side click-CliRunner tests at tests/cli/test_drop_host_capabilities.py and tests/cli/test_drop_mcp_config_path_flag.py. Extending the conformance runner to also exercise argv-level invariants would require subprocess invocation and changes to the loader at src/amplifier_agent_lib/protocol/conformance/loader.py (engine-internal, off limits per the PR scope) and is intentionally left for a future PR.

Verification (host machine):

* wrappers/conformance/ (TS): 7 passed (was 5 after rename commit)

* wrappers/conformance/ (Py): 7 passed (was 5 after rename commit)

* tests/test_protocol_conformance_fixtures.py: 17 passed (was 16; the new fixtures load and validate)

* tests/test_phase_2_1_exit_gate.py: passes (now acknowledges the new fixtures in the sorted-name guard)

* Engine suite (uv run pytest -q --ignore=wrappers): 519 passed, 3 pre-existing failures in tests/cli/test_mode_a_v2_envelope.py (unchanged from PR #31's pre-existing baseline -- they reference the long-removed --mcp-servers flag and are tracked for a separate cleanup).

* Wrappers/python: 48 passed; wrappers/typescript: 63 passed; manual smoke confirms --host-capabilities / --env-allowlist / --env-extra / --allow-protocol-skew / --mcp-config-path all reject with click UsageError ('no such option').
@manojp99 manojp99 merged commit 5162339 into main Jun 3, 2026
2 of 3 checks passed
manojp99 pushed a commit that referenced this pull request Jun 3, 2026
, #32)

Replace the [Unreleased] section with a full [0.4.0] - 2026-06-03 entry
that consolidates the host-config-layer release window:

  PR #27 - Host config layer + drop hostCapabilities surface
  PR #29 - Drop --mcp-config-path argv (subsumed by host config + env var)
  PR #30 - host-config skills: block + tool-skills bundle composition
  PR #31 - Drop env-allowlist, env-extra, allow-protocol-skew from wrappers
  PR #32 - Restore conformance suite

Highlights:
  - 4 argv flags + 1 env var removed (subsumed by host config layer)
  - hostCapabilities fully removed from envelope, initialize, and wrapper API
  - 5th host-config block 'skills:' (D11/D12/D13)
  - Wire protocol 0.2.0 -> 0.3.0 (BREAKING)
  - Engine 0.3.0 -> 0.4.0, Python wrapper 0.3.0 -> 0.4.0, TS wrapper 0.4.0 -> 0.5.0

Also documents the cross-repo follow-ups that downstream consumers
(notably amplifier-module-provider-nc) must catch up on but are NOT
part of this release.
manojp99 pushed a commit that referenced this pull request Jun 3, 2026
Closes the last 4 Python CI failures on this release branch.

(1) tests/cli/test_config_show.py::test_config_show_reports_default_when_env_absent

Click's CliRunner.invoke(env=...) MERGES the env dict with os.environ instead
of replacing it. GitHub Actions runners set XDG_CONFIG_HOME by default, which
leaked into the test and made source='env:XDG_CONFIG_HOME' instead of the
'default' the test was asserting. Now uses monkeypatch.delenv() to explicitly
remove XDG_CONFIG_HOME / XDG_CACHE_HOME / XDG_STATE_HOME before invoking.

(2) tests/cli/test_mode_a_v2_envelope.py

The three test_mcp_servers_* tests (inline_json_parsed, at_path_form,
malformed_json_yields_argv_envelope) target the --mcp-servers argv flag.
That flag was renamed to --mcp-config-path by PR #24 and then fully removed
by PR #29. The tests have been failing as 'pre-existing baseline' through
PRs #27, #29, #31, #32, and #33's earlier baselines. They test removed
surface and should never have been kept.

Replaced with an inline comment naming the removal context and pointing at
the removal guardrail at tests/cli/test_drop_mcp_config_path_flag.py.

Local: 532 passed, 3 skipped, 0 failed (full pytest tests/).
manojp99 added a commit that referenced this pull request Jun 3, 2026
…ce restored (#33)

* chore(protocol)!: bump wire PROTOCOL_VERSION 0.2.0 -> 0.3.0

Bump the wire protocol version to reflect accumulated backward-incompatible
changes shipped under the 0.4.0 release window:

  - metadata.hostCapabilities removed from response envelope (#27)
  - InitializeParams.host removed (#27)
  - InitializeParams.mcpServers renamed to mcpConfigPath (#24)
  - skills field added (host config skills: block, #30)

Old wrappers pinned to '0.2.0' should hard-fail handshake with a typed
protocol_version_mismatch error and exit 2 — verified manually:

    $ uv run amplifier-agent run --protocol-version 0.2.0 --session-id sm 'hi'
    {... "code": "protocol_version_mismatch" ...}
    EXIT=2

    $ uv run amplifier-agent run --protocol-version 0.3.0 --session-id sm 'hi'
    {... "reply": "..." ...}
    EXIT=0

Updated sites (audited via 'git grep "0.2.0" src/ tests/'):

  - src/amplifier_agent_lib/protocol/methods.py — PROTOCOL_VERSION constant
  - src/amplifier_agent_lib/protocol/spec.md — regenerated artifact
  - 9 conformance fixtures' protocolVersion: literals (setup + initialize params)
  - version_skew.yaml — serverVersion in expected error.data
  - 6 test files pinning protocolVersion in wrapper-side InitializeParams

Left as historical references (per release-issue guidance):

  - loader.py docstring example (illustrative fixture shape)
  - test_protocol_conformance_fixtures.py's self-contained _VALID_FIXTURE
    (loader smoke test, not engine-compat)
  - serverInfo.version: "0.2.0" in fixture script result blocks
    (not asserted by conformance harness — engine emits __version__ at
    runtime; fixture scripts are descriptive, only the explicit
    assertions: block is verified)

BREAKING CHANGE: Wire protocol 0.2.0 -> 0.3.0. Old wrappers pinned to 0.2.0
will hard-fail handshake with protocol_version_mismatch (exit 2). Reinstall
both engine and wrapper, or set allowProtocolSkew: true in host config.

* chore(engine)!: bump amplifier-agent version 0.3.0 -> 0.4.0

Engine version bump for the host-config-layer release window. Pairs with
the wire PROTOCOL_VERSION bump (0.2.0 -> 0.3.0) and consolidates argv/wire
surface removals shipped in PRs #27, #29, #30, #31.

Verified:
  $ uv run amplifier-agent --version
  amplifier-agent, version 0.4.0

BREAKING CHANGE: Engine version 0.3.0 -> 0.4.0. Wire protocol 0.2.0 -> 0.3.0
shipped in the same release. Old wrappers fail handshake. See CHANGELOG.md
for the full argv/wire/API removal list.

* chore(wrapper-py)!: bump Python wrapper version 0.3.0 -> 0.4.0

Python wrapper version bump to pair with the engine 0.4.0 release.
Same major.minor as engine — the two move together.

The wrapper's compiled PROTOCOL_VERSION (sourced from amplifier_agent_lib)
follows the engine bump 0.2.0 -> 0.3.0 transitively.

BREAKING CHANGE: SpawnAgentParams API removed envAllowlist, envExtra,
allowProtocolSkew, host, and mcpConfigPath fields across PRs #27, #29, #31.
Callers must migrate to host_config (JSON file passed via --config or
$AMPLIFIER_AGENT_CONFIG) or env var injection (AMPLIFIER_MCP_CONFIG).

* chore(wrapper-ts)!: bump amplifier-agent-ts version 0.4.0 -> 0.5.0

The TS wrapper jumps minor (0.4.0 -> 0.5.0) rather than tracking the
engine's major.minor (0.4.0) because of release-window history:

  - 0.4.0 was already published to npm (verified: 'npm view
    amplifier-agent-ts versions' lists 0.3.0, 0.3.1, 0.4.0)
  - 0.4.0 was published by PR #17 (path-based MCP config delivery,
    pre-#27/#29/#30/#31)
  - Since the 0.4.0 publish, PRs #27, #29, #30, #31 have all landed,
    removing breaking surface from the TS wrapper API:
      * SpawnAgentParams.host / HostCapabilities type (#27)
      * mcpConfigPath field + argv emission (#29)
      * envAllowlist / envExtra / allowProtocolSkew fields (#31)

We cannot republish 0.4.0 with different code, and the accumulated changes
are breaking (not a patch). Bumping to 0.5.0 puts the next npm release on
a fresh, unpublished version. If you have context I'm missing about a
prior decision to keep TS major.minor tied to engine major.minor, override
with a follow-up bump.

BREAKING CHANGE: TS wrapper API removed SpawnAgentParams.host,
HostCapabilities type, InitializeHostParams type (#27); mcpConfigPath field
+ argv emission (#29); envAllowlist, envExtra, allowProtocolSkew fields
(#31). Callers must migrate to AMPLIFIER_MCP_CONFIG env var and a
host_config JSON file passed via --config.

* docs(changelog): consolidate 0.4.0 release notes (PRs #27, #29, #30, #31, #32)

Replace the [Unreleased] section with a full [0.4.0] - 2026-06-03 entry
that consolidates the host-config-layer release window:

  PR #27 - Host config layer + drop hostCapabilities surface
  PR #29 - Drop --mcp-config-path argv (subsumed by host config + env var)
  PR #30 - host-config skills: block + tool-skills bundle composition
  PR #31 - Drop env-allowlist, env-extra, allow-protocol-skew from wrappers
  PR #32 - Restore conformance suite

Highlights:
  - 4 argv flags + 1 env var removed (subsumed by host config layer)
  - hostCapabilities fully removed from envelope, initialize, and wrapper API
  - 5th host-config block 'skills:' (D11/D12/D13)
  - Wire protocol 0.2.0 -> 0.3.0 (BREAKING)
  - Engine 0.3.0 -> 0.4.0, Python wrapper 0.3.0 -> 0.4.0, TS wrapper 0.4.0 -> 0.5.0

Also documents the cross-repo follow-ups that downstream consumers
(notably amplifier-module-provider-nc) must catch up on but are NOT
part of this release.

* fix(ci): commit uv.lock for deterministic dependency resolution

The CI workflow uses astral-sh/setup-uv@v3 with enable-cache: true, which
defaults to globbing **/uv.lock to compute the cache key. With uv.lock
gitignored, every CI run failed at the install step with:

  No matches found for glob **/uv.lock

Committing uv.lock fixes CI and aligns with Astral's recommended practice
for reproducible builds. Catches reproducibility drift between contributors
and CI/DTU. The lock is ~150KB.

This was a pre-existing CI break on main (every recent CI run failing) that
this version-bump release is unblocking as part of release-readiness.

* fix(ci): install Node + pnpm for conformance parity tests

tests/test_conformance_parity.py::test_ts_and_py_runners_agree[*] shells
out to 'pnpm exec tsx runner_ts.ts' to cross-validate the TS runner against
the Python runner. The Python CI job had no Node.js or pnpm installed, so
all 10 parametrized cases failed with:

  FileNotFoundError: [Errno 2] No such file or directory: 'pnpm'

Adds setup-node@v4, pnpm/action-setup@v3, and a pnpm-install step in
wrappers/conformance/ before pytest runs. Pre-existing CI gap that this
release branch is closing as part of release-readiness.

* fix(ci): make XDG test hermetic + delete obsolete --mcp-servers tests

Closes the last 4 Python CI failures on this release branch.

(1) tests/cli/test_config_show.py::test_config_show_reports_default_when_env_absent

Click's CliRunner.invoke(env=...) MERGES the env dict with os.environ instead
of replacing it. GitHub Actions runners set XDG_CONFIG_HOME by default, which
leaked into the test and made source='env:XDG_CONFIG_HOME' instead of the
'default' the test was asserting. Now uses monkeypatch.delenv() to explicitly
remove XDG_CONFIG_HOME / XDG_CACHE_HOME / XDG_STATE_HOME before invoking.

(2) tests/cli/test_mode_a_v2_envelope.py

The three test_mcp_servers_* tests (inline_json_parsed, at_path_form,
malformed_json_yields_argv_envelope) target the --mcp-servers argv flag.
That flag was renamed to --mcp-config-path by PR #24 and then fully removed
by PR #29. The tests have been failing as 'pre-existing baseline' through
PRs #27, #29, #31, #32, and #33's earlier baselines. They test removed
surface and should never have been kept.

Replaced with an inline comment naming the removal context and pointing at
the removal guardrail at tests/cli/test_drop_mcp_config_path_flag.py.

Local: 532 passed, 3 skipped, 0 failed (full pytest tests/).

---------

Co-authored-by: Manoj Prabhakar Paidiparthy <mpaidiparthy@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant