Skip to content

Adding Microsoft SECURITY.MD#1

Merged
manojp99 merged 1 commit into
mainfrom
users/GitHubPolicyService/156ad2cb-dd68-49e6-9777-30f7072a2a2c
May 18, 2026
Merged

Adding Microsoft SECURITY.MD#1
manojp99 merged 1 commit into
mainfrom
users/GitHubPolicyService/156ad2cb-dd68-49e6-9777-30f7072a2a2c

Conversation

@microsoft-github-policy-service

Copy link
Copy Markdown
Contributor

Please accept this contribution adding the standard Microsoft SECURITY.MD 🔒 file to help the community understand the security policy and how to safely report security issues. GitHub uses the presence of this file to light-up security reminders and a link to the file. This pull request commits the latest official SECURITY.MD file from https://github.com/microsoft/repo-templates/blob/main/shared/SECURITY.md.

Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.

@manojp99 manojp99 merged commit cd20420 into main May 18, 2026
1 check passed
manojp99 added a commit that referenced this pull request May 21, 2026
…+ cross-language conformance (#7)

* docs(phase-2-1): implementation plan for wire spec hardening

* feat(protocol): add _gen.py CLI skeleton

* feat(protocol): TypedDict to JSON Schema extractor

* feat(protocol): emit JSON Schemas for all wire TypedDicts

- Add _is_typed_dict(), _discover_typed_dicts(), _write_error_codes_schema()
  to protocol/_gen.py
- Update main() to iterate all TypedDicts across methods, notifications,
  and capabilities modules and write one *.schema.json per TypedDict
- Add error_codes.schema.json enumerating all ErrorCode StrEnum values
- Generate schemas/ directory with 30 TypedDict schemas + error_codes.schema.json
- Add schemas/__init__.py marker file
- Add two new tests: test_gen_emits_schema_for_every_typeddict and
  test_gen_error_codes_schema_is_string_enum

* feat(protocol): generate spec.md with linked schema references

* test(protocol): CI staleness gate for generated spec.md + schemas

* feat(protocol): YAML fixture loader and structural validator

* feat(protocol): L14 + capability-negotiation conformance fixtures

* feat(protocol): lineage, version-skew, resume-continuity fixtures + completeness gate

* build(protocol): include spec.md, schemas, and fixtures in wheel

* test(protocol): Phase 2.1 exit-gate integration test

* style(protocol): ruff format fixup on staleness test

* docs(phase-2-2): implementation plan for wrappers and conformance

* feat(wrappers/ts): package skeleton with vitest + pnpm

* feat(wrappers/py): package skeleton as workspace member

- Add wrappers/python/pyproject.toml: amplifier-agent-client v0.0.0,
  hatchling build backend, pytest-asyncio strict mode, ruff 120/py312
- Add wrappers/python/src/amplifier_agent_client/__init__.py:
  exports only PROTOCOL_VERSION_REQUIRED_BY_WRAPPER = '2026-05-aaa-v0'
- Add wrappers/python/tests/test_smoke.py: async smoke test asserting
  the protocol version constant is importable and correct
- Update root pyproject.toml workspace members to include wrappers/python

Co-authored-by: Amplifier <amplifier@microsoft.com>

* feat(wrappers): shared wire types via JSON-Schema codegen (ts) + re-export (py)

* feat(wrappers): NDJSON subprocess transport (ts + py)

Implement Transport class for both TypeScript and Python wrappers.
Each Transport spawns a child process, exchanges JSON frames as NDJSON
over its stdio, drains stderr to an optional sink, and terminates cleanly.

TypeScript (wrappers/typescript/src/transport.ts):
- spawn() with stdio ['pipe','pipe','pipe'], readline on stdout/stderr
- onFrame(cb): register callbacks for parsed JSON frames
- send(obj): writes JSON.stringify(obj) + '\n' to stdin
- terminate(): sends SIGTERM, awaits child 'close' (exitPromise)
- Non-JSON stdout lines logged to stderr sink and dropped silently
- ExitInfo {code, signal}, TransportOptions {command, args, env, cwd?, stderr?}

Python (wrappers/python/src/amplifier_agent_client/transport.py):
- start(): asyncio.create_subprocess_exec with PIPEs
- frames(): async generator via asyncio.Queue + 0.1s poll timeout;
  exits when _stdout_done is set and queue is empty
- send(obj): json.dumps(obj) + '\n' encoded then drain()
- terminate(): proc.terminate(), wait up to 5s, fallback proc.kill()
- _read_stdout: async for loop with try/except; drops non-JSON silently
- _drain_stderr: drains stderr to optional sink

Defensive requirement (MCP-style tolerance): non-JSON stdout lines are
logged to the stderr sink (or process.stderr/sys.stderr) and dropped
silently - never raised. Matches engine pattern at jsonrpc.py.

Tests:
- wrappers/typescript/test/transport.test.ts: 3 vitest cases
- wrappers/python/tests/test_transport.py: 3 pytest-asyncio cases

All 6 tests pass (3/3 TS, 3/3 Py). No regressions.

* feat(wrappers): JSON-RPC 2.0 client with per-id correlation

- TS: JsonRpcClient at wrappers/typescript/src/jsonrpc.ts
  - TransportLike interface (send/onFrame), Notification interface, RequestHandler type
  - call(): allocates request id, creates Promise, sends {jsonrpc:'2.0',id,method,params}
  - dispatch(): routes response→resolve pending, server-request→handler+response, notification→fanout
  - Unknown server methods return -32601 error
  - NC-L16 designed out: each call() has independent Promise row in pending Map
- Py: JsonRpcClient at wrappers/python/src/amplifier_agent_client/jsonrpc.py
  - _TransportLike Protocol (send/on_frame), RequestHandler type alias
  - call(): uses asyncio.get_running_loop().create_future() for per-id isolation
  - _dispatch(): routes by frame keys with task lifecycle management (RUF006)
  - _handle_request: awaits handler, sends result or error back
- Tests: 5 cases each (TS + Py): call resolves, concurrent calls no interference,
  notifications fanout, server request dispatched, unknown method -32601 error

* feat(wrappers): SessionHandle.submit() AsyncIterable over DisplayEvent

Implements Task 7: SessionHandle.submit(prompt) returns
AsyncIterable<DisplayEvent> (TS) / AsyncIterator[DisplayEvent] (Py).

Each submit() call:
- Sends turn/submit JSON-RPC request
- Yields every display/event notification received
- Terminates when result/final notification arrives OR when
  turn/submit JSON-RPC response arrives (whichever first)
- Throws AaaError (TS) / RuntimeError (Py) on second call (D10 one-shot)

TypeScript (wrappers/typescript/src/session.ts):
- DisplayEvent interface {type, sessionId, turnId, parentTurnId?,
  synthesized?, payload}
- AaaError class extends Error with code/remediation
- TERMINAL_NOTIFICATION = 'result/final'
- SessionDeps {sessionId, terminate}
- SessionHandle with submitted flag
- makeIterable async generator using push-queue + wakeUp pattern

Python (wrappers/python/src/amplifier_agent_client/session.py):
- AaaError exception with code/remediation
- DisplayEvent class with snake_case fields
- SessionHandle with _submitted flag
- _stream async generator with asyncio.Queue sentinel pattern
- submit_task background task with finally sentinel

Tests (both languages):
- (a) yields display events and ends on result/final: drives
  2 result/delta notifs + result/final, verifies collected event
  types == ['result/delta', 'result/delta', 'result/final']
- (b) second submit() raises typed one-shot error matching
  /one-shot|already submitted/i

All 17 TS tests pass. All 15 Python session+unit tests pass.

* feat(wrappers): L14 client-side result/final synthesis

Implements design §4.6 contract #1: if the engine emits a non-null
reply in its turn/submit response but no result/final notification was
observed first, the wrapper synthesizes a result/final-shaped
DisplayEvent with synthesized: true as the last yielded event.

Pure synthesis functions:
- TS: synthesizeFinalIfMissing({sawFinal, reply, sessionId, turnId}) → DisplayEvent | null
- Py: synthesize_final_if_missing(*, saw_final, reply, session_id, turn_id) → dict | None
Both return null/None when sawFinal=true or reply=null/None.

Session wiring:
- TS session.ts: tracks sawFinal boolean; changes .finally() to
  .then(onFulfilled, onRejected) to capture reply and call synthesis
- Py session.py: uses mutable saw_final_flag={seen: False} shared
  between on_notif and submit_task closures; synthesis in try block

Tests:
- Three-case pure function tests for both languages
- Integration test (Branch B) driving stub through the synthesis path:
  engine sends result/delta events and a turn/submit reply but never
  emits result/final; last event is type=result/final, synthesized=True

* feat(wrappers): in-band approval bridge with timeout and default deny

- Add wrappers/typescript/src/approval.ts: ApprovalRequest, ApprovalResponse,
  ApprovalAdapter types; makeApprovalHandler(adapter) returns (params)=>Promise.
  No adapter → {decision:'deny', reason:'no_adapter_configured'}. Adapter race
  against setTimeout(timeoutMs) → {decision:'timeout'} on expiry. .catch →
  {decision:'deny', reason:'adapter_error'}.

- Add wrappers/python/src/amplifier_agent_client/approval.py: symmetric
  make_approval_handler(*, on_request, timeout_ms). on_request=None →
  no_adapter_configured. asyncio.wait_for timeout → {decision:'timeout'}.
  Exception → {decision:'deny', reason:'adapter_error'}.

- Update wrappers/typescript/src/session.ts: extend RpcLike with optional
  onRequest?; SessionHandle constructor takes optional ApprovalAdapter; wires
  rpc.onRequest('approval/request', makeApprovalHandler(approval)) if provided.

- Update wrappers/python/src/amplifier_agent_client/session.py: SessionHandle
  constructor takes optional approval_on_request + approval_timeout_ms; wires
  rpc.on_request('approval/request', make_approval_handler(...)) if provided.

- Add wrappers/typescript/test/approval.test.ts: 3 cases (allow response,
  timeout at 50ms, no-adapter deny).
- Add wrappers/python/tests/test_approval.py: same 3 async cases.

All 24 TS + 25 Py tests pass.

* feat(wrappers): display.onEvent push path + subagent event filtering

- Add display.ts (TS) with SubagentMode, DisplayAdapter interface, and
  applyDisplayFilter() predicate factory
- Add display.py (Py) with apply_display_filter() predicate factory
- Wire display adapter into session.ts: apply filter before yielding to
  iterator AND before invoking display.onEvent push callback
- Wire display adapter into session.py: same dual-path filtering
- Add tests in display.test.ts covering:
  (a) subagentEvents='all' keeps all events including parentTurnId ones
  (b) subagentEvents='none' drops events with parentTurnId
  (c) default (unset) is 'all'
  (d) onEvent push callback receives same events as iterator
  (e) subagentEvents='none' suppresses parentTurnId events from both paths
- Add tests in test_display.py with same five cases

TS: 29/29 tests pass. Py: 30/30 tests pass.

* feat(wrappers): version skew check + binary discovery + env allowlist + 'version' CLI

* feat(wrappers): spawnAgent / spawn_agent public API + getEngineInfo

* feat(conformance): scripted-replay runners for ts and py wrappers

- Add wrappers/conformance/runner_py.py: Python conformance runner using
  amplifier_agent_lib.protocol.conformance.loader and JsonRpcClient.
  ScriptedTransport replays server_to_client frames synchronously in
  send(). L14 synthesis applied after turn/submit if engine omits
  result/final. Emits JSON report {fixture, language, passed, assertions}.

- Add wrappers/conformance/runner_ts.ts: TypeScript port using yaml npm
  package for fixture loading and an inline minimal JSON-RPC client.
  Same ScriptedTransport pattern and L14 logic. Reports language:typescript.

- Add wrappers/conformance/tests/test_runner_py.py: pytest tests verifying
  capability_negotiation.yaml and l14_synthesis.yaml pass (Python).

- Add wrappers/conformance/test/runner-ts.test.ts: vitest tests verifying
  capability_negotiation.yaml and l14_synthesis.yaml pass (TypeScript).

- Add pnpm-workspace.yaml at repo root with members wrappers/typescript
  and wrappers/conformance.

- Add wrappers/conformance/package.json (private package with yaml^2.4.0
  dep, amplifier-agent-client-ts workspace:* dep, vitest devDep).

- Add wrappers/conformance/tsconfig.json and vitest.config.ts.

Assertion kinds supported: notification_emitted, no_notification,
error_returned, response_matches (unknown kinds skipped with ok=True).
source:engine filter on no_notification distinguishes synthesized
events from engine-emitted notifications.

Tests: 2/2 Python, 2/2 TypeScript.

* test(conformance): cross-language parity lint

Adds tests/test_conformance_parity.py — parametrised @pytest.mark.integration
test that runs both the TypeScript and Python conformance runners on each of
the 5 YAML fixtures and asserts they produce identical (kind, passed) tuples.

This is the H6 mitigation from design §4.6: prevents the silent failure mode
'TS green / Py green but they are testing different things'.

Design:
  - _REPO_ROOT and _FIXTURE_DIR constants anchor paths relative to repo root
  - _run_py() invokes `uv run python wrappers/conformance/runner_py.py <f>`
  - _run_ts() invokes `pnpm exec tsx runner_ts.ts <f>` from wrappers/conformance
  - On divergence, prints per-assertion diff with Py vs TS column alignment

Fix bundled: wrappers/conformance/runner_ts.ts error_returned handler was using
String(err) which serialises plain objects as '[object Object]', so code checks
against data.code strings always failed. Now uses JSON.stringify(err) for
structured errors, matching Python's str(frame['error']) behaviour.

* test(phase-2-2): exit gate against real amplifier-agent subprocess

* fix(conformance): resolve pyright type errors and formatting in runner_py

- Guard against None when checking assertion_id in errors dict (line 202)
- Add explicit type annotation for assertion_id in response_matches case (line 216)
- Guard against None in responses.get(assertion_id) call (line 218)
- Run ruff format to fix formatting across both conformance files

This resolves the critical type safety issue where int | None was being
passed to dict operations expecting int keys. The fixes ensure that
fixtures with missing 'id' fields in assertions won't accidentally pass
vacuously as None.

---------

Co-authored-by: Manoj Prabhakar Paidiparthy <mpaidiparthy@microsoft.com>
Co-authored-by: Amplifier <amplifier@microsoft.com>
manojp99 added a commit that referenced this pull request May 26, 2026
…3.0 wrappers) (#8)

* docs(phase-2-1): implementation plan for wire spec hardening

* feat(protocol): add _gen.py CLI skeleton

* feat(protocol): TypedDict to JSON Schema extractor

* feat(protocol): emit JSON Schemas for all wire TypedDicts

- Add _is_typed_dict(), _discover_typed_dicts(), _write_error_codes_schema()
  to protocol/_gen.py
- Update main() to iterate all TypedDicts across methods, notifications,
  and capabilities modules and write one *.schema.json per TypedDict
- Add error_codes.schema.json enumerating all ErrorCode StrEnum values
- Generate schemas/ directory with 30 TypedDict schemas + error_codes.schema.json
- Add schemas/__init__.py marker file
- Add two new tests: test_gen_emits_schema_for_every_typeddict and
  test_gen_error_codes_schema_is_string_enum

* feat(protocol): generate spec.md with linked schema references

* test(protocol): CI staleness gate for generated spec.md + schemas

* feat(protocol): YAML fixture loader and structural validator

* feat(protocol): L14 + capability-negotiation conformance fixtures

* feat(protocol): lineage, version-skew, resume-continuity fixtures + completeness gate

* build(protocol): include spec.md, schemas, and fixtures in wheel

* test(protocol): Phase 2.1 exit-gate integration test

* style(protocol): ruff format fixup on staleness test

* docs(phase-2-2): implementation plan for wrappers and conformance

* feat(wrappers/ts): package skeleton with vitest + pnpm

* feat(wrappers/py): package skeleton as workspace member

- Add wrappers/python/pyproject.toml: amplifier-agent-client v0.0.0,
  hatchling build backend, pytest-asyncio strict mode, ruff 120/py312
- Add wrappers/python/src/amplifier_agent_client/__init__.py:
  exports only PROTOCOL_VERSION_REQUIRED_BY_WRAPPER = '2026-05-aaa-v0'
- Add wrappers/python/tests/test_smoke.py: async smoke test asserting
  the protocol version constant is importable and correct
- Update root pyproject.toml workspace members to include wrappers/python

Co-authored-by: Amplifier <amplifier@microsoft.com>

* feat(wrappers): shared wire types via JSON-Schema codegen (ts) + re-export (py)

* feat(wrappers): NDJSON subprocess transport (ts + py)

Implement Transport class for both TypeScript and Python wrappers.
Each Transport spawns a child process, exchanges JSON frames as NDJSON
over its stdio, drains stderr to an optional sink, and terminates cleanly.

TypeScript (wrappers/typescript/src/transport.ts):
- spawn() with stdio ['pipe','pipe','pipe'], readline on stdout/stderr
- onFrame(cb): register callbacks for parsed JSON frames
- send(obj): writes JSON.stringify(obj) + '\n' to stdin
- terminate(): sends SIGTERM, awaits child 'close' (exitPromise)
- Non-JSON stdout lines logged to stderr sink and dropped silently
- ExitInfo {code, signal}, TransportOptions {command, args, env, cwd?, stderr?}

Python (wrappers/python/src/amplifier_agent_client/transport.py):
- start(): asyncio.create_subprocess_exec with PIPEs
- frames(): async generator via asyncio.Queue + 0.1s poll timeout;
  exits when _stdout_done is set and queue is empty
- send(obj): json.dumps(obj) + '\n' encoded then drain()
- terminate(): proc.terminate(), wait up to 5s, fallback proc.kill()
- _read_stdout: async for loop with try/except; drops non-JSON silently
- _drain_stderr: drains stderr to optional sink

Defensive requirement (MCP-style tolerance): non-JSON stdout lines are
logged to the stderr sink (or process.stderr/sys.stderr) and dropped
silently - never raised. Matches engine pattern at jsonrpc.py.

Tests:
- wrappers/typescript/test/transport.test.ts: 3 vitest cases
- wrappers/python/tests/test_transport.py: 3 pytest-asyncio cases

All 6 tests pass (3/3 TS, 3/3 Py). No regressions.

* feat(wrappers): JSON-RPC 2.0 client with per-id correlation

- TS: JsonRpcClient at wrappers/typescript/src/jsonrpc.ts
  - TransportLike interface (send/onFrame), Notification interface, RequestHandler type
  - call(): allocates request id, creates Promise, sends {jsonrpc:'2.0',id,method,params}
  - dispatch(): routes response→resolve pending, server-request→handler+response, notification→fanout
  - Unknown server methods return -32601 error
  - NC-L16 designed out: each call() has independent Promise row in pending Map
- Py: JsonRpcClient at wrappers/python/src/amplifier_agent_client/jsonrpc.py
  - _TransportLike Protocol (send/on_frame), RequestHandler type alias
  - call(): uses asyncio.get_running_loop().create_future() for per-id isolation
  - _dispatch(): routes by frame keys with task lifecycle management (RUF006)
  - _handle_request: awaits handler, sends result or error back
- Tests: 5 cases each (TS + Py): call resolves, concurrent calls no interference,
  notifications fanout, server request dispatched, unknown method -32601 error

* feat(wrappers): SessionHandle.submit() AsyncIterable over DisplayEvent

Implements Task 7: SessionHandle.submit(prompt) returns
AsyncIterable<DisplayEvent> (TS) / AsyncIterator[DisplayEvent] (Py).

Each submit() call:
- Sends turn/submit JSON-RPC request
- Yields every display/event notification received
- Terminates when result/final notification arrives OR when
  turn/submit JSON-RPC response arrives (whichever first)
- Throws AaaError (TS) / RuntimeError (Py) on second call (D10 one-shot)

TypeScript (wrappers/typescript/src/session.ts):
- DisplayEvent interface {type, sessionId, turnId, parentTurnId?,
  synthesized?, payload}
- AaaError class extends Error with code/remediation
- TERMINAL_NOTIFICATION = 'result/final'
- SessionDeps {sessionId, terminate}
- SessionHandle with submitted flag
- makeIterable async generator using push-queue + wakeUp pattern

Python (wrappers/python/src/amplifier_agent_client/session.py):
- AaaError exception with code/remediation
- DisplayEvent class with snake_case fields
- SessionHandle with _submitted flag
- _stream async generator with asyncio.Queue sentinel pattern
- submit_task background task with finally sentinel

Tests (both languages):
- (a) yields display events and ends on result/final: drives
  2 result/delta notifs + result/final, verifies collected event
  types == ['result/delta', 'result/delta', 'result/final']
- (b) second submit() raises typed one-shot error matching
  /one-shot|already submitted/i

All 17 TS tests pass. All 15 Python session+unit tests pass.

* feat(wrappers): L14 client-side result/final synthesis

Implements design §4.6 contract #1: if the engine emits a non-null
reply in its turn/submit response but no result/final notification was
observed first, the wrapper synthesizes a result/final-shaped
DisplayEvent with synthesized: true as the last yielded event.

Pure synthesis functions:
- TS: synthesizeFinalIfMissing({sawFinal, reply, sessionId, turnId}) → DisplayEvent | null
- Py: synthesize_final_if_missing(*, saw_final, reply, session_id, turn_id) → dict | None
Both return null/None when sawFinal=true or reply=null/None.

Session wiring:
- TS session.ts: tracks sawFinal boolean; changes .finally() to
  .then(onFulfilled, onRejected) to capture reply and call synthesis
- Py session.py: uses mutable saw_final_flag={seen: False} shared
  between on_notif and submit_task closures; synthesis in try block

Tests:
- Three-case pure function tests for both languages
- Integration test (Branch B) driving stub through the synthesis path:
  engine sends result/delta events and a turn/submit reply but never
  emits result/final; last event is type=result/final, synthesized=True

* feat(wrappers): in-band approval bridge with timeout and default deny

- Add wrappers/typescript/src/approval.ts: ApprovalRequest, ApprovalResponse,
  ApprovalAdapter types; makeApprovalHandler(adapter) returns (params)=>Promise.
  No adapter → {decision:'deny', reason:'no_adapter_configured'}. Adapter race
  against setTimeout(timeoutMs) → {decision:'timeout'} on expiry. .catch →
  {decision:'deny', reason:'adapter_error'}.

- Add wrappers/python/src/amplifier_agent_client/approval.py: symmetric
  make_approval_handler(*, on_request, timeout_ms). on_request=None →
  no_adapter_configured. asyncio.wait_for timeout → {decision:'timeout'}.
  Exception → {decision:'deny', reason:'adapter_error'}.

- Update wrappers/typescript/src/session.ts: extend RpcLike with optional
  onRequest?; SessionHandle constructor takes optional ApprovalAdapter; wires
  rpc.onRequest('approval/request', makeApprovalHandler(approval)) if provided.

- Update wrappers/python/src/amplifier_agent_client/session.py: SessionHandle
  constructor takes optional approval_on_request + approval_timeout_ms; wires
  rpc.on_request('approval/request', make_approval_handler(...)) if provided.

- Add wrappers/typescript/test/approval.test.ts: 3 cases (allow response,
  timeout at 50ms, no-adapter deny).
- Add wrappers/python/tests/test_approval.py: same 3 async cases.

All 24 TS + 25 Py tests pass.

* feat(wrappers): display.onEvent push path + subagent event filtering

- Add display.ts (TS) with SubagentMode, DisplayAdapter interface, and
  applyDisplayFilter() predicate factory
- Add display.py (Py) with apply_display_filter() predicate factory
- Wire display adapter into session.ts: apply filter before yielding to
  iterator AND before invoking display.onEvent push callback
- Wire display adapter into session.py: same dual-path filtering
- Add tests in display.test.ts covering:
  (a) subagentEvents='all' keeps all events including parentTurnId ones
  (b) subagentEvents='none' drops events with parentTurnId
  (c) default (unset) is 'all'
  (d) onEvent push callback receives same events as iterator
  (e) subagentEvents='none' suppresses parentTurnId events from both paths
- Add tests in test_display.py with same five cases

TS: 29/29 tests pass. Py: 30/30 tests pass.

* feat(wrappers): version skew check + binary discovery + env allowlist + 'version' CLI

* feat(wrappers): spawnAgent / spawn_agent public API + getEngineInfo

* feat(conformance): scripted-replay runners for ts and py wrappers

- Add wrappers/conformance/runner_py.py: Python conformance runner using
  amplifier_agent_lib.protocol.conformance.loader and JsonRpcClient.
  ScriptedTransport replays server_to_client frames synchronously in
  send(). L14 synthesis applied after turn/submit if engine omits
  result/final. Emits JSON report {fixture, language, passed, assertions}.

- Add wrappers/conformance/runner_ts.ts: TypeScript port using yaml npm
  package for fixture loading and an inline minimal JSON-RPC client.
  Same ScriptedTransport pattern and L14 logic. Reports language:typescript.

- Add wrappers/conformance/tests/test_runner_py.py: pytest tests verifying
  capability_negotiation.yaml and l14_synthesis.yaml pass (Python).

- Add wrappers/conformance/test/runner-ts.test.ts: vitest tests verifying
  capability_negotiation.yaml and l14_synthesis.yaml pass (TypeScript).

- Add pnpm-workspace.yaml at repo root with members wrappers/typescript
  and wrappers/conformance.

- Add wrappers/conformance/package.json (private package with yaml^2.4.0
  dep, amplifier-agent-client-ts workspace:* dep, vitest devDep).

- Add wrappers/conformance/tsconfig.json and vitest.config.ts.

Assertion kinds supported: notification_emitted, no_notification,
error_returned, response_matches (unknown kinds skipped with ok=True).
source:engine filter on no_notification distinguishes synthesized
events from engine-emitted notifications.

Tests: 2/2 Python, 2/2 TypeScript.

* test(conformance): cross-language parity lint

Adds tests/test_conformance_parity.py — parametrised @pytest.mark.integration
test that runs both the TypeScript and Python conformance runners on each of
the 5 YAML fixtures and asserts they produce identical (kind, passed) tuples.

This is the H6 mitigation from design §4.6: prevents the silent failure mode
'TS green / Py green but they are testing different things'.

Design:
  - _REPO_ROOT and _FIXTURE_DIR constants anchor paths relative to repo root
  - _run_py() invokes `uv run python wrappers/conformance/runner_py.py <f>`
  - _run_ts() invokes `pnpm exec tsx runner_ts.ts <f>` from wrappers/conformance
  - On divergence, prints per-assertion diff with Py vs TS column alignment

Fix bundled: wrappers/conformance/runner_ts.ts error_returned handler was using
String(err) which serialises plain objects as '[object Object]', so code checks
against data.code strings always failed. Now uses JSON.stringify(err) for
structured errors, matching Python's str(frame['error']) behaviour.

* test(phase-2-2): exit gate against real amplifier-agent subprocess

* fix(conformance): resolve pyright type errors and formatting in runner_py

- Guard against None when checking assertion_id in errors dict (line 202)
- Add explicit type annotation for assertion_id in response_matches case (line 216)
- Guard against None in responses.get(assertion_id) call (line 218)
- Run ruff format to fix formatting across both conformance files

This resolves the critical type safety issue where int | None was being
passed to dict operations expecting int keys. The fixes ensure that
fixtures with missing 'id' fields in assertions won't accidentally pass
vacuously as None.

* feat(wire): bump PROTOCOL_VERSION to 0.1.0 (A1)

Per design §4.10.3, bump PROTOCOL_VERSION from '2026-05-aaa-v0' to '0.1.0'
across all three required locations:
- src/amplifier_agent_lib/protocol/methods.py
- wrappers/typescript/src/index.ts (PROTOCOL_VERSION_REQUIRED_BY_WRAPPER)
- wrappers/python/src/amplifier_agent_client/__init__.py (PROTOCOL_VERSION_REQUIRED_BY_WRAPPER)

Regenerated spec.md via amplifier_agent_lib.protocol._gen (schemas/ files
contain no version literal; types.ts regenerated for completeness — no diff).

Updated wrapper test fixtures that hardcoded the old version string so
pnpm test / pnpm typecheck and wrappers/python/tests/* continue to pass.

Updated tests/test_protocol_gen.py to assert '0.1.0' in spec.md.

Added tests/test_protocol_version_bump.py with test_protocol_version_is_0_1_0
asserting PROTOCOL_VERSION == '0.1.0' (RED → GREEN verified).

* feat(wire): add McpServerConfig, HostCapabilities, InitializeParams.mcpServers/.host (A1)

Add three new TypedDicts to the protocol surface (design §4.10.1):
- McpServerConfig: per-server MCP configuration with required transport field
- HostCapabilities: total=False host capability advertisement
- InitializeHostParams: total=False envelope wrapping host capabilities

Extend InitializeParams with two NotRequired fields:
- mcpServers: dict[str, McpServerConfig]
- host: InitializeHostParams

Regenerate spec.md and schemas/. Wire the new params through both wrappers:
- TypeScript: SpawnAgentParams gains mcpServers/host; passed to agent/initialize
- Python: spawn_agent() gains mcp_servers/host kwargs; merged into init payload

Update test_protocol_gen.py expected schema set to include the three new
schema files. Add tests/test_wire_types_v01.py with four tests verifying
the new TypedDicts and InitializeParams field extensions.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wire): AaaError.severity/classification/correlation_id + approval ErrorCodes (A1)

Adds optional severity / classification / correlation_id / stderr_tail
fields to engine-side AaaError per design §4.10.2. The wrapper-side
AaaError gains keyword-only classification and severity arguments while
preserving its positional (code, remediation) signature.

New ErrorCode values:
  - APPROVAL_TRANSLATION_FAILED
  - APPROVAL_PROTOCOL_VIOLATION
  - ENV_INJECTION_REJECTED

Regenerated spec.md, error_codes.schema.json, and wrappers/typescript
types.ts to reflect the new codes.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wire): extend TS AaaError with severity/classification/correlationId (A1)

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): SessionStore — JSONL transcript + JSON metadata (A2 CR-1)

Adds amplifier_agent_lib.session_store.SessionStore implementing the minimal
persistence contract from Design §4.6:

- session_dir(session_id) -> <root>/sessions/<id>
- save(session_id, transcript, metadata) writes transcript.jsonl + metadata.json
  via amplifier_foundation.write_with_backup (synchronous — no await)
- load(session_id) returns (transcript, metadata) | None

Pattern lifted near-verbatim from amplifier-app-cli and trimmed to the engine's
needs (no project-slug coupling, no recovery-from-backup logic — those live in
the host).

Tests cover save/load roundtrip, missing-session None, sessions/ subdirectory
layout, JSONL line-per-message format, empty-transcript roundtrip, and the
session_dir path contract. 6/6 pass; pyright clean.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): IncrementalSaveHook — tool:post transcript save (A2 CR-1)

Adds IncrementalSaveHook which persists the current transcript via
SessionStore on every kernel tool:post event. Implements the standard
hook callable contract (async __call__(event, data) -> HookResult) so it
can be registered directly with coordinator.hooks.register('tool:post', ...).

Design reference: §4.6.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): wire session persistence resume path into _runtime.py (A2 CR-1)

Thread SessionStore and IncrementalSaveHook into make_turn_handler:

- On resume (is_resumed=True with session_id), load persisted transcript
  from SessionStore(state_root()) and replay it via the
  'context.set_messages' coordinator capability.
- After session creation, register an IncrementalSaveHook on 'tool:post'
  (name='incremental_save') so transcripts are checkpointed after every
  tool invocation via 'context.get_messages'.
- Both wiring paths are guarded with 'is not None' on the resolved
  capability so kernels/contexts without these hooks skip rather than
  crash (Design §4.8).

Tests:
- test_runtime_loads_transcript_for_resumed_session: pre-populates the
  store on tmp_path, patches _runtime.state_root, asserts set_messages
  is awaited with the loaded transcript.
- test_runtime_registers_incremental_save_hook: captures all
  coordinator.hooks.register calls and asserts at least one tool:post
  registration with name containing 'incremental_save'.
- _FakeCoordinator gains a get_capability() returning the previously
  registered fn (or None) to match the real coordinator contract.

All 9 test_runtime.py tests pass.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): WireApprovalProvider — three explicit error codes (A3 CR-2)

Implements design §4.7 wire-bridging shim that forwards ApprovalRequest
objects over the wire to the host adapter. Exposes exactly three failure
modes via AaaError with classification='approval':

  • approval_translation_failed  — request cannot be serialized
  • approval_timeout             — host did not respond in time
  • approval_protocol_violation  — host response malformed

Constructor is keyword-only (approval_request_fn, timeout_seconds);
default APPROVAL_TIMEOUT_SECONDS = 30.0. Translation seams
(_translate_request / _translate_response) are overridable so future
host-specific shims (or tests) can inject custom serializers without
subclassing.

Tests (tests/test_wire_approval_provider.py, 4 passing):
  • test_approval_translation_failed_on_unserializable_request
  • test_approval_timeout (uses asyncio.sleep(9999) + 50ms deadline)
  • test_approval_protocol_violation_on_bad_response
  • test_successful_approval_returns_response

pyright: 0 errors, 0 warnings.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): register WireApprovalProvider in _runtime.py (A3 CR-2)

Replace raw ctx.approval.request capability registration with a
WireApprovalProvider-wrapped bound method so approval requests flow
through the wire-bridging shim with its three explicit error codes
(approval_translation_failed, approval_timeout, approval_protocol_violation).

Per design §4.8 (A3 — CR-2).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wrappers): Python build_env BLOCKED_ENV_KEYS validation (A6 SC-3)

Reject env.extra entries whose key is a well-known dynamic-loader /
interpreter hook (PYTHONPATH, LD_PRELOAD, LD_LIBRARY_PATH, PYTHONSTARTUP,
PATH, PYTHONHOME, PYTHONNOUSERSITE, DYLD_INSERT_LIBRARIES,
DYLD_LIBRARY_PATH) per design §4.12.1.  Raises AaaError with
code='env_injection_rejected', classification='protocol',
severity='error'.

Adds three tests to test_spawn.py covering PYTHONPATH rejection,
LD_PRELOAD rejection, and the non-blocked-extra happy path.

* feat(wrappers): async probe_engine_version — Python wrapper (A6 SC-7)

Convert probe_engine_version() to async using asyncio.create_subprocess_exec
so the version probe does not block the event loop (design §4.12.2).

- spawn.py: async def probe_engine_version() — subprocess via asyncio
- __init__.py: await both probe call sites; type hint Callable[..., Awaitable[dict]]
- test_spawn.py: test_probe_engine_version_is_async (RED → GREEN)
- test_spawn_agent.py: _version_probe mock now async

42/42 wrappers/python tests pass.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wrappers): TS buildEnv BLOCKED_ENV_KEYS validation (A6 SC-3)

Add BLOCKED_ENV_KEYS set to wrappers/typescript/src/spawn.ts and validate
env.extra keys in buildEnv(). Throws AaaError(env_injection_rejected) with
classification='protocol' and severity='error' when a blocked key
(PYTHONPATH, LD_PRELOAD, LD_LIBRARY_PATH, PYTHONSTARTUP, PATH, PYTHONHOME,
PYTHONNOUSERSITE, DYLD_INSERT_LIBRARIES, DYLD_LIBRARY_PATH) appears in
env.extra. Mirrors the Python wrapper behavior added in c12db09.

Tests cover both rejection paths and the safe-key passthrough.

Design reference: §4.12.1.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wrappers): async probeEngineVersion — TypeScript wrapper (A6 SC-7)

Convert probeEngineVersion() from sync execFileSync() to async using
promisify(execFile). Updates spawnAgent() call site to await the probe
and updates _versionProbe injection-point type from sync to Promise.

Tests:
- New SC-7 test verifies probeEngineVersion returns a Promise
- spawn-agent.test.ts _versionProbe mock updated to async
- pnpm typecheck: clean
- pnpm test: 45/45 pass

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(bundle): CR-1/Q6/Q9/SC-2 — context-simple, add tool-mcp + hooks-approval, remove hooks-logging

- CR-1: context-persistent → context-simple (canonical pattern;
  context-persistent does not exist in foundation)
- Q9: add tool-mcp (verbose_servers=false, max_content_size=65536);
  servers omitted intentionally — _runtime.py merges params[mcpServers]
  at runtime via tool_overrides. Closes NanoClaw reply-channel blocker.
- Q6: add hooks-approval @v0.1.0 (default mode, NOT policy_driven_only);
  WireApprovalProvider registers at runtime. Pinned tag verified stable.
- SC-2: remove hooks-logging block (ephemeral in-container path; A2's
  IncrementalSaveHook is the canonical host-mounted writer).
- O-2: drop stale prose paragraph claiming session-transcript persistence
  is out of scope (Phase 1 A2 delivered it).
- Bump bundle.version 1.1.0 → 1.2.0.

Cache key (sha256 of bundle.md) changes from 3cf00a1e26ff1bd1 to
2e21030a0bdb48f2 — warm prepared-bundle cache self-invalidates on next
amplifier-agent prepare.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): A5 — thread mcpServers into tool-mcp tool_overrides; store host.capabilities

Adds handle_initialize(params) entry point in _runtime.py that:
- Loads the prepared bundle from cache
- Merges static tool-mcp config with wire-supplied params['mcpServers']
- Passes merged config to bundle.create_session via tool_overrides
- Stores params.host.capabilities on session.metadata['host_capabilities']

Verified against amplifier_module_tool_mcp/config.py:35-53,56-61 — the config
dict passed to mount() has highest priority and accepts this shape directly.
This enables wire-supplied MCP servers at runtime and future capability-flag
logic without protocol changes.

Adds tests/test_runtime_mcp_threading.py covering:
- mcpServers threaded into tool_overrides['tool-mcp']['config']['servers']
- Static config keys (verbose_servers, max_content_size) preserved alongside servers
- Empty mcpServers still produces tool_overrides with servers={}
- host.capabilities stored on session.metadata

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(cli): A7a — doctor --strict (CI gate) and --quick (minimal check) flags

Extends the doctor admin command with two new flags from Phase 2 design §4.9:

- --strict: Exits non-zero when prepared-bundle cache is missing (turning
  the cache [INFO] line into a [FAIL]). Intended for CI / image-build gating
  where a missing prepared cache should fail the build.

- --quick: Runs only essential checks (Python version, prepared-cache
  presence). Skips provider detection and the three XDG writability probes.
  Intended for fast health checks where full diagnostics are overkill.

Default behavior (no flags) is unchanged: missing cache stays [INFO], exit 0
when the five core checks pass.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(cli): A7b — doctor --emit-sha for supply-chain bundle source SHA baseline

Adds --emit-sha flag to the doctor admin command. Emits one line per
bundle module containing the sha256 prefix of its source URL, the
module name, and the URL. CI baseline-diffs this output daily to
detect supply-chain drift in bundle.md.

v1 stub: SHA is computed over the source URL string, not over the
installed module content. Full content-pinning is tracked as D-v1.x-02.
Still catches manifest-level drift — any change to a source URL or pin
in bundle.md changes the hash and trips the baseline diff.

Tests (4 added to tests/test_admin_doctor_phase2.py):
- --emit-sha flag is listed in --help
- output includes 'module=' lines
- output includes tool-mcp (A4 verification)
- output includes hooks-approval (A4 verification)

Refs: design §4.9, SC-4, §10.6

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(cli): A7c — doctor bundle module presence, approval shape, session_store roundtrip checks

Add three additional checks to the full (non-quick) doctor path per Design §4.9:

1. _check_bundle_modules: static parse of bundle.md frontmatter — verifies
   context-simple (CR-1), tool-mcp + hooks-approval present (A4), and
   hooks-logging absent (SC-2). Catches A4 regressions.

2. _check_approval_provider_shape: imports WireApprovalProvider, verifies
   amplifier_core.ApprovalProvider is in its MRO (Protocol-safe equivalent
   of issubclass — ApprovalProvider is a non-@runtime_checkable Protocol),
   and confirms all three approval error codes appear in source. Detects
   Phase 1 A3 (CR-2) regressions.

3. _check_session_store_roundtrip: creates a SessionStore in a tempdir,
   saves a probe transcript + metadata, loads them back, and asserts
   lossless round-trip. Detects Phase 1 A2 regressions.

Checks run only in the full path (not --quick) and contribute to the
hard_failures verdict.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(conformance): A8 — fixture for initialize with mcpServers wire field

Adds initialize-with-mcpservers.yaml conformance fixture exercising the
client wire plumbing from A1: client wrapper sends agent/initialize with
mcpServers field, parses sessionState response, and routes a result/final
notification to the consumer.

Both Python and TypeScript harness tests pass.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(conformance): A8 — fixture for initialize with host.capabilities wire field

Adds initialize-with-host-capabilities.yaml fixture exercising the
HostCapabilities wire shape (design §4.10.1, B+C hybrid). Verifies the
client wrapper can send agent/initialize with host.capabilities
{supports_structured_errors: true, supports_steering: false} and that
the session is established successfully from the server sessionState
response.

Both Python and TypeScript conformance runners load and pass the new
fixture (assertions: response_matches sessionState; notification_emitted
result/final).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(conformance): A8 — fixture for WireApprovalProvider three typed error codes

Adds approval-shim-three-error-codes.yaml exercising three typed error codes
from WireApprovalProvider (Phase 1 A3, CR-2):
  - approval_translation_failed
  - approval_timeout
  - approval_protocol_violation

Each code is exercised via a separate initialize + turn/submit sequence
with scripted server-side JSON-RPC error responses. Error codes appear in
both message and data.code for parity between Python runner (str(exc)) and
TypeScript runner (JSON.stringify) surface paths.

Wires fixture into Python and TypeScript harness test suites.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(conformance): A8 — fixture for resume with session_store

Adds resume-with-session-store.yaml verifying wire shape of resume
protocol with session_store context (CR-1, Phase 1 A2). Scripts two
spawns of the same sessionId: first with resume=false, then with
resume=true. Verifies sessionState.resumed=true on second spawn and
that the second turn references first-turn context (text='42').

Uses v0.1.0 protocolVersion and clientCapabilities display.events of
[result/final, tool/started, tool/completed]. Engine-level transcript
continuity remains covered by tests/test_resume_continuity.py; this
fixture tests the wire-level contract only.

Both harnesses (Python pytest + TypeScript vitest) gain a new test
asserting the fixture passes.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* test(conformance): A8 — update expected fixture set to include 4 new wire-shape fixtures

Adds initialize-with-mcpservers, initialize-with-host-capabilities,
approval-shim-three-error-codes, and resume-with-session-store to the
hardcoded expected-set assertion. Required follow-up to commits
ba19e5c, bf2b4aa, 9732653, da48a86 which added the fixtures themselves.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* chore(release): A9 — bump version to 0.2.0 (wire v0.1.0, MCP threading, doctor --strict, 4 conformance fixtures)

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* docs(release): A9 — CHANGELOG.md for v0.2.0

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* fix(bundle): pin hooks-approval to @main, bump bundle.version to 1.2.1

Root cause: bundle.md line 124 pinned hooks-approval to git tag @v0.1.0,
which does not exist on the upstream repo. git ls-remote --tags shows only
branch refs (main, feat/use-approval-needs-check-capability-callback).
The README's "Version: 0.1.0" label was mistaken for a git tag.

Fix: align hooks-approval source to @main convention used by the other two
foundation modules in the same bundle (context-simple@main, tool-mcp@main).
Bump bundle.version 1.2.0 → 1.2.1 to invalidate any cached prepared bundle
built with the broken pin.

Also in this commit:
- test_bundle_loader.py: rename test_prepared_bundle_declares_context_persistent
  → test_prepared_bundle_declares_context_simple, reflecting current bundle.md
  state (context-simple). The A7-era swap to context-persistent is deferred
  pending Issue 2 (resume-continuity) investigation.
- test_bundle_loader.py: add test_bundle_module_sources_use_main_not_version_tags,
  a defense-in-depth static check asserting no git+ source URL uses a @vX.Y.Z
  tag ref. Future broken-pin regressions will be caught at test time.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* fix(tests): update stale protocol version references from '2026-05-aaa-v0' to '0.1.0'

Root cause: PROTOCOL_VERSION was bumped to '0.1.0' in commit eae95bf (A1 stage,
strict-refuse on protocol version mismatch per design D6), but eight test
fixtures still hardcoded the old version string, triggering AaaError
('protocol_version_mismatch') in every engine test.

Stale references updated:
- tests/test_engine.py: _boot_params() helper, line 70
- tests/test_engine_with_real_bundle.py: _boot_params() helper, line 46
- tests/test_protocol_methods.py: docstring + constant assertion + 4 round-trip fixtures
- tests/test_cli_version_subcommand.py: docstring + assertions + error message strings
- tests/test_protocol_conformance_fixtures.py: inline YAML setup block + assertion

Additional stale fix (required for test_phase_2_1_exit_gate.py to pass):
- tests/test_phase_2_1_exit_gate.py: expected fixture list updated from 5 (D7 only)
  to 9 (D7 + A8), since A8 added four new wire-shape fixtures; the fixture-list
  assertion was not updated at the time
- src/amplifier_agent_lib/protocol/conformance/fixtures/{l14_synthesis,
  subagent_lineage,resume_continuity,capability_negotiation,
  initialize-with-mcpservers}.yaml: protocolVersion updated to '0.1.0'
- src/amplifier_agent_lib/protocol/conformance/fixtures/version_skew.yaml:
  serverVersion in expected error response updated to '0.1.0' (server now speaks
  '0.1.0', not '2026-05-aaa-v0')
- src/amplifier_agent_cli/admin/version_info.py: stale standalone PROTOCOL_VERSION
  constant updated to '0.1.0' (was diverged from methods.py's bump in A1;
  fixing this makes test_cli_version_subcommand.py consistent)
- src/amplifier_agent_lib/protocol/conformance/loader.py: docstring example updated

Intentional skew-test preservation:
- version_skew.yaml setup.protocolVersion stays '2099-12-future-vN' — this is
  the deliberate future-version client testing strict-refuse behavior (design D6
  allowProtocolSkew test).

All 27 target tests now pass. grep -rn '2026-05-aaa-v0' tests/ returns 0 matches.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* test(engine): A2 — unit test for resume wiring via mount registry (failing, then green)

Three tests in test_runtime_resume_wiring.py reproduce both defects at the
unit level without a live LLM call:

1. test_resume_wiring_uses_mount_registry_for_set_messages — Defect C:
   resume path uses coordinator.get_capability('context.set_messages')
   which returns None (context-simple mounts via coordinator.mount, not
   register_capability), so set_messages is never called.

2. test_hook_registration_uses_mount_registry_for_get_messages — Defect C:
   IncrementalSaveHook is never registered because get_capability(
   'context.get_messages') returns None; tool-call transcripts not saved.

3. test_turn_end_save_persists_transcript_after_execute — Defect A:
   pure conversational turns produce no tool:post events, so the hook
   never fires; transcript not persisted even with Defect C fixed.

All three fail before the corresponding production fixes are applied.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* fix(engine): A2/CR-1 — use mount registry for context access in _runtime.py (Defect C)

Root cause: _runtime.py used coordinator.get_capability('context.set_messages')
and coordinator.get_capability('context.get_messages') to access the context
module.  context-simple (and all standard context modules) mount via
coordinator.mount(), which puts the instance in the module mount registry
(coordinator.get()), NOT the capability registry.  Both guards silently
returned None and the entire resume/save path was dead code.

Fix 1 — resume path (lines 117-126):
  BEFORE: set_messages = coordinator.get_capability('context.set_messages')
          if set_messages is not None: await set_messages(loaded_transcript)
  AFTER:  context_module = coordinator.get('context')
          if context_module is not None and hasattr(context_module, 'set_messages'):
              await context_module.set_messages(loaded_transcript)

Fix 1 — hook registration path (lines 128-149):
  BEFORE: get_messages = coordinator.get_capability('context.get_messages')
          if get_messages is not None: IncrementalSaveHook(get_messages=get_messages)
  AFTER:  context_module = coordinator.get('context')
          if context_module is not None and hasattr(context_module, 'get_messages'):
              IncrementalSaveHook(get_messages=context_module.get_messages)

Defense-in-depth: both paths now log a WARNING when the context module is
absent or missing the expected method, making the silent-no-op condition
visible in operator logs.

test_runtime.py: update Test 8 to stub coordinator.get('context') → context
stub with AsyncMock for set_messages/get_messages (matches new code path).
Add get() method to _FakeCoordinator returning None so Tests 6/7 don't raise
AttributeError when the runtime calls coordinator.get('context').

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* fix(engine): A2/CR-1 — explicit turn-end save mirrors app-cli main_loop (Defect A)

Root cause: IncrementalSaveHook fires only on tool:post events.  Pure
conversational turns ('remember my favourite colour is purple') never
invoke any tool, so tool:post never fires, the hook never runs, and the
transcript is never saved.  Turn 2 (resumed) loads None from SessionStore
and starts with an empty context — agent says 'I don't know'.

Fix: after session.execute() completes, read the final transcript from
the context module and call store.save() explicitly.  This mirrors the
amplifier-app-cli main_loop pattern (session_runner.py) and fires on
every turn regardless of whether tools were called.  The IncrementalSaveHook
is kept for crash-recovery checkpointing between tool calls.

_runtime.py lines 165–173: inside 'async with session:' after execute(),
if session_id and context module has get_messages, call get_messages() and
store.save(session_id, final_transcript, metadata={'last_turn': 'complete'}).

test_runtime.py: update _fake_prepared and Tests 2, 4, 5 to set
coordinator.get.return_value = None (no context module) so the new
turn-end save path skips gracefully.  Update Test 9 to provide an actual
context stub (AsyncMock get_messages) so the await does not TypeError.
Update Test 10 to set coordinator.get.return_value = None.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* docs(design): add v1 NC amplifier-agent provider design + 3 phase implementation plans

Design doc (2026-05-22-aaa-v2-amplifier-agent-nc-provider.md): B+selected-C hybrid
architecture for NanoClaw's in-container amplifier-agent provider adapter. 12 locked
decisions covering binary install, push semantics (B1 buffer), approval shim, session
persistence, MCP threading, capability negotiation, and error taxonomy. 4 critical
risks closed (CR-1..CR-4), 7 significant concerns resolved, 12 v1.x deferrals catalogued.

Three phase implementation plans totaling 42 tasks:
  - Phase 1 (phase1-engine-core): wire protocol bump, SessionStore, IncrementalSaveHook,
    WireApprovalProvider, wrapper BLOCKED_ENV_KEYS and async probeEngineVersion.
  - Phase 2 (phase2-engine-integration): bundle edits, MCP threading in _runtime.py,
    doctor --strict/--quick/--emit-sha flags, four conformance fixtures, v0.2.0 tag.
  - Phase 3 (phase3-nanoclaw-consumption): Dockerfile changes, CI version lint,
    host-side provider registration, in-container adapter, E2E scenarios.

All three plans subsequently executed via SDD recipe on this branch.

Two doc corrections from DTU verification findings folded in pre-commit:
  - F2: uv 0.11+ removed the --bin-dir flag; corrected to UV_TOOL_BIN_DIR env var
    form in all three locations (decision table D10, Dockerfile block in D10 detail,
    Dockerfile block in Phase 3 Task 5).
  - F3: file: package reference for amplifier-agent-client-ts does not resolve in a
    standalone docker build; documented as known constraint with two resolution
    options in both the design doc (§10.8) and the Phase 3 plan.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* docs(design): Mode A pivot amendment 2026-05-24 — replace stdio JSON-RPC with subprocess-driver wire

Amends the locked 2026-05-22 v1 design after DTU verification surfaced that Mode B (stdio JSON-RPC) was specified but never implemented, and that none of the four queued hosts (NC, Paperclip, OpenCode, Claude Code) actually require bidirectional mid-turn streaming for v1. Replaces 6 of 12 locked wire decisions (D1, D3, D4, D6, D9, D12) with a Mode A subprocess driver pattern mirroring Claude Code's per-turn invocation model. Preserves the other 6 decisions plus all CR-1..CR-4 closures. Adversarial review by systems-design-critic surfaced 5 Critical Risks (MCP secret leak via argv, stdout discipline, DisplayEvent shape conflation, integration gate weakness, default output change) and 8 Significant Concerns; all addressed in Phase 7 refinement. Banner headers on 2026-05-20 and 2026-05-22 mark them as amended. Migration estimate ~8-10 working days. The 10 new real-binary conformance fixtures specified in §8.1 A4' are designed to close the integration-test process gap that allowed Mode B to ship as stubbed in the original plan.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* docs(plans): Mode A pivot — three phase implementation plans (A engine, B wrapper+conformance, C NC rebuild)

Three phase plans implementing the 2026-05-24 Mode A pivot amendment. Phase A (15 tasks) extends the engine's `amplifier-agent run` Mode A CLI with mcpServers + host capabilities + env + protocol-version + output flags, adds the JSON envelope with metadata.correlationId, and adds per-turn audit trail writes — all with stdout discipline enforcement (CR-B) and MCP env tmpfile spill (CR-A). Phase B (20 tasks) rewrites the TS+Py wrappers as subprocess drivers, simplifies the DisplayEvent shape (CR-C breaking change), adds PGID cleanup, rejects approval.onRequest loudly (SC-C), implements envelope-wins precedence (SC-D), and authors 10 real-binary conformance fixtures closing the integration-test process gap (R9'/CR-D). Phase C (10 tasks) rebuilds NC's in-container adapter against the simplified DisplayEvent shape, re-applies the F2/F3 Dockerfile fixes from the DTU verification, and re-runs the DTU end-to-end harness left at `aaa-nc-verify-v2`. Every task in every plan explicitly declares whether its tests are (a) unit-with-mocks, (b) real-binary, or (c) hybrid — Phase A has 2 real-binary tests, Phase B has 10 (the full conformance roster), Phase C is end-to-end (b) by definition.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* chore(engine): start Phase A — Mode A pivot engine work

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* test(engine): A2' — failing test for v2 JSON envelope shape

* feat(engine): A2' — emit Mode A v2 JSON envelope on success

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* test(engine): A1' — failing test for --output text|json flag

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): A1' — add --output {text,json} flag, default json

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* test(engine): A1'/D9 — failing tests for --mcp-servers parsing

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): D9' — accept --mcp-servers inline JSON or @path

Adds:
- _emit_argv_envelope(): writes a §4.1-shape error envelope for argv
  validation failures (protocol classification, exit code 2).
- _parse_json_or_atpath(): parses '<json>' or '@<path>' flag values,
  emitting argv_json_malformed / argv_path_unreadable envelopes on
  errors.
- --mcp-servers flag wired through to _TurnSpec.mcp_servers.
- TODO(phase-A-task-7) marker in _execute_turn for engine threading,
  which lives in Task 15's lint cleanup per the 2026-05-22 §4.8 closure.

Tests: 3/3 mcp_servers tests PASS in tests/cli/test_mode_a_v2_envelope.py.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): A1'/D6'/D12' — add --host-capabilities, --env-*, --protocol-version flags

- Add --host-capabilities (inline JSON) threaded into envelope.metadata.hostCapabilities
- Add --env-allowlist (comma-separated) and --env-extra (inline JSON); parsed now, threaded into engine subprocess wiring in a later task (D12')
- Add --protocol-version with self-validation against PROTOCOL_VERSION; emits
  protocol_version_mismatch envelope (exit 2) when wrapper pin diverges, unless
  --allow-protocol-skew (or AMPLIFIER_AGENT_ALLOW_PROTOCOL_SKEW env) is set (D6')
- Extend _emit_argv_envelope with optional remediation field for structured
  error envelopes that wrappers can surface verbatim
- Three new envelope unit tests cover host-capabilities thread-through,
  protocol mismatch, and skew-suppression flow

* test(engine): A2'/CR-B — failing test for stdout discipline

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): A2'/CR-B — enforce stdout discipline via redirect_stdout

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* test(engine): A2' — failing test for §4.3 error envelope shape

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): A2'/§4.4 — error envelope path with classification-based exit codes

Add _EXIT_CODE_BY_CLASSIFICATION mapping (engine/transport/unknown→1,
protocol→2, approval→3) and _CLASSIFICATION_BY_CODE table mapping known
AaaError codes onto classifications. Rewire the except blocks in
single_turn.run() to build §4.3 error envelopes and emit them on the
real stdout (bypassing the redirect_stdout guard), exiting with the
classification-mapped exit code.

Also relax AaaError.__init__ to accept code and message positionally
(kwargs still work) — required by callers in the wrapper layer
(amplifier_agent_client/__init__.py) and by the §4.3 envelope test.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): A2.1'/SC-H — per-turn audit trail with sha256-digested secrets

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* test(engine): A4'/R9' — real-binary happy-path integration gate

Launches the actual amplifier-agent binary via subprocess.run with
ANTHROPIC_BASE_URL pointed at an in-process HTTP mock LLM, and asserts
the stdout envelope matches §4.1. Per amendment §8.1 A4', the mock LLM
HTTP server is the only mock allowed in real-binary tests.

The mock returns a complete Anthropic SSE message stream (message_start
through message_stop) because the amplifier-module-provider-anthropic
provider uses streaming by default (use_streaming=True).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(engine): A1'/SC-B — engine self-promotes to session leader for MCP group cleanup

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* chore(wrappers): start Phase B — Mode A pivot wrapper rewrite

* test(wrappers/ts): A3'/CR-C — failing test for simplified DisplayEvent

RED step for A3'/CR-C (2026-05-24 Mode A pivot amendment, §5.2). Pins the
simplified Mode A DisplayEvent shape as a discriminated union with four
variants: init (sessionId), activity (no fields), result (text), error
(code, classification, severity, correlationId, message, retryable,
optional stderrTail). Replaces the locked-design flat interface that
required type/sessionId/turnId/payload on every event.

Verifies (i)–(iv) per the spec via @ts-expect-error directives anchored
to fields that must NOT exist on each variant, structural Object.keys
assertions, and an exhaustive switch with never-narrowing for the union.

State at this commit:
  - tsc --noEmit on this test file fails with the exact errors predicted
    by §5.2 (missing turnId/payload on init/activity/result/error literals;
    unknown code/classification/severity/correlationId/message/retryable
    on error literal; unused @ts-expect-error directives because turnId
    and payload still exist on the locked-design interface).
  - npm test -- session-mode-a-shape passes at runtime because vitest's
    esbuild transform strips all TS-only constructs (the @ts-expect-error
    directives become comments and the type annotations become bare
    object literals). This is a known vitest+esbuild behavior; type-level
    RED is verifiable via tsc directly. GREEN step (next task) rewrites
    src/session.ts to the new union and the type errors clear.

* feat(wrappers/ts)!: A3'/CR-C — simplify DisplayEvent for Mode A v2

Replace the flat DisplayEvent interface with a discriminated union
({ init | activity | result | error }) per the Mode A pivot amendment
§5.2. Removes fields the Mode A wire cannot meaningfully populate:
turnId, parentTurnId, synthesized, payload.

The breaking-change marker (!) reflects that consumers of
SessionHandle.submit()'s yielded events must now narrow via ev.type
and read per-variant typed fields (text, code, classification, …)
instead of the previous turnId / payload bag.

Existing SessionHandle.submit() emitter code (l14.ts, display.ts,
session.ts:217/225) type-errors against the new shape — this is
expected and is rewritten in Task 7/8 once the subprocess driver
lands. A TODO(phase-b-task-8) marker has been added above the
L14 import so the cleanup is recoverable from grep.

The Task 2 RED test (session-mode-a-shape.test.ts, 5 cases) is now
GREEN — DisplayEvent matches the simplified union exactly.

Refs: docs/designs/2026-05-24-aaa-v2-mode-a-pivot-amendment.md §5.2

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wrappers/ts): A3'/SC-C — reject approval.onRequest with typed AaaError

Per amendment §5.3, the Mode A wire has no mid-turn request channel.
The earlier draft of the amendment had spawnAgent() accept the callback
and log a stderr warning; SC-C adversarial review found that warning-only
acceptance ships silent auto-allow to a host author who believed their
callback was wired up. Reject loudly instead.

- Add first guard in spawnAgent() (src/index.ts): if params.approval.onRequest
  is provided, throw AaaError('approval_not_supported_in_v1') with
  classification='protocol', severity='error', BEFORE any subprocess work.
- Add verbatim JSDoc from amendment §5.3 above the SpawnAgentParams.approval
  field documenting the NOT-SUPPORTED-IN-v1 status and the v1.x revival path
  (WG-4 in amendment §6).
- New unit test test/spawn-rejects-approval.test.ts verifies the rejection
  with toMatchObject({name:'AaaError', code:'approval_not_supported_in_v1',
  classification:'protocol'}).

The throw happens before any subprocess work; the host author sees the
failure immediately at spawnAgent() call time instead of at some downstream
point where a tool was supposed to prompt but silently auto-allowed.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wrappers/ts): A3' — pure argv assembler for Mode A v2

Add wrappers/typescript/src/argv-builder.ts exporting assembleArgv() and
AssembleArgvInput. Pure function — no I/O, no env reads — building the
argv array for 'amplifier-agent run' in the canonical order:

  run --session-id <sid> {--fresh|--resume}
  [--cwd] [--provider] [--mcp-servers] [--host-capabilities]
  [--env-allowlist] [--env-extra]
  --output json --protocol-version <ver>
  [--allow-protocol-skew] -y <prompt>

SC-C: wrapper always emits -y to enforce auto-allow at the bundle layer;
approval responsibility lives in the orchestrating host, not the engine.

Tests cover the 5 spec cases: happy-path argv equality, resume mode,
host-capabilities JSON threading, mcp-servers inline JSON threading, and
mcp-servers @path threading (caller pre-spilled).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wrappers/ts): A3'/CR-A — secret-aware MCP tmpfile spill (0600)

Add mcp-spill.ts implementing CR-A's secret-aware MCP servers resolution
for the Mode A v2 wrapper:

- resolveMcpServersFlag(mcpServers, sessionId): returns inline JSON when no
  server has a non-empty env block; otherwise spills the full config to a
  0600 tmpfile under $XDG_RUNTIME_DIR/amplifier-agent/<sessionId>/ (falling
  back to os.tmpdir()) under a 0700 per-session directory and returns
  `@<path>` as the flag value.
- cleanupSpillFile(spillPath): idempotent unlink that swallows ENOENT and
  is a no-op for null input, so callers can call it unconditionally on
  every exit path.

Empty env objects (`env: {}`) do NOT trigger spilling — only env blocks
with at least one key are considered secret-bearing.

Tests: 7 cases covering null/undefined/empty input, inline JSON when no
env, 0600 spill when any env present, mode/contents verification, and
idempotent cleanup (including null and double-unlink).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wrappers/ts): A3'/SC-D — §4.1 envelope parser with precedence rules

Implements parseRunOutput() per §4.1 (run-output envelope schema) and §4.4
(SC-D exit-code/envelope precedence) from the Mode A v2 pivot amendment.

Rule 1 — envelope parseable per §4.1 → envelope is authoritative:
  - error===null  → result event with text=reply (regardless of exit code)
  - error present → error event populated from envelope (retryable=false)

Rule 2 — envelope absent / unparseable / partial → synthesize from outcome:
  - exit 0   → envelope_missing / protocol (engine protocol violation)
  - exit N≠0 → engine_exit_<N> / engine, stderrTail truncated to 4096 chars

Partial JSON is NOT half-parsed; missing required §4.1 fields fall to rule 2.

Tests: 6 SC-D cases (1a/1b/1c/2a/2b/2c) + stderrTail truncation. All pass.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wrappers/ts)!: A3' — rewrite SessionHandle as subprocess driver

Rewrite SessionHandle per the 2026-05-24 Mode A pivot amendment §5.2: the
handle is now a per-submit subprocess driver, not a JSON-RPC client.

Per §5.2 contract:
- SC-1: yield {type:'init', sessionId} synchronously before any async work
- CR-A: resolve --mcp-servers via resolveMcpServersFlag (inline or 0600 spill)
- assembleArgv builds the engine argv (no I/O in the builder)
- SC-B: spawn detached:true so PID == PGID for group signals
- 2s activity ticker pushed into the iterator queue while child runs
- exit promise races a configurable timeout (default 10min)
  - on exit:    parseRunOutput({stdout, stderr, exitCode})
  - on timeout: synthesize {type:'error', code:'engine_hung'} + cancel()
  - on spawn error (ENOENT/EACCES): {type:'error', code:'spawn_failed'}
- cleanup spill file after the drain loop ends (every exit path)
- cancel(): SIGTERM the -pgid, wait 5s, SIGKILL the -pgid if alive;
  then unlink the spill file (idempotent)
- dispose() is an alias for cancel()

spawnAgent (index.ts) is synchronous-in-spirit: validate params, resolve
binary, build env, return new SessionHandle(params). No subprocess at
spawn time. No probeEngineVersion call. The version.ts / checkProtocolVersion
path is unused — flagged TODO for Task-9 cleanup.

Deleted:
- src/l14.ts, src/jsonrpc.ts (Mode B JSON-RPC artifacts — no longer used)
- src/display.ts (display.onEvent consumption path removed in §2.2)
- test/l14.test.ts, test/jsonrpc.test.ts (covered the deleted modules)
- test/display.test.ts (asserted on the removed parentTurnId field)
- test/session.test.ts (asserted on result/delta JSON-RPC notifications)
- test/spawn-agent.test.ts (asserted on the removed _versionProbe seam)

Added:
- test/session-subprocess.test.ts: 9 tests covering the §5.2 contract —
  init-before-spawn, one-shot lifecycle, envelope→result, non-zero-exit→
  engine_exit_<n>, timeout→engine_hung, MCP spill cleanup, getEngineInfo,
  spawn ENOENT→spawn_failed, cancel-before-submit no-op.

BREAKING CHANGE: SessionHandle constructor signature changed from
(rpc, deps, approval?, display?) to (params: SessionHandleParams).
SessionDeps / RpcLike / TERMINAL_NOTIFICATION are removed from the
public surface. spawnAgent's _transportFactory / _versionProbe
test seams are gone — Mode A has no transport to inject.

Verification:
- npx tsc --noEmit: clean
- npm test: 61/61 PASS across 12 files

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* feat(wrappers/py)!: A3' — Python wrapper parity for Mode A subprocess driver

Mirrors Tasks 3-8 (TS) onto wrappers/python/src/amplifier_agent_client/:

- types.py: add DisplayEvent Literal-discriminated TypedDict union
  (DisplayEventInit | DisplayEventActivity | DisplayEventResult | DisplayEventError)
  per amendment §5.2 / CR-C
- spawn.py: AaaError raised on env_injection_rejected (unchanged contract)
- argv_builder.py: NEW — pure argv assembly mirroring TS argv-builder.ts
- mcp_spill.py: NEW — secret-aware spill via asyncio.to_thread for blocking
  file I/O (0700 dir, 0600 file). Mirrors TS mcp-spill.ts (CR-A)
- run_output_parser.py: NEW — §4.1 envelope parser with SC-D precedence rules
- session.py: REWRITE — subprocess driver using asyncio.create_subprocess_exec
  with start_new_session=True (POSIX setsid). cancel() uses
  os.killpg(os.getpgid(proc.pid), SIGTERM) for group signal, with 5s grace
  before SIGKILL. submit() yields DisplayEvents per amendment §5.2.
- __init__.py: REWRITE — spawn_agent() now:
  - rejects approval.on_request LOUDLY with AaaError(approval_not_supported_in_v1)
    (SC-C — no warning-only acceptance)
  - skips agent/initialize, version probe, and Transport (Mode A v2 pivot)
  - returns SessionHandle without spawning subprocess (deferred to submit())

Tests mirror TS:
- test_display_event_shape.py — discriminated-union exhaustiveness (CR-C)
- test_spawn_rejects_approval.py — SC-C loud rejection
- test_argv_builder.py — canonical argv assembly, 5 cases
- test_mcp_spill.py — spill behavior, 0600 mode, idempotent cleanup
- test_run_output_parser.py — Rule 1/Rule 2 precedence, stderr tail truncation

Deleted (per task spec + cleanup of orphans):
- l14.py, jsonrpc.py and their tests (explicit)
- display.py (orphan; DisplayEvent shape no longer has parent_turn_id)
- test_session.py, test_spawn_agent.py, test_display.py (test removed flows)

Verification: cd wrappers/python && uv run pytest && uv run ruff check
src/ tests/ && uv run pyright src/ — all green (48 passed, ruff clean,
pyright clean).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* chore(wrappers): bump to v0.3.0 — Mode A pivot breaking changes (CR-C)

Bump TypeScript wrapper from 0.2.0 to 0.3.0 and Python wrapper to 0.3.0
to mark the Mode A pivot wrapper rewrites (commits d2a3c86, df1c263) as
a breaking-change release. The DisplayEvent type was simplified (CR-C),
SessionHandle was rewritten as a subprocess driver, and the JSON-RPC
transport layer was removed.

Notes:
- Python wrapper version was at 0.0.0 (pre-existing version skew with
  TS wrapper at 0.2.0). Bumped to 0.3.0 to align with TS.
- The pre-pivot conformance parity lint (tests/test_conformance_parity.py)
  is now obsolete — its runners (runner_py.py, runner_ts.ts) import the
  deleted amplifier_agent_client.jsonrpc/ScriptedTransport machinery.
  Per the design (§A4'/CR-D), this lint and its fixtures will be
  replaced by the real-binary fixture suite in Tasks 11–20 of the
  Phase B plan. Not addressed in this task.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

* test(conformance): A4'/SC-C — mode-a-approval-callback-rejected

Author the SC-C real-binary conformance fixture per amendment §8.1 A4'.

The fixture asserts two independent properties:
  (i)  spawnAgent throws AaaError(approval_not_supported_in_v1) with
       classification='protocol' SYNCHRONOUSLY when a non-null
       approval.onRequest callback is supplied.
  (ii) No 'amplifier-agent run' subprocess is started — verified by a
       before/after pgrep snapshot whose counts must match exactly.

Together, these close the SC-C threat surface: the wrapper cannot silently
swallow an approval callback (because it throws) AND cannot do any
subprocess work under the rejected configuration (because pgrep proves no
engine launched).

The corresponding wrapper unit tests already exist
(wrappers/typescript/test/spawn-rejects-approval.test.ts and
wrappers/python/tests/test_spawn_rejects_approval.py from commits
f2ab685 and df1c263); this fixture adds the cross-language conformance
gate and the no-subprocess (pgrep) assertion that the unit tests do not
cover.

Schema note: this fixture uses the v0.3 real-binary fixture schema
(test_type: real-bi…
manojp99 pushed a commit that referenced this pull request Jun 3, 2026
…th (#1)

Engine PR #27 / v0.4.0 added the --config <path> flag and the
host_config layer (approval mode, MCP servers, provider defaults,
allowProtocolSkew, etc.). The wrapper had no surface to forward this,
so callers had to fall back to AMPLIFIER_AGENT_CONFIG in env.extra.

This change:

  - Adds SpawnAgentParams.configPath?: string (public, @public TSDoc).
  - Adds AssembleArgvInput.configPath?: string.
  - assembleArgv emits --config <path> when configPath is set.
  - Threads configPath through SessionHandleParams to the per-submit
    argv assembly.

Also drive-by adds approvalMode field to AssembleArgvInput (used by
#10's commit). The argv-builder now reads input.approvalMode and emits
-y / -n / nothing accordingly. Default remains -y for backward compat
with callers that haven't opted into the approval API.

Closes #1.
manojp99 pushed a commit that referenced this pull request Jun 3, 2026
Mirrors PR #29 / #31 pattern: dist/ is tracked so consumers installing
from the git tarball get the compiled artifacts without a build step.

Regenerated from npm run build after issues #1, #2, #3, #4, #5, #6, #7,
#9, #10 landed.
manojp99 pushed a commit that referenced this pull request Jun 3, 2026
Wrapper hardening release closing 8 consumer-reported gaps at 0.5.0:
  #1  configPath surface
  #2  stderr NDJSON parsing
  #3  runChildProcess injection
  #4  display.onEvent dispatch
  #5  public re-exports
  #6  Transport dead code (root cause of #2/#4)
  #7  getEngineInfo() implementation
  #9  checkProtocolVersion() wired into init path
  #10 approval API mapped to engine -y/-n + approval.mode

Issue #8 in the consumer report was a misread — InitializeParams.
mcpConfigPath is intentionally retained in protocol-0.3.0. No
type change needed; the schema is canonical and correct.

This is a minor bump per 0.x convention even though some changes
are BREAKING — the wrapper hasn't shipped a 1.0 yet, so breaking
changes ride minor bumps. See CHANGELOG for the BREAKING list.

Engine compatibility: requires amplifier-agent >= 0.4.0.
Pinned protocol: 0.3.0.
manojp99 added a commit that referenced this pull request Jun 3, 2026
…, approval, getEngineInfo, +5 more) (#36)

* feat(wrapper-ts): re-export internal helpers from index.ts (#5)

Adds named re-exports from the package entry point so consumers can
import internal helpers without reaching into private deep paths:

  assembleArgv, AssembleArgvInput
  resolveMcpConfigPath, cleanupSpillFile, McpSpillResult
  buildEnv, resolveBinaryPath, probeEngineVersion,
    DEFAULT_ALLOWLIST, BLOCKED_ENV_KEYS,
    ResolveBinaryPathOptions, BuildEnvOptions
  Transport, TransportOptions, ExitInfo
  checkProtocolVersion, VersionCheckResult, VersionCheckOk,
    VersionCheckFail, CheckProtocolVersionOptions
  parseRunOutput, STDERR_TAIL_BYTES, SubprocessOutcome
  makeApprovalHandler, ApprovalAdapter, ApprovalRequest,
    ApprovalHandler

Each export is annotated @public.

Closes #5.

* feat(wrapper-ts): wire checkProtocolVersion() into init path (#9)

spawnAgent() now probes the engine's protocol version once during
initialization (via amplifier-agent version --json) and runs
checkProtocolVersion() against PROTOCOL_VERSION_REQUIRED_BY_WRAPPER
BEFORE constructing a SessionHandle. Mismatch fails fast wrapper-side
with AaaError(protocol_version_mismatch), saving a full subprocess
roundtrip later.

Adds two new SpawnAgentParams fields:
  - allowProtocolSkew?: boolean — bypass the check (mirrors engine's
    host_config.allowProtocolSkew)
  - _engineVersionProbe?: () => Promise<EngineVersionPayload> —
    test-only injection point for the probe

Also bumps PROTOCOL_VERSION_REQUIRED_BY_WRAPPER from "0.2.0" to
"0.3.0" to match the engine's current wire protocol
(amplifier_agent_lib.protocol.methods.PROTOCOL_VERSION). The wrapper
was shipping with a stale pin; the new check would have surfaced this
at startup.

Closes #9.

* feat(wrapper-ts): add runChildProcess injection point (#3)

Adds SpawnAgentParams.runChildProcess?: ChildProcessFactory — a public
seam to substitute the subprocess factory used inside SessionHandle.
When set, the wrapper invokes the factory in place of
child_process.spawn, preserving the same options shape (detached, stdio,
env, optional cwd).

Useful for:
  - Sandboxing (e.g. wrapping the child in a container or namespace)
  - Test doubles (e.g. EventEmitter fakes that drive scripted outputs)
  - Harness wrapping (e.g. observing the subprocess from outside)

ChildProcessFactory is exported as a @public type from index.ts.

Closes #3.

* feat(wrapper-ts)!: wire Transport NDJSON pipeline + dispatch to display.onEvent (#2, #4, #6)

The engine emits one JSON object per line on the child subprocess's
stderr stream for each wire-protocol notification (progress,
result/delta, result/final, thinking/delta, thinking/final,
tool/started, tool/completed, approval/request, approval/timeout,
plus wire-level error). Before this change the wrapper buffered
stderr as raw text and silently dropped every event — the existing
Transport class implemented NDJSON parsing but was never wired
anywhere (dead code).

This change:

  - Adds parseNdjsonStream(stream, {onJson, onNonJson?}) — a
    standalone helper extracted from the parsing logic Transport
    already had. Resolves when the stream emits 'close'. Exported
    @public.

  - Wires parseNdjsonStream onto child.stderr inside
    SessionHandle.makeIterable(). JSON lines are parsed into
    'notification' DisplayEvents and dispatched to
    params.display?.onEvent. Non-JSON lines (and JSON lines, for
    completeness) are still accumulated into stderrBuf so the
    stderrTail surface on parseRunOutput remains diagnostically
    useful.

  - Extends the DisplayEvent discriminated union with a new
    {type: 'notification', method: string, params: unknown}
    variant. **BREAKING**: existing exhaustive switch statements
    on event.type will no longer be exhaustive without a
    notification branch.

  - Threads SpawnAgentParams.display through to SessionHandle so
    the callback that was previously silently dropped is now
    actually fired (Issue #4).

Closes #2, #4, #6.

BREAKING CHANGE: display.onEvent callbacks are now actually invoked
with wire-event notifications. Callers that registered onEvent
expecting it to be a no-op may observe new event flow. The
DisplayEvent union has a new 'notification' variant; exhaustive
switch statements need a corresponding branch.

* feat(wrapper-ts): surface --config flag via SpawnAgentParams.configPath (#1)

Engine PR #27 / v0.4.0 added the --config <path> flag and the
host_config layer (approval mode, MCP servers, provider defaults,
allowProtocolSkew, etc.). The wrapper had no surface to forward this,
so callers had to fall back to AMPLIFIER_AGENT_CONFIG in env.extra.

This change:

  - Adds SpawnAgentParams.configPath?: string (public, @public TSDoc).
  - Adds AssembleArgvInput.configPath?: string.
  - assembleArgv emits --config <path> when configPath is set.
  - Threads configPath through SessionHandleParams to the per-submit
    argv assembly.

Also drive-by adds approvalMode field to AssembleArgvInput (used by
#10's commit). The argv-builder now reads input.approvalMode and emits
-y / -n / nothing accordingly. Default remains -y for backward compat
with callers that haven't opted into the approval API.

Closes #1.

* feat(wrapper-ts)!: wire approval API to engine -y/-n + approval.mode (#10)

Previously, SpawnAgentParams.approval threw AaaError(
approval_not_supported_in_v1) whenever set because it required the
mid-turn onRequest callback that v1 doesn't support.

This change extends SpawnAgentParams.approval to also accept the
static-policy shape { mode: 'yes' | 'no' | 'prompt' }, which maps to
engine argv:

  - 'yes'    -> -y (auto-allow every tool call)
  - 'no'     -> -n (auto-deny every tool call)
  - 'prompt' -> emit no flag; engine falls back to
                host_config.approval.mode or the bundle's TTY-based
                default. This is how a host hands policy resolution
                back to the engine.

The legacy { onRequest, timeoutMs } form still throws
approval_not_supported_in_v1 — the Mode A wire has no mid-turn
channel. Mid-turn callbacks will return when WG-4 lands.

Engine compatibility: { mode: 'prompt' } requires
amplifier-agent >= 0.4.0 (PR #34 added host_config.approval.mode).

Closes #10.

BREAKING CHANGE: SpawnAgentParams.approval is now a union shape;
callers passing { mode } no longer hit approval_not_supported_in_v1.
Callers that defensively catch that error need to remove the try/catch
when migrating to the mode shape.

* feat(wrapper-ts): implement getEngineInfo() — engineVersion + bundleDigest (#7)

Closes the Task-9 TODO: getEngineInfo() now returns the values
captured during the engine version probe that spawnAgent() runs at
init (Issue #9). Previously both fields were hardcoded empty strings.

  - engineVersion populated from `amplifier-agent version --json`
    payload's `version` field.
  - bundleDigest populated from the probe payload's optional
    `bundleDigest` field. The engine's current `version --json`
    output (from admin/version_info.py) only emits {version,
    protocolVersion} — bundleDigest will be empty string until a
    future engine release exposes it. Forward-compatible: when the
    engine adds it, the wrapper picks it up automatically with no
    further changes.

DONE_WITH_CONCERNS for the bundleDigest follow-up: filed as an
engine-side ask for a future PR. The wrapper does what it can with
the data the engine surface exposes today; the contract is wired
so the field will populate the moment the engine emits it.

Closes #7.

* chore(wrapper-ts): rebuild dist after hardening release changes

Mirrors PR #29 / #31 pattern: dist/ is tracked so consumers installing
from the git tarball get the compiled artifacts without a build step.

Regenerated from npm run build after issues #1, #2, #3, #4, #5, #6, #7,
#9, #10 landed.

* chore(release): bump amplifier-agent-ts to 0.6.0 + CHANGELOG

Wrapper hardening release closing 8 consumer-reported gaps at 0.5.0:
  #1  configPath surface
  #2  stderr NDJSON parsing
  #3  runChildProcess injection
  #4  display.onEvent dispatch
  #5  public re-exports
  #6  Transport dead code (root cause of #2/#4)
  #7  getEngineInfo() implementation
  #9  checkProtocolVersion() wired into init path
  #10 approval API mapped to engine -y/-n + approval.mode

Issue #8 in the consumer report was a misread — InitializeParams.
mcpConfigPath is intentionally retained in protocol-0.3.0. No
type change needed; the schema is canonical and correct.

This is a minor bump per 0.x convention even though some changes
are BREAKING — the wrapper hasn't shipped a 1.0 yet, so breaking
changes ride minor bumps. See CHANGELOG for the BREAKING list.

Engine compatibility: requires amplifier-agent >= 0.4.0.
Pinned protocol: 0.3.0.

---------

Co-authored-by: Manoj Prabhakar Paidiparthy <mpaidiparthy@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant