coder · ThomasK33 · Mar 21, 2026 · Mar 20, 2026 · Mar 20, 2026 · Mar 20, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -22,7 +22,7 @@ permissions:
 jobs:
   quality-gates:
     runs-on: ubuntu-latest
-    timeout-minutes: 15
+    timeout-minutes: 20
     steps:
       - name: Check out repository
         uses: actions/checkout@v6
@@ -33,6 +33,9 @@ jobs:
       - name: Install CI dependencies
         run: mise run bootstrap-ci
 
+      - name: Install Playwright Chromium
+        run: npx playwright install chromium
+
       - name: Check formatting
         run: mise run format-check
 

diff --git a/README.md b/README.md
@@ -12,5 +12,5 @@ Node/TypeScript CLI scaffold.
 
 - GitHub Actions uses `mise` as the canonical entrypoint for tool setup and quality gates.
 - The committed workflow in `.github/workflows/ci.yml` is hand-curated. `mise generate github-action` is useful as a scaffold, but the checked-in file is the maintained source of truth because it includes repo-specific triggers, bootstrap behavior, and step-level logs.
-- CI uses `mise run bootstrap-ci` so pull requests get deterministic installs via `npm ci` without the extra Chromium download used by the local `bootstrap` task.
+- CI uses `mise run bootstrap-ci` for deterministic `npm ci` installs, then explicitly runs `npx playwright install chromium` so renderer smoke coverage is exercised on GitHub Actions.
 - For v1, CI intentionally follows the major-version tool pins declared in `mise.toml` (`node = "24"`, `python = "3"`). This repo does not commit a `mise.lock` yet.
diff --git a/WEEK2-GAPS.md b/WEEK2-GAPS.md
@@ -0,0 +1,29 @@
+# Week 2 remaining gaps
+
+The Week 2 renderer-backed inspection slice is complete, but the following work is still intentionally out of scope or not yet delivered:
+
+## Export and packaging
+
+- **Asciicast export** is not implemented yet.
+- **WebM video export** is not implemented yet.
+- **MCP wrapper** is not implemented yet.
+
+## Renderer backends and platform coverage
+
+- **Native renderer adapters** are not implemented yet; the current slice is centered on the reference `ghostty-web` path.
+- **Cross-platform rendering parity** is not guaranteed yet.
+
+## Input and topology
+
+- **Mouse input support** is not implemented yet.
+- **Remote/network sessions** are not implemented yet.
+
+## Fidelity and determinism
+
+- **Screenshot pixel-perfect determinism** is not guaranteed; font rendering can still vary by environment.
+- **Scrollback in snapshots** is not implemented; snapshots currently report the visible viewport only.
+- **Cursor blink animation in screenshots** is not captured; screenshots represent a static frame.
+
+## Security & Isolation
+
+- **Renderer CSP trade-off** currently allows `unsafe-inline`/`unsafe-eval` for the ghostty-web harness because the localhost-only loopback renderer still needs inline bootstrap code and WASM eval support in current browsers.
diff --git a/design/20260319_agent-terminal-v1.md b/design/20260319_agent-terminal-v1.md
@@ -19,6 +19,19 @@ It is designed to let an agent:
 
 This design intentionally describes a **general product**, not a Mux-specific implementation. A future Mux integration should consume `agent-terminal` as an external CLI/runtime rather than baking Mux-specific assumptions into the design.
 
+## Current shipped status (2026-03-21)
+
+The repository now ships the first renderer-backed vertical slice of this design:
+
+- long-lived session hosts,
+- PTY control and append-only event logs,
+- renderer-backed `snapshot` and `wait`,
+- deterministic `screenshot`,
+- artifact manifests,
+- and proof bundles under `dogfood/`.
+
+Replay export artifacts such as asciicast and video remain part of the design direction, but they are still future work rather than shipped functionality.
+
 ## Executive summary
 
 The recommended v1 shape is:
@@ -165,10 +178,10 @@ V1 is successful when an AI agent can:
 4. wait until the screen reaches a target state,
 5. fetch a semantic snapshot of the screen,
 6. capture a PNG screenshot,
-7. export an asciicast,
-8. export a replay video,
-9. destroy the session,
-10. and leave behind an artifact bundle that a human reviewer can inspect.
+7. destroy the session,
+8. and leave behind an artifact bundle that a human reviewer can inspect.
+
+Asciicast and replay-video export remain intended follow-on capabilities rather than current success criteria for the shipped slice.
 
 ## Deliverables in this design set
 
@@ -180,6 +193,7 @@ This design file is the entry point. Detailed supporting docs live in `design/20
 - [04-implementation-plan.md](./20260319_agent-terminal-v1/04-implementation-plan.md)
 - [05-dogfooding-and-validation.md](./20260319_agent-terminal-v1/05-dogfooding-and-validation.md)
 - [06-roadmap-and-week-1-plan.md](./20260319_agent-terminal-v1/06-roadmap-and-week-1-plan.md)
+- [07-week-2-plan.md](./20260319_agent-terminal-v1/07-week-2-plan.md)
 
 ## High-level architecture
 

diff --git a/design/20260319_agent-terminal-v1/03-rendering-and-artifacts.md b/design/20260319_agent-terminal-v1/03-rendering-and-artifacts.md
@@ -35,8 +35,25 @@ V1 should support four artifact classes.
 | ----------------- | ---------------------------------------------------- | -------------- |
 | Semantic snapshot | Structured screen state for reasoning and assertions | Yes            |
 | Screenshot PNG    | Visual verification of layout, color, and wrapping   | Yes            |
-| Asciicast         | Portable terminal replay artifact                    | Yes            |
-| Replay video      | Reviewer-friendly visual playback                    | Yes            |
+| Asciicast         | Portable terminal replay artifact                    | Not yet shipped |
+| Replay video      | Reviewer-friendly visual playback                    | Not yet shipped |
+
+## Current implementation status (2026-03-21)
+
+The current Week 2 implementation ships the first two artifact classes from this design:
+
+- semantic snapshots,
+- and screenshot PNGs.
+
+It does **not** yet ship asciicast export or replay video export; those remain deferred and are tracked in `WEEK2-GAPS.md`.
+
+The current renderer path is:
+
+- host-prepared replay input,
+- lazy `ghostty-web` boot in headless Chromium,
+- viewport-scoped semantic extraction,
+- deterministic screenshot capture,
+- and manifest-backed artifact storage under `artifacts/`.
 
 ## 4. Canonical replay model
 
@@ -50,13 +67,7 @@ Everything visual should be reproducible from:
 ### 4.1 Replay input
 
 ```ts
-export interface ReplayInput {
-  sessionId: string;
-  events: ReplayEvent[];
-  rows: number;
-  cols: number;
-  renderProfile: ResolvedRenderProfile;
-}
+const replayInput = ReplayInputSchema.parse(rawReplayInput);
 ```
 
 ### 4.2 Replay rules
@@ -112,6 +123,20 @@ export interface RenderProfile {
 }
 ```
 
+### 5.2.1 Current Week 2 profile shape
+
+The shipped Week 2 profile shape is intentionally smaller than the fully elaborated interface below. Today it pins:
+
+- profile name,
+- light/dark theme mode,
+- font family,
+- font size,
+- cursor style,
+- foreground color,
+- and background color.
+
+That smaller shape was enough to make screenshot output stable for the reference renderer while leaving room to add richer font/padding/palette metadata later.
+
 ### 5.3 Determinism rules
 
 To keep screenshots reproducible, v1 should:
@@ -282,6 +307,21 @@ For agent reasoning speed, `snapshot --format text` should return only:
 
 That avoids forcing every reasoning step to parse full cell objects.
 
+### 9.4 Current Week 2 snapshot scope
+
+The shipped Week 2 snapshot shape is intentionally viewport-scoped.
+
+It currently records:
+
+- session ID,
+- capture sequence,
+- rows/cols,
+- cursor row/col,
+- alt-screen state,
+- and visible lines.
+
+It does not yet include per-cell styling or scrollback export. Those remain good future extensions, but the lighter snapshot is already sufficient for agent reasoning and renderer-backed waits.
+
 ## 10. Asciicast export
 
 ### 10.1 Why asciicast is mandatory
@@ -371,6 +411,20 @@ export interface ArtifactEntry {
 - artifacts missing from disk are flagged during `inspect` and `doctor`,
 - manifests never point at temp files.
 
+### 12.3 Current Week 2 manifest and layout
+
+The shipped Week 2 implementation currently writes artifacts under:
+
+```text
+artifacts/
+  manifest.json
+  snapshot-<seq>-structured.json
+  snapshot-<seq>-text.json
+  screenshot-<seq>-<profile>.png
+```
+
+That is simpler than the broader naming scheme below, but it already preserves the two most important debugging dimensions: capture sequence and render profile.
+
 ## 13. Future native renderer adapter contract
 
 The reference renderer should not lock out native backends.

diff --git a/design/20260319_agent-terminal-v1/05-dogfooding-and-validation.md b/design/20260319_agent-terminal-v1/05-dogfooding-and-validation.md
@@ -6,6 +6,26 @@ It is intentionally prescriptive.
 
 A follow-up AI coding agent should treat this file as the minimum review protocol, not optional guidance.
 
+## Current shipped state (2026-03-21)
+
+This document still describes the *target* dogfooding protocol, but the current shipped product only supports a subset of the artifact expectations below.
+
+Shipped today:
+
+- JSON command outputs,
+- semantic snapshots,
+- PNG screenshots,
+- artifact manifests,
+- and notes / proof bundles under `dogfood/`.
+
+Not yet shipped:
+
+- `.cast` export,
+- replay video export,
+- and some of the richer fixture scenarios listed below.
+
+Read the remainder of this file as the broader validation target, not a claim that every artifact class is already implemented.
+
 ## 1. Dogfooding goals
 
 Dogfooding must prove that an agent can:

diff --git a/design/20260319_agent-terminal-v1/06-roadmap-and-week-1-plan.md b/design/20260319_agent-terminal-v1/06-roadmap-and-week-1-plan.md
@@ -9,6 +9,23 @@ It is intentionally biased toward:
 - proof-heavy validation,
 - and getting to a usable dogfood loop early.
 
+## Status update (2026-03-21)
+
+Week 1 is complete and has been superseded by a shipped Week 2 renderer-backed slice.
+
+What shipped from the Week 1 plan:
+
+- real session creation, inspection, listing, and teardown,
+- a background host process per session,
+- PTY spawn and output capture,
+- input, paste, key, resize, and signal control,
+- append-only event logging,
+- `wait --exit` and `wait --idle-ms`,
+- deterministic fixture coverage,
+- and terminal-only proof bundles.
+
+Week 2 then added renderer-backed snapshots, waits, screenshots, artifact manifests, and browser smoke checks. The Week 1 plan below is preserved as the original execution record, but its outcome and sign-off checklists should now be read as **completed history** rather than future work.
+
 ## 1. Current baseline in this repository
 
 As of this draft, the repository already contains a narrow Phase 0 scaffold:
@@ -213,15 +230,15 @@ A coding agent working from this section should treat every unchecked item below
 
 ### Week 1 outcome checklist
 
-- [ ] Real session creation and teardown exist.
-- [ ] A background host process exists and is used for sessions.
-- [ ] PTY spawn and output capture work.
-- [ ] `create`, `list`, `inspect`, and `destroy` are implemented.
-- [ ] `type`, `paste`, `send-keys`, `resize`, and `signal` are implemented.
-- [ ] Append-only event logging exists.
-- [ ] `wait --exit` and `wait --idle-ms` are implemented.
-- [ ] One or two deterministic fixture apps exist.
-- [ ] A terminal-only proof bundle shows that the control plane works.
+- [x] Real session creation and teardown exist.
+- [x] A background host process exists and is used for sessions.
+- [x] PTY spawn and output capture work.
+- [x] `create`, `list`, `inspect`, and `destroy` are implemented.
+- [x] `type`, `paste`, `send-keys`, `resize`, and `signal` are implemented.
+- [x] Append-only event logging exists.
+- [x] `wait --exit` and `wait --idle-ms` are implemented.
+- [x] One or two deterministic fixture apps exist.
+- [x] A terminal-only proof bundle shows that the control plane works.
 
 Renderer work is a stretch goal for week 1, not the baseline commitment.
 
@@ -301,10 +318,10 @@ Renderer work is a stretch goal for week 1, not the baseline commitment.
 
 ### Week 1 sign-off checklist
 
-- [ ] All required implementation and checkpoint checkboxes above are complete.
-- [ ] Relevant tests for the implemented week 1 scope pass.
+- [x] All required implementation and checkpoint checkboxes above are complete.
+- [x] Relevant tests for the implemented week 1 scope pass.
 - [ ] The dogfood bundle contains screenshots and a screen recording.
-- [ ] Remaining gaps are documented explicitly rather than implied.
+- [x] Remaining gaps are documented explicitly rather than implied.
 
 ### Week 1 stretch goals