Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 66 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,8 @@ npm ci
npx playwright install chromium
npm run build

SESSION_ID=$(node dist/cli/main.js create --json --name demo | jq -r '.data.sessionId')
node dist/cli/main.js type "$SESSION_ID" 'echo hello from agent-terminal'
node dist/cli/main.js send-keys "$SESSION_ID" Enter
SESSION_ID=$(node dist/cli/main.js create --json --name demo | jq -r '.result.sessionId')
node dist/cli/main.js run "$SESSION_ID" 'echo hello from agent-terminal'
node dist/cli/main.js inspect "$SESSION_ID" --json
node dist/cli/main.js destroy "$SESSION_ID"
```
Expand All @@ -26,6 +25,43 @@ node dist/cli/main.js destroy "$SESSION_ID"
- Recording export to asciicast (`.cast`) or WebM for artifact bundles.
- Failure recovery via reconciliation, stale-session cleanup, and retained manifests/artifacts.

## 0.1.0 release focus

`agent-terminal` `0.1.0` is the first release aimed at reliable, isolated, reviewable TUI automation.
Week 9 closes the release-readiness bar around the new `run` command, isolated-environment renderer reliability, and isolation-aware `doctor` diagnostics.
For the explicit release contract, see [`RELEASE.md`](./RELEASE.md).
Reviewer-facing proof bundles live under `dogfood/`, including `dogfood/20260326-week9-release-readiness/`, `dogfood/run-command/`, and `dogfood/20260325-week8-contract-locks/`.

## TUI Workflow

For setup-heavy TUI automation, prefer an isolated home plus the higher-level `run` primitive:

```bash
AGENT_HOME="$(mktemp -d)"
agent-terminal --home "$AGENT_HOME" doctor --json
SESSION_ID=$(agent-terminal --home "$AGENT_HOME" create --json -- /bin/bash | jq -r '.result.sessionId')
agent-terminal --home "$AGENT_HOME" run "$SESSION_ID" 'npm install'
agent-terminal --home "$AGENT_HOME" wait "$SESSION_ID" --text 'ready'
agent-terminal --home "$AGENT_HOME" screenshot "$SESSION_ID"
agent-terminal --home "$AGENT_HOME" record export "$SESSION_ID" --format webm
```

Recommended sequence:

1. Create an isolated session home with `create`.
2. Use `run` for shell setup and multiline bootstrap work.
3. Use `wait` for render-visible readiness conditions.
4. Capture screenshots for point-in-time review.
5. Export WebM recordings when reviewers need motion proof.

## Isolation

- `--home <path>` stores manifests, sockets, event logs, and artifacts under an isolated agent-terminal home. Pass the same `--home` value to each command in a workflow.
- `doctor --json` reports whether `agent-terminal` is using the default location or an isolated home, including a `home_isolation` check for whether `--home` produced an isolated environment.
- It also exposes `browser_cache_accessible`, which verifies the Playwright browser cache is discoverable for renderer operations before screenshot/export flows.
- Renderer boot now carries Playwright browser-cache resolution into isolated-home workflows automatically when Chromium is installed in the normal cache or exposed through `PLAYWRIGHT_BROWSERS_PATH`.
- In a new machine, CI job, or container, run `agent-terminal --home <path> doctor --json` before starting screenshot or recording workflows.

## Platform Support

- **Linux** — Tier-1. CI-tested on `ubuntu-latest`. Primary development and testing platform.
Expand All @@ -50,6 +86,7 @@ node dist/cli/main.js destroy "$SESSION_ID"
- `gc`: remove stale or old sessions.
- `type <session-id> [text]`: type text into a session.
- `paste <session-id> [text]`: paste text into a session.
- `run <session-id> [command]`: run a command inside a session with optional completion detection.
- `mark <session-id> <label>`: add a marker event to a session timeline.
- `send-keys <session-id> <keys...>`: send key sequences such as `Enter` or `Ctrl+C`.
- `resize <session-id>`: resize the PTY dimensions.
Expand All @@ -59,6 +96,30 @@ node dist/cli/main.js destroy "$SESSION_ID"
- `record export <session-id>`: export replay artifacts as asciicast or WebM.
- `wait <session-id>`: wait for exit, idleness, text, regex, cursor, or stable-screen conditions.

## Run Command

Basic usage:

```bash
agent-terminal run <session-id> [command]
agent-terminal run <session-id> --file ./setup.sh
agent-terminal run <session-id> 'npm install && npm test' --timeout 60000 --json
agent-terminal run <session-id> 'npm run dev' --no-wait
```

Important flags:

- `--timeout <ms>` — completion timeout in milliseconds. Default: `30000`.
- `--no-wait` — fire-and-forget mode. The command is injected and the CLI returns without waiting for completion.
- `--file <path>` — read command text from a file instead of the positional argument.
- `--json` — emit a machine-readable command envelope.

Use `run` when you want shell-oriented setup inside the existing session, especially for multiline bootstrap scripts or other commands that should preserve shell state.
Use `type` when the target application needs literal interactive typing, `paste` when the target should receive a literal pasted payload, and `send-keys` for discrete control keys such as `Enter`, `Escape`, or `Ctrl+C`.

Under the hood, `run` injects the command through paste-mode and, unless `--no-wait` is set, appends a generated boundary marker that the renderer waits to see in visible output.
That makes shell setup more reliable than simulating long keystroke sequences, but `run` does not capture command output or report an exit status.

## Development setup

```bash
Expand All @@ -82,7 +143,8 @@ That runs formatting, linting, typechecking, unit/e2e tests, and the production

## Design docs

Design and implementation notes live under `design/`, especially `design/20260319_agent-terminal-v1/`. See `design/20260319_agent-terminal-v1/` for architecture, weekly plans, and status docs through Week 5.
Design and implementation notes live under `design/`, especially `design/20260319_agent-terminal-v1/`.
See `design/20260319_agent-terminal-v1/` for architecture, weekly plans, and status docs through Week 9, and see [`RELEASE.md`](./RELEASE.md) for the `0.1.0` contract.

## Repository notes

Expand Down
37 changes: 37 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# agent-terminal 0.1.0 release contract

`agent-terminal` `0.1.0` is the first release that explicitly targets isolated, reviewable terminal automation for real TUI workflows.
The contract below is the bar for what maintainers should feel comfortable supporting at release time.
If a workflow depends on behavior outside this document, it should be treated as future-scope or best-effort rather than a guaranteed `0.1.0` capability.

## What 0.1.0 delivers

- Reliable isolated session lifecycle management: `create`, `inspect`, `destroy`, and `gc` all work against isolated agent-terminal homes.
- Renderer-backed screenshots, semantic snapshots, and WebM export for reviewer-visible proof artifacts.
- The `run` command for robust in-session command execution without having to simulate long shell setup scripts as manual keystrokes.
- `doctor --json` with isolation-aware diagnostics for home resolution, renderer prerequisites, and screenshot viability.
- An append-only event log that remains the canonical replay/export source of truth.
- Schema-locked JSON envelopes across the public CLI surface.

## What 0.1.0 explicitly does not deliver

- Native renderer backends such as Ghostty native or kitty.
- Mouse input support.
- Remote or networked sessions.
- An MCP wrapper.
- Full semantic TUI automation.
- Cross-terminal pixel parity.
- Output capture or exit-code detection from `run`.

## Known limitations

- The renderer is the `ghostty-web` reference backend, not a native-terminal parity guarantee.
- `run` completion detection relies on shell-visible echo of an injected boundary marker.
- Screenshots and WebM export require Playwright/Chromium to be installed and discoverable.
- The reviewed LazyVim workflow currently assumes Neovim `>= 0.11.2` plus its usual prerequisites; older Neovim builds are out of contract for that scenario.

## Validation

- Current release bar: 595 tests across 56 test files.
- Reviewer-facing proof bundles live under `dogfood/`, including `dogfood/20260326-week9-release-readiness/`, `dogfood/run-command/`, and `dogfood/20260325-week8-contract-locks/`.
- Run `npm run verify` for the full validation bar.
7 changes: 5 additions & 2 deletions design/20260319_agent-terminal-v1.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@ It is designed to let an agent:

This design intentionally describes a **general product**, not a Mux-specific implementation. A future Mux integration should consume `agent-terminal` as an external CLI/runtime rather than baking Mux-specific assumptions into the design.

## Current shipped status (2026-03-25)
## Current shipped status (2026-03-26)

Update (2026-03-26): Week 9 is now complete as the pre-`0.1.0` release-readiness milestone. The shipped surface now includes the new `run` command for robust in-session command execution, renderer/browser-path handling that respects isolated-home workflows, and isolation-aware `doctor --json` diagnostics on top of the earlier lifecycle, snapshot, screenshot, and export work. With the explicit release contract captured in [`../RELEASE.md`](../RELEASE.md), the repository is ready for `0.1.0` once maintainers are satisfied with the documented proof bundles and release checklist; larger asks such as native renderers, mouse input, remote/network sessions, MCP wrapping, and broader semantic TUI automation remain intentionally deferred.

The repository now ships the first three milestones of this design plus Weeks 4–7 of CLI/artifact/lifecycle hardening, config/rendering/platform closeout, contract/introspection reconciliation, and Week 7 contract/doc ratification:

Expand All @@ -38,7 +40,7 @@ The repository now ships the first three milestones of this design plus Weeks 4
- macOS CI validation,
- and proof bundles under `dogfood/`.

Week 7 closed the design-synchronization pass for the shipped v1 surface. The next planned milestone is Week 8, which focuses on runtime capability discovery, richer renderer/session introspection, the remaining lower-priority public-envelope locks, and proof-bundle normalization/validation before broader native-renderer/platform expansion. The intentionally deferred platform/runtime work is tracked separately in [`../WEEK2-GAPS.md`](../WEEK2-GAPS.md). See [14-week-6-status.md](./20260319_agent-terminal-v1/14-week-6-status.md), [15-week-7-plan.md](./20260319_agent-terminal-v1/15-week-7-plan.md), [16-week-8-plan.md](./20260319_agent-terminal-v1/16-week-8-plan.md), and [`../WEEK2-GAPS.md`](../WEEK2-GAPS.md) for the current state.
Week 7 closed the design-synchronization pass for the shipped v1 surface. Week 8 then completed runtime capability discovery, richer renderer/session introspection, the remaining lower-priority public-envelope locks, and proof-bundle normalization/validation. Week 9 then closed the pre-`0.1.0` release-readiness milestone: isolated-environment renderer reliability is now part of the shipped contract, the higher-level in-session `run` primitive is documented and shipped, TUI-focused diagnostics/docs are in place, and the remaining large asks are intentionally deferred future-scope work rather than release blockers. The intentionally deferred platform/runtime work is tracked separately in [`../WEEK2-GAPS.md`](../WEEK2-GAPS.md). See [14-week-6-status.md](./20260319_agent-terminal-v1/14-week-6-status.md), [15-week-7-plan.md](./20260319_agent-terminal-v1/15-week-7-plan.md), [16-week-8-plan.md](./20260319_agent-terminal-v1/16-week-8-plan.md), [17-week-9-plan.md](./20260319_agent-terminal-v1/17-week-9-plan.md), [`../RELEASE.md`](../RELEASE.md), and [`../WEEK2-GAPS.md`](../WEEK2-GAPS.md) for the current state.

## Executive summary

Expand Down Expand Up @@ -211,6 +213,7 @@ This design file is the entry point. Detailed supporting docs live in `design/20
- [14-week-6-status.md](./20260319_agent-terminal-v1/14-week-6-status.md)
- [15-week-7-plan.md](./20260319_agent-terminal-v1/15-week-7-plan.md)
- [16-week-8-plan.md](./20260319_agent-terminal-v1/16-week-8-plan.md)
- [17-week-9-plan.md](./20260319_agent-terminal-v1/17-week-9-plan.md)

## High-level architecture

Expand Down
Loading
Loading