From 93a29d1ad0974848080eccead9e2409c0f8f2078 Mon Sep 17 00:00:00 2001 From: Thomas Kosiewski Date: Fri, 24 Apr 2026 14:59:52 +0200 Subject: [PATCH] docs: restructure repository documentation Change-Id: I294659a8e23dc62538a5a67cd35ed1c108d82b44 Signed-off-by: Thomas Kosiewski --- README.md | 365 ++++++++++-------------------- ROADMAP.md | 2 +- design/ARCHITECTURE.md | 11 +- docs/AGENT-SKILLS.md | 83 +++++++ docs/CONTRIBUTING.md | 1 + docs/INSTALL.md | 124 ++++++++++ docs/README.md | 30 ++- docs/TROUBLESHOOTING.md | 83 +++++++ docs/USAGE.md | 141 ++++++++++++ dogfood/README.md | 2 +- src/cli/commands/doctor.ts | 6 +- test/unit/commands/doctor.test.ts | 6 +- 12 files changed, 589 insertions(+), 265 deletions(-) create mode 100644 docs/AGENT-SKILLS.md create mode 100644 docs/INSTALL.md create mode 100644 docs/TROUBLESHOOTING.md create mode 100644 docs/USAGE.md diff --git a/README.md b/README.md index 22326bf..9bf8112 100644 --- a/README.md +++ b/README.md @@ -1,330 +1,205 @@ # agent-tty -`agent-tty` is a Node/TypeScript CLI for launching, controlling, inspecting, and exporting reviewable terminal sessions. -It is built for agent workflows that need both semantic state and visual artifacts from live or exited TUIs. +`agent-tty` is a CLI-first terminal automation tool for AI agents and humans. +It creates long-lived PTY-backed sessions, lets automation drive and inspect those sessions across CLI invocations, and produces reviewable artifacts such as semantic snapshots, PNG screenshots, asciicast recordings, and WebM exports. -## Installation +The goal is to make terminal and TUI automation inspectable rather than only scriptable. +It is inspired by the `agent-browser` style of agent workflow: give an agent a stateful environment, explicit wait/inspect primitives, and evidence a human can review after the fact. -`agent-tty` currently supports Node `24.x`. -The recommended hosted install path is the npm package `agent-tty`. GitHub Release tarballs remain the registry-independent fallback, and direct git dependency installs remain best-effort because they build from source. +## Why It Exists -### npm installation (recommended) +Most terminal automation falls back to brittle patterns: blind sleeps, long simulated keystroke streams, ad hoc screenshots, or detached `tmux` sessions that are hard for another process to inspect. +`agent-tty` provides a tighter loop: -#### Global install from npm +1. create an isolated terminal session, +2. run setup or send input, +3. wait for observable terminal state, +4. inspect the screen semantically, +5. capture renderer-backed visual evidence, +6. export replay artifacts, +7. clean up the session. -```bash -PACKAGE_VERSION= -npm install -g "agent-tty@${PACKAGE_VERSION}" -agent-tty version --json -agent-tty --home "$(mktemp -d)" doctor --json -``` - -To follow the prerelease channel instead of pinning an exact version, substitute `@beta` (or another dist-tag such as `@rc`) for `@${PACKAGE_VERSION}`. +That loop is useful for AI coding agents, shell scripts, CI smoke tests, TUI dogfooding, and humans debugging terminal workflows locally. -#### Project-local install from npm - -```bash -PACKAGE_VERSION= -npm install "agent-tty@${PACKAGE_VERSION}" -./node_modules/.bin/agent-tty version --json -``` +## What It Provides -### GitHub Release tarball installation +- Long-lived terminal sessions backed by `node-pty`. +- A stable, machine-readable CLI surface with JSON envelopes. +- `run`, `type`, `paste`, `send-keys`, `resize`, and `signal` controls. +- `wait` and `snapshot` primitives for semantic inspection. +- Renderer-backed screenshots and WebM exports using `ghostty-web` under Playwright/Chromium. +- Append-only event logs so snapshots, screenshots, and recordings can be replayed from session history. +- Isolated homes via `--home`, useful for agents, tests, CI, and proof bundles. -#### Direct release asset install +## How It Works -```bash -VERSION= -RELEASE_TAG="v${VERSION}" -RELEASE_TGZ="agent-tty-${VERSION}.tgz" -TARBALL_URL="https://github.com/coder/agent-tty/releases/download/${RELEASE_TAG}/${RELEASE_TGZ}" +The architecture is: -npm install -g "$TARBALL_URL" -agent-tty version --json +```text +agent-tty CLI -> per-session host -> PTY + event log -> ghostty-web renderer -> artifacts ``` -#### Authenticated or private release install +The PTY and append-only event log are the execution truth. +`ghostty-web` is the reference renderer for semantic snapshots, screenshots, and replay video; it is not a native-terminal parity guarantee. +The design keeps rendering behind an adapter so future native renderers can be added without changing the public CLI contract. -```bash -VERSION= -RELEASE_TAG="v${VERSION}" -RELEASE_TGZ="agent-tty-${VERSION}.tgz" - -gh release download "$RELEASE_TAG" --repo coder/agent-tty --pattern "$RELEASE_TGZ" -npm install -g "./$RELEASE_TGZ" -agent-tty version --json -agent-tty --home "$(mktemp -d)" doctor --json -``` +## Quick Start -#### Project-local install from a downloaded tarball +`agent-tty` requires Node `>=24 <26`. +Renderer-backed screenshots and WebM export also require a discoverable Playwright Chromium install. ```bash -VERSION= -RELEASE_TGZ="./agent-tty-${VERSION}.tgz" +npm install -g agent-tty -npm install "$RELEASE_TGZ" -./node_modules/.bin/agent-tty version --json -``` - -### Local tarball build from a source checkout - -When you need a deterministic local artifact before publishing a GitHub Release, build the tarball from a checkout: - -```bash -TARBALL_DIR=$(mktemp -d) -npm ci -npm run pack:private -- --pack-destination "$TARBALL_DIR" +AGENT_HOME="$(mktemp -d)" +agent-tty --home "$AGENT_HOME" doctor --json -INSTALL_PREFIX=$(mktemp -d) -npm install -g --prefix "$INSTALL_PREFIX" "$TARBALL_DIR"/*.tgz -"$INSTALL_PREFIX"/bin/agent-tty version --json -"$INSTALL_PREFIX"/bin/agent-tty --home "$(mktemp -d)" doctor --json +SESSION_ID=$(agent-tty --home "$AGENT_HOME" create --json --name demo -- /bin/bash | jq -r '.result.sessionId') +agent-tty --home "$AGENT_HOME" run "$SESSION_ID" 'printf "hello from agent-tty\n"' --json +agent-tty --home "$AGENT_HOME" wait "$SESSION_ID" --text 'hello from agent-tty' --json +agent-tty --home "$AGENT_HOME" snapshot "$SESSION_ID" --format text --json +agent-tty --home "$AGENT_HOME" screenshot "$SESSION_ID" --json +agent-tty --home "$AGENT_HOME" destroy "$SESSION_ID" --json ``` -`npm run pack:private` always rebuilds `dist/` before packing. Release automation instead uses `npm run pack:release` after the CI-quality build step so GitHub Releases and the npm publish job both reuse the same verified tarball plus a checksum file. - -### Git source installation (best-effort) +If Chromium is missing on a fresh machine, run: ```bash -npm install -g github:coder/agent-tty -agent-tty version --json +npx playwright install chromium ``` -GitHub installs attempt to build from source via npm's `prepare` hook. -Use this only when you explicitly want the latest default-branch snapshot and your npm/git-dependency environment can build native dependencies such as `node-pty` cleanly. -The repository's install smoke treats tarball install as the required path and records the current git-install caveat separately. +For prerelease channels, tarball installs, authenticated GitHub Release installs, and source-checkout tarballs, see [`docs/INSTALL.md`](./docs/INSTALL.md). -If your shell setup injects `mise activate` (or similar trust-checked tooling) into npm lifecycle subprocesses, trust the checkout path first or prefer the release tarball route. -If `doctor --json` reports a missing Playwright browser cache on a fresh machine, run `npx playwright install chromium` once before renderer-backed workflows. +## Common Usage -## Quick start +### Run setup inside a shell ```bash AGENT_HOME="$(mktemp -d)" -agent-tty --home "$AGENT_HOME" doctor --json -SESSION_ID=$(agent-tty --home "$AGENT_HOME" create --json --name demo -- /bin/bash | jq -r '.result.sessionId') -agent-tty --home "$AGENT_HOME" run "$SESSION_ID" 'echo hello from agent-tty' --json +SESSION_ID=$(agent-tty --home "$AGENT_HOME" create --json -- /bin/bash | jq -r '.result.sessionId') + +agent-tty --home "$AGENT_HOME" run "$SESSION_ID" 'pwd && npm test' --json agent-tty --home "$AGENT_HOME" snapshot "$SESSION_ID" --format text --json agent-tty --home "$AGENT_HOME" destroy "$SESSION_ID" --json ``` -## Documentation map - -- [`RELEASE.md`](./RELEASE.md) — the supported release contract for the `0.1.x` line. -- [`ROADMAP.md`](./ROADMAP.md) — intentionally deferred work and post-release direction. -- [`design/README.md`](./design/README.md) — architecture references plus archived week-by-week planning. -- [`dogfood/CATALOG.md`](./dogfood/CATALOG.md) — curated proof bundles and recommended review paths. -- [`docs/README.md`](./docs/README.md) — contributor and maintainer navigation. - -## Feature highlights - -- Full session lifecycle management: create, inspect, list, wait, destroy, and garbage-collect. -- Semantic snapshots for structured or text inspection, including optional scrollback capture. -- Renderer-backed screenshots and replay exports for reviewable visual evidence. -- Recording export to asciicast (`.cast`) or WebM for artifact bundles. -- Failure recovery via reconciliation, stale-session cleanup, and retained manifests/artifacts. - -## Release contract - -The `0.1.x` release line is centered on reliable, isolated, reviewable TUI automation. -For the explicit support contract, see [`RELEASE.md`](./RELEASE.md). For intentionally deferred work, see [`ROADMAP.md`](./ROADMAP.md). -Reviewer-facing proof bundles are curated in [`dogfood/CATALOG.md`](./dogfood/CATALOG.md), with current release-signoff evidence in `dogfood/20260326-week9-release-readiness/` and evergreen workflow coverage such as `dogfood/run-command/`. - -## TUI Workflow - -For setup-heavy TUI automation, prefer an isolated home plus the higher-level `run` primitive: +### Drive an interactive CLI or TUI ```bash AGENT_HOME="$(mktemp -d)" -agent-tty --home "$AGENT_HOME" doctor --json SESSION_ID=$(agent-tty --home "$AGENT_HOME" create --json -- /bin/bash | jq -r '.result.sessionId') -agent-tty --home "$AGENT_HOME" run "$SESSION_ID" 'npm install' -agent-tty --home "$AGENT_HOME" wait "$SESSION_ID" --text 'ready' -agent-tty --home "$AGENT_HOME" screenshot "$SESSION_ID" -agent-tty --home "$AGENT_HOME" record export "$SESSION_ID" --format webm -``` - -Recommended sequence: - -1. Create an isolated session home with `create`. -2. Use `run` for shell setup and multiline bootstrap work. -3. Use `wait` for render-visible readiness conditions. -4. Capture screenshots for point-in-time review. -5. Export WebM recordings when reviewers need motion proof. -6. Destroy the session when done. -## AI agent skills - -`agent-tty` ships two related skill trees in the npm package as well as the GitHub Release tarball: - -- `skills/agent-tty/` is the thin public bootstrap used by TanStack Intent and other skill loaders that discover files directly. -- `skill-data/` contains the canonical runtime skills served by the CLI. -- `agent-tty skills list` discovers the bundled runtime skills, including `agent-tty` and `dogfood-tui`. - -Install `agent-tty` from npm first (or from a GitHub Release tarball when you need a registry-independent fallback), then either copy the bootstrap skill into your agent config or let the CLI print the canonical runtime skill on demand. - -For coding agents that can ingest instructions on demand, `agent-tty skills get ` prints the packaged runtime `SKILL.md` directly to stdout after installation. - -```bash -agent-tty skills get agent-tty +agent-tty --home "$AGENT_HOME" run "$SESSION_ID" '' --no-wait --json +agent-tty --home "$AGENT_HOME" wait "$SESSION_ID" --screen-stable-ms 1000 --json +agent-tty --home "$AGENT_HOME" send-keys "$SESSION_ID" Down Down Enter --json +agent-tty --home "$AGENT_HOME" screenshot "$SESSION_ID" --json +agent-tty --home "$AGENT_HOME" destroy "$SESSION_ID" --json ``` -Use `agent-tty skills list` to discover every bundled runtime skill, and `agent-tty skills get dogfood-tui` when you want the built-in TUI dogfooding skill. - -### TanStack Intent integration - -After installing `agent-tty` in the project, let Intent wire the bootstrap from `skills/agent-tty/` into `AGENTS.md`, `CLAUDE.md`, or another supported agent config file. +### Export reviewer-facing proof ```bash -PACKAGE_VERSION= -npm install "agent-tty@${PACKAGE_VERSION}" -npx @tanstack/intent@latest list -npx @tanstack/intent@latest install -``` - -That workflow keeps the skill version aligned with the installed `agent-tty` package, while the bootstrap stays small and points agents back to the CLI-served runtime skill. - -### Mux skill installation - -After installing the npm package globally, copy the bootstrap skill from `skills/agent-tty/`: +AGENT_HOME="$(mktemp -d)" +SESSION_ID=$(agent-tty --home "$AGENT_HOME" create --json -- /bin/bash | jq -r '.result.sessionId') -```bash -mkdir -p ~/.mux/skills/agent-tty -cp -R "$(npm root -g)/agent-tty/skills/agent-tty/." ~/.mux/skills/agent-tty/ +agent-tty --home "$AGENT_HOME" run "$SESSION_ID" 'printf "artifact proof\n"' --json +agent-tty --home "$AGENT_HOME" wait "$SESSION_ID" --text 'artifact proof' --json +agent-tty --home "$AGENT_HOME" screenshot "$SESSION_ID" --json +agent-tty --home "$AGENT_HOME" record export "$SESSION_ID" --format asciicast --json +agent-tty --home "$AGENT_HOME" record export "$SESSION_ID" --format webm --json +agent-tty --home "$AGENT_HOME" destroy "$SESSION_ID" --json ``` -Mux can then discover the bootstrap normally, and the bootstrap instructs the agent to load the canonical runtime skill with `agent-tty skills get agent-tty`. +For command details, examples, and workflow notes, see [`docs/USAGE.md`](./docs/USAGE.md). -### Direct skill copy for other skill loaders +## Command Surface -After installing the npm package globally, copy the same bootstrap for loaders that read skill files directly: +Global flags: -```bash -mkdir -p ~/.claude/skills/agent-tty -cp -R "$(npm root -g)/agent-tty/skills/agent-tty/." ~/.claude/skills/agent-tty/ -``` +- `--home `: override the agent-tty home directory. +- `--timeout-ms `: apply a shared CLI timeout budget in milliseconds. +- `--no-color`: disable ANSI color in human-readable output. +- `--log-level `: set log level (`debug`, `info`, `warn`, `error`). +- `--profile `: select a default render profile. +- `--json`: available on user-facing commands for structured output. -If your assistant supports repository-backed skills, point it at `coder/agent-tty` and select the `skills/agent-tty/` bootstrap directory. +Command groups: -### Suggested `AGENTS.md` / `CLAUDE.md` snippet +- Environment and skills: `version`, `doctor`, `skills list|get|path`. +- Lifecycle: `create`, `list`, `inspect`, `destroy`, `gc`. +- Input and control: `run`, `type`, `paste`, `send-keys`, `resize`, `signal`, `mark`. +- Observation and artifacts: `wait`, `snapshot`, `screenshot`, `record export`. -```markdown -## Terminal Automation +## Notes And Limitations -Use `agent-tty` for terminal and TUI automation instead of `tmux`, ad hoc PTY wrappers, or external screenshot tools. +- Linux and macOS are tier-1 targets; Windows is tier-2 and not CI-tested. +- Screenshots and WebM export depend on Playwright/Chromium and the bundled `ghostty-web` renderer. +- `ghostty-web` provides reference visual truth, not exact parity with every native terminal emulator. +- `run` is best for shell-oriented setup and bootstrap work. It does not capture command output as a structured result or report the child command's exit status. +- Use `--home ` for isolated automation. Passing the same home to every command keeps manifests, sockets, event logs, and artifacts together. +- Run `agent-tty --home doctor --json` before screenshot or recording workflows in a new machine, CI job, or container. -Preferred workflow: +Troubleshooting notes live in [`docs/TROUBLESHOOTING.md`](./docs/TROUBLESHOOTING.md). -1. Create an isolated home and session with `agent-tty --home "$AGENT_HOME" create --json -- /bin/bash`. -2. Use `agent-tty run` for setup and bootstrap commands. -3. Use `agent-tty wait` for observable readiness instead of blind sleeps. -4. Use `agent-tty snapshot` to inspect the current terminal state. -5. Use `agent-tty screenshot` or `agent-tty record export` for reviewer-facing artifacts. -6. Destroy the session when the task is done. -``` +## Agent Skills -Maintainers can validate the shipped bootstrap skill locally with: +`agent-tty` ships a thin public bootstrap skill under `skills/agent-tty/` and canonical runtime skills under `skill-data/`. +After installing the CLI, coding agents can load the current runtime instructions directly: ```bash -npm run intent:validate +agent-tty skills get agent-tty +agent-tty skills list +agent-tty skills get dogfood-tui ``` -## Isolation +For TanStack Intent, Mux, and direct skill-copy instructions, see [`docs/AGENT-SKILLS.md`](./docs/AGENT-SKILLS.md). -- `--home ` stores manifests, sockets, event logs, and artifacts under an isolated agent-tty home. Pass the same `--home` value to each command in a workflow. -- `doctor --json` reports whether `agent-tty` is using the default location or an isolated home, including a `home_isolation` check for whether `--home` produced an isolated environment. -- It also exposes `browser_cache_accessible`, which verifies the Playwright browser cache is discoverable for renderer operations before screenshot/export flows. -- Renderer boot now carries Playwright browser-cache resolution into isolated-home workflows automatically when Chromium is installed in the normal cache or exposed through `PLAYWRIGHT_BROWSERS_PATH`. -- In a new machine, CI job, or container, run `agent-tty --home doctor --json` before starting screenshot or recording workflows. +## Vision And Roadmap -## Platform Support +The current `0.1.x` line is centered on reliable, isolated, reviewable TUI automation through the CLI. +Future work includes native renderer adapters, broader platform parity, mouse input, richer semantic TUI automation, remote/networked control, and external control layers such as an MCP wrapper. -- **Linux** — Tier-1. CI-tested on `ubuntu-latest`. Primary development and testing platform. -- **macOS** — Tier-1. CI-tested on `macos-latest`. Supported for local development and agent workflows. -- **Windows** — Tier-2. Not CI-tested. PTY uses ConPTY when available; rendering and PTY behavior differences are possible. Community contributions welcome. +- [`RELEASE.md`](./RELEASE.md) defines the supported release contract. +- [`ROADMAP.md`](./ROADMAP.md) tracks intentionally deferred work and post-release direction. +- [`design/ARCHITECTURE.md`](./design/ARCHITECTURE.md) explains the architecture and product intent in more detail. -## CLI-wide flags +## Local Development -- `--home `: override the agent-tty home directory. -- `--timeout-ms `: apply a shared CLI timeout budget in milliseconds. -- `--no-color`: disable ANSI color in human-readable output. -- `--json`: available on user-facing commands to emit structured command envelopes. - -## Commands - -- `version`: print the CLI version. -- `doctor`: validate local environment requirements. -- `create [command...]`: create a session and launch the requested command or shell. -- `list`: list sessions, optionally including exited ones. -- `inspect `: inspect manifest state and artifact metadata for a session. -- `destroy `: tear down a session, with optional forced shutdown. -- `gc`: remove stale or old sessions. -- `type [text]`: type text into a session. -- `paste [text]`: paste text into a session. -- `run [command]`: run a command inside a session with optional completion detection. -- `mark