Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,16 @@ npx playwright install chromium

For prerelease channels, tarball installs, authenticated GitHub Release installs, and source-checkout tarballs, see [`docs/INSTALL.md`](./docs/INSTALL.md).

## Agent Demo

This dogfood bundle records Codex and Claude interactive TUIs using `agent-tty` to drive `nvim --clean`, write a file, export proof artifacts, and exit cleanly. The linked WebMs are trimmed review cuts; the bundle also keeps untrimmed outer recordings.

| Codex | Claude |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| [![Codex agent-tty demo](./dogfood/agent-uses-agent-tty/artifacts/codex-thumbnail.png)](./dogfood/agent-uses-agent-tty/artifacts/codex-outer.webm) | [![Claude agent-tty demo](./dogfood/agent-uses-agent-tty/artifacts/claude-thumbnail.png)](./dogfood/agent-uses-agent-tty/artifacts/claude-outer.webm) |

See [`dogfood/agent-uses-agent-tty/`](./dogfood/agent-uses-agent-tty/) for the reproducer, inner Neovim recordings, transcripts, and final file proofs.

## Common Usage

### Run setup inside a shell
Expand Down
1 change: 1 addition & 0 deletions dogfood/CATALOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Paths below are relative to the repository root.
| Scrollback | Scrollback-aware snapshots, screenshots, and recording export | `dogfood/20260322-dogfood-scrollback/` |
| Unicode | Unicode rendering plus snapshot/export review | `dogfood/20260322-dogfood-unicode/` |
| LazyVim | A real TUI scenario that exercises editor startup and reviewer-visible artifacts | `dogfood/20260322-lazyvim-scenario/` |
| Agent uses TTY | Codex and Claude TUIs using `agent-tty` to drive Neovim and export proof artifacts | `dogfood/agent-uses-agent-tty/` |
| Public skill | The shipped `skills/agent-terminal/` workflow and documentation surface | `dogfood/20260327-public-skill/` |
| Install flows | Pre-public tarball install proof plus the current local git-install caveat evidence | `dogfood/install-flows/` |
| Config parity | Configuration/profile behavior checks that remain useful as a standing scenario | `dogfood/week5-config-parity/` |
Expand Down
2 changes: 1 addition & 1 deletion dogfood/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Some bundles are evergreen workflow scenarios, some are release/contract validat
1. Read [`CATALOG.md`](./CATALOG.md) for the curated bundle map.
2. For the current release-signoff view, start with `dogfood/20260326-week9-release-readiness/`.
3. For the Phase 5 eval DX token-usage proof from commit `91a571de`, start with `dogfood/token-usage-phase5-proof/`.
4. For evergreen workflows, start with bundles such as `dogfood/run-command/`, `dogfood/20260322-dogfood-hello-prompt/`, and `dogfood/20260322-lazyvim-scenario/`.
4. For evergreen workflows, start with bundles such as `dogfood/agent-uses-agent-tty/`, `dogfood/run-command/`, `dogfood/20260322-dogfood-hello-prompt/`, and `dogfood/20260322-lazyvim-scenario/`.
5. For recovery and hardening behavior, use the recovery section in the catalog.

## How to treat the directory
Expand Down
1 change: 1 addition & 0 deletions dogfood/agent-uses-agent-tty/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.tmp/
80 changes: 80 additions & 0 deletions dogfood/agent-uses-agent-tty/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Agent Uses agent-tty Dogfood Bundle

This evergreen bundle records coding agents using the public `agent-tty` CLI to drive a clean Neovim session. It supports Codex and Claude modes and writes reviewer-facing artifacts under `dogfood/agent-uses-agent-tty/artifacts/`.

## Demo Recordings

| Agent | Outer agent recording | Inner Neovim recording | File proof |
| ------ | ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------ |
| Codex | [![Codex recording thumbnail](./artifacts/codex-thumbnail.png)](./artifacts/codex-outer.webm) | [`codex-inner-nvim.webm`](./artifacts/codex-inner-nvim.webm), [`codex-inner-nvim.cast`](./artifacts/codex-inner-nvim.cast) | [`codex-final-file-proof.txt`](./artifacts/codex-final-file-proof.txt) |
| Claude | [![Claude recording thumbnail](./artifacts/claude-thumbnail.png)](./artifacts/claude-outer.webm) | [`claude-inner-nvim.webm`](./artifacts/claude-inner-nvim.webm), [`claude-inner-nvim.cast`](./artifacts/claude-inner-nvim.cast) | [`claude-final-file-proof.txt`](./artifacts/claude-final-file-proof.txt) |

The outer recording shows the Codex or Claude interactive TUI running inside an `agent-tty` session. The inner recording is the nested `agent-tty` session that the agent created to control `nvim --clean -n demo-note.txt`.
The thumbnail links point to slowed review cuts from the final command/export window of an accelerated replay, so startup waits do not dominate the video. The untrimmed recorded-timing outer WebM is kept as `artifacts/*-outer-full.webm`.

## Reproduce

From the repository root:

```bash
bash dogfood/agent-uses-agent-tty/reproduce.sh --agent codex
bash dogfood/agent-uses-agent-tty/reproduce.sh --agent claude
bash dogfood/agent-uses-agent-tty/reproduce.sh --agent both
```

`--agent both` is the default.

The script builds the local package, packs it, installs the tarball into a temporary prefix, prepends that prefix to `PATH`, and records the demo with public `agent-tty ...` commands. It also writes a checked helper script into the disposable workspace so the nested agent can run one deterministic command while the helper prints and executes the public `agent-tty ...` flow. It does not use repo-local `npx tsx src/cli/main.ts ...` inside the recorded agent runs.

## Prerequisites

- Project dependencies are installed.
- `node`, `npm`, `jq`, `ffmpeg`, `ffprobe`, `nvim`, and `shasum` are available.
- Playwright Chromium is available for screenshot and WebM export.
- Codex mode requires `codex` on PATH and `codex login status` to succeed.
- Claude mode requires `claude` on PATH and `claude auth status` to succeed.

The script records only sanitized auth status in `environment.txt`; it does not write Claude account details or Codex credential details into the bundle.
Codex mode uses `codex --dangerously-bypass-approvals-and-sandbox` because the run is already isolated to temporary workspaces and the inner `agent-tty doctor`/WebM checks need normal local browser access.

## Isolation And Cleanup

Each agent run uses:

- a temporary `agent-tty` install prefix,
- a temporary outer `agent-tty` home for the agent recording,
- a temporary inner `agent-tty` home for the Neovim session,
- a temporary git workspace,
- isolated Neovim XDG config, data, state, and cache directories.

Temporary directories are removed on exit. Set `KEEP_AGENT_USES_AGENT_TTY_TEMP=1` when debugging a failed run.
Set `AGENT_USES_AGENT_TTY_REVIEW_TAIL_SECONDS`, `AGENT_USES_AGENT_TTY_REVIEW_SLOWDOWN`, `AGENT_USES_AGENT_TTY_REVIEW_CPU_USED`, and `AGENT_USES_AGENT_TTY_REVIEW_CRF` to tune the linked review cuts. The defaults export an accelerated replay of the outer session, keep the final 6 seconds, and slow that segment by 4x.

## Bundle Contents

- `reproduce.sh` — self-contained generator.
- `prompts/template.md` — prompt template used for the nested agent runs.
- `environment.txt` — generated environment and auth-check summary.
- `*-outer-*.json` — generated CLI envelopes for the outer recording session.
- `artifacts/*-outer.webm` — slowed accelerated-replay review cut of the coding agent command/export window.
- `artifacts/*-outer-full.webm` and `artifacts/*-outer.cast` — untrimmed recorded-timing recordings of the coding agent process.
- `artifacts/*-outer-snapshot.txt` — text snapshot captured from the outer coding agent session.
- `artifacts/*-thumbnail.png` — README thumbnails copied from `agent-tty screenshot`.
- `artifacts/*-inner-nvim.webm` and `artifacts/*-inner-nvim.cast` — artifacts exported by the nested coding agent.
- `artifacts/*-demo-note.txt` and `artifacts/*-final-file-proof.txt` — final file proof.
Comment thread
ThomasK33 marked this conversation as resolved.
- `artifacts/*-prompt.md` — rendered prompt given to the coding agent.
- `artifacts/*-agent-transcript.txt` — captured agent output from the outer session snapshot.
- `artifacts/*-recording-summary.txt` — script-generated summary of the recording artifacts.

## Adding Another Agent

1. Extend `selected_agents`, `write_runner`, and argument validation in `reproduce.sh`.
2. Add a README row for the new `artifacts/<agent>-*` outputs.
3. Run `bash dogfood/agent-uses-agent-tty/reproduce.sh --agent <agent>` and confirm the generated file proof, outer recording, inner recording, thumbnail, and transcript are non-empty.

## References

- [Codex CLI](https://developers.openai.com/codex/cli)
- [Claude Code getting started](https://code.claude.com/docs/en/getting-started)
- [Claude Code CLI reference](https://code.claude.com/docs/en/cli-usage)
- [GitHub attachment file types](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/attaching-files)
1 change: 1 addition & 0 deletions dogfood/agent-uses-agent-tty/artifacts/.gitkeep
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Loading
Loading