sdk/summary: expose turn outcome (stop_reason) on TurnRecord + summary breakdown (#437) by willwashburn · Pull Request #445 · AgentWorkforce/burn

willwashburn · 2026-05-25T13:11:18Z

Closes #437.

Part of the burn 3.0 coordinated bump (see #430–#440 batch — do not merge alone).

Summary

New StopReason enum in reader/types.rs: EndTurn | MaxTokens | PauseTurn | StopSequence | ToolUse | Refusal | Silent. Kebab-case serde wire form. StopReason::from_wire accepts Anthropic snake_case (end_turn, max_tokens, …) and opencode's tool-calls (maps to ToolUse).
TurnRecord.stop_reason changed from Option<String> to Option<StopReason> with a lenient deserializer — unknown strings land in StopReason::Silent so pre-3.0 ledger replays don't error.
Readers parse string → enum at the TurnRecord boundary in claude.rs, opencode.rs, opencode_stream.rs. Codex stays None (no equivalent field); documented.
Schema v1→v2: nullable stop_reason TEXT column on turns + idx_turns_stop_reason partial index. Idempotent ALTER TABLE migration on Ledger::open.
Summary aggregates: StopReasonCounts struct added to both Summary and SummaryGroupedReport. CLI emits:
```
Turn outcomes: 142 end_turn, 3 max_tokens, 1 refusal, 0 pause
```
JSON output: stopReasons block in burn summary --json.

Why `Silent` vs `none` are distinct buckets

Silent — row exists, carries an unrecognized stop_reason string (forward compat for future Anthropic values).
none — no stop_reason field on the row at all (Codex, sidechain agents, malformed rows).

Keeping them separate lets future regressions ("we stopped parsing X") stand out instead of vanishing into a combined bucket.

Merge note

This PR bumps schema v1→v2 by adding turns.stop_reason. #436 (tool-output bytes) ALSO bumps v1→v2 for tool_result_events.output_bytes + output_truncated. Whichever lands second needs to become v3 with a chained ALTER TABLE. Both migrations are idempotent — mechanical reconcile.

Test plan

Out of scope

stop_reason in hotspots / overhead (separate concerns).
#[non_exhaustive] — not added; this is part of the 3.0 major bump.

Generated by Claude Code

coderabbitai · 2026-05-25T13:11:26Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: bccfb861-adb4-4e07-8b59-5a68897d58d0

📥 Commits

Reviewing files that changed from the base of the PR and between e999e2a and 989271b.

📒 Files selected for processing (18)

CHANGELOG.md
crates/relayburn-cli/src/commands/summary.rs
crates/relayburn-sdk/src/analyze/quality.rs
crates/relayburn-sdk/src/ledger/db.rs
crates/relayburn-sdk/src/ledger/schema.rs
crates/relayburn-sdk/src/ledger/tests.rs
crates/relayburn-sdk/src/ledger/writer.rs
crates/relayburn-sdk/src/lib.rs
crates/relayburn-sdk/src/query_verbs.rs
crates/relayburn-sdk/src/reader.rs
crates/relayburn-sdk/src/reader/claude.rs
crates/relayburn-sdk/src/reader/opencode.rs
crates/relayburn-sdk/src/reader/opencode/tests.rs
crates/relayburn-sdk/src/reader/types.rs
tests/fixtures/cli-golden/snapshots/state-status-json.stdout.txt
tests/fixtures/cli-golden/snapshots/state-status.stdout.txt
tests/fixtures/cli-golden/snapshots/summary-json.stdout.txt
tests/fixtures/cli-golden/snapshots/summary.stdout.txt

📝 Walkthrough

Walkthrough

This PR introduces structured stop-reason tracking across the ledger system: a new StopReason enum replaces string-based outcome fields, the ledger schema advances to v2 with in-place migration, all readers parse wire strings into the enum, query verbs expose aggregated counts, the CLI renders "Turn outcomes" summaries, and quality analysis aligns with the typed representation.

Changes

Stop reason: definition, storage, retrieval, and presentation

Layer / File(s)	Summary
Stop reason enum definition and lenient deserialization `crates/relayburn-sdk/src/reader/types.rs`, `crates/relayburn-sdk/src/lib.rs`	Introduces `pub enum StopReason` with kebab-case wire serialization, custom lenient deserializer (maps snake_case/legacy strings, defaults unknowns to `Silent`), and normalization rules (`stop` → `EndTurn`, tool variants → `ToolUse`). Adds `StopReason` to public re-exports.
Reader implementations: Claude and OpenCode parsing `crates/relayburn-sdk/src/reader/claude.rs`, `crates/relayburn-sdk/src/reader/opencode.rs`, `crates/relayburn-sdk/src/reader/*.rs`, test files	Claude and OpenCode parsers now convert wire stop-reason strings to `StopReason` enum via `from_wire`, falling back to `Silent` for missing/invalid values. Readers no longer copy raw strings; all assertions updated to compare `StopReason` variants.
Ledger schema v2 bump and forward migration `crates/relayburn-sdk/src/ledger/schema.rs`, `crates/relayburn-sdk/src/ledger/db.rs`, `crates/relayburn-sdk/src/ledger/tests.rs`	Schema version bumped to 2; `turns` table adds nullable `stop_reason TEXT` column. New `migrate_burn_schema` runs at `Ledger::open` before verification: reads current version, performs `ALTER TABLE` for v1→v2 (tolerates duplicate-column errors), updates `archive_state` to v2, creates filtered index. Integration test covers legacy v1 ledger migration idempotency.
Ledger writer: persist denormalized stop_reason `crates/relayburn-sdk/src/ledger/writer.rs`	`append_turns` extends `INSERT` to include `stop_reason` column, computing wire-string form or `NULL` when absent.
Query layer: StopReasonCounts and summary integration `crates/relayburn-sdk/src/query_verbs.rs`	New `StopReasonCounts` struct aggregates per-outcome turn counts with `bump`, `from_turns`, and `is_empty` methods. `Summary` and `SummaryGroupedReport` gain `stop_reasons: StopReasonCounts` field; `compute_summary` populates it from turn set. Acceptance tests validate grouped and slim summary aggregation including `none` bucket.
CLI: render stop-reason breakdowns in human and JSON output `crates/relayburn-cli/src/commands/summary.rs`	`burn summary` renders "Turn outcomes: …" line for human (core buckets unconditional, others only when non-zero) and `stopReasons` object for JSON output. Two formatting helpers `format_stop_reasons_line` and `stop_reasons_to_json` added; grouped test fixture updated.
Quality analysis: session-ending classification with typed enum `crates/relayburn-sdk/src/analyze/quality.rs`	`ending_role` now matches `last.stop_reason` against `StopReason::EndTurn` instead of string comparison. Test helper `TurnOverrides.stop_reason` changed to `Option<StopReason>`; all quality test scenarios updated with enum variants replacing `"end_turn"` / `"tool_use"` strings.
Documentation and golden snapshots `CHANGELOG.md`, `tests/fixtures/cli-golden/snapshots/*`	CHANGELOG documents new `burn summary` "Turn outcomes" feature and breaking `relayburn-sdk` contract for `TurnRecord.stop_reason: Option<StopReason>` with lenient pre-3.0 replay. CLI golden snapshots updated to schema version 2 and new "Turn outcomes" output.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

AgentWorkforce/burn#315: Both extend the burn summary implementation and output; #315 scaffolds the presenter while this PR adds stop-reason aggregation and rendering.
AgentWorkforce/burn#386: This PR extends the summary_report/SummaryGroupedReport types moved to the SDK in #386, adding new stop-reason fields and corresponding CLI rendering.
AgentWorkforce/burn#309: This PR implements full burn summary functionality including stop-reason breakdowns; #309 only scaffolds the command stub.

Poem

🐰 A rabbit hops through turns so bright,
Now counting ends both left and right—
Refusals, max-tokens, silent too,
Each outcome tracked, the schema new. 🎯

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/burn-3.0-437-stop-reason

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7ffe553e36

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-25T14:29:30Z

+            "end-turn" => Some(Self::EndTurn),
+            "max-tokens" | "length" => Some(Self::MaxTokens),
+            "pause-turn" => Some(Self::PauseTurn),
+            "stop-sequence" | "stop" => Some(Self::StopSequence),


Treat stop finish reason as natural completion

The from_wire mapping currently collapses "stop" into StopSequence, but OpenAI-compatible providers commonly use "stop" for normal completion (not just custom stop-sequence hits). In sessions parsed through opencode, this will misclassify successful assistant endings as stop-sequence terminations, which then skews burn summary outcome buckets and any downstream logic that interprets non-EndTurn reasons as user-ended. This alias should map to EndTurn (or remain unknown) instead of StopSequence.

Useful? React with 👍 / 👎.

Fixed in 5dab4df (now 5dab4df after rebase). "stop" now maps to StopReason::EndTurn — OpenAI / AI-SDK (and therefore opencode wrapping an OpenAI-shaped provider) emit a bare "stop" for ordinary end-of-turn completions, so collapsing it into StopSequence would have skewed the burn summary outcome buckets. Anthropic's actual stop-sequence outcome (stop_sequence / stop-sequence) still resolves to StopSequence.

The existing stop_reason_from_wire_normalizes_underscored_and_legacy_variants test now asserts both the new "stop" -> EndTurn mapping and that the stop_sequence / stop-sequence aliases still resolve to StopSequence. Branch HEAD is now 10d3cd7 (also includes a sibling fix to the migration pattern + a rebase onto current main).

Generated by Claude Code

cubic-dev-ai

1 issue found across 19 files

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

…`burn summary` (#437) BREAKING: `TurnRecord.stop_reason` switches from `Option<String>` to `Option<StopReason>`. Anthropic / opencode finish-reason strings are parsed into the enum at the reader boundary; unknown values fall back to `StopReason::Silent` so pre-3.0 ledgers replay cleanly. Schema bumps to v2 with a new denormalized `turns.stop_reason TEXT` column populated on insert; existing v1 ledgers are migrated in place via `ALTER TABLE` on `Ledger::open`. `burn summary` adds a `Turn outcomes: …` line and a `stopReasons` block in JSON output. Codex turns continue to carry `None` (the rollout schema doesn't include a stop reason field). https://claude.ai/code/session_01QEpNZbWEYNwxzqQjTN5LCY

OpenAI / AI-SDK convention emits a bare `"stop"` for ordinary end-of-turn completions, which is what opencode forwards when it wraps an OpenAI-shaped provider. The previous mapping collapsed `"stop"` into `StopSequence` alongside Anthropic's `stop_sequence`, which would misclassify successful assistant endings as stop-sequence terminations and skew the outcome buckets in `burn summary`. `stop_sequence` / `stop-sequence` still resolve to `StopSequence` (the actual Anthropic semantics). Addresses chatgpt-codex-connector review on #445. https://claude.ai/code/session_01QEpNZbWEYNwxzqQjTN5LCY

The v1 -> v2 `ALTER TABLE turns ADD COLUMN stop_reason` previously used a `column_exists` pre-check, which is race-prone under concurrent ledger opens: two processes can both observe the column missing, both issue the `ALTER`, and the second loses. Switch to try-then-swallow on the `duplicate column name` SQLite error so the migration is genuinely idempotent under contention. Deliberately does not catch `SqliteFailure(_, None)` — that shape is too broad and would mask real schema breakage. Keeps the pattern aligned with the matching migration in #444 so the two PRs don't diverge when both land. `column_exists` had no other callers, so the helper is dropped. Addresses cubic-dev-ai review on #445. https://claude.ai/code/session_01QEpNZbWEYNwxzqQjTN5LCY

willwashburn marked this pull request as ready for review May 25, 2026 14:24

chatgpt-codex-connector Bot reviewed May 25, 2026

View reviewed changes

cubic-dev-ai Bot reviewed May 25, 2026

View reviewed changes

Comment thread crates/relayburn-sdk/src/ledger/db.rs Outdated

claude added 3 commits May 25, 2026 14:56

willwashburn force-pushed the claude/burn-3.0-437-stop-reason branch from 7ffe553 to 10d3cd7 Compare May 25, 2026 14:58

Merge branch 'main' into claude/burn-3.0-437-stop-reason

989271b

willwashburn merged commit 6778f77 into main May 25, 2026
11 of 12 checks passed

willwashburn deleted the claude/burn-3.0-437-stop-reason branch May 25, 2026 16:11

This was referenced May 26, 2026

sdk: per-turn span tree as analytical primitive (#430) #451

Merged

chore: restore Rust fmt and clippy cleanliness #463

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sdk/summary: expose turn outcome (stop_reason) on TurnRecord + summary breakdown (#437)#445

sdk/summary: expose turn outcome (stop_reason) on TurnRecord + summary breakdown (#437)#445
willwashburn merged 4 commits into
mainfrom
claude/burn-3.0-437-stop-reason

willwashburn commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 25, 2026

Uh oh!

willwashburn May 25, 2026

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

willwashburn commented May 25, 2026

Summary

Why Silent vs none are distinct buckets

Merge note

Test plan

Out of scope

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

willwashburn May 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Why `Silent` vs `none` are distinct buckets

coderabbitai Bot commented May 25, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading