Skip to content

sdk/summary: expose turn outcome (stop_reason) on TurnRecord + summary breakdown (#437)#445

Merged
willwashburn merged 4 commits into
mainfrom
claude/burn-3.0-437-stop-reason
May 25, 2026
Merged

sdk/summary: expose turn outcome (stop_reason) on TurnRecord + summary breakdown (#437)#445
willwashburn merged 4 commits into
mainfrom
claude/burn-3.0-437-stop-reason

Conversation

@willwashburn

Copy link
Copy Markdown
Member

Closes #437.

Part of the burn 3.0 coordinated bump (see #430#440 batch — do not merge alone).

Summary

  • New StopReason enum in reader/types.rs: EndTurn | MaxTokens | PauseTurn | StopSequence | ToolUse | Refusal | Silent. Kebab-case serde wire form. StopReason::from_wire accepts Anthropic snake_case (end_turn, max_tokens, …) and opencode's tool-calls (maps to ToolUse).
  • TurnRecord.stop_reason changed from Option<String> to Option<StopReason> with a lenient deserializer — unknown strings land in StopReason::Silent so pre-3.0 ledger replays don't error.
  • Readers parse string → enum at the TurnRecord boundary in claude.rs, opencode.rs, opencode_stream.rs. Codex stays None (no equivalent field); documented.
  • Schema v1→v2: nullable stop_reason TEXT column on turns + idx_turns_stop_reason partial index. Idempotent ALTER TABLE migration on Ledger::open.
  • Summary aggregates: StopReasonCounts struct added to both Summary and SummaryGroupedReport. CLI emits:
    Turn outcomes: 142 end_turn, 3 max_tokens, 1 refusal, 0 pause
    
  • JSON output: stopReasons block in burn summary --json.

Why Silent vs none are distinct buckets

  • Silent — row exists, carries an unrecognized stop_reason string (forward compat for future Anthropic values).
  • none — no stop_reason field on the row at all (Codex, sidechain agents, malformed rows).

Keeping them separate lets future regressions ("we stopped parsing X") stand out instead of vanishing into a combined bucket.

Merge note

This PR bumps schema v1→v2 by adding turns.stop_reason. #436 (tool-output bytes) ALSO bumps v1→v2 for tool_result_events.output_bytes + output_truncated. Whichever lands second needs to become v3 with a chained ALTER TABLE. Both migrations are idempotent — mechanical reconcile.

Test plan

  • stop_reason_serializes_kebab_case
  • stop_reason_from_wire_normalizes_underscored_and_legacy_variants
  • turn_record_stop_reason_deserializes_legacy_strings_into_enum (legacy max_tokens round-trip)
  • turn_record_stop_reason_unknown_string_falls_back_to_silent
  • summary_report_aggregates_stop_reasons_per_outcome (fixture with max_tokens turn surfaces in summary)
  • summary_legacy_surface_includes_stop_reason_counts_with_none_for_missing_field
  • legacy_v1_ledger_migrates_to_v2_on_open_and_adds_stop_reason_column
  • Golden snapshots updated: summary, summary-json, state-status*
  • cargo test --workspace — 794 tests, 0 failed
  • BURN_GOLDEN=1 cargo test --test golden — pass

Out of scope

  • stop_reason in hotspots / overhead (separate concerns).
  • #[non_exhaustive] — not added; this is part of the 3.0 major bump.

Generated by Claude Code

@coderabbitai

coderabbitai Bot commented May 25, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: bccfb861-adb4-4e07-8b59-5a68897d58d0

📥 Commits

Reviewing files that changed from the base of the PR and between e999e2a and 989271b.

📒 Files selected for processing (18)
  • CHANGELOG.md
  • crates/relayburn-cli/src/commands/summary.rs
  • crates/relayburn-sdk/src/analyze/quality.rs
  • crates/relayburn-sdk/src/ledger/db.rs
  • crates/relayburn-sdk/src/ledger/schema.rs
  • crates/relayburn-sdk/src/ledger/tests.rs
  • crates/relayburn-sdk/src/ledger/writer.rs
  • crates/relayburn-sdk/src/lib.rs
  • crates/relayburn-sdk/src/query_verbs.rs
  • crates/relayburn-sdk/src/reader.rs
  • crates/relayburn-sdk/src/reader/claude.rs
  • crates/relayburn-sdk/src/reader/opencode.rs
  • crates/relayburn-sdk/src/reader/opencode/tests.rs
  • crates/relayburn-sdk/src/reader/types.rs
  • tests/fixtures/cli-golden/snapshots/state-status-json.stdout.txt
  • tests/fixtures/cli-golden/snapshots/state-status.stdout.txt
  • tests/fixtures/cli-golden/snapshots/summary-json.stdout.txt
  • tests/fixtures/cli-golden/snapshots/summary.stdout.txt

📝 Walkthrough

Walkthrough

This PR introduces structured stop-reason tracking across the ledger system: a new StopReason enum replaces string-based outcome fields, the ledger schema advances to v2 with in-place migration, all readers parse wire strings into the enum, query verbs expose aggregated counts, the CLI renders "Turn outcomes" summaries, and quality analysis aligns with the typed representation.

Changes

Stop reason: definition, storage, retrieval, and presentation

Layer / File(s) Summary
Stop reason enum definition and lenient deserialization
crates/relayburn-sdk/src/reader/types.rs, crates/relayburn-sdk/src/lib.rs
Introduces pub enum StopReason with kebab-case wire serialization, custom lenient deserializer (maps snake_case/legacy strings, defaults unknowns to Silent), and normalization rules (stopEndTurn, tool variants → ToolUse). Adds StopReason to public re-exports.
Reader implementations: Claude and OpenCode parsing
crates/relayburn-sdk/src/reader/claude.rs, crates/relayburn-sdk/src/reader/opencode.rs, crates/relayburn-sdk/src/reader/*.rs, test files
Claude and OpenCode parsers now convert wire stop-reason strings to StopReason enum via from_wire, falling back to Silent for missing/invalid values. Readers no longer copy raw strings; all assertions updated to compare StopReason variants.
Ledger schema v2 bump and forward migration
crates/relayburn-sdk/src/ledger/schema.rs, crates/relayburn-sdk/src/ledger/db.rs, crates/relayburn-sdk/src/ledger/tests.rs
Schema version bumped to 2; turns table adds nullable stop_reason TEXT column. New migrate_burn_schema runs at Ledger::open before verification: reads current version, performs ALTER TABLE for v1→v2 (tolerates duplicate-column errors), updates archive_state to v2, creates filtered index. Integration test covers legacy v1 ledger migration idempotency.
Ledger writer: persist denormalized stop_reason
crates/relayburn-sdk/src/ledger/writer.rs
append_turns extends INSERT to include stop_reason column, computing wire-string form or NULL when absent.
Query layer: StopReasonCounts and summary integration
crates/relayburn-sdk/src/query_verbs.rs
New StopReasonCounts struct aggregates per-outcome turn counts with bump, from_turns, and is_empty methods. Summary and SummaryGroupedReport gain stop_reasons: StopReasonCounts field; compute_summary populates it from turn set. Acceptance tests validate grouped and slim summary aggregation including none bucket.
CLI: render stop-reason breakdowns in human and JSON output
crates/relayburn-cli/src/commands/summary.rs
burn summary renders "Turn outcomes: …" line for human (core buckets unconditional, others only when non-zero) and stopReasons object for JSON output. Two formatting helpers format_stop_reasons_line and stop_reasons_to_json added; grouped test fixture updated.
Quality analysis: session-ending classification with typed enum
crates/relayburn-sdk/src/analyze/quality.rs
ending_role now matches last.stop_reason against StopReason::EndTurn instead of string comparison. Test helper TurnOverrides.stop_reason changed to Option<StopReason>; all quality test scenarios updated with enum variants replacing "end_turn" / "tool_use" strings.
Documentation and golden snapshots
CHANGELOG.md, tests/fixtures/cli-golden/snapshots/*
CHANGELOG documents new burn summary "Turn outcomes" feature and breaking relayburn-sdk contract for TurnRecord.stop_reason: Option<StopReason> with lenient pre-3.0 replay. CLI golden snapshots updated to schema version 2 and new "Turn outcomes" output.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • AgentWorkforce/burn#315: Both extend the burn summary implementation and output; #315 scaffolds the presenter while this PR adds stop-reason aggregation and rendering.
  • AgentWorkforce/burn#386: This PR extends the summary_report/SummaryGroupedReport types moved to the SDK in #386, adding new stop-reason fields and corresponding CLI rendering.
  • AgentWorkforce/burn#309: This PR implements full burn summary functionality including stop-reason breakdowns; #309 only scaffolds the command stub.

Poem

🐰 A rabbit hops through turns so bright,
Now counting ends both left and right—
Refusals, max-tokens, silent too,
Each outcome tracked, the schema new. 🎯

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/burn-3.0-437-stop-reason

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@willwashburn willwashburn marked this pull request as ready for review May 25, 2026 14:24

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7ffe553e36

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

"end-turn" => Some(Self::EndTurn),
"max-tokens" | "length" => Some(Self::MaxTokens),
"pause-turn" => Some(Self::PauseTurn),
"stop-sequence" | "stop" => Some(Self::StopSequence),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat stop finish reason as natural completion

The from_wire mapping currently collapses "stop" into StopSequence, but OpenAI-compatible providers commonly use "stop" for normal completion (not just custom stop-sequence hits). In sessions parsed through opencode, this will misclassify successful assistant endings as stop-sequence terminations, which then skews burn summary outcome buckets and any downstream logic that interprets non-EndTurn reasons as user-ended. This alias should map to EndTurn (or remain unknown) instead of StopSequence.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5dab4df (now 5dab4df after rebase). "stop" now maps to StopReason::EndTurn — OpenAI / AI-SDK (and therefore opencode wrapping an OpenAI-shaped provider) emit a bare "stop" for ordinary end-of-turn completions, so collapsing it into StopSequence would have skewed the burn summary outcome buckets. Anthropic's actual stop-sequence outcome (stop_sequence / stop-sequence) still resolves to StopSequence.

The existing stop_reason_from_wire_normalizes_underscored_and_legacy_variants test now asserts both the new "stop" -> EndTurn mapping and that the stop_sequence / stop-sequence aliases still resolve to StopSequence. Branch HEAD is now 10d3cd7 (also includes a sibling fix to the migration pattern + a rebase onto current main).


Generated by Claude Code

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 19 files

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread crates/relayburn-sdk/src/ledger/db.rs Outdated
claude added 3 commits May 25, 2026 14:56
…`burn summary` (#437)

BREAKING: `TurnRecord.stop_reason` switches from `Option<String>` to
`Option<StopReason>`. Anthropic / opencode finish-reason strings are
parsed into the enum at the reader boundary; unknown values fall back to
`StopReason::Silent` so pre-3.0 ledgers replay cleanly. Schema bumps to v2
with a new denormalized `turns.stop_reason TEXT` column populated on
insert; existing v1 ledgers are migrated in place via `ALTER TABLE` on
`Ledger::open`. `burn summary` adds a `Turn outcomes: …` line and a
`stopReasons` block in JSON output. Codex turns continue to carry `None`
(the rollout schema doesn't include a stop reason field).

https://claude.ai/code/session_01QEpNZbWEYNwxzqQjTN5LCY
OpenAI / AI-SDK convention emits a bare `"stop"` for ordinary
end-of-turn completions, which is what opencode forwards when it wraps
an OpenAI-shaped provider. The previous mapping collapsed `"stop"` into
`StopSequence` alongside Anthropic's `stop_sequence`, which would
misclassify successful assistant endings as stop-sequence terminations
and skew the outcome buckets in `burn summary`.

`stop_sequence` / `stop-sequence` still resolve to `StopSequence` (the
actual Anthropic semantics).

Addresses chatgpt-codex-connector review on #445.

https://claude.ai/code/session_01QEpNZbWEYNwxzqQjTN5LCY
The v1 -> v2 `ALTER TABLE turns ADD COLUMN stop_reason` previously used
a `column_exists` pre-check, which is race-prone under concurrent
ledger opens: two processes can both observe the column missing, both
issue the `ALTER`, and the second loses. Switch to try-then-swallow
on the `duplicate column name` SQLite error so the migration is
genuinely idempotent under contention.

Deliberately does not catch `SqliteFailure(_, None)` — that shape is
too broad and would mask real schema breakage. Keeps the pattern
aligned with the matching migration in #444 so the two PRs don't
diverge when both land.

`column_exists` had no other callers, so the helper is dropped.

Addresses cubic-dev-ai review on #445.

https://claude.ai/code/session_01QEpNZbWEYNwxzqQjTN5LCY
@willwashburn willwashburn force-pushed the claude/burn-3.0-437-stop-reason branch from 7ffe553 to 10d3cd7 Compare May 25, 2026 14:58
@willwashburn willwashburn merged commit 6778f77 into main May 25, 2026
11 of 12 checks passed
@willwashburn willwashburn deleted the claude/burn-3.0-437-stop-reason branch May 25, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

summary: expose turn outcome (stop_reason) — end_turn / max_tokens / refusal / pause

2 participants