Skip to content

Architecture refactor + breaking public-API tightening (→ 4.0.0)#488

Merged
willwashburn merged 15 commits into
mainfrom
refactor/architecture-loop
Jun 23, 2026
Merged

Architecture refactor + breaking public-API tightening (→ 4.0.0)#488
willwashburn merged 15 commits into
mainfrom
refactor/architecture-loop

Conversation

@willwashburn

@willwashburn willwashburn commented Jun 22, 2026

Copy link
Copy Markdown
Member

14 commits: behavior-preserving structural refactors plus a breaking tightening of the published SDK surface. Breaking → major version (4.0.0).

Behavior-preserving (verified equivalent)

  • Decomposed both parser god-functions: parse_codex_buffer (885→170 lines) and claude run_incremental (531→332) into *ParseState structs + per-event handlers. Each verified with an independent golden diff over thousands of real sessions (codex: 350 turns; claude: 3567 turns, 4 output dimensions, byte-identical) plus a line-by-line adversarial equivalence review.
  • Separated mixed concerns: classifier.rs → rule-tables / bash_parse / slash_triads; the two largest CLI commands (summary, hotspots) → args / json / human submodules.
  • Removed genuine duplication: hoisted byte-identical cross-reader helpers into user_turn.rs (left the per-harness ones — unifying them would be leaky).
  • Split oversized files: query_verbs/summary.rs (2056) along an API/engine boundary; moved the claude & codex parse engines into <reader>/incremental.rs so all three readers share one shape.

All moves verified by item-set parity + byte-identity spot-checks + full workspace tests.

Breaking changes (the 4.0.0 driver)

  • relayburn-sdk: stopped re-exporting ~50 low-level analyze-layer internals (detectors/aggregators + helper types like PricingTable, CompareTable, CompareCell). The verb layer (LedgerHandle methods / summary_report / hotspots / compare) is the sole intended public surface. The AnalyzeHotspotsOptions/AnalyzeCompareOptions collision aliases are gone. Only the Rust SDK API breaks — the in-repo embedders (CLI, node bindings) never used these.
  • CLI compare rewired onto the verb (fixed the one CLAUDE.md layering violation), enabling the compare-layer de-publicization. Output byte-identical over a 4-model / 3998-turn corpus.
  • Behavior nuances (affect CLI / MCP / @relayburn/sdk output, not their APIs): compare costs now use canonical decimal rounding (shift by 1 in the last digit at ties); fidelity JSON keys now emit in a stable, deterministic order (fixes a pre-existing HashMap non-determinism bug surfaced during review).

Verification

  • Full workspace (SDK + CLI + node bindings) builds clean; all tests green; clippy --workspace + fmt clean.
  • The two god-function decompositions: golden diffs on real data + adversarial reviews.
  • The breaking changes: a holistic pre-release review (compare edge cases → no regression; API coherence → 2 one-liner stragglers found & fixed).

Release notes

[Unreleased] entries are ready in the root CHANGELOG.md and all three package changelogs (@relayburn/sdk, @relayburn/mcp, relayburn). README and CLAUDE.md were checked and remain accurate. Publish via the Publish Packages workflow with version: major (lockstep → 4.0.0).

🤖 Generated with Claude Code

Review in cubic

willwashburn and others added 14 commits June 21, 2026 23:51
classifier.rs (1915 lines) mixed three distinct concerns. Extract two into
a classifier/ submodule directory, leaving the activity classifier + rule
tables in the parent:

- classifier/bash_parse.rs (516) — the bash-command parser: parse_bash_command
  and its helpers, the binary lookup tables, and ENV_ASSIGN_RE.
- classifier/slash_triads.rs (327) — slash-triad + task-notification row
  detection: detect_slash_triads, is_task_notification, and row helpers.
- classifier.rs (1094) — activity classifier, rule tables, shared types, tests.

Pure code-movement: all moved item bodies are byte-identical, and the item set
is unchanged (138 items before and after). build_re becomes pub(super); the
public surface is preserved via re-exports so crate::reader::classifier::<name>
paths and reader.rs re-exports keep resolving.

Verified: 138/138 item-set parity, byte-identity on representative functions,
warning-free build, clippy clean, 832 SDK tests pass, live summary/hotspots OK.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
resolve_token_counter and join_nonempty were byte-identical copies in both the
codex and opencode readers. Move them to reader/user_turn.rs (pub(crate)):
resolve_token_counter sits beside the UserTurnTokenizer/HeuristicCounter types
it maps between, and join_nonempty serves combined user-turn text assembly —
both readers' only callers. Per util.rs's convention, reader-only helpers stay
in the reader rather than the cross-module util module.

The other reader helpers the survey flagged (to_usage, merge_usage_coverage,
build_*_fidelity, pick_target, measure_*_tool_output) are genuinely per-harness
(different tool-name conventions and coverage/usage types) and are deliberately
left as-is — unifying them would be a leaky abstraction.

Pure move: bodies byte-identical, all reader tests pass (codex 47, opencode 55,
user_turn 28), clippy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
parse_codex_buffer was an 885-line streaming state machine threading ~25
mutable locals plus a parallel committed_* shadow set through one giant
payload-type match. Decompose it without changing behavior:

- CodexParseState owns the 25 working fields; per-event handler methods
  (handle_session_meta / turn_context / compacted / event_msg / response_item)
  hold each arm's logic. The driver loop is now a clean 5-arm dispatch (170
  lines, was 885).
- CommittedSnapshot captures the transaction (commit-on-boundary) mechanism as
  an explicit type. snapshot() is invoked only from handle_event_msg's
  task_complete arm — the sole commit site — and the result is still assembled
  entirely from committed state (finalized_count / user_turns_count / end_offset
  filters), so trailing partial lines are still discarded.

Pure behavior-preserving extract: handler bodies are byte-identical to the
original arms modulo var->self.var, continue->return (loop body ends at the
match, so equivalent), and the commit-block extraction.

Verified: 47 codex tests + full SDK suite pass, clippy clean, warning-free
build, and an independent golden diff (summary/sessions/hotspots byte-identical
over 60 sessions / 350 real codex turns), corroborated by a line-by-line
adversarial equivalence review.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
query_verbs/summary.rs was the largest SDK file (2056 lines), mixing the public
report API with its private computation engine. Split into a summary/ module:

- summary/mod.rs (970) — the public surface: option/report types and impls, the
  LedgerHandle dispatchers (summary / summary_report / summary_timeseries), the
  free-function verbs, and the public fidelity-to-value helpers.
- summary/compute.rs (1125) — the private compute engine: query building,
  filtering, tag/model aggregation, by-tool cost attribution, and relationship
  matching. Re-exported via `pub(crate) use compute::*` so existing
  crate::query_verbs::summary::<name> paths (incl. tests.rs) keep resolving.

Pure code-movement: 101/101 item-set parity (none dropped or duplicated),
moved bodies byte-identical, 832 SDK tests pass, clippy clean, live summary OK.
query_verbs/mod.rs: pruned imports the engine took with it (use-block only).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
commands/summary.rs was the largest CLI file (1515 lines), bundling three
presenter concerns. Split into a summary/ module:

- summary/mod.rs (611) — orchestration: SummaryArgs, run, run_inner (dispatch +
  flag-exclusivity validation), arg parsing, ingest, and the test block.
- summary/json.rs (170) — JSON serialization (emit_json, grouped_json_value,
  the *_to_json helpers).
- summary/human.rs (778) — human/table rendering (emit_human, all render_*,
  format_*, coverage cells).

Public surface (SummaryArgs, run) unchanged at crate::commands::summary::*.
Pure code-movement: 47/47 item-set parity, moved bodies byte-identical, all
CLI tests pass, clippy clean, live summary/--by-tool OK.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mirror the commands/summary split so the two largest CLI commands share one
structure. commands/hotspots.rs (1236 lines) → hotspots/ module:

- hotspots/mod.rs (294) — orchestration: HotspotsArgs, RankBy, run, run_inner,
  pattern-selection, and the test block.
- hotspots/json.rs (280) — JSON serialization (the *_to_json helpers).
- hotspots/human.rs (687) — human/table rendering (emit_human, findings/section
  tables, *_row, sort_*, format_bytes); internal helpers kept private.

Public surface (HotspotsArgs, RankBy, run) unchanged. Pure code-movement:
43/43 item-set parity, moved bodies byte-identical, CLI tests pass, clippy
clean, live hotspots OK.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
run_incremental was a 531-line streaming parser threading ~25 mutable locals
(plus evidence) through one match-on-line_type loop — the Claude analogue of
the codex god-function. Decompose it without changing behavior:

- ClaudeParseState owns the working fields + evidence; a constructor reproduces
  the setup (incl. the start_offset>0 prescan wiring and the resume_marker_offset
  sentinel derivation) verbatim. Per-line handlers handle_assistant /
  handle_user / handle_system hold each arm's logic. The driver is now setup +
  early return + a 3-way dispatch loop + the unchanged post-loop assembly
  (332 lines, was 531).

Pure behavior-preserving extract: handler bodies are byte-identical to the
original arms modulo var->self.var and parameter threading; the loop preamble
and entire post-loop are byte-identical. The early-return path builds evidence
inline (a clone became a move — new_evidence is a pure constructor, so
unobservable).

Verified: 103 claude tests + full SDK suite pass, clippy clean, warning-free
build, an independent golden diff (summary/by-tool/sessions/hotspots
byte-identical over 58 sessions / 3567 real Claude turns), and a line-by-line
adversarial equivalence review covering the resume/early-return paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…l.rs

claude.rs was the largest SDK source file (2134 lines). Extract the
incremental-parse engine — ClaudeParseState + its handlers, run_incremental,
prescan_nodes, record_root_incremental, PrescanOutput — into a new
claude/incremental.rs submodule, leaving the root as the public API +
parse-primitive helpers. claude.rs drops to 1345; incremental.rs is 844.

Pure code-movement: the moved block is byte-identical (only run_incremental
gained pub(super)); root helpers the engine uses were widened to
pub(in crate::reader::claude) (tightest scope, warning-free). Gives the reader
trio the same shape (each reader is a directory; the engine lives in its own
submodule). 103 claude tests pass, full SDK green, clippy clean, live OK.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mirror the claude split: extract the codex incremental-parse engine —
Pending<T>, CodexParseState + its handlers, CommittedSnapshot, parse_codex_buffer
— out of the oversized codex.rs (1801) into codex/incremental.rs, leaving the
root as the public API + parse-primitive helpers. codex.rs drops to 831;
incremental.rs is 1014.

Pure code-movement: the moved block is byte-identical (only parse_codex_buffer
gained pub(super)); the engine's root dependencies were widened to
pub(in crate::reader::codex) and now-unused root imports pruned. All three
readers now share one shape — each is a directory with its engine in
incremental.rs. 47 codex tests pass, full SDK green, clippy clean, live OK.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
BREAKING: the published relayburn-sdk surface re-exported ~44 low-level
analyze-layer items (detector/aggregator fns like detect_patterns,
attribute_hotspots, compute_quality, the aggregate_by_* family; helper types
like PricingTable, ComputeQualityOptions, OverheadFile, ProviderRule,
SessionTotals) that no embedder used — the verb layer (LedgerHandle methods /
summary_report / hotspots / compare) is the intended API. Flip them to
pub(crate) and drop them from lib.rs, removing the confusing
HotspotsOptions-vs-HotspotsOptions collision aliases (AnalyzeHotspotsOptions/
AnalyzeHotspotsResult) the leak forced.

The verb layer reaches analyze via `use crate::analyze::`, not the lib re-export,
so intra-crate plumbing is unaffected. Items that appear in a kept-public
signature (ModelCost/ReasoningMode, OneShotMetrics/SessionOutcome, SpanEvent/
SpanStatus, …) stayed public — the build is the ground truth. Also pruned a few
now-unreachable dead helpers.

44 public names removed from lib.rs. Full workspace (SDK + CLI + node bindings)
builds clean, all tests pass, clippy clean. CLI/MCP/@relayburn/sdk unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…analyze compare layer

The CLI compare command reimplemented the SDK verb inline (a CLAUDE.md layering
violation: presenters should wrap verbs). Rewire the non-bucket path onto
LedgerHandle::compare(): the three renderers (json/csv/tty) now consume the
verb's CompareResult/CompareCellResult instead of building the low-level
analyze CompareTable, and analyzed-turns/fidelity come from the verb. The bucket
path already used the verb (compare_timeseries) and is untouched.

With the CLI off the low-level layer, de-publicize analyze::compare
(CompareOptions/CompareTable/CompareCell/CompareCategory/CompareTotals/
build_compare_table) plus the now-internal load_pricing/provider_for/
has_minimum_fidelity/ProviderFilter; delete the dead compare_from_archive path.

BREAKING (numeric): the verb's round_digits used (n*scale).round()/scale, which
diverged from the {:.N}/toFixed rounding the renderers (and the rest of the
codebase) use. Switched to format!-based rounding so the verb's pre-rounded
cells match the display contract; tie values shift by one in the last digit.

Verified behaviorally: burn compare json/csv/human/fidelity output is
byte-identical over a mixed 4-model / 3998-turn corpus; the bucketed JSON is
equivalent modulo the rounding fix and a pre-existing HashMap key-ordering
non-determinism in the fidelity block. Full workspace (SDK + CLI + node) builds
clean, all tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FidelitySummary stored by_class / by_granularity / missing_coverage as HashMap,
which serde serializes in randomized per-process order — so the `fidelity` block
of `summary`/`compare` JSON changed run-to-run (breaking diffing, caching, and
snapshot tests; surfaced while golden-verifying the compare rewire). Switch the
three maps to BTreeMap and derive Ord on FidelityClass/UsageGranularity so keys
emit in a stable order (fidelity-class declaration order; missing-coverage
alphabetical). Output is now reproducible.

Map .insert/.get call sites are unchanged (BTreeMap shares the API). Full
workspace builds clean, all tests pass, clippy clean; bucketed compare JSON is
now byte-stable across runs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Holistic-review follow-up to the public-API tightening:
- TurnProvider was still publicly re-exported but orphaned — it's only produced
  by provider_for/provider_for_with_rules (both pub(crate)) and appears in no
  public signature or field, so a caller could name it but never obtain one.
  Flip the struct to pub(crate) and drop the dead re-export. Completes the
  analyze-internals de-publicization.
- CostBreakdown::model's doc-comment intra-linked to `sum_costs`, which became
  pub(crate) in the same change set (rustdoc warned about a public->private
  doc link). Demote it to a plain code span.

Full workspace builds warning-free, all tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ckage changelogs

The compare canonical-rounding and fidelity-summary key-order changes flow
through to the published npm packages, but only the root CHANGELOG had entries.
Add concise [Unreleased] notes to @relayburn/sdk, @relayburn/mcp, and relayburn
(the CLI wrapper) describing the practical output effect.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 97d2ee15-4d45-4322-b444-1746d689e485

📥 Commits

Reviewing files that changed from the base of the PR and between d3c5122 and 7b20bd3.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • crates/relayburn-cli/src/commands/hotspots/mod.rs
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/relayburn-cli/src/commands/hotspots/mod.rs

📝 Walkthrough

Walkthrough

The PR moves command business logic from the CLI layer into SDK query verbs, adds Rust implementations of burn summary and burn hotspots, refactors burn compare as a thin SDK presenter, narrows the relayburn-sdk analyze-layer public API surface, fixes compare cost rounding to match JS toFixed semantics, makes fidelity summary JSON keys deterministic via BTreeMap, and splits monolithic reader parsers into incremental submodules.

Changes

SDK pipeline ownership & CLI presenter refactor

Layer / File(s) Summary
SDK analyze visibility narrowing & root re-export cleanup
crates/relayburn-sdk/src/analyze.rs, crates/relayburn-sdk/src/lib.rs, crates/relayburn-sdk/src/analyze/claude_md.rs, crates/relayburn-sdk/src/analyze/compare.rs, crates/relayburn-sdk/src/analyze/context_delta.rs, crates/relayburn-sdk/src/analyze/cost.rs, crates/relayburn-sdk/src/analyze/findings.rs, crates/relayburn-sdk/src/analyze/flow_graph.rs, crates/relayburn-sdk/src/analyze/hotspots.rs, crates/relayburn-sdk/src/analyze/overhead.rs, crates/relayburn-sdk/src/analyze/patterns.rs, crates/relayburn-sdk/src/analyze/pricing.rs, crates/relayburn-sdk/src/analyze/provider.rs, crates/relayburn-sdk/src/analyze/provider_reattribution.rs, crates/relayburn-sdk/src/analyze/quality.rs, crates/relayburn-sdk/src/analyze/replacement_savings.rs, crates/relayburn-sdk/src/analyze/subagent_tree.rs, crates/relayburn-sdk/src/analyze/tool_call_patterns.rs, crates/relayburn-sdk/src/analyze/tool_output_bloat.rs, crates/relayburn-sdk/src/query_verbs/mod.rs
Dozens of analyze-layer items are moved from pub to pub(crate) (builders, helper functions, internal types); analyze.rs splits its re-export block into pub use/pub(crate) use; lib.rs removes compare/provider/helper symbols from the SDK root. compare_from_archive and its type are deleted; CompareOptions::new becomes test-only; ProviderRule::regex is removed; filtering functions become #[cfg(test)]-gated.
Fidelity stable ordering & compare rounding fix
crates/relayburn-sdk/src/reader/types.rs, crates/relayburn-sdk/src/analyze/fidelity.rs, crates/relayburn-sdk/src/query_verbs/compare.rs
FidelityClass and UsageGranularity gain Ord/PartialOrd; FidelitySummary's three map fields change from HashMap to BTreeMap for deterministic JSON key order; round_digits is rewritten using format!/parse to match JS toFixed half-to-even semantics.
SDK summary query verb
crates/relayburn-sdk/src/query_verbs/summary/mod.rs, crates/relayburn-sdk/src/query_verbs/summary/compute.rs
New summary query verb module: mod.rs defines all public types (SummaryOptions, SummaryReport, SummaryGroupedReport, SummaryTimeseries, etc.) and LedgerHandle::summary/summary_report/summary_timeseries implementations; compute.rs provides the full internal compute pipeline including by-tool attribution, relationship matching, subagent tree traversal, and aggregation helpers.
CLI burn summary command
crates/relayburn-cli/src/commands/summary/mod.rs, crates/relayburn-cli/src/commands/summary/human.rs, crates/relayburn-cli/src/commands/summary/json.rs
New burn summary CLI command. mod.rs defines SummaryArgs, validates mutually exclusive flags, dispatches to SDK, and contains unit tests. human.rs implements all human-readable renderers (grouped, by-tool, subagent-type, relationship, tree, partial-coverage footer). json.rs implements the JSON serializer including time-series output.
CLI compare refactored as SDK presenter
crates/relayburn-cli/src/commands/compare.rs
Rewrites compare.rs to call handle.compare(CompareOptions) and render directly from CompareResult, removing the prior in-file pipeline (raw turn query, provider filtering, fidelity gate, table building). Adds index_cells/empty_cell helpers; rewrites JSON/CSV/TTY renderers; changes parse_provider_filter return type; removes duplicate deduplication of provider entries.
CLI burn hotspots command restructure
crates/relayburn-cli/src/commands/hotspots/mod.rs, crates/relayburn-cli/src/commands/hotspots/json.rs, crates/relayburn-cli/src/commands/hotspots/human.rs
New mod.rs holds HotspotsArgs, RankBy, pattern validation, ledger lifecycle, and run/run_inner dispatch. New json.rs serializes all HotspotsResult variants and aggregation types, including reorder_fidelity_summary for deterministic key layout. human.rs is reduced to a pure human-table presenter exposing emit_human as pub(super).
Changelogs
CHANGELOG.md, packages/relayburn/CHANGELOG.md, packages/sdk-node/CHANGELOG.md, packages/mcp/CHANGELOG.md
Unreleased entries document the breaking SDK re-export removal, the compare rounding change (last digit may shift at exact ties), and the stable fidelity JSON key order (byClass/byGranularity/missingCoverage).

Reader parser modularization

Layer / File(s) Summary
Shared reader utilities extracted
crates/relayburn-sdk/src/reader/user_turn.rs, crates/relayburn-sdk/src/reader/opencode.rs
Adds resolve_token_counter and join_nonempty to user_turn.rs as pub(crate) helpers; removes duplicate local implementations from opencode.rs by importing from user_turn.
Classifier split into submodules
crates/relayburn-sdk/src/reader/classifier.rs, crates/relayburn-sdk/src/reader/classifier/bash_parse.rs, crates/relayburn-sdk/src/reader/classifier/slash_triads.rs
Removes bash-command parsing (~800 lines) and slash-triad/task-notification detection (~350 lines) from classifier.rs, placing them in new submodules bash_parse.rs and slash_triads.rs; classifier.rs retains classify_activity and re-exports the moved APIs via pub use. bash_parse.rs adds the full heuristic Bash parser; slash_triads.rs adds the three-row caveat→invocation→stdout chain detector.
Claude incremental parser extracted
crates/relayburn-sdk/src/reader/claude.rs, crates/relayburn-sdk/src/reader/claude/incremental.rs
Moves the incremental parse engine (~700 lines) from claude.rs into claude/incremental.rs exposing only run_incremental as pub(super); claude.rs widens 16 intermediate structs/functions to pub(in crate::reader::claude) so the submodule can access them. incremental.rs implements two-phase prescan + streaming parse with ClaudeParseState, triad detection, and end_offset calculation.
Codex incremental parser extracted
crates/relayburn-sdk/src/reader/codex.rs, crates/relayburn-sdk/src/reader/codex/incremental.rs
Moves parse_codex_buffer and the incremental state machine into codex/incremental.rs; codex.rs widens ~25 helpers and turn-assembly structs to pub(in crate::reader::codex). incremental.rs implements CodexParseState, CommittedSnapshot (only advancing at task_complete boundaries), and the full committed-emission pipeline.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(100, 100, 200, 0.5)
        Note over CLI,SDK: burn summary / burn hotspots flow (new)
    end
    participant CLI as CLI run_inner
    participant Handle as LedgerHandle (SDK)
    participant Compute as query_verbs/summary/compute.rs
    participant Analyze as analyze layer (pub(crate))
    participant Ledger as Ledger/SQL

    CLI->>Handle: summary_report(SummaryReportOptions)
    Handle->>Compute: build_summary_report_query(opts)
    Handle->>Ledger: query_enriched_turns(query)
    Ledger-->>Handle: Vec<EnrichedTurn>
    Handle->>Compute: filter_summary_enriched_turns(turns, agent, provider)
    Handle->>Analyze: summarize_fidelity, compute_quality, summarize_replacement_savings
    Analyze-->>Handle: FidelitySummary (BTreeMap, stable order), QualityResult
    Handle->>Compute: compute_summary_by_tool_report / aggregate_by_model
    Compute->>Ledger: load user turns for by-tool attribution
    Compute-->>Handle: SummaryByToolReport
    Handle-->>CLI: SummaryReport variant
    CLI->>CLI: emit_json or emit_human (thin presenter)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • AgentWorkforce/burn#429: Both PRs modify crates/relayburn-sdk/src/analyze.rs public re-export surface by removing or narrowing previously exported analyzer/provider symbols at the same code-level API boundary.
  • AgentWorkforce/burn#483: The main PR refactors the compare CLI to render from LedgerHandle::compare_timeseries results (including bucket mode wiring), which is tightly coupled to that PR's addition of compare_timeseries/summary_timeseries SDK APIs.

Poem

🐇 A rabbit once juggled too many pub things,
Now crate-private helpers wear quieter rings.
The summary verb knows its pipeline by heart,
And fidelity keys stay in BTreeMap's art.
Incremental modules split neatly in two—
Refactor complete! Fewer lines for the brew. 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: a major architecture refactor with breaking API changes driving version 4.0.0, clearly indicating the scope and significance of the pull request.
Description check ✅ Passed The description is comprehensive and directly related to the changeset. It explains the refactoring rationale, verifies equivalence of behavior-preserving changes, documents breaking changes, and provides clear release instructions.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/architecture-loop

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
crates/relayburn-sdk/src/analyze/provider.rs (1)

27-27: 🧹 Nitpick | 🔵 Trivial

Narrow provider_for and provider_for_model_with_rules to pub(crate) to match documented intent.

TurnProvider is crate-private, and lib.rs explicitly documents that provider_for / provider_for_model_with_rules should be pub(crate) internals, not publicly re-exported. However, these functions are still declared pub fn in provider.rs. Narrow them to align with this design decision.

♻️ Proposed visibility cleanup
-pub fn provider_for(turn: &TurnRecord) -> TurnProvider {
+pub(crate) fn provider_for(turn: &TurnRecord) -> TurnProvider {
     provider_for_with_rules(turn, default_rules())
 }
@@
-pub fn provider_for_model_with_rules(
+pub(crate) fn provider_for_model_with_rules(
     model: &str,
     source: Option<SourceKind>,
     rules: &[ProviderRule],
 ) -> TurnProvider {

Also applies to: 139-151

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-sdk/src/analyze/provider.rs` at line 27, The functions
`provider_for` and `provider_for_model_with_rules` in provider.rs are currently
declared as `pub fn` but should be narrowed to `pub(crate) fn` to align with the
crate-private design of `TurnProvider` and the documented intent in lib.rs.
Change the visibility modifier from `pub` to `pub(crate)` for both the
`provider_for` function and the `provider_for_model_with_rules` function to
ensure they are not publicly re-exported.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CHANGELOG.md`:
- Line 7: The BREAKING entry at line 7 in CHANGELOG.md is too verbose and
implementation-focused with excessive details about specific function and type
names that were removed. Shorten this entry to a single concise, impact-first
sentence that clearly states: the Rust SDK no longer re-exports low-level
analyze-layer internals, and users should migrate to using the verb layer
(LedgerHandle methods/summary_report/hotspots/compare). Remove the detailed
lists of removed functions and types, and eliminate the implementation
backstory, keeping only what directly impacts users and how they should adapt
their code.

In `@crates/relayburn-cli/src/commands/hotspots/mod.rs`:
- Around line 81-82: Update the documentation comment at lines 81-82 in the
hotspots module to replace the deprecated term "waste-pattern detectors" with
the new "hotspots" terminology. Ensure the help text refers to hotspots
detectors instead of waste-pattern detectors to maintain consistency with the
current API naming conventions.

In `@crates/relayburn-cli/src/commands/summary/json.rs`:
- Around line 63-73: The json! macro on line 70 is treating the unquoted
identifier label_key as a literal field name instead of using the variable
value. The variable label_key (defined from report.group_by.wire_str() at line
56) should be used as a dynamic key in the JSON object. Fix this by wrapping the
label_key variable in parentheses in the json! macro so it evaluates the
variable and uses its string value as the actual field name, rather than
creating a literal "label_key" field. This will ensure non-tag grouped reports
emit the correct dynamic keys like "model" or "provider" instead of a hardcoded
"label_key" field.

In `@crates/relayburn-sdk/src/query_verbs/summary/compute.rs`:
- Around line 154-161: The resolve_summary_agent_session_tree function is using
Query::default() which loses source filtering context when querying
relationships, allowing unrelated sessions to be included. Modify the function
signature to accept a query parameter (likely a Query reference), replace the
Query::default() call in ledger.query_relationships() with this parameter, and
then update the two call sites in
crates/relayburn-sdk/src/query_verbs/summary/mod.rs (at the original lines 592
and 656) to pass the source-scoped query variable (q) instead of letting the
function create a default query.

In `@crates/relayburn-sdk/src/reader/classifier/bash_parse.rs`:
- Around line 116-122: The env command handler in the block where binary equals
"env" loses quoted argument boundaries when calling env_args.join(" "), causing
arguments like "npm test" to be reparsed as separate tokens. Instead of joining
the env_args vector into a string and recursively calling
parse_bash_command_inner with a reparsed string, either refactor
parse_bash_command_inner to accept the token vector directly (avoiding
reparsing), or re-quote tokens in env_args that contain whitespace before
joining them to preserve the token boundaries during reparsing.

In `@crates/relayburn-sdk/src/reader/claude/incremental.rs`:
- Around line 268-278: The prescan_nodes function call rebuilds nodes_by_uuid
but fails to carry forward the slash-triad state needed for resumed parses,
causing assistant messages with parent chains pointing to pre-offset slash
triads to miss the Skill override. Modify prescan_nodes to return or populate a
compact slash-triad state containing only the triad row and skill-UUID
information (not full user row data via obj.clone()). After prescan_nodes
completes when start_offset is greater than zero, capture this compact triad
state and feed it into the detect_slash_triads function before processing the
current pass, ensuring the slash-triad detector has the complete context from
both the prescan phase and the current incremental parse. Apply this fix in all
locations where prescan_nodes is called and detect_slash_triads is subsequently
invoked, including the sections around lines 385-391 and 673-686.

---

Nitpick comments:
In `@crates/relayburn-sdk/src/analyze/provider.rs`:
- Line 27: The functions `provider_for` and `provider_for_model_with_rules` in
provider.rs are currently declared as `pub fn` but should be narrowed to
`pub(crate) fn` to align with the crate-private design of `TurnProvider` and the
documented intent in lib.rs. Change the visibility modifier from `pub` to
`pub(crate)` for both the `provider_for` function and the
`provider_for_model_with_rules` function to ensure they are not publicly
re-exported.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: c2f9b466-80a1-43e1-b68f-1b9615a09cd5

📥 Commits

Reviewing files that changed from the base of the PR and between 30e23b2 and d3c5122.

📒 Files selected for processing (47)
  • CHANGELOG.md
  • crates/relayburn-cli/src/commands/compare.rs
  • crates/relayburn-cli/src/commands/hotspots/human.rs
  • crates/relayburn-cli/src/commands/hotspots/json.rs
  • crates/relayburn-cli/src/commands/hotspots/mod.rs
  • crates/relayburn-cli/src/commands/summary.rs
  • crates/relayburn-cli/src/commands/summary/human.rs
  • crates/relayburn-cli/src/commands/summary/json.rs
  • crates/relayburn-cli/src/commands/summary/mod.rs
  • crates/relayburn-sdk/src/analyze.rs
  • crates/relayburn-sdk/src/analyze/claude_md.rs
  • crates/relayburn-sdk/src/analyze/compare.rs
  • crates/relayburn-sdk/src/analyze/context_delta.rs
  • crates/relayburn-sdk/src/analyze/cost.rs
  • crates/relayburn-sdk/src/analyze/fidelity.rs
  • crates/relayburn-sdk/src/analyze/findings.rs
  • crates/relayburn-sdk/src/analyze/flow_graph.rs
  • crates/relayburn-sdk/src/analyze/hotspots.rs
  • crates/relayburn-sdk/src/analyze/overhead.rs
  • crates/relayburn-sdk/src/analyze/patterns.rs
  • crates/relayburn-sdk/src/analyze/pricing.rs
  • crates/relayburn-sdk/src/analyze/provider.rs
  • crates/relayburn-sdk/src/analyze/provider_reattribution.rs
  • crates/relayburn-sdk/src/analyze/quality.rs
  • crates/relayburn-sdk/src/analyze/replacement_savings.rs
  • crates/relayburn-sdk/src/analyze/subagent_tree.rs
  • crates/relayburn-sdk/src/analyze/tool_call_patterns.rs
  • crates/relayburn-sdk/src/analyze/tool_output_bloat.rs
  • crates/relayburn-sdk/src/lib.rs
  • crates/relayburn-sdk/src/query_verbs/compare.rs
  • crates/relayburn-sdk/src/query_verbs/mod.rs
  • crates/relayburn-sdk/src/query_verbs/summary.rs
  • crates/relayburn-sdk/src/query_verbs/summary/compute.rs
  • crates/relayburn-sdk/src/query_verbs/summary/mod.rs
  • crates/relayburn-sdk/src/reader/classifier.rs
  • crates/relayburn-sdk/src/reader/classifier/bash_parse.rs
  • crates/relayburn-sdk/src/reader/classifier/slash_triads.rs
  • crates/relayburn-sdk/src/reader/claude.rs
  • crates/relayburn-sdk/src/reader/claude/incremental.rs
  • crates/relayburn-sdk/src/reader/codex.rs
  • crates/relayburn-sdk/src/reader/codex/incremental.rs
  • crates/relayburn-sdk/src/reader/opencode.rs
  • crates/relayburn-sdk/src/reader/types.rs
  • crates/relayburn-sdk/src/reader/user_turn.rs
  • packages/mcp/CHANGELOG.md
  • packages/relayburn/CHANGELOG.md
  • packages/sdk-node/CHANGELOG.md
💤 Files with no reviewable changes (2)
  • crates/relayburn-cli/src/commands/summary.rs
  • crates/relayburn-sdk/src/query_verbs/summary.rs

Comment thread CHANGELOG.md Outdated
Comment thread crates/relayburn-cli/src/commands/hotspots/mod.rs Outdated
Comment on lines +63 to +73
let mut row = if report.group_by == SummaryGroupBy::Tag {
json!({
"tag": report.tag_key.as_deref().unwrap_or(""),
"value": report.tag_values.get(idx).cloned().flatten(),
})
} else {
json!({
label_key: r.label,
})
};
let obj = row.as_object_mut().unwrap();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n crates/relayburn-cli/src/commands/summary/json.rs | head -100

Repository: AgentWorkforce/burn

Length of output: 3949


🏁 Script executed:

# Verify the behavior of json!() macro with variable keys by checking how label_key is used
cat -n crates/relayburn-cli/src/commands/summary/json.rs | sed -n '50,90p'

Repository: AgentWorkforce/burn

Length of output: 1728


🏁 Script executed:

# Check if there are any references to old terminology (waste, diagnose, context) in this file
rg -i "waste|diagnose|context" crates/relayburn-cli/src/commands/summary/json.rs

Repository: AgentWorkforce/burn

Length of output: 45


🏁 Script executed:

# Verify what wire_str() returns by checking the SummaryGroupBy enum
rg -A 15 "enum SummaryGroupBy" --type rust

Repository: AgentWorkforce/burn

Length of output: 1216


🏁 Script executed:

# Check the serde_json::json! macro documentation behavior with dynamic keys
# Let's look for similar usage patterns in the codebase to see how dynamic keys are handled elsewhere
rg -B 2 -A 2 "Map::new()" crates/relayburn-cli/src/commands/summary/json.rs

Repository: AgentWorkforce/burn

Length of output: 163


🏁 Script executed:

# Let's verify the complete context around this function to understand the full scope
cat -n crates/relayburn-cli/src/commands/summary/json.rs | sed -n '51,130p'

Repository: AgentWorkforce/burn

Length of output: 3407


Use a true dynamic key for grouped row labels.

Line 70 currently emits a literal "label_key" field, not "model"/"provider", breaking the JSON schema for non-tag grouped reports. The variable label_key (defined line 56 as report.group_by.wire_str()) contains the intended key string but the json! macro treats unquoted identifiers as literal keys.

Suggested fix
         .map(|(idx, r)| {
             let mut row = if report.group_by == SummaryGroupBy::Tag {
                 json!({
                     "tag": report.tag_key.as_deref().unwrap_or(""),
                     "value": report.tag_values.get(idx).cloned().flatten(),
                 })
             } else {
-                json!({
-                    label_key: r.label,
-                })
+                let mut obj = Map::new();
+                obj.insert(label_key.to_string(), json!(r.label));
+                Value::Object(obj)
             };
             let obj = row.as_object_mut().unwrap();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let mut row = if report.group_by == SummaryGroupBy::Tag {
json!({
"tag": report.tag_key.as_deref().unwrap_or(""),
"value": report.tag_values.get(idx).cloned().flatten(),
})
} else {
json!({
label_key: r.label,
})
};
let obj = row.as_object_mut().unwrap();
let mut row = if report.group_by == SummaryGroupBy::Tag {
json!({
"tag": report.tag_key.as_deref().unwrap_or(""),
"value": report.tag_values.get(idx).cloned().flatten(),
})
} else {
let mut obj = Map::new();
obj.insert(label_key.to_string(), json!(r.label));
Value::Object(obj)
};
let obj = row.as_object_mut().unwrap();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-cli/src/commands/summary/json.rs` around lines 63 - 73, The
json! macro on line 70 is treating the unquoted identifier label_key as a
literal field name instead of using the variable value. The variable label_key
(defined from report.group_by.wire_str() at line 56) should be used as a dynamic
key in the JSON object. Fix this by wrapping the label_key variable in
parentheses in the json! macro so it evaluates the variable and uses its string
value as the actual field name, rather than creating a literal "label_key"
field. This will ensure non-tag grouped reports emit the correct dynamic keys
like "model" or "provider" instead of a hardcoded "label_key" field.

Comment on lines +154 to +161
pub(crate) fn resolve_summary_agent_session_tree(
ledger: &crate::ledger::Ledger,
agent_id: &str,
) -> Result<HashSet<String>> {
Ok(collect_summary_agent_session_tree(
&ledger.query_relationships(&Query::default())?,
agent_id,
))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep agent-session expansion source-scoped.

Line 159 uses Query::default(), which drops source filtering and can pull relationship rows from other sources, admitting unrelated sessions into agent-filtered summaries.

Suggested fix
 pub(crate) fn resolve_summary_agent_session_tree(
     ledger: &crate::ledger::Ledger,
+    q: &Query,
     agent_id: &str,
 ) -> Result<HashSet<String>> {
     Ok(collect_summary_agent_session_tree(
-        &ledger.query_relationships(&Query::default())?,
+        &ledger.query_relationships(&Query {
+            source: q.source,
+            ..Default::default()
+        })?,
         agent_id,
     ))
 }

Also update callers in crates/relayburn-sdk/src/query_verbs/summary/mod.rs (Line 592 and Line 656) to pass &q.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-sdk/src/query_verbs/summary/compute.rs` around lines 154 -
161, The resolve_summary_agent_session_tree function is using Query::default()
which loses source filtering context when querying relationships, allowing
unrelated sessions to be included. Modify the function signature to accept a
query parameter (likely a Query reference), replace the Query::default() call in
ledger.query_relationships() with this parameter, and then update the two call
sites in crates/relayburn-sdk/src/query_verbs/summary/mod.rs (at the original
lines 592 and 656) to pass the source-scoped query variable (q) instead of
letting the function create a default query.

Comment on lines +116 to +122
if binary == "env" {
let env_args = env_command_args(&tokens, i + 1);
if env_args.is_empty() {
return None;
}
return parse_bash_command_inner(&env_args.join(" "), depth + 1);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Preserve token boundaries when unwrapping env.

env_args.join(" ") loses quoted arguments, so env bash -lc "npm test" is reparsed as bash -lc npm test and normalizes to npm instead of npm test. Recurse over the remaining token vector directly, or re-quote tokens containing whitespace before reparsing.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-sdk/src/reader/classifier/bash_parse.rs` around lines 116 -
122, The env command handler in the block where binary equals "env" loses quoted
argument boundaries when calling env_args.join(" "), causing arguments like "npm
test" to be reparsed as separate tokens. Instead of joining the env_args vector
into a string and recursively calling parse_bash_command_inner with a reparsed
string, either refactor parse_bash_command_inner to accept the token vector
directly (avoiding reparsing), or re-quote tokens in env_args that contain
whitespace before joining them to preserve the token boundaries during
reparsing.

Comment on lines +268 to +278
if start_offset > 0 {
let pre = prescan_nodes(
path,
start_offset,
&mut nodes_by_uuid,
&mut evidence,
&mut tool_result_counters,
)?;
last_assistant_message_id = pre.last_assistant_message_id;
next_event_index = pre.next_event_index;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Carry compact slash-triad state through the resume prescan.

On resumed parses, prescan_nodes rebuilds nodes_by_uuid but does not seed the slash-triad input, so an assistant emitted after start_offset whose parent chain points to a pre-offset slash triad will not get the Skill override. Also, obj.clone() stores full user rows despite the detector needing only UUID/parent/purpose markers. Carry a compact triad row/skill-UUID state from prescan and append compact current-pass rows before detect_slash_triads.

Also applies to: 385-391, 673-686

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-sdk/src/reader/claude/incremental.rs` around lines 268 -
278, The prescan_nodes function call rebuilds nodes_by_uuid but fails to carry
forward the slash-triad state needed for resumed parses, causing assistant
messages with parent chains pointing to pre-offset slash triads to miss the
Skill override. Modify prescan_nodes to return or populate a compact slash-triad
state containing only the triad row and skill-UUID information (not full user
row data via obj.clone()). After prescan_nodes completes when start_offset is
greater than zero, capture this compact triad state and feed it into the
detect_slash_triads function before processing the current pass, ensuring the
slash-triad detector has the complete context from both the prescan phase and
the current incremental parse. Apply this fix in all locations where
prescan_nodes is called and detect_slash_triads is subsequently invoked,
including the sections around lines 385-391 and 673-686.

… changelog

CodeRabbit review follow-up (the two valid nits; the other findings were a
verified false positive plus two pre-existing items left for a separate PR):
- `burn hotspots --patterns` help said "waste-pattern detectors", reintroducing
  the deprecated `waste` term; rename to "hotspot-pattern detectors" per the
  repo's hotspots terminology rule.
- Trim the verbose BREAKING SDK changelog entry to an impact-first summary
  (keeps a couple representative removed types + the verb-layer migration path).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@willwashburn willwashburn merged commit ddcaf88 into main Jun 23, 2026
12 checks passed
@willwashburn willwashburn deleted the refactor/architecture-loop branch June 23, 2026 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant