Skip to content

refactor(sdk): dedup + cohesion follow-up across analyze and time handling#487

Merged
willwashburn merged 22 commits into
mainfrom
refactor/sdk-dedup-followup
Jun 22, 2026
Merged

refactor(sdk): dedup + cohesion follow-up across analyze and time handling#487
willwashburn merged 22 commits into
mainfrom
refactor/sdk-dedup-followup

Conversation

@willwashburn

@willwashburn willwashburn commented Jun 22, 2026

Copy link
Copy Markdown
Member

Follow-up cleanup after #486, consolidating remaining duplication, tightening module boundaries, organizing oversized files, removing dead code, and centralizing the SDK's date/time handling. Every change is behavior-preserving and individually verified; the riskier ones were adversarially checked (byte-identity sweeps for the time math, diff-equivalence for the builder/skeleton refactors).

analyze module

Deduplication

  • util::percentile<T> replaces two byte-identical f64 copies; findings::hotspots_action promoted to shared.
  • cost::total_cost_for_turn + sum_turn_costs scalar helpers replace a repeated .map(|c| c.total).unwrap_or(0.0) idiom and hand-rolled per-turn sum loops.
  • WasteFinding::session_cost builder + with_* chainers collapse 10 cost-driven finding adapters.
  • detect_streaks skeleton (StreakOp{Extend,Rotate,Break}) centralizes the bug-prone commit-on-boundary loop shared by the 5 retry/failure/cancellation detectors.
  • util::group_turns_by_session_sorted folds a repeated post-group sort_by_key(turn_index) across 5 detectors.
  • util::first_seen_unique{,_by} replaces 6 hand-rolled first-seen-unique loops.
  • source_kind_str → existing SourceKind::wire_str(); strip_provider_prefix hoisted to one home; truncate_chars extracted.

Abstraction boundary

  • External consumers (flow.rs, reader span_tree builders, tests) now go through crate::analyze's public re-export surface instead of reaching into private submodule paths.

File organization (no logic change)

  • patterns.rs (1564 lines) split into a parent orchestrator + cohesive patterns/{streaks,edits,compaction,skills}.rs.
  • Inline test blocks externalized via #[path] (matching the existing patterns_tests.rs convention): hotspots, tool_output_bloat, subagent_tree, context_delta, ghost_surface.

Dead code

  • Removed the legacy subagent-tree builder (~200 lines). It was dead for any ledger ingested by current code (every session carries an always-emitted Root relationship row), and the relationship path reproduces it from TurnRecord.subagent alone. Verified empirically: all 48 subagent tests pass through the unified path, including the unresolved-sidechain and nested-cost cases. (Old event logs that predate Root emission are not supported without a re-ingest — confirmed acceptable.)

SDK time handling

util::time is now the single home for the wire timestamp format:

  • Dropped two duplicate ISO-8601 parsers (a verbatim copy in reader/inference.rs with a stale justification comment, and a pass-through shim in flow.rs).
  • Centralized the Howard Hinnant ymd_to_days/days_to_ymd civil-date primitives (previously copied across util/time, query_verbs, reader/opencode).
  • Single util::time::format_iso_ms for the YYYY-MM-DDTHH:MM:SS.mmmZ format (was hand-rolled in two places).
  • Generic query_verbs::partition_into_buckets<T> shared by the summary and compare --bucket paths.

Documented why quality::parse_iso8601_ms is deliberately not shared (it's a strict superset — TZ offsets + range rejection — that outcome inference needs).

Verification

  • Build warning-free; clippy --workspace --all-targets clean; 990 workspace tests pass.
  • Timestamp parse/format equivalence proven byte-identical via exhaustive numeric sweeps (the changes feed serialized output: opencode ingest ts, --bucket edges, the --since SQLite filter).
  • Live-tested throughout: summary, hotspots (+findings/subagent), overhead, compare, flow, subagent-tree render.

🤖 Generated with Claude Code

Review in cubic

willwashburn and others added 20 commits June 21, 2026 23:03
Promote the byte-identical f64 percentile in claude_md/subagent_tree to a
single generic util::percentile<T: Copy + Default>. The u64 variant in
tool_output_bloat stays local (it scales p over 0..=100 and sorts internally).

Promote findings::hotspots_action to pub(crate) and drop the duplicate copy
in tool_call_patterns, matching how severity_from_usd is already shared.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… helpers

Two scalar-cost helpers in the cost module replace the repeated
`cost_for_turn(...).map(|c| c.total).unwrap_or(0.0)` idiom and the
hand-rolled per-turn sum loops:

- total_cost_for_turn(turn, pricing) -> f64
- sum_turn_costs(turns, pricing) -> f64  (iteration-order accumulation)

Routed patterns.rs (sum_cost_for_turns now delegates; three idiom/loop
sites) and subagent_tree.rs (three idiom sites). compare/provider keep
cost_for_turn (they need the full breakdown and priced-turn gating);
claude_md sums session totals, not per-turn costs, so it is untouched.

Adding $0 for unpriced turns is bit-identical to the prior skip
(x + 0.0 == x), so the cost-precision fixtures stay green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Collapse the repeated cost-driven finding scaffold (severity_from_usd(cost),
session_id clone, usd_per_session savings, hotspots_action, event_source None)
into a WasteFinding::session_cost(kind, session_id, cost, title, detail)
constructor plus with_severity / with_event_source / with_tokens_per_session
chainers.

Routes the nine in-file pattern adapters and tool_call_pattern_to_finding
through it (net -41 lines). ghost_surface and tool_output_bloat keep struct
literals: their bespoke Paste actions and fixed/decoupled severity would mean
overriding every default. Output is byte-identical (all 27 finding tests pass,
including the compaction zero-tokens guard and the edit-heavy severity cap).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…terns

The five status-pattern detectors (graph + flat retry, graph + flat failure,
graph cancellation) each hand-rolled the same commit-on-boundary streak loop,
the bug-prone part being the commit/clear/re-push dispatch. Extract that
control flow into detect_streaks(elements, classify, commit) with a StreakOp
{ Extend, Rotate, Break } verb; each detector now supplies only its own
classify (what extends/breaks a streak) and commit (how a streak becomes a
finding) over its own element type.

This deliberately abstracts the *mechanism*, not the accessors: the graph vs
flat domain differences (ToolResultEventRef vs ToolCallRef, dedup_defined_turns
vs dedup_turns, event-source coalescing, the failure-run has_non_tool_result
rule) stay explicit in local closures rather than behind a leaky accessor
trait. Behavior is unchanged — all 992 tests pass, including the graph/flat
fallback, cancellation, break-boundary, and retry-vs-failure guard cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Line-wrapping cleanup for the WasteFinding::session_cost calls introduced in
the builder commit; no behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…paths

query_verbs/flow.rs, the reader span_tree builders, and the query_verbs tests
reached into crate::analyze::span_tree:: and crate::analyze::context_delta::
directly, bypassing the curated re-export surface in analyze.rs. Route them
through crate::analyze::{TurnSpanTree, OwnerRail, ContextDelta, ContextDeltaOpts,
deltas_for_session, AttrValue, SpanKind, SpanNode, SpanStatus} instead, adding
the missing names to the shared query_verbs import block.

No external code references analyze submodule paths anymore — the re-export
list is now the single entry point. Pure path change; all 992 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…modules

patterns.rs had grown to 1564 lines bundling ~10 independent detectors plus
shared plumbing. Split the detector families into cohesive submodules under
patterns/, leaving the shared plumbing and the detect_patterns orchestrator
in the parent:

  patterns.rs        755  consts, tool-name helpers, DetectPatternsOptions,
                          detect_patterns, ref structs, ContentIndex,
                          flatten/dedup/cost helpers, detect_streaks/StreakOp,
                          graph event helpers, build_summaries
  patterns/streaks.rs    372  retry / failure / cancellation (graph + flat) + signatures
  patterns/edits.rs      157  edit-revert, edit-heavy
  patterns/compaction.rs 150  compaction-loss + window summary
  patterns/skills.rs     157  skill recall-dup, pruning-protection, system-prompt-tax

Pure code movement. Each submodule reaches the parent's shared plumbing via
`use super::*`. The only non-movement edits are widening GraphStatusPatterns /
detect_graph_status_patterns_for_session / detect_compaction_losses from
private to pub(super) so the parent orchestrator can call across the boundary.
detect_patterns's body is unchanged. All 992 tests pass, clippy clean, build
warning-free; the moved function bodies are byte-identical to the originals.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hotspots.rs was 2451 lines, but ~1426 of those were the inline
`#[cfg(test)] mod tests` block — the code itself is a cohesive ~1024-line
attribution + aggregation module. Move the test block verbatim into
hotspots_tests.rs, included via `#[cfg(test)] #[path = "hotspots_tests.rs"]
mod tests;`, mirroring the existing patterns_tests.rs convention.

The source file is now 1027 lines of pure code. The module name stays `tests`
so test paths (analyze::hotspots::tests::*) are unchanged; the 24 hotspots
tests pass identically. Pure relocation — no code or test logic touched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
tool_output_bloat.rs was 1676 lines, ~1072 of them the inline test block.
Move the tests verbatim into tool_output_bloat_tests.rs via
`#[cfg(test)] #[path = ...] mod tests;`, matching the patterns_tests.rs /
hotspots_tests.rs convention. Source file is now 606 lines of pure code;
the 29 tests pass identically (paths unchanged). Pure relocation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Apply the patterns_tests.rs / hotspots_tests.rs convention to the two
remaining large analyze files with a clean single inline test block:

  subagent_tree.rs  1277 -> 826 code  (tests -> subagent_tree_tests.rs)
  context_delta.rs  1077 -> 658 code  (tests -> context_delta_tests.rs)

Tests moved verbatim, module name kept `tests` so paths are unchanged.
All tests pass, clippy clean. Pure relocation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- ghost_surface::source_kind_str was byte-identical to the existing public
  SourceKind::wire_str(); delete it and call wire_str() at the two sites.
- strip_provider_prefix was defined verbatim in both cost.rs and provider.rs;
  hoist a single pub(crate) copy into provider_reattribution.rs (the provider-
  normalization module both already import from) and route both callers there.
- extract_error_signature and truncate_for_preview shared an identical
  "char-count, else take(max-1) + ellipsis" tail; extract util::truncate_chars
  and delegate from both.

Behavior-preserving (identical outputs); all 992 tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…st-group sort

Five detectors (patterns, hotspots, quality, tool_call_patterns, claude_md)
grouped turns by session and then repeated
`bucket.sort_by_key(|t| t.turn_index)` inside the loop. Add a
group_turns_by_session_sorted util variant that does the stable sort once and
route all five through it. The sort is stable, so per-session ordering is
bit-identical; all 992 tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The "keep first occurrence per key, in input order" loop was reimplemented six
times: dedup_strings / dedup_numbers (tool_call_patterns), dedup_turns
(patterns), and three inline tools_involved builders in patterns/streaks.rs.
Add util::first_seen_unique_by (key fn) + first_seen_unique (value-keyed) and
route all six through them. dedup_numbers keeps its trailing sort_unstable at
the call site; the streaks tools_involved order (fixture-gated) is preserved
since first-seen order is identical. All 992 tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A survey flagged quality.rs's ISO parser as a missed consolidation against
util::time::parse_iso_ms. It is a deliberate superset (applies ±HH:MM offsets,
rejects out-of-range components and trailing garbage) that outcome inference
needs for correct is_recent classification, whereas the shared parser stays
lean for the reader/ledger ingest path. Document the divergence so it is not
naively merged later.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ghost_surface.rs was 1235 lines with a ~700-line inline test block. Move the
main `mod tests` block verbatim to ghost_surface_tests.rs via
`#[cfg(test)] #[path] mod tests;` (the test-only `use adapters::{...}` import
stays inline). Source file is now 533 lines of code; the 30 tests pass
identically. Pure relocation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
build_subagent_tree dispatched to a separate build_legacy_subagent_trees
fallback whenever no relationship rows were supplied. That path was dead for
any ledger ingested by current code (every session now carries an always-
emitted Root relationship row), and build_relationship_trees already folds in
the per-turn TurnRecord.subagent data via add_legacy_subagent_gaps +
ensure_turn_session_roots — so it reconstructs the same tree from subagent
fields alone when handed an empty relationship slice.

Collapse the dispatch to the single relationship path (empty slice when no
rows), and delete the now-dead build_legacy_subagent_trees, build_session_tree,
assign_depth, and resolve_parent_or_root (~200 lines). The equivalence guard
test now asserts the no-relationship tree matches the with-relationship tree —
the invariant that lets the fallback stand in for the removed builder.

Per owner decision: old event logs that predate Root emission are not supported
without a re-ingest. All 992 tests pass (incl. the nested-subagent and
unresolved-sidechain cases), clippy clean, live subagent-tree render verified.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
reader/inference.rs::parse_iso_ms was a byte-for-byte copy of
util::time::parse_iso_ms; its "keep reader free of a query_verbs dependency"
comment is stale (the canonical parser now lives in crate::util::time, which
sibling reader submodules already import). Delete the copy, import the shared
one, and drop its two unit tests (covered by util/time's own tests).

query_verbs/flow.rs::parse_iso_ms_compat was a pure pass-through wrapper around
the same shared helper; inline it at the two call sites and delete the shim.

Behavior-preserving; build warning-free, clippy clean, tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The days-from-civil / civil-from-days primitives were copied across
util::time (inlined in parse_iso_ms), query_verbs::mod, and reader::opencode.
Promote ymd_to_days + days_to_ymd to pub(crate) in util::time and route the
callers there:

- parse_iso_ms now calls ymd_to_days instead of inlining it.
- query_verbs::ymd_to_days keeps only its out-of-range guard, deferring the
  math to the shared primitive; its days_to_ymd copy is replaced by the import.
- reader::opencode::ms_to_iso uses the shared days_to_ymd (the {:04} year
  format works on the i64 year identically to the old i32).

Identical math, behavior-preserving; build warning-free, clippy clean, 990
tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…p format

The canonical `YYYY-MM-DDTHH:MM:SS.mmmZ` formatter was hand-rolled twice
(query_verbs::format_iso_z_ms and reader::opencode::ms_to_iso), each pairing
days_to_ymd with the same format string. Add util::time::format_iso_ms as the
single source of truth:

- opencode's six ms_to_iso call sites route to format_iso_ms; the local copy
  is deleted.
- format_iso_z_ms keeps its (seconds, millis) convenience signature but is now
  a thin adapter onto format_iso_ms, so the template lives in one place.

Byte-identical output (same days_to_ymd + same template); avoids the heavier
`time`-crate migration. Build warning-free, clippy clean, 990 tests pass, live
bucket/ingest timestamps verified.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
summary and compare each hand-rolled the same --bucket partition: anchor →
ensure_bucket_span → Buckets::new → allocate per-bucket vecs → loop pushing by
iso_z_to_epoch_secs/index_for. They differed only in element type (TurnRecord
vs EnrichedTurn) and ts accessor (t.ts vs t.turn.ts). Extract a generic
query_verbs::partition_into_buckets<T>(items, since, bucket_secs, ts_of) that
returns Ok(None) when there's no anchor, letting each caller keep its own
empty-timeseries return shape. Behavior-preserving; 990 tests pass, clippy
clean, both bucketed surfaces verified live.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@willwashburn, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 35 minutes and 26 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4eb89970-fd82-4162-a78c-11161801638a

📥 Commits

Reviewing files that changed from the base of the PR and between 003fe5a and 263e7fa.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • crates/relayburn-sdk/src/analyze/ghost_surface_tests.rs
  • crates/relayburn-sdk/src/reader/opencode.rs
📝 Walkthrough

Walkthrough

This PR centralizes shared utilities (time formatting, deduplication, percentile, session grouping, cost helpers, provider-prefix stripping) into util.rs and util/time.rs, then propagates them across the codebase by removing duplicate local implementations. Pattern detection logic from patterns.rs is extracted into four submodules (compaction, edits, skills, streaks). A WasteFinding builder API is introduced in findings.rs. The subagent tree always uses graph-based construction. A generic partition_into_buckets helper consolidates timeseries bucketing. Large inline test blocks are moved to dedicated *_tests.rs files.

Changes

relayburn-sdk analyze module refactoring

Layer / File(s) Summary
Shared analyze and time utilities
crates/relayburn-sdk/src/analyze/util.rs, crates/relayburn-sdk/src/util/time.rs
Adds first_seen_unique_by/first_seen_unique, group_turns_by_session_sorted, truncate_chars, and percentile to util.rs. Adds ymd_to_days, days_to_ymd, and format_iso_ms to util/time.rs; refactors parse_iso_ms to delegate date math to ymd_to_days.
Centralized cost and provider-prefix helpers
crates/relayburn-sdk/src/analyze/provider_reattribution.rs, crates/relayburn-sdk/src/analyze/cost.rs, crates/relayburn-sdk/src/analyze/provider.rs
Moves strip_provider_prefix into provider_reattribution.rs and removes duplicate implementations from cost.rs and provider.rs. Adds total_cost_for_turn and sum_turn_costs as new public helpers to cost.rs.
WasteFinding builder API and adapter refactors
crates/relayburn-sdk/src/analyze/findings.rs
Introduces WasteFinding::session_cost constructor and with_severity, with_event_source, with_tokens_per_session chainers; promotes hotspots_action to crate-visible. Refactors all per-pattern adapter functions (retry_loop_to_finding, failure_run_to_finding, cancellation_run_to_finding, compaction_loss_to_finding, edit_revert_to_finding, edit_heavy_to_finding, skill adapters, system-prompt-tax) to use the new builder.
Pattern detection modularization into submodules
crates/relayburn-sdk/src/analyze/patterns.rs, crates/relayburn-sdk/src/analyze/patterns/compaction.rs, crates/relayburn-sdk/src/analyze/patterns/edits.rs, crates/relayburn-sdk/src/analyze/patterns/skills.rs, crates/relayburn-sdk/src/analyze/patterns/streaks.rs
Extracts compaction-loss, edit-revert/edit-heavy, skill-recall/pruning/system-prompt-tax, and graph/non-graph retry/failure/cancellation detectors into four new submodules. patterns.rs adopts group_turns_by_session_sorted, sum_turn_costs, truncate_chars, first_seen_unique_by, and introduces the generic StreakOp/detect_streaks helper.
Subagent tree graph-only construction and depth refactor
crates/relayburn-sdk/src/analyze/subagent_tree.rs, crates/relayburn-sdk/src/analyze/subagent_tree_tests.rs
Removes build_legacy_subagent_trees/build_session_tree; build_subagent_tree always calls build_relationship_trees with empty-slice fallback. Integrates BFS depth with cycle protection into finalize_tree. Switches turn cost to total_cost_for_turn. Extracts tests to new subagent_tree_tests.rs covering cumulative fold, unresolved nodes, relationship equivalence, and aggregation stats.
Sorted session grouping adoption across analyze consumers
crates/relayburn-sdk/src/analyze/claude_md.rs, crates/relayburn-sdk/src/analyze/hotspots.rs, crates/relayburn-sdk/src/analyze/quality.rs, crates/relayburn-sdk/src/analyze/ghost_surface.rs, crates/relayburn-sdk/src/analyze/tool_call_patterns.rs
Replaces group_turns_by_session + inline sort with group_turns_by_session_sorted in all consumer functions. Removes local percentile from claude_md.rs. Switches ghost_surface_to_finding to wire_str() for source labels. Modernizes tool_call_patterns.rs deduplication and tool_call_pattern_to_finding to use the WasteFinding builder.
Shared partition_into_buckets helper and query verb type simplification
crates/relayburn-sdk/src/query_verbs/mod.rs, crates/relayburn-sdk/src/query_verbs/compare.rs, crates/relayburn-sdk/src/query_verbs/summary.rs, crates/relayburn-sdk/src/query_verbs/flow.rs, crates/relayburn-sdk/src/query_verbs/tests.rs
Adds generic partition_into_buckets helper to query_verbs/mod.rs and delegates format_iso_z_ms/ymd_to_days to util::time. compare_timeseries and summary_timeseries each replace their manual bucketing logic with one partition_into_buckets call. flow.rs drops fully-qualified type paths and removes parse_iso_ms_compat. Test import paths updated to match new module locations.
Reader modules adopt shared time utilities
crates/relayburn-sdk/src/reader/opencode.rs, crates/relayburn-sdk/src/reader/inference.rs, crates/relayburn-sdk/src/reader/claude/span_tree.rs, crates/relayburn-sdk/src/reader/codex/span_tree.rs
Replaces local ms_to_iso/days_to_ymd in opencode.rs and local parse_iso_ms in inference.rs with shared crate::util::time equivalents at all call sites. Span-tree readers updated to import types from crate::analyze instead of crate::analyze::span_tree.
Test file extractions for context delta, ghost surface, hotspots, and tool output bloat
crates/relayburn-sdk/src/analyze/context_delta.rs, crates/relayburn-sdk/src/analyze/context_delta_tests.rs, crates/relayburn-sdk/src/analyze/ghost_surface.rs, crates/relayburn-sdk/src/analyze/ghost_surface_tests.rs, crates/relayburn-sdk/src/analyze/hotspots.rs, crates/relayburn-sdk/src/analyze/hotspots_tests.rs, crates/relayburn-sdk/src/analyze/tool_output_bloat.rs, crates/relayburn-sdk/src/analyze/tool_output_bloat_tests.rs
Moves inline #[cfg(test)] blocks from four production files to dedicated *_tests.rs modules via #[path = "..."] mod tests;. New test files provide conformance coverage for context delta edge cases (compaction clamping, subagent isolation, min_delta filtering, JSON wire format), ghost surface enumeration and finding conversion, hotspots attribution semantics (sibling capping, cross-model pricing, output-byte tracking), and tool output bloat static/observed/fixture-driven detection.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • AgentWorkforce/burn#452: Directly related — refactors context_delta.rs test suite into context_delta_tests.rs, the same pattern this PR applies to four additional modules.
  • AgentWorkforce/burn#483: Directly related — modifies query-verb bucketing plumbing for compare_timeseries/summary_timeseries, which this PR consolidates further into the shared partition_into_buckets helper.
  • AgentWorkforce/burn#486: Directly related — updates attribute_claude_md_refs in claude_md.rs to use group_turns_by_session, which this PR further replaces with group_turns_by_session_sorted.

Poem

🐇 Hop hop, the helpers are shared now at last,
No more copy-paste ghosts from the codebase's past.
The patterns all shuffled to their own cozy rooms,
The builder constructs findings — no manual looms!
The buckets partition with one tidy call,
A little refactor to tidy it all. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: a refactoring consolidating code deduplication and cohesion improvements across analyze and time handling modules, which accurately reflects the changeset's primary focus.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, detailing deduplication efforts, module boundary improvements, file organization, dead code removal, and SDK time handling centralization with verification steps.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/sdk-dedup-followup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 003fe5a78d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +67 to +68
let relationships = opts.relationships.unwrap_or(&[]);
build_relationship_trees(turns, relationships, opts.pricing)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve legacy roots for orphaned subagent parents

When BuildSubagentTreeOptions::new() is used without relationship rows, this now always runs the relationship graph builder instead of the legacy path. In the no-relationship case, a turn whose subagent.parent_agent_id points at a missing/cyclic agent now causes ensure_turn_session_roots/resolve_graph_parent to leave the synthetic parent as a separate root (or skip cycle attachment), so build_subagent_tree returns extra top-level trees and the main session's cumulative turns/cost omit those sidechain turns. The removed legacy builder explicitly reattached these sparse parent chains to the session root, which matters for older ledgers that only have TurnRecord.subagent metadata.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/relayburn-sdk/src/analyze/ghost_surface_tests.rs`:
- Around line 4-705: The entire test module from the use statements through all
helper functions and test cases (such as make_inputs, observed, observed_multi,
count_map, user_text, claude_enumerates_agents_skills_commands,
codex_enumerates_prompts_skills_rules_memories, and all other tests) has
improper indentation with leading whitespace when they should be at the module
level. Remove all leading indentation from every line in this test module so
that top-level items like use statements, function definitions, and test
attributes start at column 0 without any preceding whitespace.

In `@crates/relayburn-sdk/src/analyze/patterns.rs`:
- Around line 622-623: The dedup_turns function uses a string format with pipe
separator ("{}|{}") as the dedup key, which is not collision-safe because if the
session_id or message_id contain the pipe character, distinct turns can
incorrectly be merged together, undercounting the contributing-turn cost.
Replace the format! string with a tuple key (t.session_id, t.message_id) passed
to first_seen_unique_by to ensure collision-free deduplication based on the
combined session and message IDs.

In `@crates/relayburn-sdk/src/analyze/util.rs`:
- Around line 126-131: The truncate_chars function performs a subtraction
operation max - 1 on line 130 without guarding against the case where max equals
zero, which will cause an integer underflow panic. Add an early return guard
check at the beginning of the truncate_chars function that returns an empty
string when max is 0, before any other logic executes, to prevent the underflow
from occurring during the take operation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 767b4524-397a-4edb-9b23-49eb57c30af0

📥 Commits

Reviewing files that changed from the base of the PR and between c08236a and 003fe5a.

📒 Files selected for processing (33)
  • crates/relayburn-sdk/src/analyze/claude_md.rs
  • crates/relayburn-sdk/src/analyze/context_delta.rs
  • crates/relayburn-sdk/src/analyze/context_delta_tests.rs
  • crates/relayburn-sdk/src/analyze/cost.rs
  • crates/relayburn-sdk/src/analyze/findings.rs
  • crates/relayburn-sdk/src/analyze/ghost_surface.rs
  • crates/relayburn-sdk/src/analyze/ghost_surface_tests.rs
  • crates/relayburn-sdk/src/analyze/hotspots.rs
  • crates/relayburn-sdk/src/analyze/hotspots_tests.rs
  • crates/relayburn-sdk/src/analyze/patterns.rs
  • crates/relayburn-sdk/src/analyze/patterns/compaction.rs
  • crates/relayburn-sdk/src/analyze/patterns/edits.rs
  • crates/relayburn-sdk/src/analyze/patterns/skills.rs
  • crates/relayburn-sdk/src/analyze/patterns/streaks.rs
  • crates/relayburn-sdk/src/analyze/provider.rs
  • crates/relayburn-sdk/src/analyze/provider_reattribution.rs
  • crates/relayburn-sdk/src/analyze/quality.rs
  • crates/relayburn-sdk/src/analyze/subagent_tree.rs
  • crates/relayburn-sdk/src/analyze/subagent_tree_tests.rs
  • crates/relayburn-sdk/src/analyze/tool_call_patterns.rs
  • crates/relayburn-sdk/src/analyze/tool_output_bloat.rs
  • crates/relayburn-sdk/src/analyze/tool_output_bloat_tests.rs
  • crates/relayburn-sdk/src/analyze/util.rs
  • crates/relayburn-sdk/src/query_verbs/compare.rs
  • crates/relayburn-sdk/src/query_verbs/flow.rs
  • crates/relayburn-sdk/src/query_verbs/mod.rs
  • crates/relayburn-sdk/src/query_verbs/summary.rs
  • crates/relayburn-sdk/src/query_verbs/tests.rs
  • crates/relayburn-sdk/src/reader/claude/span_tree.rs
  • crates/relayburn-sdk/src/reader/codex/span_tree.rs
  • crates/relayburn-sdk/src/reader/inference.rs
  • crates/relayburn-sdk/src/reader/opencode.rs
  • crates/relayburn-sdk/src/util/time.rs

Comment thread crates/relayburn-sdk/src/analyze/ghost_surface_tests.rs Outdated
Comment on lines +622 to +623
fn dedup_turns(turns: Vec<&TurnRecord>) -> Vec<&TurnRecord> {
first_seen_unique_by(turns, |t| format!("{}|{}", t.session_id, t.message_id))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use a collision-safe dedup key for turns.

format!("{}|{}", session_id, message_id) can merge distinct turns when IDs contain |, which undercounts contributing-turn cost in findings. Use a tuple key instead.

Suggested fix
 fn dedup_turns(turns: Vec<&TurnRecord>) -> Vec<&TurnRecord> {
-    first_seen_unique_by(turns, |t| format!("{}|{}", t.session_id, t.message_id))
+    first_seen_unique_by(turns, |t| (t.session_id.as_str(), t.message_id.as_str()))
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn dedup_turns(turns: Vec<&TurnRecord>) -> Vec<&TurnRecord> {
first_seen_unique_by(turns, |t| format!("{}|{}", t.session_id, t.message_id))
fn dedup_turns(turns: Vec<&TurnRecord>) -> Vec<&TurnRecord> {
first_seen_unique_by(turns, |t| (t.session_id.as_str(), t.message_id.as_str()))
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-sdk/src/analyze/patterns.rs` around lines 622 - 623, The
dedup_turns function uses a string format with pipe separator ("{}|{}") as the
dedup key, which is not collision-safe because if the session_id or message_id
contain the pipe character, distinct turns can incorrectly be merged together,
undercounting the contributing-turn cost. Replace the format! string with a
tuple key (t.session_id, t.message_id) passed to first_seen_unique_by to ensure
collision-free deduplication based on the combined session and message IDs.

Comment on lines +126 to +131
pub(crate) fn truncate_chars(s: &str, max: usize) -> String {
if s.chars().count() <= max {
return s.to_string();
}
let truncated: String = s.chars().take(max - 1).collect();
format!("{truncated}…")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify current callsites and whether any max constants can be zero.
rg -nP --type=rs '\btruncate_chars\s*\('
rg -nP --type=rs '\b(ERROR_SIGNATURE_MAX_CHARS|SAMPLE_PREVIEW_MAX_CHARS)\b'

Repository: AgentWorkforce/burn

Length of output: 117


🏁 Script executed:

#!/bin/bash
# Verify current callsites and whether any max constants can be zero.
rg -nP --type=rust '\btruncate_chars\s*\('
rg -nP --type=rust '\b(ERROR_SIGNATURE_MAX_CHARS|SAMPLE_PREVIEW_MAX_CHARS)\b'

Repository: AgentWorkforce/burn

Length of output: 1021


🏁 Script executed:

#!/bin/bash
# Check the visibility of truncate_chars and if it's used elsewhere
cat -n crates/relayburn-sdk/src/analyze/util.rs | sed -n '120,135p'
cat -n crates/relayburn-sdk/src/analyze/patterns.rs | sed -n '65,85p'

Repository: AgentWorkforce/burn

Length of output: 1952


Guard zero-length truncation before subtracting one.

Line 130 does max - 1; if max == 0, this underflows and can panic. Add an explicit zero guard before the subtraction.

🔧 Suggested fix
 pub(crate) fn truncate_chars(s: &str, max: usize) -> String {
+    if max == 0 {
+        return String::new();
+    }
     if s.chars().count() <= max {
         return s.to_string();
     }
     let truncated: String = s.chars().take(max - 1).collect();
     format!("{truncated}…")
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub(crate) fn truncate_chars(s: &str, max: usize) -> String {
if s.chars().count() <= max {
return s.to_string();
}
let truncated: String = s.chars().take(max - 1).collect();
format!("{truncated}…")
pub(crate) fn truncate_chars(s: &str, max: usize) -> String {
if max == 0 {
return String::new();
}
if s.chars().count() <= max {
return s.to_string();
}
let truncated: String = s.chars().take(max - 1).collect();
format!("{truncated}…")
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-sdk/src/analyze/util.rs` around lines 126 - 131, The
truncate_chars function performs a subtraction operation max - 1 on line 130
without guarding against the case where max equals zero, which will cause an
integer underflow panic. Add an early return guard check at the beginning of the
truncate_chars function that returns an empty string when max is 0, before any
other logic executes, to prevent the underflow from occurring during the take
operation.

willwashburn and others added 2 commits June 21, 2026 23:17
ghost_surface_tests.rs was extracted from the inline `mod tests` block with its
original 4-space indentation; as a `#[path]`-included module file its items
belong at column 0 (matching patterns_tests.rs). Run cargo fmt to fix it plus a
small wrap in opencode.rs. Pure formatting; no behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@willwashburn willwashburn merged commit 30e23b2 into main Jun 22, 2026
2 checks passed
@willwashburn willwashburn deleted the refactor/sdk-dedup-followup branch June 22, 2026 03:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant