refactor(sdk): dedup + cohesion follow-up across analyze and time handling by willwashburn · Pull Request #487 · AgentWorkforce/burn

willwashburn · 2026-06-22T03:06:06Z

Follow-up cleanup after #486, consolidating remaining duplication, tightening module boundaries, organizing oversized files, removing dead code, and centralizing the SDK's date/time handling. Every change is behavior-preserving and individually verified; the riskier ones were adversarially checked (byte-identity sweeps for the time math, diff-equivalence for the builder/skeleton refactors).

analyze module

Deduplication

util::percentile<T> replaces two byte-identical f64 copies; findings::hotspots_action promoted to shared.
cost::total_cost_for_turn + sum_turn_costs scalar helpers replace a repeated .map(|c| c.total).unwrap_or(0.0) idiom and hand-rolled per-turn sum loops.
WasteFinding::session_cost builder + with_* chainers collapse 10 cost-driven finding adapters.
detect_streaks skeleton (StreakOp{Extend,Rotate,Break}) centralizes the bug-prone commit-on-boundary loop shared by the 5 retry/failure/cancellation detectors.
util::group_turns_by_session_sorted folds a repeated post-group sort_by_key(turn_index) across 5 detectors.
util::first_seen_unique{,_by} replaces 6 hand-rolled first-seen-unique loops.
source_kind_str → existing SourceKind::wire_str(); strip_provider_prefix hoisted to one home; truncate_chars extracted.

Abstraction boundary

External consumers (flow.rs, reader span_tree builders, tests) now go through crate::analyze's public re-export surface instead of reaching into private submodule paths.

File organization (no logic change)

patterns.rs (1564 lines) split into a parent orchestrator + cohesive patterns/{streaks,edits,compaction,skills}.rs.
Inline test blocks externalized via #[path] (matching the existing patterns_tests.rs convention): hotspots, tool_output_bloat, subagent_tree, context_delta, ghost_surface.

Dead code

Removed the legacy subagent-tree builder (~200 lines). It was dead for any ledger ingested by current code (every session carries an always-emitted Root relationship row), and the relationship path reproduces it from TurnRecord.subagent alone. Verified empirically: all 48 subagent tests pass through the unified path, including the unresolved-sidechain and nested-cost cases. (Old event logs that predate Root emission are not supported without a re-ingest — confirmed acceptable.)

SDK time handling

util::time is now the single home for the wire timestamp format:

Dropped two duplicate ISO-8601 parsers (a verbatim copy in reader/inference.rs with a stale justification comment, and a pass-through shim in flow.rs).
Centralized the Howard Hinnant ymd_to_days/days_to_ymd civil-date primitives (previously copied across util/time, query_verbs, reader/opencode).
Single util::time::format_iso_ms for the YYYY-MM-DDTHH:MM:SS.mmmZ format (was hand-rolled in two places).
Generic query_verbs::partition_into_buckets<T> shared by the summary and compare --bucket paths.

Documented why quality::parse_iso8601_ms is deliberately not shared (it's a strict superset — TZ offsets + range rejection — that outcome inference needs).

Verification

Build warning-free; clippy --workspace --all-targets clean; 990 workspace tests pass.
Timestamp parse/format equivalence proven byte-identical via exhaustive numeric sweeps (the changes feed serialized output: opencode ingest ts, --bucket edges, the --since SQLite filter).
Live-tested throughout: summary, hotspots (+findings/subagent), overhead, compare, flow, subagent-tree render.

🤖 Generated with Claude Code

Promote the byte-identical f64 percentile in claude_md/subagent_tree to a single generic util::percentile<T: Copy + Default>. The u64 variant in tool_output_bloat stays local (it scales p over 0..=100 and sorts internally). Promote findings::hotspots_action to pub(crate) and drop the duplicate copy in tool_call_patterns, matching how severity_from_usd is already shared. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… helpers Two scalar-cost helpers in the cost module replace the repeated `cost_for_turn(...).map(|c| c.total).unwrap_or(0.0)` idiom and the hand-rolled per-turn sum loops: - total_cost_for_turn(turn, pricing) -> f64 - sum_turn_costs(turns, pricing) -> f64 (iteration-order accumulation) Routed patterns.rs (sum_cost_for_turns now delegates; three idiom/loop sites) and subagent_tree.rs (three idiom sites). compare/provider keep cost_for_turn (they need the full breakdown and priced-turn gating); claude_md sums session totals, not per-turn costs, so it is untouched. Adding $0 for unpriced turns is bit-identical to the prior skip (x + 0.0 == x), so the cost-precision fixtures stay green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collapse the repeated cost-driven finding scaffold (severity_from_usd(cost), session_id clone, usd_per_session savings, hotspots_action, event_source None) into a WasteFinding::session_cost(kind, session_id, cost, title, detail) constructor plus with_severity / with_event_source / with_tokens_per_session chainers. Routes the nine in-file pattern adapters and tool_call_pattern_to_finding through it (net -41 lines). ghost_surface and tool_output_bloat keep struct literals: their bespoke Paste actions and fixed/decoupled severity would mean overriding every default. Output is byte-identical (all 27 finding tests pass, including the compaction zero-tokens guard and the edit-heavy severity cap). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…terns The five status-pattern detectors (graph + flat retry, graph + flat failure, graph cancellation) each hand-rolled the same commit-on-boundary streak loop, the bug-prone part being the commit/clear/re-push dispatch. Extract that control flow into detect_streaks(elements, classify, commit) with a StreakOp { Extend, Rotate, Break } verb; each detector now supplies only its own classify (what extends/breaks a streak) and commit (how a streak becomes a finding) over its own element type. This deliberately abstracts the *mechanism*, not the accessors: the graph vs flat domain differences (ToolResultEventRef vs ToolCallRef, dedup_defined_turns vs dedup_turns, event-source coalescing, the failure-run has_non_tool_result rule) stay explicit in local closures rather than behind a leaky accessor trait. Behavior is unchanged — all 992 tests pass, including the graph/flat fallback, cancellation, break-boundary, and retry-vs-failure guard cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Line-wrapping cleanup for the WasteFinding::session_cost calls introduced in the builder commit; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…paths query_verbs/flow.rs, the reader span_tree builders, and the query_verbs tests reached into crate::analyze::span_tree:: and crate::analyze::context_delta:: directly, bypassing the curated re-export surface in analyze.rs. Route them through crate::analyze::{TurnSpanTree, OwnerRail, ContextDelta, ContextDeltaOpts, deltas_for_session, AttrValue, SpanKind, SpanNode, SpanStatus} instead, adding the missing names to the shared query_verbs import block. No external code references analyze submodule paths anymore — the re-export list is now the single entry point. Pure path change; all 992 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…modules patterns.rs had grown to 1564 lines bundling ~10 independent detectors plus shared plumbing. Split the detector families into cohesive submodules under patterns/, leaving the shared plumbing and the detect_patterns orchestrator in the parent: patterns.rs 755 consts, tool-name helpers, DetectPatternsOptions, detect_patterns, ref structs, ContentIndex, flatten/dedup/cost helpers, detect_streaks/StreakOp, graph event helpers, build_summaries patterns/streaks.rs 372 retry / failure / cancellation (graph + flat) + signatures patterns/edits.rs 157 edit-revert, edit-heavy patterns/compaction.rs 150 compaction-loss + window summary patterns/skills.rs 157 skill recall-dup, pruning-protection, system-prompt-tax Pure code movement. Each submodule reaches the parent's shared plumbing via `use super::*`. The only non-movement edits are widening GraphStatusPatterns / detect_graph_status_patterns_for_session / detect_compaction_losses from private to pub(super) so the parent orchestrator can call across the boundary. detect_patterns's body is unchanged. All 992 tests pass, clippy clean, build warning-free; the moved function bodies are byte-identical to the originals. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

hotspots.rs was 2451 lines, but ~1426 of those were the inline `#[cfg(test)] mod tests` block — the code itself is a cohesive ~1024-line attribution + aggregation module. Move the test block verbatim into hotspots_tests.rs, included via `#[cfg(test)] #[path = "hotspots_tests.rs"] mod tests;`, mirroring the existing patterns_tests.rs convention. The source file is now 1027 lines of pure code. The module name stays `tests` so test paths (analyze::hotspots::tests::*) are unchanged; the 24 hotspots tests pass identically. Pure relocation — no code or test logic touched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

tool_output_bloat.rs was 1676 lines, ~1072 of them the inline test block. Move the tests verbatim into tool_output_bloat_tests.rs via `#[cfg(test)] #[path = ...] mod tests;`, matching the patterns_tests.rs / hotspots_tests.rs convention. Source file is now 606 lines of pure code; the 29 tests pass identically (paths unchanged). Pure relocation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Apply the patterns_tests.rs / hotspots_tests.rs convention to the two remaining large analyze files with a clean single inline test block: subagent_tree.rs 1277 -> 826 code (tests -> subagent_tree_tests.rs) context_delta.rs 1077 -> 658 code (tests -> context_delta_tests.rs) Tests moved verbatim, module name kept `tests` so paths are unchanged. All tests pass, clippy clean. Pure relocation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- ghost_surface::source_kind_str was byte-identical to the existing public SourceKind::wire_str(); delete it and call wire_str() at the two sites. - strip_provider_prefix was defined verbatim in both cost.rs and provider.rs; hoist a single pub(crate) copy into provider_reattribution.rs (the provider- normalization module both already import from) and route both callers there. - extract_error_signature and truncate_for_preview shared an identical "char-count, else take(max-1) + ellipsis" tail; extract util::truncate_chars and delegate from both. Behavior-preserving (identical outputs); all 992 tests pass, clippy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…st-group sort Five detectors (patterns, hotspots, quality, tool_call_patterns, claude_md) grouped turns by session and then repeated `bucket.sort_by_key(|t| t.turn_index)` inside the loop. Add a group_turns_by_session_sorted util variant that does the stable sort once and route all five through it. The sort is stable, so per-session ordering is bit-identical; all 992 tests pass, clippy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The "keep first occurrence per key, in input order" loop was reimplemented six times: dedup_strings / dedup_numbers (tool_call_patterns), dedup_turns (patterns), and three inline tools_involved builders in patterns/streaks.rs. Add util::first_seen_unique_by (key fn) + first_seen_unique (value-keyed) and route all six through them. dedup_numbers keeps its trailing sort_unstable at the call site; the streaks tools_involved order (fixture-gated) is preserved since first-seen order is identical. All 992 tests pass, clippy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

A survey flagged quality.rs's ISO parser as a missed consolidation against util::time::parse_iso_ms. It is a deliberate superset (applies ±HH:MM offsets, rejects out-of-range components and trailing garbage) that outcome inference needs for correct is_recent classification, whereas the shared parser stays lean for the reader/ledger ingest path. Document the divergence so it is not naively merged later. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ghost_surface.rs was 1235 lines with a ~700-line inline test block. Move the main `mod tests` block verbatim to ghost_surface_tests.rs via `#[cfg(test)] #[path] mod tests;` (the test-only `use adapters::{...}` import stays inline). Source file is now 533 lines of code; the 30 tests pass identically. Pure relocation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

build_subagent_tree dispatched to a separate build_legacy_subagent_trees fallback whenever no relationship rows were supplied. That path was dead for any ledger ingested by current code (every session now carries an always- emitted Root relationship row), and build_relationship_trees already folds in the per-turn TurnRecord.subagent data via add_legacy_subagent_gaps + ensure_turn_session_roots — so it reconstructs the same tree from subagent fields alone when handed an empty relationship slice. Collapse the dispatch to the single relationship path (empty slice when no rows), and delete the now-dead build_legacy_subagent_trees, build_session_tree, assign_depth, and resolve_parent_or_root (~200 lines). The equivalence guard test now asserts the no-relationship tree matches the with-relationship tree — the invariant that lets the fallback stand in for the removed builder. Per owner decision: old event logs that predate Root emission are not supported without a re-ingest. All 992 tests pass (incl. the nested-subagent and unresolved-sidechain cases), clippy clean, live subagent-tree render verified. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

reader/inference.rs::parse_iso_ms was a byte-for-byte copy of util::time::parse_iso_ms; its "keep reader free of a query_verbs dependency" comment is stale (the canonical parser now lives in crate::util::time, which sibling reader submodules already import). Delete the copy, import the shared one, and drop its two unit tests (covered by util/time's own tests). query_verbs/flow.rs::parse_iso_ms_compat was a pure pass-through wrapper around the same shared helper; inline it at the two call sites and delete the shim. Behavior-preserving; build warning-free, clippy clean, tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The days-from-civil / civil-from-days primitives were copied across util::time (inlined in parse_iso_ms), query_verbs::mod, and reader::opencode. Promote ymd_to_days + days_to_ymd to pub(crate) in util::time and route the callers there: - parse_iso_ms now calls ymd_to_days instead of inlining it. - query_verbs::ymd_to_days keeps only its out-of-range guard, deferring the math to the shared primitive; its days_to_ymd copy is replaced by the import. - reader::opencode::ms_to_iso uses the shared days_to_ymd (the {:04} year format works on the i64 year identically to the old i32). Identical math, behavior-preserving; build warning-free, clippy clean, 990 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…p format The canonical `YYYY-MM-DDTHH:MM:SS.mmmZ` formatter was hand-rolled twice (query_verbs::format_iso_z_ms and reader::opencode::ms_to_iso), each pairing days_to_ymd with the same format string. Add util::time::format_iso_ms as the single source of truth: - opencode's six ms_to_iso call sites route to format_iso_ms; the local copy is deleted. - format_iso_z_ms keeps its (seconds, millis) convenience signature but is now a thin adapter onto format_iso_ms, so the template lives in one place. Byte-identical output (same days_to_ymd + same template); avoids the heavier `time`-crate migration. Build warning-free, clippy clean, 990 tests pass, live bucket/ingest timestamps verified. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

summary and compare each hand-rolled the same --bucket partition: anchor → ensure_bucket_span → Buckets::new → allocate per-bucket vecs → loop pushing by iso_z_to_epoch_secs/index_for. They differed only in element type (TurnRecord vs EnrichedTurn) and ts accessor (t.ts vs t.turn.ts). Extract a generic query_verbs::partition_into_buckets<T>(items, since, bucket_secs, ts_of) that returns Ok(None) when there's no anchor, letting each caller keep its own empty-timeseries return shape. Behavior-preserving; 990 tests pass, clippy clean, both bucketed surfaces verified live. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

gemini-code-assist · 2026-06-22T03:06:10Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-06-22T03:06:21Z

Warning

Review limit reached

@willwashburn, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 35 minutes and 26 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4eb89970-fd82-4162-a78c-11161801638a

📥 Commits

Reviewing files that changed from the base of the PR and between 003fe5a and 263e7fa.

📒 Files selected for processing (3)

CHANGELOG.md
crates/relayburn-sdk/src/analyze/ghost_surface_tests.rs
crates/relayburn-sdk/src/reader/opencode.rs

📝 Walkthrough

Walkthrough

This PR centralizes shared utilities (time formatting, deduplication, percentile, session grouping, cost helpers, provider-prefix stripping) into util.rs and util/time.rs, then propagates them across the codebase by removing duplicate local implementations. Pattern detection logic from patterns.rs is extracted into four submodules (compaction, edits, skills, streaks). A WasteFinding builder API is introduced in findings.rs. The subagent tree always uses graph-based construction. A generic partition_into_buckets helper consolidates timeseries bucketing. Large inline test blocks are moved to dedicated *_tests.rs files.

Changes

relayburn-sdk analyze module refactoring

Layer / File(s)	Summary
Shared analyze and time utilities `crates/relayburn-sdk/src/analyze/util.rs`, `crates/relayburn-sdk/src/util/time.rs`	Adds `first_seen_unique_by/first_seen_unique`, `group_turns_by_session_sorted`, `truncate_chars`, and `percentile` to `util.rs`. Adds `ymd_to_days`, `days_to_ymd`, and `format_iso_ms` to `util/time.rs`; refactors `parse_iso_ms` to delegate date math to `ymd_to_days`.
Centralized cost and provider-prefix helpers `crates/relayburn-sdk/src/analyze/provider_reattribution.rs`, `crates/relayburn-sdk/src/analyze/cost.rs`, `crates/relayburn-sdk/src/analyze/provider.rs`	Moves `strip_provider_prefix` into `provider_reattribution.rs` and removes duplicate implementations from `cost.rs` and `provider.rs`. Adds `total_cost_for_turn` and `sum_turn_costs` as new public helpers to `cost.rs`.
WasteFinding builder API and adapter refactors `crates/relayburn-sdk/src/analyze/findings.rs`	Introduces `WasteFinding::session_cost` constructor and `with_severity`, `with_event_source`, `with_tokens_per_session` chainers; promotes `hotspots_action` to crate-visible. Refactors all per-pattern adapter functions (`retry_loop_to_finding`, `failure_run_to_finding`, `cancellation_run_to_finding`, `compaction_loss_to_finding`, `edit_revert_to_finding`, `edit_heavy_to_finding`, skill adapters, system-prompt-tax) to use the new builder.
Pattern detection modularization into submodules `crates/relayburn-sdk/src/analyze/patterns.rs`, `crates/relayburn-sdk/src/analyze/patterns/compaction.rs`, `crates/relayburn-sdk/src/analyze/patterns/edits.rs`, `crates/relayburn-sdk/src/analyze/patterns/skills.rs`, `crates/relayburn-sdk/src/analyze/patterns/streaks.rs`	Extracts compaction-loss, edit-revert/edit-heavy, skill-recall/pruning/system-prompt-tax, and graph/non-graph retry/failure/cancellation detectors into four new submodules. `patterns.rs` adopts `group_turns_by_session_sorted`, `sum_turn_costs`, `truncate_chars`, `first_seen_unique_by`, and introduces the generic `StreakOp`/`detect_streaks` helper.
Subagent tree graph-only construction and depth refactor `crates/relayburn-sdk/src/analyze/subagent_tree.rs`, `crates/relayburn-sdk/src/analyze/subagent_tree_tests.rs`	Removes `build_legacy_subagent_trees`/`build_session_tree`; `build_subagent_tree` always calls `build_relationship_trees` with empty-slice fallback. Integrates BFS depth with cycle protection into `finalize_tree`. Switches turn cost to `total_cost_for_turn`. Extracts tests to new `subagent_tree_tests.rs` covering cumulative fold, unresolved nodes, relationship equivalence, and aggregation stats.
Sorted session grouping adoption across analyze consumers `crates/relayburn-sdk/src/analyze/claude_md.rs`, `crates/relayburn-sdk/src/analyze/hotspots.rs`, `crates/relayburn-sdk/src/analyze/quality.rs`, `crates/relayburn-sdk/src/analyze/ghost_surface.rs`, `crates/relayburn-sdk/src/analyze/tool_call_patterns.rs`	Replaces `group_turns_by_session` + inline sort with `group_turns_by_session_sorted` in all consumer functions. Removes local `percentile` from `claude_md.rs`. Switches `ghost_surface_to_finding` to `wire_str()` for source labels. Modernizes `tool_call_patterns.rs` deduplication and `tool_call_pattern_to_finding` to use the WasteFinding builder.
Shared `partition_into_buckets` helper and query verb type simplification `crates/relayburn-sdk/src/query_verbs/mod.rs`, `crates/relayburn-sdk/src/query_verbs/compare.rs`, `crates/relayburn-sdk/src/query_verbs/summary.rs`, `crates/relayburn-sdk/src/query_verbs/flow.rs`, `crates/relayburn-sdk/src/query_verbs/tests.rs`	Adds generic `partition_into_buckets` helper to `query_verbs/mod.rs` and delegates `format_iso_z_ms`/`ymd_to_days` to `util::time`. `compare_timeseries` and `summary_timeseries` each replace their manual bucketing logic with one `partition_into_buckets` call. `flow.rs` drops fully-qualified type paths and removes `parse_iso_ms_compat`. Test import paths updated to match new module locations.
Reader modules adopt shared time utilities `crates/relayburn-sdk/src/reader/opencode.rs`, `crates/relayburn-sdk/src/reader/inference.rs`, `crates/relayburn-sdk/src/reader/claude/span_tree.rs`, `crates/relayburn-sdk/src/reader/codex/span_tree.rs`	Replaces local `ms_to_iso`/`days_to_ymd` in `opencode.rs` and local `parse_iso_ms` in `inference.rs` with shared `crate::util::time` equivalents at all call sites. Span-tree readers updated to import types from `crate::analyze` instead of `crate::analyze::span_tree`.
Test file extractions for context delta, ghost surface, hotspots, and tool output bloat `crates/relayburn-sdk/src/analyze/context_delta.rs`, `crates/relayburn-sdk/src/analyze/context_delta_tests.rs`, `crates/relayburn-sdk/src/analyze/ghost_surface.rs`, `crates/relayburn-sdk/src/analyze/ghost_surface_tests.rs`, `crates/relayburn-sdk/src/analyze/hotspots.rs`, `crates/relayburn-sdk/src/analyze/hotspots_tests.rs`, `crates/relayburn-sdk/src/analyze/tool_output_bloat.rs`, `crates/relayburn-sdk/src/analyze/tool_output_bloat_tests.rs`	Moves inline `#[cfg(test)]` blocks from four production files to dedicated `*_tests.rs` modules via `#[path = "..."] mod tests;`. New test files provide conformance coverage for context delta edge cases (compaction clamping, subagent isolation, `min_delta` filtering, JSON wire format), ghost surface enumeration and finding conversion, hotspots attribution semantics (sibling capping, cross-model pricing, output-byte tracking), and tool output bloat static/observed/fixture-driven detection.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

AgentWorkforce/burn#452: Directly related — refactors context_delta.rs test suite into context_delta_tests.rs, the same pattern this PR applies to four additional modules.
AgentWorkforce/burn#483: Directly related — modifies query-verb bucketing plumbing for compare_timeseries/summary_timeseries, which this PR consolidates further into the shared partition_into_buckets helper.
AgentWorkforce/burn#486: Directly related — updates attribute_claude_md_refs in claude_md.rs to use group_turns_by_session, which this PR further replaces with group_turns_by_session_sorted.

Poem

🐇 Hop hop, the helpers are shared now at last,
No more copy-paste ghosts from the codebase's past.
The patterns all shuffled to their own cozy rooms,
The builder constructs findings — no manual looms!
The buckets partition with one tidy call,
A little refactor to tidy it all. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: a refactoring consolidating code deduplication and cohesion improvements across analyze and time handling modules, which accurately reflects the changeset's primary focus.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, detailing deduplication efforts, module boundary improvements, file organization, dead code removal, and SDK time handling centralization with verification steps.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/sdk-dedup-followup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 003fe5a78d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-22T03:11:12Z

+    let relationships = opts.relationships.unwrap_or(&[]);
+    build_relationship_trees(turns, relationships, opts.pricing)


Preserve legacy roots for orphaned subagent parents

When BuildSubagentTreeOptions::new() is used without relationship rows, this now always runs the relationship graph builder instead of the legacy path. In the no-relationship case, a turn whose subagent.parent_agent_id points at a missing/cyclic agent now causes ensure_turn_session_roots/resolve_graph_parent to leave the synthetic parent as a separate root (or skip cycle attachment), so build_subagent_tree returns extra top-level trees and the main session's cumulative turns/cost omit those sidechain turns. The removed legacy builder explicitly reattached these sparse parent chains to the session root, which matters for older ledgers that only have TurnRecord.subagent metadata.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/relayburn-sdk/src/analyze/ghost_surface_tests.rs`:
- Around line 4-705: The entire test module from the use statements through all
helper functions and test cases (such as make_inputs, observed, observed_multi,
count_map, user_text, claude_enumerates_agents_skills_commands,
codex_enumerates_prompts_skills_rules_memories, and all other tests) has
improper indentation with leading whitespace when they should be at the module
level. Remove all leading indentation from every line in this test module so
that top-level items like use statements, function definitions, and test
attributes start at column 0 without any preceding whitespace.

In `@crates/relayburn-sdk/src/analyze/patterns.rs`:
- Around line 622-623: The dedup_turns function uses a string format with pipe
separator ("{}|{}") as the dedup key, which is not collision-safe because if the
session_id or message_id contain the pipe character, distinct turns can
incorrectly be merged together, undercounting the contributing-turn cost.
Replace the format! string with a tuple key (t.session_id, t.message_id) passed
to first_seen_unique_by to ensure collision-free deduplication based on the
combined session and message IDs.

In `@crates/relayburn-sdk/src/analyze/util.rs`:
- Around line 126-131: The truncate_chars function performs a subtraction
operation max - 1 on line 130 without guarding against the case where max equals
zero, which will cause an integer underflow panic. Add an early return guard
check at the beginning of the truncate_chars function that returns an empty
string when max is 0, before any other logic executes, to prevent the underflow
from occurring during the take operation.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 767b4524-397a-4edb-9b23-49eb57c30af0

📥 Commits

Reviewing files that changed from the base of the PR and between c08236a and 003fe5a.

📒 Files selected for processing (33)

crates/relayburn-sdk/src/analyze/claude_md.rs
crates/relayburn-sdk/src/analyze/context_delta.rs
crates/relayburn-sdk/src/analyze/context_delta_tests.rs
crates/relayburn-sdk/src/analyze/cost.rs
crates/relayburn-sdk/src/analyze/findings.rs
crates/relayburn-sdk/src/analyze/ghost_surface.rs
crates/relayburn-sdk/src/analyze/ghost_surface_tests.rs
crates/relayburn-sdk/src/analyze/hotspots.rs
crates/relayburn-sdk/src/analyze/hotspots_tests.rs
crates/relayburn-sdk/src/analyze/patterns.rs
crates/relayburn-sdk/src/analyze/patterns/compaction.rs
crates/relayburn-sdk/src/analyze/patterns/edits.rs
crates/relayburn-sdk/src/analyze/patterns/skills.rs
crates/relayburn-sdk/src/analyze/patterns/streaks.rs
crates/relayburn-sdk/src/analyze/provider.rs
crates/relayburn-sdk/src/analyze/provider_reattribution.rs
crates/relayburn-sdk/src/analyze/quality.rs
crates/relayburn-sdk/src/analyze/subagent_tree.rs
crates/relayburn-sdk/src/analyze/subagent_tree_tests.rs
crates/relayburn-sdk/src/analyze/tool_call_patterns.rs
crates/relayburn-sdk/src/analyze/tool_output_bloat.rs
crates/relayburn-sdk/src/analyze/tool_output_bloat_tests.rs
crates/relayburn-sdk/src/analyze/util.rs
crates/relayburn-sdk/src/query_verbs/compare.rs
crates/relayburn-sdk/src/query_verbs/flow.rs
crates/relayburn-sdk/src/query_verbs/mod.rs
crates/relayburn-sdk/src/query_verbs/summary.rs
crates/relayburn-sdk/src/query_verbs/tests.rs
crates/relayburn-sdk/src/reader/claude/span_tree.rs
crates/relayburn-sdk/src/reader/codex/span_tree.rs
crates/relayburn-sdk/src/reader/inference.rs
crates/relayburn-sdk/src/reader/opencode.rs
crates/relayburn-sdk/src/util/time.rs

coderabbitai · 2026-06-22T03:16:41Z

+fn dedup_turns(turns: Vec<&TurnRecord>) -> Vec<&TurnRecord> {
+    first_seen_unique_by(turns, |t| format!("{}|{}", t.session_id, t.message_id))


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use a collision-safe dedup key for turns.

format!("{}|{}", session_id, message_id) can merge distinct turns when IDs contain |, which undercounts contributing-turn cost in findings. Use a tuple key instead.

Suggested fix

fn dedup_turns(turns: Vec<&TurnRecord>) -> Vec<&TurnRecord> { - first_seen_unique_by(turns, |t| format!("{}|{}", t.session_id, t.message_id)) + first_seen_unique_by(turns, |t| (t.session_id.as_str(), t.message_id.as_str())) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fn dedup_turns(turns: Vec<&TurnRecord>) -> Vec<&TurnRecord> {

first_seen_unique_by(turns, |t| format!("{}|{}", t.session_id, t.message_id))

fn dedup_turns(turns: Vec<&TurnRecord>) -> Vec<&TurnRecord> {

first_seen_unique_by(turns, |t| (t.session_id.as_str(), t.message_id.as_str()))

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/relayburn-sdk/src/analyze/patterns.rs` around lines 622 - 623, The dedup_turns function uses a string format with pipe separator ("{}|{}") as the dedup key, which is not collision-safe because if the session_id or message_id contain the pipe character, distinct turns can incorrectly be merged together, undercounting the contributing-turn cost. Replace the format! string with a tuple key (t.session_id, t.message_id) passed to first_seen_unique_by to ensure collision-free deduplication based on the combined session and message IDs.

coderabbitai · 2026-06-22T03:16:41Z

+pub(crate) fn truncate_chars(s: &str, max: usize) -> String {
+    if s.chars().count() <= max {
+        return s.to_string();
+    }
+    let truncated: String = s.chars().take(max - 1).collect();
+    format!("{truncated}…")


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify current callsites and whether any max constants can be zero. rg -nP --type=rs '\btruncate_chars\s*\(' rg -nP --type=rs '\b(ERROR_SIGNATURE_MAX_CHARS|SAMPLE_PREVIEW_MAX_CHARS)\b'

Repository: AgentWorkforce/burn

Length of output: 117

🏁 Script executed:

#!/bin/bash # Verify current callsites and whether any max constants can be zero. rg -nP --type=rust '\btruncate_chars\s*\(' rg -nP --type=rust '\b(ERROR_SIGNATURE_MAX_CHARS|SAMPLE_PREVIEW_MAX_CHARS)\b'

Repository: AgentWorkforce/burn

Length of output: 1021

🏁 Script executed:

#!/bin/bash # Check the visibility of truncate_chars and if it's used elsewhere cat -n crates/relayburn-sdk/src/analyze/util.rs | sed -n '120,135p' cat -n crates/relayburn-sdk/src/analyze/patterns.rs | sed -n '65,85p'

Repository: AgentWorkforce/burn

Length of output: 1952

Guard zero-length truncation before subtracting one.

Line 130 does max - 1; if max == 0, this underflows and can panic. Add an explicit zero guard before the subtraction.

🔧 Suggested fix

pub(crate) fn truncate_chars(s: &str, max: usize) -> String { + if max == 0 { + return String::new(); + } if s.chars().count() <= max { return s.to_string(); } let truncated: String = s.chars().take(max - 1).collect(); format!("{truncated}…") }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

pub(crate) fn truncate_chars(s: &str, max: usize) -> String {

if s.chars().count() <= max {

return s.to_string();

}

let truncated: String = s.chars().take(max - 1).collect();

format!("{truncated}…")

pub(crate) fn truncate_chars(s: &str, max: usize) -> String {

if max == 0 {

return String::new();

}

if s.chars().count() <= max {

return s.to_string();

}

let truncated: String = s.chars().take(max - 1).collect();

format!("{truncated}…")

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/relayburn-sdk/src/analyze/util.rs` around lines 126 - 131, The truncate_chars function performs a subtraction operation max - 1 on line 130 without guarding against the case where max equals zero, which will cause an integer underflow panic. Add an early return guard check at the beginning of the truncate_chars function that returns an empty string when max is 0, before any other logic executes, to prevent the underflow from occurring during the take operation.

ghost_surface_tests.rs was extracted from the inline `mod tests` block with its original 4-space indentation; as a `#[path]`-included module file its items belong at column 0 (matching patterns_tests.rs). Run cargo fmt to fix it plus a small wrap in opencode.rs. Pure formatting; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

willwashburn and others added 20 commits June 21, 2026 23:03

style(sdk/analyze): rustfmt the session_cost finding adapters

4ce1229

Line-wrapping cleanup for the WasteFinding::session_cost calls introduced in the builder commit; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector Bot reviewed Jun 22, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 22, 2026

View reviewed changes

willwashburn and others added 2 commits June 21, 2026 23:17

docs(changelog): note subagent-tree legacy path removal

263e7fa

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

willwashburn merged commit 30e23b2 into main Jun 22, 2026
2 checks passed

willwashburn deleted the refactor/sdk-dedup-followup branch June 22, 2026 03:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(sdk): dedup + cohesion follow-up across analyze and time handling#487

refactor(sdk): dedup + cohesion follow-up across analyze and time handling#487
willwashburn merged 22 commits into
mainfrom
refactor/sdk-dedup-followup

willwashburn commented Jun 22, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

gemini-code-assist Bot commented Jun 22, 2026

Uh oh!

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 22, 2026

Uh oh!

coderabbitai Bot Jun 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		let relationships = opts.relationships.unwrap_or(&[]);
		build_relationship_trees(turns, relationships, opts.pricing)

		fn dedup_turns(turns: Vec<&TurnRecord>) -> Vec<&TurnRecord> {
		first_seen_unique_by(turns, \|t\| format!("{}\|{}", t.session_id, t.message_id))

Conversation

willwashburn commented Jun 22, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

analyze module

SDK time handling

Verification

Uh oh!

gemini-code-assist Bot commented Jun 22, 2026

Uh oh!

coderabbitai Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

willwashburn commented Jun 22, 2026 •

edited by cubic-dev-ai Bot

Loading

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading