Skip to content

refactor(sdk/analyze): consolidate duplication + split oversized modules#486

Merged
willwashburn merged 9 commits into
mainfrom
refactor/analyze-dedup
Jun 21, 2026
Merged

refactor(sdk/analyze): consolidate duplication + split oversized modules#486
willwashburn merged 9 commits into
mainfrom
refactor/analyze-dedup

Conversation

@willwashburn

@willwashburn willwashburn commented Jun 21, 2026

Copy link
Copy Markdown
Member

Survey-driven cleanup of relayburn-sdk's analyze module: collapse the duplicated helpers the port had scattered across submodules, split two oversized files along cohesive seams, and fix one latent TS-parity bug surfaced along the way.

Each commit is self-contained and was individually verified: cargo build/clippy/fmt clean and the full workspace test suite held at its 992-test baseline at every step, with the real CLI (summary/hotspots/overhead/ingest) live-tested per change.

Commits

Deduplication (behavior-preserving):

  • consolidate duplicated cost/severity/usd helpersfmt_usd/severity_from_usd/PER_MILLION had 4/2/2 copies → single shared definitions.
  • consolidate approx-token heuristic — the ~4 bytes/token heuristic was reimplemented 6× under 4 names → util::tokens_from_bytes / tokens_from_utf16_len / bytes_from_tokens, preserving the UTF-16-vs-bytes distinction (context_delta's intentional floor division left alone).
  • single group_turns_by_session helper — 6 hand-rolled session groupings (inline IndexMaps, HashMap+order-vec, Vec+index-map) → one generic helper.
  • single crate::util::home_dir() — home resolution hand-rolled 5× with 3 behaviors; unify on HOME→USERPROFILE→., fixing a latent Windows bug (ghost_surface + ingest were missing the USERPROFILE fallback).
  • share output-byte accumulation across rollups — the total/max/truncated fold was copy-pasted in 4 hotspots rollups → accumulate_output_bytes.

Cohesion splits (pure code moves):

  • extract shell tokenizer to patterns/shell.rspatterns.rs 1851 → 1578.
  • extract ghost_surface adapters to submoduleghost_surface.rs 1571 → 1246; the 3 harness adapters + their filesystem helpers move to ghost_surface/adapters.rs.

Bug fix:

  • fix: unify tool-result stringify on the TS-faithful versionhotspots and patterns each carried a copy of stringifyToolResult; the TS originals (recovered from a573de5~1) are byte-identical, but patterns' port diverged at the array catch-all, JSON-stringifying bare scalar blocks that TS skips. Unified on the faithful version in util::stringify_tool_result. No observable change on real input — Claude/Codex tool_result.content is never an array of bare scalars, so the corrected arm is unreachable in practice and no test exercises it.

Verification

  • cargo build --workspace, cargo clippy --workspace, cargo fmt --all --check: all clean.
  • cargo test --workspace: 992 passed (== pre-refactor baseline) on every commit.

Notes for reviewers

  • The only behavior change is the stringify fix, and it is unreachable on real harness data (see commit body). Everything else is behavior-preserving.
  • Deliberately left out as marginal/diminishing-returns: a price_tokens consolidation (only 2 of 5 candidate sites are cleanly foldable; the rest are entangled with model-counting or compute a per-token rate) and a hotspots.rs types/attribution/aggregation split.

🤖 Generated with Claude Code

Review in cubic

willwashburn and others added 8 commits June 21, 2026 00:28
The analyze submodules each carried private copies of the same small
helpers. Consolidate them so the shared behavior has one definition:

- fmt_usd → analyze::util (removed 4 identical copies)
- severity_from_usd + SEVERITY_*_USD thresholds → reuse the existing
  pub(crate) findings::severity_from_usd (removed 2 copies + 4 consts)
- PER_MILLION → cost::PER_MILLION, now pub(crate) (removed 2 copies)

No behavior change: every removed copy was byte-for-byte equivalent.
992/992 workspace tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The bytes-per-token (~4, ceil) heuristic was reimplemented six times under
four different names/constants across analyze submodules, with the UTF-16 vs
raw-byte distinction left implicit. Centralize it as named primitives:

- util::tokens_from_bytes      — ceil(bytes / 4)
- util::tokens_from_utf16_len  — ceil(utf16_units / 4); TS string.length parity
- util::bytes_from_tokens      — inverse (tokens * 4)

Routed hotspots (utf16), ghost_surface, claude_md, and tool_output_bloat
through them. context_delta intentionally keeps floor division for its own
approx_tokens field and is left untouched (noted in util). No behavior
change; 992/992 workspace tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Grouping turns into per-session buckets in first-seen order (the TS
Map<sessionId, TurnRecord[]> iteration contract that fixtures depend on) was
hand-rolled six different ways: inline IndexMaps, HashMap + a parallel
order-vec, and a Vec + index-map. Three of those reimplemented from scratch
what IndexMap already does.

Add util::group_turns_by_session, generic over IntoIterator<Item=&TurnRecord>
so both &[TurnRecord] and &[&TurnRecord] (claude_md) callers share it, and
route hotspots, subagent_tree, claude_md, patterns, quality, and
tool_call_patterns through it. Per-session turn_index sorts are unchanged.
No behavior change; 992/992 workspace tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Home-directory resolution was hand-rolled five times with three different
behaviors. summary, flow, and tool_output_bloat used the canonical "HOME, then
USERPROFILE (Node os.homedir parity), then '.'" form; ghost_surface and ingest
resolved HOME only and silently fell back to "." on Windows where HOME is
usually unset — a latent bug.

Add crate::util::home_dir() with the canonical semantics and route all five
through it, so the Windows USERPROFILE fallback now applies uniformly.
ledger::ledger_home (RELAYBURN_HOME data root, HOME-only by design) is left
separate and noted in the helper's docs.

Common path (HOME set) is unchanged everywhere; the only deltas are the
now-uniform USERPROFILE fallback and harmless relative-path prefixes on the
both-env-unset edge. 992/992 workspace tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
patterns.rs carried a ~250-line self-contained shell-command parser (segment
splitting, quote-aware tokenization, redirect/operand detection) that only the
edit-heavy / codex-read detectors use through one entry point. Move it to a
patterns/shell.rs submodule exposing just shell_command_has_file_read; the
codex read-command vocabulary (is_codex_shell_read_command) moves with it since
nothing else references it. patterns.rs drops from 1851 to 1578 lines.

Pure code move (byte-identical apart from the one pub(super) on the entry fn);
992/992 workspace tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The total/max/truncated output-byte fold was copy-pasted verbatim in four
hotspots rollups (file, bash, bash-verb, subagent). Extract it into
accumulate_output_bytes so the saturating-add / running-max / truncation-count
logic has one definition and the four rollups can't drift. McpServer is
untouched (it carries no byte fields).

Pure extraction (logic byte-identical); 992/992 workspace tests pass. Also
rustfmt-normalizes the home_dir chain in user_claude_settings_path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ghost_surface.rs bundled the per-harness GhostSurfaceAdapter implementations
(Claude/Codex/OpenCode) and their filesystem-enumeration helpers in with the
public types, orchestrator, and finding adapter. Move the adapter cluster —
the three adapters, the DirEntry/list_dir_files directory walker, the
is_markdown/is_plain_text_surface predicates, the OpenCode catalog reader, and
the default_ghost_adapters registry — into ghost_surface/adapters.rs, exposed
to the parent as pub(super). The trait, public types, slash-command miners,
orchestrator, finding adapter, and tests stay in the parent.

ghost_surface.rs drops from 1571 to 1246 lines. Pure move (no logic change);
992/992 workspace tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hotspots and patterns each carried a copy of stringifyToolResult. The TS
originals (recovered from a573de5~1) are byte-identical, but patterns' Rust
port diverged at the array catch-all: it JSON-stringified bare scalar blocks
(number/bool/null) that TS — and the hotspots port — skip, because such a
block is neither `typeof === 'object'` nor `typeof === 'string'`.

Move the TS-faithful version (hotspots') into analyze::util::stringify_tool_result
and call it from both, deleting the two copies. This realigns patterns with
the TS source and removes the drift permanently.

No observable change on real input: Claude/Codex tool_result.content is a bare
string or an array of typed content-block objects, never bare scalars, so the
corrected arm is unreachable in practice and no test exercises it. 992/992
workspace tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 21, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@willwashburn, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 19 minutes and 43 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 228bd7a1-81d3-431a-8270-8feaaed6ef24

📥 Commits

Reviewing files that changed from the base of the PR and between 2a3ff58 and e044d4a.

📒 Files selected for processing (3)
  • crates/relayburn-sdk/src/analyze/ghost_surface/adapters.rs
  • crates/relayburn-sdk/src/analyze/patterns/shell.rs
  • crates/relayburn-sdk/src/analyze/util.rs
📝 Walkthrough

Walkthrough

Shared utility helpers (group_turns_by_session, tokens_from_bytes, bytes_from_tokens, tokens_from_utf16_len, fmt_usd, stringify_tool_result, home_dir) are centralized into analyze/util.rs and util.rs. Ghost-surface harness adapters move to a new ghost_surface/adapters.rs submodule, and shell file-read detection moves to patterns/shell.rs. All analyze, ingest, and query-verb callsites are updated to use the shared implementations, removing duplicate local definitions.

Changes

Utility Centralization and Adapter Extraction

Layer / File(s) Summary
Shared utility helpers
crates/relayburn-sdk/src/analyze/util.rs, crates/relayburn-sdk/src/util.rs, crates/relayburn-sdk/src/analyze/cost.rs
Adds group_turns_by_session, fmt_usd, tokens_from_bytes, bytes_from_tokens, tokens_from_utf16_len, and stringify_tool_result to analyze/util.rs; adds home_dir() to top-level util.rs; makes PER_MILLION pub(crate) in cost.rs.
Ghost surface adapter submodule
crates/relayburn-sdk/src/analyze/ghost_surface/adapters.rs, crates/relayburn-sdk/src/analyze/ghost_surface.rs
Moves DirEntry, list_dir_files, file-type predicates, ClaudeGhostAdapter, CodexGhostAdapter, OpenCodeGhostAdapter, enumerate_opencode_project, and default_ghost_adapters into a new adapters submodule; removes the corresponding 364 lines from ghost_surface.rs and updates its imports, severity helpers, and test imports.
Shell file-read tokenizer submodule
crates/relayburn-sdk/src/analyze/patterns/shell.rs, crates/relayburn-sdk/src/analyze/patterns.rs
Adds a POSIX-ish shell tokenizer/detector for cat/head/tail file-operand detection in patterns/shell.rs; removes the equivalent in-file implementation from patterns.rs and wires in shell::shell_command_has_file_read plus shared utilities.
Hotspots: shared helpers and accumulate_output_bytes
crates/relayburn-sdk/src/analyze/hotspots.rs
Imports PER_MILLION, stringify_tool_result, and tokens_from_utf16_len; replaces manual session bucketing with group_turns_by_session; adds accumulate_output_bytes helper; updates aggregate_by_file, aggregate_by_bash, aggregate_by_bash_verb, and aggregate_by_subagent to use it.
Callsite updates
crates/relayburn-sdk/src/analyze/claude_md.rs, findings.rs, quality.rs, subagent_tree.rs, tool_call_patterns.rs, tool_output_bloat.rs, crates/relayburn-sdk/src/ingest/ingest.rs, crates/relayburn-sdk/src/query_verbs/flow.rs, crates/relayburn-sdk/src/query_verbs/summary.rs
Replaces local group_by_session, fmt_usd, bytes_to_tokens, BYTES_PER_TOKEN, ceil_div, CHARS_PER_TOKEN, and home_dir implementations with imports from shared utility modules.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 Hop hop! The warren's tidy now,
No more duplicate burrows, I vow!
tokens_from_bytes and fmt_usd in one den,
home_dir shared among all the warren's ten.
Shell tokenizer neatly in its own hole—
One util to rule them, wholesome and whole! 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main focus: refactoring to consolidate duplicated code and split oversized modules in the analyze submodule.
Description check ✅ Passed The description comprehensively explains the refactoring scope, lists all deduplication and code-motion changes, documents verification, and notes the unreachable bug fix.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/analyze-dedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the relayburn-sdk analysis modules by centralizing shared utilities (such as turn grouping, money formatting, token-byte heuristics, and tool-result stringification) into analyze/util.rs, moving ghost surface adapters and shell-parsing helpers into dedicated submodules, and unifying home directory resolution. Feedback on these changes suggests optimizing group_turns_by_session to reduce string allocations, simplifying redundant error handling in list_dir_files, removing redundant state variables in shell-parsing helpers, and refactoring is_signed_integer to use more idiomatic Rust.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +22 to +25
let mut by_session: IndexMap<String, Vec<&'a TurnRecord>> = IndexMap::new();
for t in turns {
by_session.entry(t.session_id.clone()).or_default().push(t);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To avoid cloning session_id on every single turn, we can first check if the session already exists in the map using get_mut. This reduces the number of string allocations from $O(N)$ (where $N$ is the number of turns) to $O(S)$ (where $S$ is the number of unique sessions).

    let mut by_session: IndexMap<String, Vec<&'a TurnRecord>> = IndexMap::new();
    for t in turns {
        if let Some(vec) = by_session.get_mut(&t.session_id) {
            vec.push(t);
        } else {
            by_session.insert(t.session_id.clone(), vec![t]);
        }
    }

Comment on lines +41 to +49
Err(err) => {
// ENOENT / ENOTDIR: surface no entries.
if err.kind() == std::io::ErrorKind::NotFound
|| err.kind() == std::io::ErrorKind::NotADirectory
{
return Vec::new();
}
return Vec::new();
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error handling here is redundant because both branches of the if statement, as well as the fallback, return Vec::new(). We can simplify this by discarding the error and returning Vec::new() directly.

        Err(_) => return Vec::new(),

Comment on lines +197 to +220
fn is_redirect_open(token: &str) -> bool {
// matches `^\d*>` (zero or more digits followed by '>')
let mut chars = token.chars();
let mut saw_any = false;
let mut found_gt = false;
let mut leading_digits = 0_usize;
for c in chars.by_ref() {
if c.is_ascii_digit() && !found_gt {
leading_digits += 1;
continue;
}
if c == '>' {
found_gt = true;
saw_any = true;
break;
}
break;
}
let _ = leading_digits;
if found_gt {
return saw_any;
}
token.starts_with('>')
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function can be significantly simplified. The leading_digits variable is unused (and currently suppressed with let _ = leading_digits;), and the state tracking variables saw_any and found_gt are redundant. We can return true directly when > is encountered after zero or more digits.

fn is_redirect_open(token: &str) -> bool {
    // matches `^\d*>` (zero or more digits followed by '>')
    let mut chars = token.chars();
    for c in chars.by_ref() {
        if c.is_ascii_digit() {
            continue;
        }
        if c == '>'; {
            return true;
        }
        break;
    }
    token.starts_with('>')
}

Comment on lines +222 to +241
fn is_pure_redirect(token: &str) -> bool {
// matches `/^\d*>+$/` or `/^>+$/`
let mut i = 0_usize;
let bytes = token.as_bytes();
while i < bytes.len() && bytes[i].is_ascii_digit() {
i += 1;
}
if i == bytes.len() {
return false;
}
let mut saw_gt = false;
while i < bytes.len() {
if bytes[i] != b'>' {
return false;
}
saw_gt = true;
i += 1;
}
saw_gt
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The saw_gt variable is redundant. Since i < bytes.len() is guaranteed to be true if the function doesn't return early at line 229, the loop will run at least once and only complete if all remaining characters are b'>'. Thus, we can simply return true at the end of the loop.

fn is_pure_redirect(token: &str) -> bool {
    // matches `/^\d*>+$/` or `/^>+$/`
    let mut i = 0_usize;
    let bytes = token.as_bytes();
    while i < bytes.len() && bytes[i].is_ascii_digit() {
        i += 1;
    }
    if i == bytes.len() {
        return false;
    }
    while i < bytes.len() {
        if bytes[i] != b'>' {
            return false;
        }
        i += 1;
    }
    true
}

Comment on lines +243 to +263
fn is_signed_integer(token: &str) -> bool {
// matches `/^[+-]?\d+$/`
let bytes = token.as_bytes();
if bytes.is_empty() {
return false;
}
let mut i = 0_usize;
if bytes[0] == b'+' || bytes[0] == b'-' {
i = 1;
}
if i == bytes.len() {
return false;
}
while i < bytes.len() {
if !bytes[i].is_ascii_digit() {
return false;
}
i += 1;
}
true
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

We can simplify this function and make it much more idiomatic Rust by leveraging starts_with and chars().all instead of manual index tracking and byte slicing.

fn is_signed_integer(token: &str) -> bool {
    // matches `/^[+-]?\d+$/`
    let mut s = token;
    if s.starts_with('+') || s.starts_with('-') {
        s = &s[1..];
    }
    !s.is_empty() && s.chars().all(|c| c.is_ascii_digit())
}

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
crates/relayburn-sdk/src/analyze/ghost_surface/adapters.rs (1)

39-50: 💤 Low value

Optional: Simplify redundant error handling.

Both branches of the conditional return Vec::new(), making the if statement redundant. If the intent is to silently handle all errors (which the comment at lines 248-251 in enumerate_opencode_project suggests), consider simplifying:

♻️ Suggested simplification
     let entries = match fs::read_dir(dir) {
         Ok(e) => e,
-        Err(err) => {
-            // ENOENT / ENOTDIR: surface no entries.
-            if err.kind() == std::io::ErrorKind::NotFound
-                || err.kind() == std::io::ErrorKind::NotADirectory
-            {
-                return Vec::new();
-            }
+        Err(_) => {
+            // ENOENT / ENOTDIR / permission errors: surface no entries.
             return Vec::new();
         }
     };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-sdk/src/analyze/ghost_surface/adapters.rs` around lines 39 -
50, The error handling in the Err branch of the match statement contains
redundant logic where both the if condition (when error kind is NotFound or
NotADirectory) and the implicit else path return Vec::new(). Since all error
cases return the same value, remove the if statement entirely and replace the
entire Err(err) block body with a single return Vec::new() statement, making the
code more concise while maintaining the same behavior.
crates/relayburn-sdk/src/analyze/patterns/shell.rs (1)

197-220: 💤 Low value

Unused variable and redundant logic.

leading_digits is computed but never used (line 215 just suppresses the warning). Additionally, saw_any will always equal found_gt in this function, making the return on line 217 redundant.

♻️ Simplified implementation
 fn is_redirect_open(token: &str) -> bool {
     // matches `^\d*>` (zero or more digits followed by '>')
-    let mut chars = token.chars();
-    let mut saw_any = false;
-    let mut found_gt = false;
-    let mut leading_digits = 0_usize;
-    for c in chars.by_ref() {
-        if c.is_ascii_digit() && !found_gt {
-            leading_digits += 1;
-            continue;
-        }
-        if c == '>' {
-            found_gt = true;
-            saw_any = true;
-            break;
-        }
-        break;
-    }
-    let _ = leading_digits;
-    if found_gt {
-        return saw_any;
-    }
-    token.starts_with('>')
+    let mut chars = token.chars().peekable();
+    while chars.peek().map_or(false, |c| c.is_ascii_digit()) {
+        chars.next();
+    }
+    chars.next() == Some('>')
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-sdk/src/analyze/patterns/shell.rs` around lines 197 - 220,
The is_redirect_open function has unused and redundant logic. Remove the unused
leading_digits variable completely (eliminating the counter increment and the
`let _ = leading_digits;` line that suppresses the warning). Also remove the
saw_any variable since it will always have the same value as found_gt (saw_any
is only set to true when found_gt is set to true). Simplify the final return
statement to directly return found_gt instead of saw_any, keeping only the
fallback check for token.starts_with('>').
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/relayburn-sdk/src/analyze/ghost_surface/adapters.rs`:
- Around line 39-50: The error handling in the Err branch of the match statement
contains redundant logic where both the if condition (when error kind is
NotFound or NotADirectory) and the implicit else path return Vec::new(). Since
all error cases return the same value, remove the if statement entirely and
replace the entire Err(err) block body with a single return Vec::new()
statement, making the code more concise while maintaining the same behavior.

In `@crates/relayburn-sdk/src/analyze/patterns/shell.rs`:
- Around line 197-220: The is_redirect_open function has unused and redundant
logic. Remove the unused leading_digits variable completely (eliminating the
counter increment and the `let _ = leading_digits;` line that suppresses the
warning). Also remove the saw_any variable since it will always have the same
value as found_gt (saw_any is only set to true when found_gt is set to true).
Simplify the final return statement to directly return found_gt instead of
saw_any, keeping only the fallback check for token.starts_with('>').

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 331a5ea1-9cac-4e6d-a00a-507ace48c69b

📥 Commits

Reviewing files that changed from the base of the PR and between 1c8695c and 2a3ff58.

📒 Files selected for processing (17)
  • crates/relayburn-sdk/src/analyze/claude_md.rs
  • crates/relayburn-sdk/src/analyze/cost.rs
  • crates/relayburn-sdk/src/analyze/findings.rs
  • crates/relayburn-sdk/src/analyze/ghost_surface.rs
  • crates/relayburn-sdk/src/analyze/ghost_surface/adapters.rs
  • crates/relayburn-sdk/src/analyze/hotspots.rs
  • crates/relayburn-sdk/src/analyze/patterns.rs
  • crates/relayburn-sdk/src/analyze/patterns/shell.rs
  • crates/relayburn-sdk/src/analyze/quality.rs
  • crates/relayburn-sdk/src/analyze/subagent_tree.rs
  • crates/relayburn-sdk/src/analyze/tool_call_patterns.rs
  • crates/relayburn-sdk/src/analyze/tool_output_bloat.rs
  • crates/relayburn-sdk/src/analyze/util.rs
  • crates/relayburn-sdk/src/ingest/ingest.rs
  • crates/relayburn-sdk/src/query_verbs/flow.rs
  • crates/relayburn-sdk/src/query_verbs/summary.rs
  • crates/relayburn-sdk/src/util.rs

Nitpicks from the PR bots (Gemini, CodeRabbit), all behavior-preserving:

- group_turns_by_session: clone the session id once per session via get_mut
  instead of on every turn via entry() — restores O(S) key clones for the
  HashMap-based callers this consolidation had bumped to O(N).
- list_dir_files: collapse the dead `if` whose branches both returned
  Vec::new() into a single Err(_) arm.
- is_redirect_open / is_pure_redirect: drop redundant state (leading_digits,
  saw_any, saw_gt) that the control flow made unconditional.
- is_signed_integer: rewrite with strip_prefix + bytes().all instead of manual
  index tracking.

992/992 workspace tests pass; build/clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@willwashburn

Copy link
Copy Markdown
Member Author

Thanks for the reviews. Devin reported no issues; the Gemini + CodeRabbit suggestions were all behavior-preserving nitpicks, and I've applied all five in e044d4a:

  • group_turns_by_session — switched from entry(id.clone()) to get_mut/insert so the session id is cloned once per session, not once per turn (restores O(S) clones for the callers this consolidation had bumped to O(N)).
  • list_dir_files — collapsed the dead if (both branches returned Vec::new()) into a single Err(_) => return Vec::new().
  • is_redirect_open — dropped the unused leading_digits and the always-equal saw_any/found_gt; the starts_with('>') fallback was only ever reached as false, so the loop now returns directly.
  • is_pure_redirect — removed the redundant saw_gt (always true if the loop completes past the length guard) and return true directly.
  • is_signed_integer — rewritten with strip_prefix(['+','-']) + bytes().all(...).

Each is equivalent on all inputs (the redirect/integer rewrites were traced case-by-case), and the full workspace suite still passes 992/992 with clippy/fmt clean.

@willwashburn willwashburn merged commit f3a26de into main Jun 21, 2026
12 checks passed
@willwashburn willwashburn deleted the refactor/analyze-dedup branch June 21, 2026 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant