Skip to content

docs: update build performance benchmarks (3.4.0)#625

Merged
carlos-alm merged 4 commits intomainfrom
benchmark/build-v3.4.0-20260326-073715
Mar 26, 2026
Merged

docs: update build performance benchmarks (3.4.0)#625
carlos-alm merged 4 commits intomainfrom
benchmark/build-v3.4.0-20260326-073715

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Automated build benchmark update for 3.4.0 from workflow run #462.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 26, 2026

Greptile Summary

This automated PR records CI-measured build performance benchmarks for version 3.4.0 against the 473-file self-hosted codebase. All figures in README.md and the benchmark table are internally consistent and verified against the raw JSON payload (per-file normalizations, percentage deltas, and 50k-file extrapolations all check out).

Key 3.4.0 highlights:

  • Native build speed improved — 6.1 → 5.3 ms/file (−13%)
  • Query time regressed — 7ms → 12ms (+63% for both engines)
  • WASM build speed regressed — 11.2 → 12.2 ms/file (+9%)
  • 1-file incremental rebuild regressed on both engines (~22%) — native 353ms → 432ms, WASM 506ms → 621ms
  • Engine parity (nodes/edges/DB size) — pre-existing divergence tracked in bug(native): native engine under-extracts ~68 call edges vs WASM #613, acknowledged in prior review threads; not new to this PR
  • New finding not tracked elsewhere — WASM CFG phase regressed 139% (154.8ms → 370.5ms) and WASM complexity regressed 61% (215.5ms → 348.1ms), while native CFG held flat and native complexity improved 85%

Confidence Score: 4/5

Safe to merge as a benchmark snapshot; the data accurately reflects CI output, but the WASM CFG/complexity regression deserves a follow-up tracking issue before 3.5.0.

All numbers are internally consistent and correctly derived from the raw JSON payload. The pre-existing engine parity issue is tracked in #613 and acknowledged. The new WASM CFG/complexity phase divergence is a real signal in the data that isn't yet tracked, but this PR's purpose is to record measurements, not fix them — the fix belongs in the engine. Score stops at 4 rather than 5 because the untracked regression (WASM CFG 2.4× slower than native with no matching issue) should at minimum be filed before it gets buried in future benchmark updates.

generated/benchmarks/BUILD-BENCHMARKS.md — the phase breakdown table contains the CFG/complexity divergence data

Important Files Changed

Filename Overview
README.md Performance metrics updated correctly from 3.3.1 to 3.4.0; all figures cross-check against the benchmark JSON (native 5.3 ms/file, WASM 12.2 ms/file, query 12ms, 1-file rebuild 432ms, ~265s at 50k files)
generated/benchmarks/BUILD-BENCHMARKS.md 3.4.0 benchmark rows added correctly with consistent per-file normalizations, percentage deltas, extrapolations, and raw JSON payload; new WASM CFG (+139%) and complexity (+61%) phase regressions not tracked in any existing issue

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph Native["Native Engine (3.4.0) — 2499ms total"]
        N1[Parse 1348ms] --> N2[Insert nodes 192ms]
        N2 --> N3[Resolve imports 13ms]
        N3 --> N4[Build edges 145ms]
        N4 --> N5[Structure + Roles 90ms]
        N5 --> N6[AST nodes 338ms]
        N6 --> N7[Complexity ✅ 26ms]
        N7 --> N8[CFG ✅ 160ms]
        N8 --> N9[Dataflow 117ms]
    end
    subgraph WASM["WASM Engine (3.4.0) — 5752ms total"]
        W1[Parse 2438ms] --> W2[Insert nodes 210ms]
        W2 --> W3[Resolve imports 15ms]
        W3 --> W4[Build edges 149ms]
        W4 --> W5[Structure + Roles 104ms]
        W5 --> W6[AST nodes 217ms]
        W6 --> W7[Complexity ⚠️ 348ms +61%]
        W7 --> W8[CFG ⚠️ 371ms +139%]
        W8 --> W9[Dataflow 120ms]
    end
    style N7 fill:#90EE90,color:#000
    style N8 fill:#90EE90,color:#000
    style W7 fill:#FFB347,color:#000
    style W8 fill:#FF6B6B,color:#000
Loading

Reviews (2): Last reviewed commit: "Merge branch 'main' into benchmark/build..." | Re-trigger Greptile

Comment on lines +8 to +9
| 3.4.0 | native | 2026-03-26 | 473 | 5.3 ↓13% | 11.6 ↑63% | 23.2 ↑32% | 44.2 ↑21% | 55041 ↑13% |
| 3.4.0 | wasm | 2026-03-26 | 473 | 12.2 ↑9% | 11.7 ↑62% | 23.1 ↑32% | 44.4 ↑22% | 40198 ↓15% |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Engine parity violation: nodes and edges diverge between native and WASM

The 3.4.0 data shows a measurable divergence in extraction results between the two engines:

Metric Native WASM Delta
Nodes 10,963 10,937 −26
Edges 20,921 21,022 +101

Per CLAUDE.md, both engines must produce identical results — if they diverge, the less-accurate engine has a bug that should be fixed rather than documented:

Both engines must produce identical results. If they diverge, the less-accurate engine has a bug — fix it, don't document the gap.

This divergence was also present in 3.3.1 (17.6 vs 17.5 nodes/file, 36.5 vs 36.4 edges/file), so it is pre-existing rather than introduced here. However, recording it in a new release benchmark entry without a corresponding fix or a tracked issue means the gap is effectively being accepted. The underlying extraction or resolution layer that produces the incorrect counts should be investigated and fixed.

Context Used: CLAUDE.md (source)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — this divergence is a pre-existing bug already tracked in #613 ("bug(native): native engine under-extracts ~68 call edges vs WASM"). That issue covers both the node count delta (+26) and edge count delta (+101) between engines.

This PR is an automated benchmark recording from CI. The purpose is to capture the raw numbers as-is so we can measure progress as #613 is fixed. Suppressing or adjusting the data would hide the regression signal. The fix belongs in the extraction/resolution layer, not in the benchmark docs.

Comment on lines +8 to +9
| 3.4.0 | native | 2026-03-26 | 473 | 5.3 ↓13% | 11.6 ↑63% | 23.2 ↑32% | 44.2 ↑21% | 55041 ↑13% |
| 3.4.0 | wasm | 2026-03-26 | 473 | 12.2 ↑9% | 11.7 ↑62% | 23.1 ↑32% | 44.4 ↑22% | 40198 ↓15% |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 New DB size divergence between engines

In 3.3.1, native and WASM DB sizes were nearly identical (48,707 vs 47,317 bytes/file — a ~3% gap consistent with minor overhead differences). In 3.4.0 they have split dramatically:

Engine DB (bytes/file) Change from 3.3.1
native 55,041 ↑ 13%
wasm 40,198 ↓ 15%

That is a ~37% difference in DB footprint for the same 473-file codebase. Since the DB is derived from extracted nodes and edges, and the node/edge counts already diverge (see comment above), the engines are clearly persisting different data. This is a new regression introduced in 3.4.0 and is consistent with one engine gaining a feature (or fixing an omission) that the other has not yet received. Per CLAUDE.md, this should be treated as a bug to fix — not just documented.

Context Used: CLAUDE.md (source)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that the DB size gap widening from ~3% to ~37% is concerning and worth investigating. This is a downstream consequence of the node/edge divergence tracked in #613.

The 3.4.0 release added new AST-store and complexity data phases, which likely amplify the extraction differences into larger DB footprint deltas. I've added a comment on #613 noting the DB size regression so it's tracked alongside the root cause.

As above, this PR records the raw CI benchmark output — the fix belongs in the engine parity work in #613, not in the benchmark docs.

@carlos-alm
Copy link
Copy Markdown
Contributor

@greptileai

@carlos-alm carlos-alm merged commit e29fb8b into main Mar 26, 2026
13 checks passed
@carlos-alm carlos-alm deleted the benchmark/build-v3.4.0-20260326-073715 branch March 26, 2026 22:40
@github-actions github-actions bot locked and limited conversation to collaborators Mar 26, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant