docs: update build performance benchmarks (3.4.0)#625
Conversation
Greptile SummaryThis automated PR records CI-measured build performance benchmarks for version 3.4.0 against the 473-file self-hosted codebase. All figures in Key 3.4.0 highlights:
Confidence Score: 4/5Safe to merge as a benchmark snapshot; the data accurately reflects CI output, but the WASM CFG/complexity regression deserves a follow-up tracking issue before 3.5.0. All numbers are internally consistent and correctly derived from the raw JSON payload. The pre-existing engine parity issue is tracked in #613 and acknowledged. The new WASM CFG/complexity phase divergence is a real signal in the data that isn't yet tracked, but this PR's purpose is to record measurements, not fix them — the fix belongs in the engine. Score stops at 4 rather than 5 because the untracked regression (WASM CFG 2.4× slower than native with no matching issue) should at minimum be filed before it gets buried in future benchmark updates. generated/benchmarks/BUILD-BENCHMARKS.md — the phase breakdown table contains the CFG/complexity divergence data Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph Native["Native Engine (3.4.0) — 2499ms total"]
N1[Parse 1348ms] --> N2[Insert nodes 192ms]
N2 --> N3[Resolve imports 13ms]
N3 --> N4[Build edges 145ms]
N4 --> N5[Structure + Roles 90ms]
N5 --> N6[AST nodes 338ms]
N6 --> N7[Complexity ✅ 26ms]
N7 --> N8[CFG ✅ 160ms]
N8 --> N9[Dataflow 117ms]
end
subgraph WASM["WASM Engine (3.4.0) — 5752ms total"]
W1[Parse 2438ms] --> W2[Insert nodes 210ms]
W2 --> W3[Resolve imports 15ms]
W3 --> W4[Build edges 149ms]
W4 --> W5[Structure + Roles 104ms]
W5 --> W6[AST nodes 217ms]
W6 --> W7[Complexity ⚠️ 348ms +61%]
W7 --> W8[CFG ⚠️ 371ms +139%]
W8 --> W9[Dataflow 120ms]
end
style N7 fill:#90EE90,color:#000
style N8 fill:#90EE90,color:#000
style W7 fill:#FFB347,color:#000
style W8 fill:#FF6B6B,color:#000
Reviews (2): Last reviewed commit: "Merge branch 'main' into benchmark/build..." | Re-trigger Greptile |
| | 3.4.0 | native | 2026-03-26 | 473 | 5.3 ↓13% | 11.6 ↑63% | 23.2 ↑32% | 44.2 ↑21% | 55041 ↑13% | | ||
| | 3.4.0 | wasm | 2026-03-26 | 473 | 12.2 ↑9% | 11.7 ↑62% | 23.1 ↑32% | 44.4 ↑22% | 40198 ↓15% | |
There was a problem hiding this comment.
Engine parity violation: nodes and edges diverge between native and WASM
The 3.4.0 data shows a measurable divergence in extraction results between the two engines:
| Metric | Native | WASM | Delta |
|---|---|---|---|
| Nodes | 10,963 | 10,937 | −26 |
| Edges | 20,921 | 21,022 | +101 |
Per CLAUDE.md, both engines must produce identical results — if they diverge, the less-accurate engine has a bug that should be fixed rather than documented:
Both engines must produce identical results. If they diverge, the less-accurate engine has a bug — fix it, don't document the gap.
This divergence was also present in 3.3.1 (17.6 vs 17.5 nodes/file, 36.5 vs 36.4 edges/file), so it is pre-existing rather than introduced here. However, recording it in a new release benchmark entry without a corresponding fix or a tracked issue means the gap is effectively being accepted. The underlying extraction or resolution layer that produces the incorrect counts should be investigated and fixed.
Context Used: CLAUDE.md (source)
There was a problem hiding this comment.
Acknowledged — this divergence is a pre-existing bug already tracked in #613 ("bug(native): native engine under-extracts ~68 call edges vs WASM"). That issue covers both the node count delta (+26) and edge count delta (+101) between engines.
This PR is an automated benchmark recording from CI. The purpose is to capture the raw numbers as-is so we can measure progress as #613 is fixed. Suppressing or adjusting the data would hide the regression signal. The fix belongs in the extraction/resolution layer, not in the benchmark docs.
| | 3.4.0 | native | 2026-03-26 | 473 | 5.3 ↓13% | 11.6 ↑63% | 23.2 ↑32% | 44.2 ↑21% | 55041 ↑13% | | ||
| | 3.4.0 | wasm | 2026-03-26 | 473 | 12.2 ↑9% | 11.7 ↑62% | 23.1 ↑32% | 44.4 ↑22% | 40198 ↓15% | |
There was a problem hiding this comment.
New DB size divergence between engines
In 3.3.1, native and WASM DB sizes were nearly identical (48,707 vs 47,317 bytes/file — a ~3% gap consistent with minor overhead differences). In 3.4.0 they have split dramatically:
| Engine | DB (bytes/file) | Change from 3.3.1 |
|---|---|---|
| native | 55,041 | ↑ 13% |
| wasm | 40,198 | ↓ 15% |
That is a ~37% difference in DB footprint for the same 473-file codebase. Since the DB is derived from extracted nodes and edges, and the node/edge counts already diverge (see comment above), the engines are clearly persisting different data. This is a new regression introduced in 3.4.0 and is consistent with one engine gaining a feature (or fixing an omission) that the other has not yet received. Per CLAUDE.md, this should be treated as a bug to fix — not just documented.
Context Used: CLAUDE.md (source)
There was a problem hiding this comment.
Agreed that the DB size gap widening from ~3% to ~37% is concerning and worth investigating. This is a downstream consequence of the node/edge divergence tracked in #613.
The 3.4.0 release added new AST-store and complexity data phases, which likely amplify the extraction differences into larger DB footprint deltas. I've added a comment on #613 noting the DB size regression so it's tracked alongside the root cause.
As above, this PR records the raw CI benchmark output — the fix belongs in the engine parity work in #613, not in the benchmark docs.
Automated build benchmark update for 3.4.0 from workflow run #462.