fix(parity): restore call AST node extraction in WASM engine#705
fix(parity): restore call AST node extraction in WASM engine#705carlos-alm merged 17 commits intomainfrom
Conversation
Extract shared patterns from 9 language extractors into 4 reusable
helpers in helpers.ts, reducing per-language boilerplate by ~30 lines:
- findParentNode: replaces 6 findParent*/findCurrentImpl functions
- extractBodyMembers: replaces 5 body-iteration patterns for enums/structs
- stripQuotes: replaces inline .replace(/"/g,'') across 3 extractors
- lastPathSegment: replaces inline .split('.').pop() across 6 extractors
Net: +77 helper lines, -159 extractor lines = -82 lines total.
Add clarifying comment to empty self_parameter if-block in rust.ts and document anchored-vs-global semantics in stripQuotes JSDoc.
PR #686 incorrectly removed call_expression extraction from the WASM ast-store-visitor and added filters to strip call nodes from native output — documenting a parity gap as expected behavior rather than fixing it. This restores full parity: - Restore call_expression in JS astTypes map - Restore call extraction logic in ast-store-visitor with receiver extraction, argument-only recursion, and dedup tracking - Remove three call-kind filters (engine.ts, ast.ts bulk path, ast.ts JS fallback path) that stripped native call nodes - Clear stale engineOpts.nativeDb reference after pipeline closes the NativeDatabase to prevent stages from using a closed connection - Update ast-nodes test to assert calls ARE stored
|
Claude finished @carlos-alm's task —— View job Comprehensive PR Review Complete ✅Reviewing PR #705: "fix(parity): restore call AST node extraction in WASM engine" Root Cause AnalysisProblem Being Solved: PR #686 incorrectly removed call AST node extraction from WASM engine while filtering native call nodes, creating an engine parity gap. The previous PR documented this gap as "expected behavior" rather than fixing the underlying issue. Why This Approach: This PR restores true engine parity by ensuring both native and WASM engines extract identical call AST nodes. The approach mirrors native engine strategy exactly (argument-only recursion, receiver extraction, deduplication). Risk Assessment: Low risk - this is a genuine bug fix restoring intended behavior, not a workaround. Backlog Compliance
Critical ConcernsNone identified. This is a clean, well-implemented fix that:
Particularly Strong Points:
Final Recommendation
This PR should be merged immediately. |
Greptile SummaryThis PR restores full WASM/native engine parity for Key changes:
Confidence Score: 5/5Safe to merge — the parity fix is well-tested (2131 passing tests including 4 parity tests), and the only remaining observation is a minor P2 inconsistency in how a null receiver is normalised between the bulk and JS fallback paths. All P0/P1 concerns from prior review rounds are addressed. The single remaining finding is P2: null ?? '' in the bulk path stores an empty string for calls with no receiver, while the JS fallback stores null. Both values semantically mean no receiver and the parity test passes, so this does not block merging. src/features/ast.ts — the receiver: n.receiver ?? '' normalisation differs from the JS fallback path's n.receiver || null Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[runAnalyses] --> B{Native bulk path\nnativeDb available?}
B -- Yes --> C[walkWithVisitors WASM\nenterNode for all kinds incl. call]
B -- No --> D[JS fallback\nwalkAst via createAstStoreVisitor]
C --> E[symbols.astNodes set\nwith call nodes included]
D --> E2[rows collected\nwith call nodes included]
E --> F{buildAstNodes\nnativeDb bulk path?}
E2 --> G[JS INSERT\nreceiver → null if falsy]
F -- Yes --> H[bulkInsertAstNodes\nreceiver ?? null\nall kinds incl. call]
F -- No --> G
H --> I[(ast_nodes table)]
G --> I
subgraph pipeline.ts fix
J[ctx.nativeDb.close] --> K[ctx.nativeDb = undefined]
K --> L[engineOpts.nativeDb = undefined\nPREVENTS stale ref throw]
L --> M[Reopen nativeDb\nbefore runAnalyses]
M --> N[engineOpts.nativeDb = new ref]
end
Reviews (3): Last reviewed commit: "fix: remove parity-diag console.error lo..." | Re-trigger Greptile |
| enterNode(node: TreeSitterNode, _context: VisitorContext): EnterNodeResult | undefined { | ||
| if (matched.has(node.id)) return; |
There was a problem hiding this comment.
enterNode returns undefined for matched nodes — walker still descends children
When matched.has(node.id) is true, enterNode returns undefined without setting skipChildren. The walkWithVisitors walker only halts descent when result?.skipChildren is truthy; an undefined return means the walker will recurse into the node's children via the normal DFS path.
In the current logic this is not a bug: every node that can be in matched before enterNode is called on it sits inside a call_expression argument subtree, whose parent call already returned { skipChildren: true }, so the walker never actually reaches those pre-collected nodes. However the safety assumption is subtle and not documented, and a future change to walkCallArguments (e.g. walking the function field as well) could break it silently.
Consider making the guard explicit:
| enterNode(node: TreeSitterNode, _context: VisitorContext): EnterNodeResult | undefined { | |
| if (matched.has(node.id)) return; | |
| enterNode(node: TreeSitterNode, _context: VisitorContext): EnterNodeResult | undefined { | |
| if (matched.has(node.id)) return { skipChildren: true }; |
This makes the invariant hold unconditionally without relying on the parent's skipChildren.
There was a problem hiding this comment.
Fixed — \ now returns \ for already-matched nodes instead of . This makes the invariant hold unconditionally without relying on the parent's . See commit 173ab12.
There was a problem hiding this comment.
Update: reverted the skipChildren guard. Tree-sitter \ is a memory address that can be reused across different nodes in the same parse tree. Adding \ on an ID collision suppresses an unrelated subtree, which broke WASM ast_node extraction in the build-parity test on CI (native v3.5.0 produced 19 ast_nodes, WASM produced 0). The original \ is the correct behavior: it prevents re-collection without suppressing children, and the parent call's \ handles the intended dedup case. Added a comment documenting this invariant. See commit 0f4259e.
| import { getNodeId } from '../../../../db/index.js'; | ||
| import { debug } from '../../../../infrastructure/logger.js'; | ||
| import { loadNative } from '../../../../infrastructure/native.js'; |
There was a problem hiding this comment.
Undocumented conversion from
#-alias imports to relative paths
build-edges.ts, build-structure.ts, and collect-files.ts all have their #infrastructure/…, #db/…, #shared/…, and #types imports converted to ../../../../… relative paths. This change isn't mentioned in the PR description and isn't consistent with how the rest of the codebase uses these path aliases (e.g. pipeline.ts and other stage files still use the #-aliases).
If this was done to fix a broken alias in a specific build or test context, a brief comment or follow-up ticket would help. If it was accidental, it should be reverted so import style stays uniform.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Investigated — the #-alias to relative-path conversion is actually consistent with the rest of the stage files. On main, finalize.ts, detect-changes.ts, resolve-imports.ts, insert-nodes.ts, parse-files.ts, and run-analyses.ts already use relative paths. Only build-edges.ts, build-structure.ts, and collect-files.ts still had the #-aliases. This PR normalizes the remaining 3 files to match the convention used by the other 6 stage files. No revert needed.
#705) When enterNode encountered an already-matched node it returned undefined, relying on the parent call's skipChildren to prevent redundant descent. This was correct but fragile — a future change to walkCallArguments could silently break the invariant. Now returns { skipChildren: true } explicitly.
Replace loose toBeGreaterThanOrEqual(1) with exact toBe(3) and verify each expected call name (eval, result.set, console.log). The fixture produces exactly 3 call nodes — fetch is captured as kind:await instead. Pinning the count catches future parity regressions immediately.
|
Addressed Greptile review feedback:
|
… skips (#705) Reverts the { skipChildren: true } guard for matched nodes. Tree-sitter node.id is a memory address that can be reused across different nodes in the same tree. Returning skipChildren on an ID collision suppresses an unrelated subtree, breaking WASM ast_node extraction in the build-parity test. The original undefined return is correct: it prevents re-collection without suppressing children, and the parent call's skipChildren handles the intended dedup case.
When native and WASM ast_node counts diverge, log both counts and the distinct kinds present in each. This will reveal whether the CI-only failure is caused by native extracting calls that WASM misses, or a deeper issue with the WASM walker path.
The prebuilt native binary (v3.5.0) expects `receiver` as `String` (not `Option<String>`), so passing JS `null` from WASM-extracted ast_nodes crashes the bulk insert with: Failed to convert JavaScript value `Null` into rust type `String` Coerce `null`/`undefined` receiver to `""` before passing to the native `bulkInsertAstNodes` path. This fixes the build-parity test (native 19 vs WASM 0 ast_nodes) and the typed-method-call test.
Remove temporary diagnostic logging added to trace the WASM ast_nodes CI failure. The root cause (null receiver in NAPI bulk insert) is now fixed, so these stderr diagnostics are no longer needed.
|
Merged main into the branch and fixed the CI failure caused by the merge. Root cause: PR #696 (merged to main) reopens Fixes applied:
All 2131 tests pass locally, lint clean (1 pre-existing warning). |
Summary
Fixes #697. PR #686 incorrectly removed call-AST extraction from the WASM engine and filtered call nodes from native output — documenting a parity gap as expected behavior rather than fixing it. This restores full engine parity for
ast_nodes.call_expression: 'call'inastTypesmap (rules/javascript.ts)ast-store-visitor.tswithextractCallName,extractCallReceiver, argument-only recursion (walkCallArguments), and node dedup tracking (matchedset)engine.ts,ast.tsbulk path,ast.tsJS fallback path)engineOpts.nativeDbreference — pipeline closes the NativeDatabase but left a dangling reference inengineOpts, causingbulkInsertAstNodesto throw on a closed connectionast-nodes.test.tsnow asserts calls ARE stored (was incorrectly asserting 0 calls)Test plan
build-parity.test.ts— all 4 tests pass (nodes, edges, roles, ast_nodes)ast-nodes.test.ts— all 19 tests pass (WASM + native)