Skip to content

test: incremental edge parity CI check#539

Merged
carlos-alm merged 7 commits intomainfrom
test/incremental-edge-parity
Mar 20, 2026
Merged

test: incremental edge parity CI check#539
carlos-alm merged 7 commits intomainfrom
test/incremental-edge-parity

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Adds tests/integration/incremental-edge-parity.test.js — a CI gate that verifies incremental rebuilds produce identical edges to clean full builds
  • Tests four mutation scenarios against the sample-project fixture (CJS, classes, cross-file calls):
    1. Comment-only touch — no semantic change, edges must be identical
    2. Body edit — implementation change with same exports, edges must be identical
    3. New export added — structural change with new consumer, edges must match full rebuild
    4. File deletion — stale edges must be purged, matching a clean build

Test plan

  • All 10 tests pass locally (4 scenarios × 2-3 assertions each)
  • CI passes on all platforms (ubuntu, macos, windows) × Node 20/22

Four mutation scenarios verify that incremental rebuilds produce
identical edges to clean full builds: comment-only touch, body edit,
new export addition, and file deletion. Each scenario compares node
and edge identity between incremental and full builds.
@claude
Copy link
Copy Markdown

claude bot commented Mar 19, 2026

Claude finished @carlos-alm's task —— View job


PR Review Complete

Reviewing PR: test: incremental edge parity CI check

Todo List:

  • Read and analyze the new test file
  • Phase 1: Root Cause Analysis
  • Phase 2: Configuration Change Detection
  • Phase 3: Test Integrity Assessment
  • Phase 4: Code Quality Review
  • Phase 5: Backlog Compliance Evaluation
  • Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: Addresses a critical reliability gap in incremental builds. Codegraph's key differentiator (P1: "The graph is always current") depends on incremental rebuilds being mathematically equivalent to clean builds. Without this test, regressions could silently corrupt the graph while appearing to work.

Why This Approach: Uses comprehensive mutation testing across 4 scenarios (comment, body edit, new export, file deletion) with edge-by-edge comparison. This catches both false positives (spurious edges) and false negatives (missing edges) that could break dependency analysis.

Risk Assessment: LOW RISK - This is a defensive CI gate that validates existing functionality rather than changing behavior.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies, uses existing test infrastructure
  • Foundation-aligned: ✓ — Directly supports P1 ("The graph is always current") by validating incremental accuracy
  • Problem-fit: 5 — Critical infrastructure test preventing AI agents from receiving corrupted dependency data
  • Breaking: No — Purely additive test with zero user-facing changes
  • Tier: 1

Critical Concerns

None identified. The test implementation is well-designed and follows established patterns.

Code Quality Assessment ✅

Strengths:

  • Comprehensive scenarios: Tests the four critical mutation types that could break incremental builds
  • Mathematical rigor: Edge-by-edge comparison with diagnostic output for failures
  • Proper isolation: Uses temporary directories with cleanup, following existing patterns
  • Diagnostic helpers: edgeKey() function and detailed failure messages (lines 178-189) aid debugging
  • Consistent with existing tests: Uses same patterns as incremental-parity.test.js and build-parity.test.js
  • Appropriate timeouts: 60s timeout for I/O-heavy operations is consistent across integration tests

Minor observations:

  • ✅ Uses skipRegistry: true consistently to avoid cross-test interference
  • ✅ Test names clearly describe expected behavior ("edges are identical")
  • ✅ Good use of beforeAll for expensive setup operations
  • ✅ Proper CommonJS fixture usage matching the sample-project structure

No security, performance, or correctness issues identified.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE WITH CONDITIONS
  • Reasoning: Excellent defensive test that validates critical functionality. The test is well-implemented and addresses a real gap in CI coverage. Minor deduction for being infrastructure rather than user-facing functionality.

Conditions for approval:

  • CI must pass on all platforms (ubuntu, macos, windows) × Node 20/22 as mentioned in the test plan
  • Consider adding this test to the critical path CI checks given its importance to incremental build reliability

This PR directly supports Codegraph's core value proposition by ensuring incremental builds maintain mathematical equivalence with full rebuilds, preventing silent corruption that could mislead AI agents.


@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR adds tests/integration/incremental-edge-parity.test.js, a new CI gate that validates incremental rebuild edge correctness across four mutation scenarios (comment-only touch, body edit, new export, and file deletion) against the sample-project fixture.

Key observations:

  • The buildAndCompare helper correctly isolates each scenario in its own temp directory and cleans up via a try/finally block, preventing disk leaks even on failure.
  • Both readEdges and readNodes wrap their SQLite queries in try/finally to ensure db.close() is always called — consistent with the pattern used in incremental-parity.test.js.
  • All four mutation scenarios include post-replace assertions to guard against silent fixture drift (addressed in prior review).
  • The JSDoc header accurately lists all four scenarios (addressed in prior review).
  • Scenario 3 includes a useful diagnostic block that shows exactly which edges are missing or extra when the edge count diverges; Scenarios 2 and 4 omit this, making CI failures harder to triage.
  • The ORDER BY n1.name, n2.name, e.kind in readEdges lacks a tiebreaker, which is a latent flakiness risk if the fixture ever gains nodes with duplicate names.

Confidence Score: 4/5

  • This PR is safe to merge; it adds test coverage only with no production code changes.
  • The new test file is well-structured and addresses feedback from prior review cycles (try/finally for DB handles, post-replace mutation guards, complete JSDoc). The two remaining items — missing diagnostic output in Scenarios 2/4 and a potential ORDER BY tiebreaker — are minor style/robustness gaps that don't affect correctness today given the current fixture. No production code is touched.
  • No files require special attention beyond the two P2 style notes in the new test file.

Important Files Changed

Filename Overview
tests/integration/incremental-edge-parity.test.js New integration test file adding a CI gate for incremental build edge parity across four mutation scenarios. Correctly uses try/finally for DB handles, applies post-replace assertions to guard against silent fixture drift, and cleans up temp directories inside buildAndCompare's finally block. Minor gaps: Scenarios 2 and 4 lack the diagnostic output present in Scenario 3, and the edge ORDER BY could flake under ties.

Sequence Diagram

sequenceDiagram
    participant T as Test (beforeAll)
    participant BAC as buildAndCompare()
    participant FS as Filesystem (tmpdir)
    participant BG as buildGraph()
    participant DB as SQLite (graph.db)

    T->>BAC: call with fixtureDir + mutate fn
    BAC->>FS: mkdtempSync → fullDir / incrDir
    BAC->>FS: copyDirSync(fixture → fullDir)
    BAC->>FS: copyDirSync(fixture → incrDir)

    BAC->>BG: buildGraph(incrDir, incremental=false)
    Note over BG: Baseline hashes written to incrDir DB

    BAC->>FS: mutate(fullDir)
    BAC->>FS: mutate(incrDir)

    BAC->>BG: buildGraph(fullDir, incremental=false)
    Note over BG: Clean full rebuild on mutated source

    BAC->>BG: buildGraph(incrDir, incremental=true)
    Note over BG: Incremental rebuild — must match full

    BAC->>DB: readEdges(fullDir DB)
    BAC->>DB: readEdges(incrDir DB)
    BAC->>DB: readNodes(fullDir DB)
    BAC->>DB: readNodes(incrDir DB)

    BAC->>FS: rmSync(tmpBase) [finally]
    BAC-->>T: { fullEdges, incrEdges, fullNodes, incrNodes }

    T->>T: it('edge count matches') → length check
    T->>T: it('edges are identical') → toEqual check
Loading

Comments Outside Diff (2)

  1. tests/integration/incremental-edge-parity.test.js, line 113-119 (link)

    Missing diagnostic output in Scenarios 2 and 4

    Scenario 3 (lines 183–194) includes a helpful diagnostic block that fires when edge counts differ — it lists the specific edges missing from incremental vs. the extra edges present. Scenarios 2 and 4 omit this, so a failure in CI shows only a raw array diff without indicating which edges diverged, making root-cause analysis significantly harder.

    Consider extracting the diagnostic into a shared helper and calling it in both places:

    function assertEdgeParity(incrEdges, fullEdges) {
      if (incrEdges.length !== fullEdges.length) {
        const fullSet = new Set(fullEdges.map(edgeKey));
        const incrSet = new Set(incrEdges.map(edgeKey));
        const missingInIncr = [...fullSet].filter((k) => !incrSet.has(k));
        const extraInIncr = [...incrSet].filter((k) => !fullSet.has(k));
        expect.fail(
          `Edge mismatch:\n  Missing in incremental: ${missingInIncr.join(', ') || 'none'}\n  Extra in incremental: ${extraInIncr.join(', ') || 'none'}`,
        );
      }
      expect(incrEdges).toEqual(fullEdges);
    }

    The same issue applies to the "edges are identical" test in Scenario 4 (lines 243–245).

  2. tests/integration/incremental-edge-parity.test.js, line 36-45 (link)

    Edge ORDER BY may not be fully deterministic

    The current ORDER BY n1.name, n2.name, e.kind does not include a tiebreaker. In the sample-project fixture all function names are unique across files, so in practice this is fine. However, if the fixture ever gains two different nodes with the same name that both call a common target with the same kind, the order between them could differ between the full build and the incremental build (SQLite does not guarantee insertion-order stability for equal-key rows). This would cause toEqual to flake even though the edge sets are semantically identical.

    Adding e.rowid (or any stable column) as a final sort key eliminates the ambiguity at negligible cost:

Last reviewed commit: "test: use per-replac..."

Comment on lines +125 to +126
src = src.replace('return a + b;', 'return b + a;');
fs.writeFileSync(p, src);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Silent mutation failure could cause false-positive passes

The .replace() call in Scenario 2 (and several others) doesn't verify that it actually found and replaced the target string. If the fixture math.js is ever updated so it no longer contains the literal 'return a + b;' (e.g., reformatted to return a+b;, or the add function is renamed), the mutation silently does nothing. Both fullDir and incrDir then contain identical, untouched source, so the comparison trivially passes — the test now gives a green result while providing zero coverage.

The same silent-failure risk affects:

  • Line 148-151: src.replace('module.exports = { add, multiply, square };', ...) in Scenario 3
  • Line 157-160: indexSrc.replace("const { add } = require('./math');", ...) in Scenario 3
  • Line 161-164: indexSrc.replace('console.log(add(1, 2));', ...) in Scenario 3
  • Line 203-206: The four src.replace(...) calls in Scenario 4

Add a post-replace assertion so that a drift in the fixture is caught immediately rather than silently:

Suggested change
src = src.replace('return a + b;', 'return b + a;');
fs.writeFileSync(p, src);
src = src.replace('return a + b;', 'return b + a;');
if (!src.includes('return b + a;')) throw new Error('Mutation failed: target string not found in math.js');
fs.writeFileSync(p, src);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added post-replace assertions to all mutation scenarios (2, 3, and 4). Each replace now validates its result: Scenario 2 checks for the replacement string, Scenario 3 checks each of the three replacements independently, and Scenario 4 compares against a before snapshot to catch the case where none of the four replacements matched.

Comment on lines +33 to +46
function readEdges(dbPath) {
const db = new Database(dbPath, { readonly: true });
const edges = db
.prepare(
`SELECT n1.name AS source_name, n2.name AS target_name, e.kind
FROM edges e
JOIN nodes n1 ON e.source_id = n1.id
JOIN nodes n2 ON e.target_id = n2.id
ORDER BY n1.name, n2.name, e.kind`,
)
.all();
db.close();
return edges;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing try/finally for db.close()

If db.prepare(...).all() throws (e.g., the graph.db file is absent, schema mismatch, or a disk error), the db.close() call on line 44 is skipped and the SQLite handle leaks. The existing readAnalysisTables function in incremental-parity.test.js already uses a try/finally for exactly this reason.

The same issue applies to readNodes (lines 48–53).

Suggested change
function readEdges(dbPath) {
const db = new Database(dbPath, { readonly: true });
const edges = db
.prepare(
`SELECT n1.name AS source_name, n2.name AS target_name, e.kind
FROM edges e
JOIN nodes n1 ON e.source_id = n1.id
JOIN nodes n2 ON e.target_id = n2.id
ORDER BY n1.name, n2.name, e.kind`,
)
.all();
db.close();
return edges;
}
function readEdges(dbPath) {
const db = new Database(dbPath, { readonly: true });
try {
const edges = db
.prepare(
`SELECT n1.name AS source_name, n2.name AS target_name, e.kind
FROM edges e
JOIN nodes n1 ON e.source_id = n1.id
JOIN nodes n2 ON e.target_id = n2.id
ORDER BY n1.name, n2.name, e.kind`,
)
.all();
return edges;
} finally {
db.close();
}
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — both readEdges and readNodes now wrap the query in try/finally to ensure db.close() runs even if the prepare/all call throws.

Comment on lines +1 to +12
/**
* Incremental edge parity CI check.
*
* Verifies that incremental rebuilds produce exactly the same edges as a
* clean full build, across multiple mutation scenarios:
* 1. Comment-only touch (no semantic change)
* 2. Body edit (change implementation, keep exports)
* 3. New export added (structural change)
*
* Uses the sample-project fixture (CJS, classes, cross-file calls) for
* broader edge coverage than the barrel-project fixture.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 JSDoc header missing Scenario 4 (file deletion)

The module-level JSDoc lists only three mutation scenarios, but the file implements four. Scenario 4 — file deletion — is missing from the list.

Suggested change
/**
* Incremental edge parity CI check.
*
* Verifies that incremental rebuilds produce exactly the same edges as a
* clean full build, across multiple mutation scenarios:
* 1. Comment-only touch (no semantic change)
* 2. Body edit (change implementation, keep exports)
* 3. New export added (structural change)
*
* Uses the sample-project fixture (CJS, classes, cross-file calls) for
* broader edge coverage than the barrel-project fixture.
*/
/**
* Incremental edge parity CI check.
*
* Verifies that incremental rebuilds produce exactly the same edges as a
* clean full build, across multiple mutation scenarios:
* 1. Comment-only touch (no semantic change)
* 2. Body edit (change implementation, keep exports)
* 3. New export added (structural change)
* 4. File deletion (stale edges must be purged)
*
* Uses the sample-project fixture (CJS, classes, cross-file calls) for
* broader edge coverage than the barrel-project fixture.
*/

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added Scenario 4 (file deletion) to the JSDoc header listing.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

All three Greptile review items addressed:

  1. JSDoc header — Added missing Scenario 4 (file deletion) to the doc comment listing (5c5ba8f)
  2. SQLite handle leaks — Wrapped readEdges and readNodes in try/finally so db.close() runs even if the query throws (e640a4b)
  3. Silent mutation failures — Added post-replace assertions to all .replace() calls in Scenarios 2, 3, and 4. Each mutation now validates its result immediately, so fixture drift surfaces as a clear error instead of a false-positive pass (83d02c2)

All 10 tests pass. Lint clean.

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed Greptile's remaining observation: Scenario 4 now uses per-replacement assertions (e4be883) instead of the weaker collective src === before guard. Each of the 4 .replace() calls validates independently, so a partial mutation failure (e.g., only 2 of 4 matched) is caught immediately rather than silently passing.

All 10 tests pass. Lint clean.

@greptileai

@carlos-alm carlos-alm merged commit 089bfdb into main Mar 20, 2026
15 checks passed
@carlos-alm carlos-alm deleted the test/incremental-edge-parity branch March 20, 2026 06:39
@github-actions github-actions bot locked and limited conversation to collaborators Mar 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant