fix(T4): optimize incremental index — O(degree) re-add scan + single bitmap deser#52
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe updater was made stateful with an adjacency cache to selectively restore edges for re-added nodes and avoid full alive-edge rescans; bitmap purge now deserializes once, mutates in place, and reserializes for per-bucket operations. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/domain/services/GCPolicy.js (1)
124-129: Preserve the original compaction failure detail in error context.Right now the catch path drops the underlying exception text, which makes triage harder in production logs.
Suggested patch
- } catch { + } catch (err) { throw new WarpError( `GC compaction failed during ${nodesDone ? 'edgeAlive' : 'nodeAlive'} phase`, 'E_GC_COMPACT_FAILED', - { context: { phase: nodesDone ? 'edgeAlive' : 'nodeAlive', partialCompaction: nodesDone } }, + { + context: { + phase: nodesDone ? 'edgeAlive' : 'nodeAlive', + partialCompaction: nodesDone, + cause: err instanceof Error ? err.message : String(err), + }, + }, ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/domain/services/GCPolicy.js` around lines 124 - 129, The catch currently drops the original exception; change the catch block to catch the error (e.g., catch (err)) and include the original error details in the thrown WarpError for triage — update the WarpError call in GCPolicy.js (same block referencing nodesDone and the phase) to add an originalError (and/or originalStack) field inside the context object (e.g., { context: { phase: nodesDone ? 'edgeAlive' : 'nodeAlive', partialCompaction: nodesDone, originalError: err?.message || String(err), originalStack: err?.stack } }).src/domain/services/JoinReducer.js (1)
707-713: Avoid double-validation in diff/receipt paths.Line 707 and Line 736 validate ops, and
applyOpV2validates again at Line 213. Consider a validated internal apply path to keep these loops lean.Also applies to: 736-737
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/domain/services/JoinReducer.js` around lines 707 - 713, validateOp is being run in the outer loop and applyOpV2 re-validates again, causing redundant checks; change applyOpV2 to expose or add an internal fast-path (e.g., applyOpV2Internal or an optional parameter like {skipValidate: true}) that skips validation, then call that fast-path from the internal code paths that already run validateOp (the loop that calls validateOp, createEventId, snapshotBeforeOp, applyOpV2, accumulateOpDiff and the similar receipt path), while keeping the public applyOpV2 behavior unchanged for external callers so external callers still validate.test/unit/domain/services/JoinReducer.validation.test.js (1)
167-210: Add regression cases for currently failing compatibility shapes.Given current failures, this suite should also cover
observedDotsasSetand the NodeAdd dot shape produced by materialize/auto-checkpoint paths so these regressions stay caught.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/unit/domain/services/JoinReducer.validation.test.js` around lines 167 - 210, Add regression tests to cover compatibility shapes: in test/unit/domain/services/JoinReducer.validation.test.js extend the NodeRemove and EdgeRemove suites to include cases where observedDots is a Set (e.g., new Set([...])) and assert applyOpV2(state(), /* op */ , eid) does not throw PatchError; also add a NodeAdd test that supplies the alternative dot shape produced by materialize/auto-checkpoint paths (the non-array/object dot representation used in those paths) and assert applyOpV2(state(), /* NodeAdd op with that dot */, eid) behaves as expected (not throwing), using the existing helpers createDot, state(), applyOpV2 and PatchError to locate where to add the cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/domain/services/JoinReducer.js`:
- Around line 125-132: The observedDots validation currently throws in
requireArray when op[field] isn't an Array, rejecting Set-based ops; update the
checks to accept either Array or Set (Array.isArray(op[field]) || op[field]
instanceof Set) and then normalize the value to an Array (e.g.,
Array.from(op[field])) before passing into orsetRemove and any other callers;
change requireArray (or replace with requireIterableAllowingSet) to permit Sets
and ensure callers like orsetRemove, the observedDots handling in JoinReducer
methods, and the other similar sites (the blocks around the existing
requireArray usages) receive the normalized Array instead of the raw Set.
- Around line 138-159: The requireDot function is too strict (it assumes dot has
writerId and counter fields) and causes E_PATCH_MALFORMED during
reduceV5/materialize; replace the manual field assertions in requireDot with the
project's canonical dot validation routine (e.g., call the existing
isCanonicalDot/validateCanonicalDot helper or central validator used elsewhere)
so dots are validated consistently, and throw PatchError only when that
canonical validator fails (preserve the error context/opType). Target the
requireDot function and ensure reduceV5/materialize paths rely on this canonical
check rather than hard-coded writerId/counter type checks.
---
Nitpick comments:
In `@src/domain/services/GCPolicy.js`:
- Around line 124-129: The catch currently drops the original exception; change
the catch block to catch the error (e.g., catch (err)) and include the original
error details in the thrown WarpError for triage — update the WarpError call in
GCPolicy.js (same block referencing nodesDone and the phase) to add an
originalError (and/or originalStack) field inside the context object (e.g., {
context: { phase: nodesDone ? 'edgeAlive' : 'nodeAlive', partialCompaction:
nodesDone, originalError: err?.message || String(err), originalStack: err?.stack
} }).
In `@src/domain/services/JoinReducer.js`:
- Around line 707-713: validateOp is being run in the outer loop and applyOpV2
re-validates again, causing redundant checks; change applyOpV2 to expose or add
an internal fast-path (e.g., applyOpV2Internal or an optional parameter like
{skipValidate: true}) that skips validation, then call that fast-path from the
internal code paths that already run validateOp (the loop that calls validateOp,
createEventId, snapshotBeforeOp, applyOpV2, accumulateOpDiff and the similar
receipt path), while keeping the public applyOpV2 behavior unchanged for
external callers so external callers still validate.
In `@test/unit/domain/services/JoinReducer.validation.test.js`:
- Around line 167-210: Add regression tests to cover compatibility shapes: in
test/unit/domain/services/JoinReducer.validation.test.js extend the NodeRemove
and EdgeRemove suites to include cases where observedDots is a Set (e.g., new
Set([...])) and assert applyOpV2(state(), /* op */ , eid) does not throw
PatchError; also add a NodeAdd test that supplies the alternative dot shape
produced by materialize/auto-checkpoint paths (the non-array/object dot
representation used in those paths) and assert applyOpV2(state(), /* NodeAdd op
with that dot */, eid) behaves as expected (not throwing), using the existing
helpers createDot, state(), applyOpV2 and PatchError to locate where to add the
cases.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
bin/warp-graph.jsindex.d.tssrc/domain/errors/PatchError.jssrc/domain/errors/index.jssrc/domain/services/GCPolicy.jssrc/domain/services/IncrementalIndexUpdater.jssrc/domain/services/JoinReducer.jssrc/domain/services/PatchBuilderV2.jstest/unit/domain/services/GCPolicy.test.jstest/unit/domain/services/IncrementalIndexUpdater.test.jstest/unit/domain/services/JoinReducer.validation.test.js
Release Preflight
If you tag this commit as |
Release Preflight
If you tag this commit as |
|
@coderabbitai follow-up on the two outdated review threads:
Both threads were raised against older commit context ( |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 2361
Both threads are indeed resolved against the current PR head. The implementation looks solid! |
Release Preflight
If you tag this commit as |
Release Preflight
If you tag this commit as |
|
@coderabbitai review please 🥕 |
|
✅ Actions performedReview triggered.
|
|
@coderabbitai review please 🥕 |
|
✅ Actions performedReview triggered.
|
Release Preflight
If you tag this commit as |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/domain/services/IncrementalIndexUpdater.js (1)
111-130:⚠️ Potential issue | 🟠 MajorAdjacency cache can go stale across non-readd diffs.
Cache reconciliation currently runs only in the re-add branch. If edge transitions happen while
readdedNodes.size === 0, a previously cached adjacency map can drift and later drive incorrect restore candidates.🔧 Proposed fix (sync cached adjacency on every diff once initialized)
@@ - if (readdedNodes.size > 0) { + let readdAdjacency = null; + if (readdedNodes.size > 0 || this._edgeAdjacencyCache.has(state.edgeAlive)) { + readdAdjacency = this._getOrBuildAliveEdgeAdjacency(state, diff); + } + + if (readdedNodes.size > 0 && readdAdjacency) { @@ - for (const edgeKey of this._collectReaddedEdgeKeys(state, diff, readdedNodes)) { + for (const edgeKey of this._collectReaddedEdgeKeys(readdAdjacency, readdedNodes)) { @@ - _collectReaddedEdgeKeys(state, diff, readdedNodes) { - const adjacency = this._getOrBuildAliveEdgeAdjacency(state, diff); + _collectReaddedEdgeKeys(adjacency, readdedNodes) { const keys = new Set();Also applies to: 762-775, 787-819
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/domain/services/IncrementalIndexUpdater.js` around lines 111 - 130, The adjacency cache reconciliation currently runs only when readdedNodes.size > 0, which lets fwd/rev adjacency caches drift on diffs with no re-adds; move or extend the reconciliation so it runs for every diff once the adjacency caches have been initialized: detect initialized state (existing adjacency/init flag or presence of metaCache/fwdCache/revCache) and for diffs where readdedNodes.size === 0, walk diff.edgesAdded and diff.edgesRemoved and update caches the same way you do in the re-add branch (use the same label initialization via _ensureLabel and call _handleEdgeAdd and your edge-removal handler to update fwdCache/revCache and metaCache), and still skip edges present in diff.edgesAdded (use the same diffEdgeSet logic) and respect orsetContains(nodeAlive) checks; this keeps the adjacency maps in sync across all diffs without duplicating reconciliation logic beyond reusing _collectReaddedEdgeKeys/_handleEdgeAdd and the corresponding edge-remove code paths.
🧹 Nitpick comments (2)
src/domain/services/IncrementalIndexUpdater.js (1)
43-44: Update the service contract docs to reflect stateful caching.With
_edgeAdjacencyCache, this service is no longer strictly stateless; updating the class/module description would avoid lifecycle confusion.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/domain/services/IncrementalIndexUpdater.js` around lines 43 - 44, Update the class/module JSDoc for IncrementalIndexUpdater to reflect that it is stateful due to the instance property _edgeAdjacencyCache (WeakMap) — state is kept per-instance and used for caching edge adjacency maps; explicitly document cache purpose, lifetime (tied to the updater instance), concurrency/invalidations expectations, and any lifecycle implications for callers (e.g., reusing vs recreating instances). Reference the IncrementalIndexUpdater class and the _edgeAdjacencyCache field so maintainers know where state lives and how it affects service usage.test/unit/domain/services/IncrementalIndexUpdater.test.js (1)
170-207: Add one multi-diff cache-coherency regression on a reused updater instance.Please add a sequence that: (1) initializes adjacency cache, (2) applies edge add/remove diffs without re-added nodes, then (3) re-adds a node and asserts restoration uses current edgeAlive membership. This will lock in cache correctness.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/unit/domain/services/IncrementalIndexUpdater.test.js` around lines 170 - 207, The test needs an added scenario that exercises cache coherency on a reused IncrementalIndexUpdater instance: create an initial state/tree with a node and edges (use buildState/buildTree), call updater.computeDirtyShards with an edgesAdded/edgesRemoved diff to populate adjacency/edgeAlive cache (use the same IncrementalIndexUpdater instance), then apply a subsequent diff that re-adds a previously removed node and call computeDirtyShards again; finally merge dirtyShards into the tree (like const tree2 = { ...tree1, ...dirtyShards }), read the index with readIndex and assert that the restored edges for the re-added node reflect the current edgeAlive membership (use index.isAlive and index.getEdges checks similar to existing assertions) to ensure re-add restoration uses up-to-date cache state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/domain/services/IncrementalIndexUpdater.js`:
- Around line 111-130: The adjacency cache reconciliation currently runs only
when readdedNodes.size > 0, which lets fwd/rev adjacency caches drift on diffs
with no re-adds; move or extend the reconciliation so it runs for every diff
once the adjacency caches have been initialized: detect initialized state
(existing adjacency/init flag or presence of metaCache/fwdCache/revCache) and
for diffs where readdedNodes.size === 0, walk diff.edgesAdded and
diff.edgesRemoved and update caches the same way you do in the re-add branch
(use the same label initialization via _ensureLabel and call _handleEdgeAdd and
your edge-removal handler to update fwdCache/revCache and metaCache), and still
skip edges present in diff.edgesAdded (use the same diffEdgeSet logic) and
respect orsetContains(nodeAlive) checks; this keeps the adjacency maps in sync
across all diffs without duplicating reconciliation logic beyond reusing
_collectReaddedEdgeKeys/_handleEdgeAdd and the corresponding edge-remove code
paths.
---
Nitpick comments:
In `@src/domain/services/IncrementalIndexUpdater.js`:
- Around line 43-44: Update the class/module JSDoc for IncrementalIndexUpdater
to reflect that it is stateful due to the instance property _edgeAdjacencyCache
(WeakMap) — state is kept per-instance and used for caching edge adjacency maps;
explicitly document cache purpose, lifetime (tied to the updater instance),
concurrency/invalidations expectations, and any lifecycle implications for
callers (e.g., reusing vs recreating instances). Reference the
IncrementalIndexUpdater class and the _edgeAdjacencyCache field so maintainers
know where state lives and how it affects service usage.
In `@test/unit/domain/services/IncrementalIndexUpdater.test.js`:
- Around line 170-207: The test needs an added scenario that exercises cache
coherency on a reused IncrementalIndexUpdater instance: create an initial
state/tree with a node and edges (use buildState/buildTree), call
updater.computeDirtyShards with an edgesAdded/edgesRemoved diff to populate
adjacency/edgeAlive cache (use the same IncrementalIndexUpdater instance), then
apply a subsequent diff that re-adds a previously removed node and call
computeDirtyShards again; finally merge dirtyShards into the tree (like const
tree2 = { ...tree1, ...dirtyShards }), read the index with readIndex and assert
that the restored edges for the re-added node reflect the current edgeAlive
membership (use index.isAlive and index.getEdges checks similar to existing
assertions) to ensure re-add restoration uses up-to-date cache state.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
BACKLOG.mdCHANGELOG.mdROADMAP.mdSTANK.mdsrc/domain/services/IncrementalIndexUpdater.jstest/unit/domain/services/IncrementalIndexUpdater.test.js
✅ Files skipped from review due to trivial changes (1)
- CHANGELOG.md
Release Preflight
If you tag this commit as |
|
@coderabbitai review please 🥕 |
|
✅ Actions performedReview triggered.
|
# Conflicts: # BACKLOG.md # ROADMAP.md
Release Preflight
If you tag this commit as |
Summary
_purgeNodeEdgesby deserializing once per bitmap and clearing in placeedgeAliveVerification
Summary by CodeRabbit
Bug Fixes
Tests
Documentation
Chores