feat: [ENG-2060] Consolidate operation — merge, temporal update, cross-reference#387
feat: [ENG-2060] Consolidate operation — merge, temporal update, cross-reference#387danhdoan merged 3 commits intoproj/dreamingfrom
Conversation
…s-reference First LLM-driven dream operation. Groups changed files by domain, finds related files via BM25 search + path siblings, classifies relationships through LLM, and executes actions: - MERGE: combine redundant files, delete sources, add consolidated_from and consolidated_at frontmatter programmatically - TEMPORAL_UPDATE: rewrite contradictory content with temporal narrative, confidence-gated review (< 0.7 → needsReview) - CROSS_REFERENCE: add bidirectional related links in YAML frontmatter - SKIP: no-op for unrelated files Also fixes three bugs found during interactive testing: - Use global storagePath for curate log store (not projectPath/.brv) - Extract filePath (absolute) from curate log ops, not path (domain) - First dream (lastDreamAt=null) scans all curate logs, not none Schema change: added optional confidence field to ConsolidationActionSchema.
|
Claude finished @RyanNg1403's task in 4m 13s —— View job PR Review: feat: [ENG-2060] Consolidate operationTasks
SummaryScore: 3.5 / 5 Solid first implementation of the consolidate operation — the domain-grouping approach, BM25 + sibling discovery, atomic writes, and partial-failure resilience are all well-designed. The three bug fixes (curate log path, There are two correctness issues that should be fixed before merge, plus a security concern and a few minor items. Issues (must fix)1. Timeout in
2. Silent empty-file writes on missing LLM content (
Security concern3. Path traversal in if (!absolutePath.startsWith('/')) return absolutePathReturns relative paths from curate logs without validation. Minor / suggestions4. CRLF body-start calculation is off ( When the CRLF branch is taken, 5. Frontmatter parsing logic duplicated ( The same ~20-line block for parsing/mutating YAML frontmatter appears in both 6. If 7. Stale comment in The step 4 comment still reads "NO-OP stubs in Phase 1 — consolidate/synthesize/prune added later". Should be updated to reflect that consolidate is now implemented. 8. No payload size guard on For a large domain with many sibling files, 9. Test at line 309 tests the wrong thing ( The test titled "continues processing when file read fails for an action" actually succeeds: What's done well
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86ef47586f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Guard MERGE/TEMPORAL_UPDATE against missing content (prevent data loss) - Wrap createTaskSession in try/catch (preserve fail-open contract) - Normalize Windows paths and reject path traversal in toContextTreeRelative - Cap LLM sandbox payload at 200K chars (matches curate task limit) - Fix misleading test name
|
Claude finished @RyanNg1403's task in 3m 24s —— View job PR Review: feat: [ENG-2060] Consolidate operation (follow-up review)Tasks
Score: 4 / 5Good progress in What was fixed ✅
Still open (must fix)Timeout is inoperative (
Remaining nitpicks / suggestionsCRLF body-start off-by-one ( When the CRLF branch is taken Duplicated frontmatter parsing ( The same ~25-line parse/mutate/serialize block exists in
If Stale JSDoc header ( The class-level What's done well
|
The 5-minute timeout in DreamExecutor created an AbortController but never passed the signal to consolidate(), making the budget inoperative.
|
Claude finished @RyanNg1403's task in 2m 55s —— View job PR Review: feat: [ENG-2060] Consolidate operation (third review)
|
| Previous issue | Status |
|---|---|
Timeout inoperative — signal never reached consolidate() |
Fixed — signal: controller.signal is now passed; loop breaks on signal.aborted |
One remaining item (should fix)
partial log entry is still unreachable (dream-executor.ts line 116 — inline comment posted)
The signal is now checked correctly between domain iterations, but when the budget fires and the loop breaks, consolidate() returns normally (no exception). The catch block is never entered, so the partial log branch remains dead code. The dream finishes with status: 'completed' even when half the domains were skipped due to timeout.
Fix: add an explicit if (controller.signal.aborted) check after consolidate() returns to write the partial entry and return early. See inline suggestion.
Carry-over nitpicks (unchanged from previous reviews)
| Item | Location | Status |
|---|---|---|
CRLF bodyStart off-by-one |
consolidate.ts lines 173, 475 |
Still open (inline comment posted) |
| Duplicated frontmatter parsing | consolidate.ts lines 166–192 and 468–495 |
Still open (inline comment posted) |
atomicWrite orphans .tmp on rename failure |
consolidate.ts line 122 |
Still open (inline comment posted) |
Stale JSDoc header (NO-OP stubs in Phase 1) |
dream-executor.ts line 8 |
Still open (not in diff — manual edit) |
What's done well
- The signal check at line 53 (
if (deps.signal?.aborted) break) is the right place — domains are sequential, so this is a natural, safe preemption point - The new test (
stops processing domains when signal is aborted) correctly verifies the abort is honoured between domains and usescreateTaskSession.callCountas a clean proxy ConsolidateDeps.signalis optional (signal?: AbortSignal), preserving backwards compatibility for callers that don't need cancellation- All previous must-fix items (data-loss guards, path traversal, payload cap, session leak) remain solid

Summary
consolidate.ts(~430 LoC): domain grouping, BM25 search + sibling discovery, LLM classification via agent sessions, action execution (MERGE, TEMPORAL_UPDATE, CROSS_REFERENCE, SKIP)consolidated_from+consolidated_atfrontmatter programmaticallyrelatedlinks in YAML frontmattersearchServiceto DreamExecutorDeps, threaded through agent-process.tsstoragePath(notprojectPath/.brv)findChangedFilesSinceLastDreamto useop.filePath(absolute) and scan all logs on first dreamconfidencefield toConsolidationActionSchemaType of change
Scope (select all touched areas)
Linked issues
Root cause (bug fixes only, otherwise write
N/A)Three bugs found during interactive testing:
FileCurateLogStorewas instantiated withprojectPath/.brvbut curate logs are stored in globalstoragePathfindChangedFilesSinceLastDreamusedop.path(domain path like "security/authentication") instead ofop.filePath(absolute file path)lastDreamAt=nullshort-circuited to empty set instead of scanning all curate logsTest plan
test/unit/infra/dream/operations/consolidate.test.ts,test/unit/infra/executor/dream-executor.test.tsUser-visible changes
brv dreamnow performs actual consolidation: merges redundant files, updates contradictory content, adds cross-referencesconsolidated_fromandconsolidated_atYAML frontmatterrelatedlinksEvidence
Checklist
npm test)npm run lint)npm run typecheck)npm run build)proj/dreamingRisks and mitigations