Skip to content

feat: [ENG-2060] Consolidate operation — merge, temporal update, cross-reference#387

Merged
danhdoan merged 3 commits intoproj/dreamingfrom
feat/ENG-2060
Apr 13, 2026
Merged

feat: [ENG-2060] Consolidate operation — merge, temporal update, cross-reference#387
danhdoan merged 3 commits intoproj/dreamingfrom
feat/ENG-2060

Conversation

@RyanNg1403
Copy link
Copy Markdown
Collaborator

Summary

  • Problem: Dream executor had stub operations — no actual memory consolidation happened
  • Why it matters: Consolidate is the first LLM-driven operation and the foundation for synthesize and prune. Without it, dreaming produces empty results.
  • What changed:
    • New consolidate.ts (~430 LoC): domain grouping, BM25 search + sibling discovery, LLM classification via agent sessions, action execution (MERGE, TEMPORAL_UPDATE, CROSS_REFERENCE, SKIP)
    • MERGE: writes merged content, deletes sources, adds consolidated_from + consolidated_at frontmatter programmatically
    • TEMPORAL_UPDATE: rewrites with temporal narrative, confidence-gated review (< 0.7 threshold)
    • CROSS_REFERENCE: bidirectional related links in YAML frontmatter
    • Wired into DreamExecutor step 4 (replaces stub)
    • Added searchService to DreamExecutorDeps, threaded through agent-process.ts
    • Fixed curate log store to use global storagePath (not projectPath/.brv)
    • Fixed findChangedFilesSinceLastDream to use op.filePath (absolute) and scan all logs on first dream
    • Added optional confidence field to ConsolidationActionSchema
  • What did NOT change (scope boundary): Synthesize and prune remain stubs (ENG-2061/ENG-2062). No review system integration. No idle trigger wiring.

Type of change

  • New feature

Scope (select all touched areas)

  • Server / Daemon
  • Agent / Tools
  • CLI Commands (oclif)

Linked issues

  • Related ENG-2060 (child of ENG-2059)
  • Depends on ENG-2065, ENG-2066, ENG-2067, ENG-2068

Root cause (bug fixes only, otherwise write N/A)

Three bugs found during interactive testing:

  • Root cause 1: FileCurateLogStore was instantiated with projectPath/.brv but curate logs are stored in global storagePath
  • Root cause 2: findChangedFilesSinceLastDream used op.path (domain path like "security/authentication") instead of op.filePath (absolute file path)
  • Root cause 3: First dream returned empty changed files because lastDreamAt=null short-circuited to empty set instead of scanning all curate logs

Test plan

  • Coverage added:
    • Unit test
    • Manual verification only
  • Test file(s): test/unit/infra/dream/operations/consolidate.test.ts, test/unit/infra/executor/dream-executor.test.ts
  • Key scenario(s) covered:
    • Unit (16 new tests): empty input, domain grouping, BM25 search, MERGE/TEMPORAL_UPDATE/CROSS_REFERENCE/SKIP execution, previousTexts, needsReview rules (MERGE=always, TEMPORAL_UPDATE=confidence, CROSS_REF=core maturity), frontmatter metadata, LLM failure resilience, file read failure, session cleanup, sibling discovery, high-confidence skip
    • Interactive (2 dream cycles): 10 curations across 4 domains, verified MERGE file deletion + frontmatter, CROSS_REFERENCE bidirectional links, TEMPORAL_UPDATE enrichment, "since last dream" scoping, gate logic, dream state lifecycle

User-visible changes

  • brv dream now performs actual consolidation: merges redundant files, updates contradictory content, adds cross-references
  • Dream log entries contain real operation details (action, inputFiles, reason, needsReview, previousTexts)
  • Merged files include consolidated_from and consolidated_at YAML frontmatter
  • Cross-referenced files include bidirectional related links

Evidence

  • 148 dream tests passing (14 consolidate + 11 executor + 121 existing + 2 new)
  • Interactive: Dream Cycle 1 (8 curations → 6 ops, 73s), Dream Cycle 2 (2 curations → 4 ops, 54s), zero errors

Checklist

  • Tests added or updated and passing (npm test)
  • Lint passes (npm run lint)
  • Type check passes (npm run typecheck)
  • Build succeeds (npm run build)
  • Commits follow Conventional Commits format
  • No breaking changes (or clearly documented above)
  • Branch is up to date with proj/dreaming

Risks and mitigations

  • Risk: LLM may merge derived artifacts (.abstract.md, .overview.md) into main files but leave orphaned derivatives from other files
    • Mitigation: Prune operation (ENG-2062) will handle cleanup of orphaned derived artifacts
  • Risk: Consolidate prompt quality affects operation accuracy
    • Mitigation: All MERGE operations are flagged for human review (needsReview=true)

…s-reference

First LLM-driven dream operation. Groups changed files by domain, finds
related files via BM25 search + path siblings, classifies relationships
through LLM, and executes actions:

- MERGE: combine redundant files, delete sources, add consolidated_from
  and consolidated_at frontmatter programmatically
- TEMPORAL_UPDATE: rewrite contradictory content with temporal narrative,
  confidence-gated review (< 0.7 → needsReview)
- CROSS_REFERENCE: add bidirectional related links in YAML frontmatter
- SKIP: no-op for unrelated files

Also fixes three bugs found during interactive testing:
- Use global storagePath for curate log store (not projectPath/.brv)
- Extract filePath (absolute) from curate log ops, not path (domain)
- First dream (lastDreamAt=null) scans all curate logs, not none

Schema change: added optional confidence field to ConsolidationActionSchema.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 13, 2026

Claude finished @RyanNg1403's task in 4m 13s —— View job


PR Review: feat: [ENG-2060] Consolidate operation

Tasks

  • Gather context and read changed files
  • Review consolidate.ts (core implementation)
  • Review dream-executor.ts (integration)
  • Review agent-process.ts (deps wiring)
  • Review test files
  • Review dream-response-schemas.ts
  • Post inline comments and summary

Summary

Score: 3.5 / 5

Solid first implementation of the consolidate operation — the domain-grouping approach, BM25 + sibling discovery, atomic writes, and partial-failure resilience are all well-designed. The three bug fixes (curate log path, op.filePath vs op.path, first-dream empty set) are correct. Test coverage is good for the happy paths.

There are two correctness issues that should be fixed before merge, plus a security concern and a few minor items.


Issues (must fix)

1. Timeout in dream-executor.ts is inoperative (dream-executor.ts lines 73–74)

AbortController is created and controller.abort() is scheduled, but consolidate() and all other awaited operations never receive controller.signal. The 5-minute budget sets a flag and nothing else — the partial log-entry branch (line 159) is unreachable under normal timeout-exceeded conditions. Either pass the signal to consolidate() and honour it between domain iterations, or race the main body against a rejection promise.

2. Silent empty-file writes on missing LLM content (consolidate.ts lines 309, 356 — inline comments posted)

executeMerge does action.mergedContent ?? '' and executeTemporalUpdate does action.updatedContent ?? ''. Both fields are optional() in the Zod schema, so the LLM can legally omit them. The result is the target file gets overwritten with just frontmatter and no body — and source files are deleted — leaving a blank file with no way to recover. Guard before writing and throw (which the surrounding try/catch will swallow gracefully).


Security concern

3. Path traversal in toContextTreeRelative (dream-executor.ts line 262 — inline comment posted)

if (!absolutePath.startsWith('/')) return absolutePath

Returns relative paths from curate logs without validation. path.join(contextTreeDir, '../../secret.md') resolves outside the tree. Remove the relative-path passthrough; only accept absolute paths rooted under contextTreeDir.


Minor / suggestions

4. CRLF body-start calculation is off (consolidate.ts line 128 — inline comment posted)

When the CRLF branch is taken, bodyStart lands on the closing --- delimiter rather than after it, so body includes ---\r\n<real body>. Since the function already normalises output to LF, the CRLF handling is mostly cosmetic — but either fix it or remove it to avoid a misleading false sense of cross-platform support.

5. Frontmatter parsing logic duplicated (consolidate.ts lines 120–147 and 415–442 — inline comment posted)

The same ~20-line block for parsing/mutating YAML frontmatter appears in both addFrontmatterFields and addRelatedLinks. The CRLF bug mentioned above exists identically in both copies. Extract a shared parseFrontmatter(content) helper.

6. atomicWrite doesn't clean up .tmp on rename failure (consolidate.ts line 114 — inline comment posted)

If rename throws after a successful writeFile, the .tmp file is orphaned in the same directory. Add a try/catch with unlink(tmpPath).catch(() => {}) cleanup.

7. Stale comment in dream-executor.ts (line 4 — couldn't post inline; the line wasn't in the diff)

The step 4 comment still reads "NO-OP stubs in Phase 1 — consolidate/synthesize/prune added later". Should be updated to reflect that consolidate is now implemented.

8. No payload size guard on setSandboxVariableOnSession (consolidate.ts line 80 — inline comment posted)

For a large domain with many sibling files, filesPayload can grow beyond model context limits. A simple byte-cap (e.g., 200 KB total) with eviction of low-relevance files would prevent silent truncation or latency spikes.

9. Test at line 309 tests the wrong thing (consolidate.test.ts — inline comment posted)

The test titled "continues processing when file read fails for an action" actually succeeds: atomicWrite creates the missing parent directory and writes the output file. The MERGE completes successfully. The assertion expect(results).to.be.an('array') is always true. Tighten the assertion or inject a real write failure to test the intended resilience path.


What's done well

  • Domain-grouping and sequential execution correctly prevents concurrent file-write races
  • agent.deleteTaskSession is called in finally — session leak is impossible
  • determineNeedsReview logic (MERGE=always, TEMPORAL_UPDATE=confidence-gated, CROSS_REF=maturity-gated) is clean and well-tested
  • The three bug fixes in findChangedFilesSinceLastDream and FileCurateLogStore path are correct
  • 16 unit tests cover the main execution paths; the parseDreamResponse two-strategy fallback handles LLM formatting variance well

@RyanNg1403 RyanNg1403 requested review from danhdoan April 13, 2026 12:03
Comment thread src/server/infra/dream/operations/consolidate.ts
Comment thread src/server/infra/dream/operations/consolidate.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 86ef47586f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/server/infra/dream/operations/consolidate.ts Outdated
Comment thread src/server/infra/dream/operations/consolidate.ts Outdated
Comment thread src/server/infra/dream/operations/consolidate.ts Outdated
Comment thread src/server/infra/executor/dream-executor.ts Outdated
Comment thread src/server/infra/dream/operations/consolidate.ts
Comment thread src/server/infra/dream/operations/consolidate.ts
Comment thread src/server/infra/executor/dream-executor.ts
Comment thread src/server/infra/dream/operations/consolidate.ts Outdated
Comment thread src/server/infra/dream/operations/consolidate.ts
Comment thread test/unit/infra/dream/operations/consolidate.test.ts Outdated
- Guard MERGE/TEMPORAL_UPDATE against missing content (prevent data loss)
- Wrap createTaskSession in try/catch (preserve fail-open contract)
- Normalize Windows paths and reject path traversal in toContextTreeRelative
- Cap LLM sandbox payload at 200K chars (matches curate task limit)
- Fix misleading test name
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 13, 2026

Claude finished @RyanNg1403's task in 3m 24s —— View job


PR Review: feat: [ENG-2060] Consolidate operation (follow-up review)

Tasks

  • Gather context from PR and previous review
  • Read changed files from latest commit (e7d7f81)
  • Verify previous review issues were addressed
  • Review consolidate.ts — correctness, security, robustness
  • Review dream-executor.ts — integration, timeout, path traversal fix
  • Review test files — coverage quality
  • Post inline comments and summary

Score: 4 / 5

Good progress in e7d7f81 — the data-loss issues, path traversal, payload cap, and createTaskSession error handling are all fixed. One must-fix item from the first review remains open.


What was fixed ✅

Previous issue Status
Silent data loss on missing mergedContent / updatedContent Fixed — throws before writing
Path traversal in toContextTreeRelative Fixed — normalises separators, rejects ..
Payload size guard on LLM sandbox Fixed — 200 K char cap with changed-file priority
createTaskSession error not isolated Fixed — wrapped in try/catch, returns []
Misleading test name (file read fails) Fixed — renamed to match actual behaviour

Still open (must fix)

Timeout is inoperative (dream-executor.ts line 109 — inline comment posted)

AbortController is still created and scheduled but the signal is never passed to consolidate(). The 5-minute budget has zero effect on an in-flight LLM call. The partial log-entry branch remains unreachable unless consolidate coincidentally throws at the same moment the timer fires. The inline comment includes a minimal fix.


Remaining nitpicks / suggestions

CRLF body-start off-by-one (consolidate.ts line 171 — inline comment posted)

When the CRLF branch is taken bodyStart lands on the closing --- delimiter, so body includes ---\r\n<real content>. Since output is always normalised to \n anyway, either fix the arithmetic (suggestion in inline comment) or remove the CRLF branch entirely. The same bug exists in addRelatedLinks.

Duplicated frontmatter parsing (consolidate.ts lines 466–493 — inline comment posted)

The same ~25-line parse/mutate/serialize block exists in addFrontmatterFields and addRelatedLinks. Any fix (including the CRLF one above) needs to be applied twice. Extract a parseFrontmatter() helper.

atomicWrite orphans .tmp on rename failure (consolidate.ts line 121 — inline comment posted)

If rename throws after a successful writeFile, the .tmp file is left on disk. One try/catch + unlink(tmpPath).catch(() => {}) fixes it.

Stale JSDoc header (dream-executor.ts line 8)

The class-level @remarks still says "NO-OP stubs in Phase 1 — consolidate/synthesize/prune added later". Couldn't post an inline comment as this line isn't in the diff, but it should be updated to reflect that consolidate is now live.


What's done well

  • createTaskSession + deleteTaskSession are now correctly bracketed in try/finally — no session leaks possible
  • toContextTreeRelative normalises backslashes and validates .. traversal for both absolute and relative inputs
  • capPayloadSize correctly prioritises changed files and evicts siblings/search-results first
  • All three previously flagged bug fixes (FileCurateLogStore path, op.filePath, first-dream empty set) are solid
  • Test suite is clean and the renamed test now accurately describes what it covers

Comment thread src/server/infra/executor/dream-executor.ts
Comment thread src/server/infra/dream/operations/consolidate.ts
Comment thread src/server/infra/dream/operations/consolidate.ts
Comment thread src/server/infra/dream/operations/consolidate.ts
The 5-minute timeout in DreamExecutor created an AbortController but
never passed the signal to consolidate(), making the budget inoperative.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 13, 2026

Claude finished @RyanNg1403's task in 2m 55s —— View job


PR Review: feat: [ENG-2060] Consolidate operation (third review)

Tasks

  • Gather context from PR and previous reviews
  • Read latest commit changes (9a02eb4)
  • Verify timeout fix (previously must-fix)
  • Check remaining nitpicks from second review
  • Review full changed files for any new issues
  • Post inline comments and summary

Score: 4.5 / 5

Good fix — the signal is now correctly wired from DreamExecutor through ConsolidateDeps into the domain-iteration loop. One small follow-up is needed to make the partial log entry reachable.


What was fixed ✅

Previous issue Status
Timeout inoperative — signal never reached consolidate() Fixedsignal: controller.signal is now passed; loop breaks on signal.aborted

One remaining item (should fix)

partial log entry is still unreachable (dream-executor.ts line 116 — inline comment posted)

The signal is now checked correctly between domain iterations, but when the budget fires and the loop breaks, consolidate() returns normally (no exception). The catch block is never entered, so the partial log branch remains dead code. The dream finishes with status: 'completed' even when half the domains were skipped due to timeout.

Fix: add an explicit if (controller.signal.aborted) check after consolidate() returns to write the partial entry and return early. See inline suggestion.


Carry-over nitpicks (unchanged from previous reviews)

Item Location Status
CRLF bodyStart off-by-one consolidate.ts lines 173, 475 Still open (inline comment posted)
Duplicated frontmatter parsing consolidate.ts lines 166–192 and 468–495 Still open (inline comment posted)
atomicWrite orphans .tmp on rename failure consolidate.ts line 122 Still open (inline comment posted)
Stale JSDoc header (NO-OP stubs in Phase 1) dream-executor.ts line 8 Still open (not in diff — manual edit)

What's done well

  • The signal check at line 53 (if (deps.signal?.aborted) break) is the right place — domains are sequential, so this is a natural, safe preemption point
  • The new test (stops processing domains when signal is aborted) correctly verifies the abort is honoured between domains and uses createTaskSession.callCount as a clean proxy
  • ConsolidateDeps.signal is optional (signal?: AbortSignal), preserving backwards compatibility for callers that don't need cancellation
  • All previous must-fix items (data-loss guards, path traversal, payload cap, session leak) remain solid

Comment thread src/server/infra/executor/dream-executor.ts
Comment thread src/server/infra/dream/operations/consolidate.ts
Comment thread src/server/infra/dream/operations/consolidate.ts
Comment thread src/server/infra/dream/operations/consolidate.ts
Copy link
Copy Markdown
Collaborator

@danhdoan danhdoan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lsgtm

@danhdoan danhdoan merged commit 98b05e9 into proj/dreaming Apr 13, 2026
6 checks passed
@RyanNg1403 RyanNg1403 deleted the feat/ENG-2060 branch April 14, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants