Skip to content

feat: [ENG-2062] prune operation — staleness scoring and archival#414

Merged
danhdoan merged 3 commits intoproj/dreamingfrom
feat/ENG-2062
Apr 15, 2026
Merged

feat: [ENG-2062] prune operation — staleness scoring and archival#414
danhdoan merged 3 commits intoproj/dreamingfrom
feat/ENG-2062

Conversation

@RyanNg1403
Copy link
Copy Markdown
Collaborator

Summary

  • Implement the Prune dreaming operation: two-signal candidate detection (importance decay + mtime staleness), LLM review, and three decision types (ARCHIVE / KEEP / MERGE_INTO)
  • Wire prune into dream executor as Step 4 alongside consolidate and synthesize
  • Fix dream-undo ARCHIVE path: compute stub path from original, wire archiveService + projectRoot into CLI undo

Test plan

  • 21 unit tests covering all candidate detection, LLM interaction, decision execution, dedup, thresholds, and error handling
  • Interactive tests: KEEP bumps mtime, ARCHIVE creates stub+full and deletes original, MERGE_INTO writes pendingMerges, undo restores archived files
  • Verify core files never appear as candidates regardless of age
  • Verify candidate cap at 20 (stalest first)
  • Verify LLM failure returns [] gracefully without crashing dream
  • Full test suite passes (6135 tests), typecheck clean, lint clean

Two-signal candidate detection (importance decay + mtime staleness),
LLM review (ARCHIVE/KEEP/MERGE_INTO), archive via existing service,
defer merge suggestions to next dream cycle via pendingMerges.

Bugs found and fixed during interactive testing:
- Inline candidate summaries in prompt (LLM exhausted maxIterations on code_exec)
- Re-read dream state at Step 7 to preserve pendingMerges written by prune
- Wire archiveService + projectRoot into CLI undo for archive restoration
- Compute stub path from original path in undo (restoreEntry expects stub)
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 15, 2026

Claude finished @RyanNg1403's task in 2m 33s —— View job


PR Review: feat: [ENG-2062] prune operation — staleness scoring and archival

  • Gather context and read changed files
  • Review prune.ts implementation
  • Review prune.test.ts test coverage
  • Review integration changes (dream-executor.ts, dream-undo.ts, dream.ts, agent-process.ts)
  • Post inline comments
  • Post overall summary

Summary

Score: 4 / 5

Solid, well-structured implementation. The two-signal candidate detection, the dedup+cap logic, the never-throw contract, and the sequential executeDecisions (correct for state consistency) are all well-thought-out. The re-read of dreamStateService before writing the final dream state is a particularly nice detail that avoids a subtle clobber race.

Issues found

Severity Location Issue
Standards prune.ts:107 candidateMap.get(path)! — non-null assertion forbidden by repo standards; safe if (existing) guard is the right fix
Issue prune.ts:363 + dream-undo.ts:253 Stub path is reconstructed at undo time by re-encoding archive naming conventions. Storing stubPath in the DreamOperation record would break this coupling
Issue prune.ts:268 countActiveFiles duplicates walkContextTree traversal logic; a shared walker helper would keep them in sync
Issue prune.test.ts:346 No test for MERGE_INTO with absent mergeTarget — behaviour is observable (op emitted vs. dropped) and should be pinned
Nitpick prune.ts:391 MERGE_INTO with no mergeTarget silently emits a SUGGEST_MERGE op with mergeTarget: undefined; better to return undefined (drop the op)
Nitpick prune.ts:333 "mergeTarget?" in prompt template is TypeScript notation, not valid JSON — minor LLM clarity issue

What's working well

  • Two-signal detection with clean dedup via Map + 'both' marking
  • Core-file guard is enforced in both walkContextTree and readCandidateInfo (Signal A path)
  • Hallucination guard (candidateSet.has) in executeDecisions is correct and tested
  • Session cleanup in finally with .catch(() => {}) — no leak risk
  • never-throws contract for the whole operation, tested with LLM failure case
  • 21 tests cover the happy paths, thresholds, dedup, and error paths thoroughly

@RyanNg1403 RyanNg1403 requested a review from danhdoan April 15, 2026 03:16
Comment thread src/server/infra/dream/operations/prune.ts Outdated
Comment thread src/server/infra/dream/operations/prune.ts
Comment thread src/server/infra/dream/operations/prune.ts Outdated
Comment thread src/server/infra/dream/operations/prune.ts
Comment thread src/server/infra/dream/operations/prune.ts
Comment thread test/unit/infra/dream/operations/prune.test.ts
Comment thread src/server/infra/dream/dream-undo.ts Outdated
Comment thread src/server/infra/executor/dream-executor.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74cd41bf33

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/server/infra/dream/operations/prune.ts Outdated
- Remove ! non-null assertion, guard with if-check instead
- Extract shared walkMdFiles helper, eliminating duplicated walk logic
- Drop MERGE_INTO op when mergeTarget is absent (return undefined)
- Store stubPath in ARCHIVE operation, undo reads it directly (no fragile path recomputation)
- Fix mergeTarget? in JSON prompt example to valid format
- Add test for MERGE_INTO with absent mergeTarget
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 15, 2026

Claude finished @RyanNg1403's task in 1m 33s —— View job


PR Review (follow-up): feat: [ENG-2062] prune operation — staleness scoring and archival

  • Gather context
  • Read changed files (commits: 74cd41b, 6a47d70)
  • Review fixes applied since last review
  • Post updated summary

Follow-up Review — Addressing Reviewer Feedback

The fix commit (6a47d70) cleanly addresses every issue raised in the previous review. Here's the verdict on each:

Prior Issue Resolution
candidateMap.get(path)! non-null assertion ✅ Replaced with if (existing) guard — standards-compliant
walkContextTree / countActiveFiles duplication walkMdFiles helper extracted (lines 177–211); both callers now delegate to it
MERGE_INTO with absent mergeTarget emits op if (!decision.mergeTarget) return undefined — op is dropped
stubPath fragile reconstruction in undoPrune stubPath added to PruneOperationSchema; stored at archive time, read directly at undo — no path convention re-encoding
"mergeTarget?" in prompt JSON template ✅ Fixed to "mergeTarget": "path (only for MERGE_INTO)"
Missing test: MERGE_INTO with no mergeTarget 'drops MERGE_INTO op when mergeTarget is absent' covers the exact edge case, and asserts dreamStateService.write not called

Notable quality of the walkMdFiles refactor

The extracted helper correctly captures all filtering rules in one place: _/. directory skip, _-prefixed file skip, .md-only, and isExcludedFromSync. Both findStaleFiles and countActiveFiles benefit. countActiveFiles is now just 3 lines:

async function countActiveFiles(contextTreeDir: string): Promise<number> {
  let count = 0
  await walkMdFiles(contextTreeDir, async () => { count++ })
  return count
}

The undoPrune improvement is particularly solid

The undo path now asserts op.stubPath presence and throws a clear error if missing — much better than the previous silent wrong-path failure. The imports cleanup (ARCHIVE_DIR, STUB_EXTENSION, extname, join) confirms no dead code was left behind.

One remaining observation (not blocking)

The Codex review flagged path separator normalization: relativePath in walkMdFiles is built via fullPath.slice(contextTreeDir.length + 1), which uses the OS separator. On the current Linux deployment this is a non-issue (/-separated throughout). If Windows support is ever added, normalizing to Unix paths before the candidateMap key will be needed. A comment or a toUnixPath call at that line would future-proof it. Not a blocker for this PR.


Score: 5 / 5

All six previously identified issues have been resolved correctly. The implementation is clean, standards-compliant, and well-tested. The walkMdFiles extraction in particular improves maintainability beyond what was strictly required. Ready to merge.

Use toUnixPath on walkMdFiles relative path output to ensure consistent
forward-slash separators across platforms. Without this, Signal A/B dedup
and candidateSet hallucination guard would fail on Windows due to
backslash vs forward-slash mismatch.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 15, 2026

Claude finished @RyanNg1403's task in 1m 14s —— View job


PR Review (follow-up #2): feat: [ENG-2062] prune operation — staleness scoring and archival

  • Gather context
  • Read new commit 572732e (Windows path normalization fix)
  • Verify the fix addresses the flagged path separator issue
  • Post updated summary

Follow-up Review — Windows Path Normalization

The third commit (572732e) addresses the final remaining observation from the previous review (also flagged as P1 by Codex): path separator inconsistency that would silently break Signal A/B dedup and the candidateSet hallucination guard on Windows.

The fix (prune.ts:203)

-        const relativePath = fullPath.slice(contextTreeDir.length + 1)
+        const relativePath = toUnixPath(fullPath.slice(contextTreeDir.length + 1))

toUnixPath is re-exported from context-tree/path-utils.ts and resolves to normalizeForComparison in agent/infra/process/path-utils.ts — it replaces all \ with / and lowercases the drive letter on Windows. This is the correct utility for cross-platform path normalization already used elsewhere in the codebase.

Why the placement is correct

The normalization happens inside walkMdFiles at the single point of path generation, so all callers (findStaleFiles for Signal B dedup, countActiveFiles) receive consistent Unix-style paths automatically — no need to patch each call site individually.

  • Signal A (findArchiveCandidates) already returns Unix-style paths (confirmed by prior review)
  • Signal B via walkMdFiles now also returns Unix-style paths
  • candidateMap keys are consistent across both signals → dedup works on Windows
  • candidateSet.has(decision.file) in executeDecisions will correctly match LLM-returned paths (which see Unix-style paths in the prompt) → hallucination guard works on Windows

Verdict

Prior Observation Resolution
Path separator mismatch in walkMdFiles — Signal A/B dedup and candidateSet guard would fail on Windows toUnixPath applied at the single generation point inside walkMdFiles; fix is minimal, idiomatic, and covers all callers

The fix is correct, minimal, and uses the project's established utility. No issues found.


Score: 5 / 5

All issues from all prior reviews have been resolved. The implementation is clean, standards-compliant, well-tested, and now cross-platform safe. Ready to merge.

@danhdoan danhdoan merged commit d3d2607 into proj/dreaming Apr 15, 2026
6 checks passed
@RyanNg1403 RyanNg1403 deleted the feat/ENG-2062 branch April 15, 2026 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants