Skip to content

feat: Phase 5 — search/store hardening#26

Merged
EtanHey merged 4 commits into
mainfrom
feat/phase-5-search-hardening
Feb 23, 2026
Merged

feat: Phase 5 — search/store hardening#26
EtanHey merged 4 commits into
mainfrom
feat/phase-5-search-hardening

Conversation

@EtanHey
Copy link
Copy Markdown
Owner

@EtanHey EtanHey commented Feb 23, 2026

Summary

  • Project scoping: auto-detect project from CWD via ~/.config/brainlayer/scopes.yaml — brain_search auto-scopes when project param is None
  • Auto-importance: keyword-based heuristic scoring in brain_store (architectural +3, prohibition +2, length +1, file ref +1, baseline 3, cap 10)
  • Decision tracking: confidence_score, outcome, reversibility, files_changed fields in brain_store for type=decision
  • phase_commits table: SQLite table for commit history with phase/session/project linkage
  • Post-commit hook: hooks/post-commit.py auto-stores git commits into BrainLayer
  • Store queue buffer: DB lock → queue to pending-stores.jsonl, auto-flush on next successful store
  • CLI: brainlayer hooks install (symlinks hook), brainlayer flush (manual queue drain)
  • Scripts: scope-config.py (generate scopes.yaml from git repos), phase-boundaries.py (phase commit report)

Test plan

  • 25 new tests in tests/test_phase5.py — all pass
  • 258 total tests passing (0 failures, 2 skipped)
  • Ruff lint clean on all modified files
  • No regressions in existing test suite

🤖 Generated with Claude Code


Note

Medium Risk
Touches core MCP search/store paths and adds new persistence/queueing behavior around SQLite locking; failures are mostly best-effort, but regressions could affect write reliability and project scoping.

Overview
Hardens BrainLayer search/store behavior and adds phase/decision tracking. brain_search now auto-scopes project from CWD via a new scoping.py + scopes.yaml mapping, while brain_store gains keyword-based auto-importance when importance is omitted.

Extends stored-memory metadata and improves write reliability. store_memory and the MCP brain_store schema now accept decision-tracking fields (confidence_score, outcome, reversibility, files_changed), and stores that hit SQLite “locked/busy” errors are buffered to pending-stores.jsonl and auto-flushed on the next successful write (with a new brainlayer flush CLI command).

Adds commit/phase tooling. Introduces a phase_commits table in the DB, a hooks/post-commit.py hook (installable via brainlayer hooks install) to store commit summaries into BrainLayer, and helper scripts to generate scopes.yaml and report phase boundaries; adds Phase 5 tests covering these behaviors.

Written by Cursor Bugbot for commit 5917a1d. This will update automatically on new commits. Configure here.

Summary by CodeRabbit

Release Notes

  • New Features
    • Automatic Git commit tracking via post-commit hook integration
    • CLI commands to install hooks and flush pending operations
    • Decision and phase tracking with confidence scores and outcome metadata
    • Automatic project detection based on repository scopes
    • Auto-importance scoring for stored items
    • Improved database resilience with queuing for locked state scenarios

…ks, store queue)

- Project scoping: auto-detect project from CWD via ~/.config/brainlayer/scopes.yaml
- Auto-importance: keyword-based heuristic scoring (arch +3, prohibition +2, length +1, file ref +1)
- Decision tracking: confidence_score, outcome, reversibility, files_changed fields in brain_store
- phase_commits table: commit history with phase/session/project linkage
- Post-commit hook: auto-store git commits into BrainLayer (hooks/post-commit.py)
- Store queue buffer: DB lock → queue to pending-stores.jsonl, flush on next success
- CLI: brainlayer hooks install, brainlayer flush
- Scripts: scope-config.py (generate scopes.yaml), phase-boundaries.py (report)
- 25 new tests (all pass), 258 total passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment thread src/brainlayer/mcp/__init__.py Outdated
Comment thread src/brainlayer/scoping.py Outdated
if remaining:
path.write_text("\n".join(remaining) + "\n")
else:
path.unlink(missing_ok=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flush can lose concurrently queued store entries

Low Severity

_flush_pending_stores reads the entire JSONL queue file, processes entries, then rewrites or deletes it. Between the read at line 1813 and the write at line 1844, a concurrent _queue_store call (from another process hitting a DB lock) can append new entries. The subsequent write_text or unlink overwrites those entries, silently losing queued stores.

Fix in Cursor Fix in Web

Comment thread hooks/post-commit.py Outdated
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 23, 2026

📝 Walkthrough

Walkthrough

This PR introduces Phase 5 features for BrainLayer, adding commit tracking via post-commit hooks, project scoping from YAML configuration, auto-importance scoring for stored items, DB-lock resilience through request queueing, enhanced decision tracking metadata, phase boundary analysis tooling, and comprehensive test coverage across new and existing functionality.

Changes

Cohort / File(s) Summary
Commit Tracking
hooks/post-commit.py
New Git post-commit hook that automatically captures commit metadata (hash, message, changed files) and stores it in BrainLayer with journal type and commit/git tags, with graceful error handling for unavailable tools.
Analysis & Configuration Scripts
scripts/phase-boundaries.py, scripts/scope-config.py
Two new utility scripts: phase-boundaries queries and reports phase_commits table statistics; scope-config scans for Git repos and generates scopes.yaml mapping with support for dry-run and custom output paths.
Project Scoping Module
src/brainlayer/scoping.py
New module implementing project scope resolution from ~/.config/brainlayer/scopes.yaml with PyYAML support, fallback simple parser, longest-prefix matching, and CWD-based heuristics for automatic project detection.
Core Storage & Metadata Enhancement
src/brainlayer/store.py, src/brainlayer/vector_store.py
Extended store\_memory signature with optional confidence\_score, outcome, reversibility, and files\_changed parameters; added phase\_commits table to track decisions with indexed fields on project and phase\_name.
MCP Enhancements
src/brainlayer/mcp/__init__.py
Auto-importance scoring from keywords/length/file refs, JSONL-backed queuing for locked databases, auto-scoping in brain\_search, new metadata field propagation through storage pipelines, and improved resilience to DB locks with queued operation recovery.
CLI Commands
src/brainlayer/cli/__init__.py
Two new commands: hooks (installs post-commit hook with symlink/backup support) and flush (processes pending-stores.jsonl queue with embedding-based flushing and status reporting).
Phase 5 Tests
tests/test_phase5.py
Comprehensive test module covering auto-importance scoring, auto-type detection, scoping resolution with config/caching, phase\_commits schema, decision metadata persistence, and integration validation across new features.

Sequence Diagram

sequenceDiagram
    participant Client as Client/MCP
    participant Store as Brain Store
    participant DB as VectorStore/DB
    participant Queue as Pending Queue<br/>(JSONL)
    participant Embed as Embedding Service

    Client->>Store: brain_store request<br/>(content, metadata)
    Store->>Store: _auto_importance()<br/>score if not provided
    Store->>DB: Attempt insert to chunks<br/>with metadata
    
    alt DB Lock/Timeout
        DB-->>Store: Lock Error
        Store->>Queue: _queue_store()<br/>(request to JSONL)
        Store-->>Client: return<br/>queued=true
    else Success
        DB-->>Store: Insert OK
        Store->>Embed: Send to embedding model<br/>(with metadata fields)
        Embed-->>Store: Embedding complete
        Store->>Queue: _flush_pending_stores()<br/>process queued items
        Queue-->>DB: Retry queued inserts
        Store-->>Client: return<br/>queued=false,<br/>flushed count
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 Commits now hop into the store,
Auto-scores keep track of more,
Projects scope with perfect aim,
DB locks? We queue the same!
Phase boundaries bloom so clear,
Quality's the vibe this year! 🌱

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.04% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Phase 5 — search/store hardening' partially relates to the changeset. While the PR includes search/store enhancements (auto-scoping, auto-importance, decision tracking), it encompasses much broader functionality: git hooks, CLI commands, database schema changes, queueing/reliability mechanisms, and utility scripts. The title focuses narrowly on search/store aspects rather than capturing the main point of the overall change.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/phase-5-search-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/brainlayer/store.py (1)

41-98: ⚠️ Potential issue | 🟠 Major

Validate decision metadata before persisting

The new decision fields are documented with ranges/enums, but store_memory accepts any values. That can pollute analytics and break expectations when data is ingested from non‑MCP callers. Add lightweight validation (and optional type checks for files_changed).

🛠 Suggested fix
     if memory_type not in VALID_MEMORY_TYPES:
         raise ValueError(f"type must be one of: {', '.join(VALID_MEMORY_TYPES)}")
+
+    if confidence_score is not None and not (0.0 <= confidence_score <= 1.0):
+        raise ValueError("confidence_score must be between 0 and 1")
+    if outcome is not None and outcome not in {"pending", "validated", "reversed"}:
+        raise ValueError("outcome must be one of: pending, validated, reversed")
+    if reversibility is not None and reversibility not in {"easy", "hard", "destructive"}:
+        raise ValueError("reversibility must be one of: easy, hard, destructive")
+    if files_changed is not None and not all(isinstance(f, str) for f in files_changed):
+        raise ValueError("files_changed must be a list of file path strings")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/store.py` around lines 41 - 98, In store_memory, validate the
new decision metadata before adding it to meta: check confidence_score is a
number between 0 and 1 (raise ValueError otherwise), ensure outcome is one of
("pending","validated","reversed"), ensure reversibility is one of
("easy","hard","destructive"), and ensure files_changed is either None or a list
of strings (raise ValueError if any element is not str); perform these checks
after clamping importance and before building the meta dict (refer to function
store_memory, variables confidence_score, outcome, reversibility, files_changed)
and return clear ValueError messages on invalid input so bad values never get
persisted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hooks/post-commit.py`:
- Around line 59-66: The hook calls an executable named "brainlayer-store" that
isn't packaged; change the subprocess command construction in
hooks/post-commit.py (the cmd list used with subprocess.run) to use the
canonical CLI form ["brainlayer", "store", ...] (i.e. replace the single-word
entrypoint with the "brainlayer" binary plus "store" subcommand and keep the
same flags/arguments), or alternatively add a "brainlayer-store" console_script
entry to pyproject.toml so the original cmd resolves; adjust whichever path you
choose and keep the existing subprocess.run usage and error handling.

In `@scripts/phase-boundaries.py`:
- Around line 61-80: The recent-commits block and phase-summary formatting treat
nullable fields and zero-confidence incorrectly: explicitly check for None
rather than using truthiness. In the rows loop (variable rows, avg_conf) change
the conf_display logic to render "N/A" only when avg_conf is None and render
"0.00" when avg_conf == 0.0; in the recent loop (variable recent,
commit_message/msg) guard against msg being None before slicing (e.g., use msg
if msg is not None else "" or assign msg = msg or "" before msg[:50]) and
similarly ensure phase/outcome formatting uses explicit None checks so
slicing/formatting never receives None.

In `@scripts/scope-config.py`:
- Around line 52-55: The YAML key (display_path) is unquoted which can produce
invalid YAML when the path contains ":" or special chars; in the loop that
builds lines (variables: repos, display_path, lines) change the list append to
quote the key and escape any internal quotes so keys become valid YAML — e.g.
compute an escaped_display_path = display_path.replace('"', '\\"') and use
lines.append(f'  "{escaped_display_path}": "{name}"') (or alternatively use a
YAML dumper like yaml.safe_dump to emit the mapping) so the generated YAML keys
are always quoted.
- Around line 14-15: Remove the unused imports by deleting the top-level import
statements for os and subprocess in scope-config.py; the file already uses
pathlib.Path for path handling so you should keep the pathlib import (if
present) and any other used imports but remove references to the symbols os and
subprocess to eliminate the unused-import warnings.
- Around line 94-96: The current write step uses args.output.write_text(content)
without an encoding and will silently overwrite existing files; update the logic
around args.output.write_text to (1) explicitly pass encoding="utf-8" when
writing, and (2) detect if args.output.exists() before writing and warn the user
and preserve the prior file (e.g., copy/rename the existing path to a
timestamped backup or .bak) or require an explicit --force flag before
overwriting; adjust the messages printed (the print(f"Wrote {args.output}
({len(repos)} repos)") call) to reflect backup/overwrite behavior and reference
args.output.parent.mkdir, args.output.exists, args.output.write_text, and repos
when implementing these changes.
- Around line 84-86: The script currently prints "No git repos found..." then
returns which yields exit code 0; modify the block under the if not repos: check
to call sys.exit(1) (or raise SystemExit(1)) after printing so the process exits
with a non-zero code; ensure the module imports sys at top if not already
present and update the if not repos: branch in scripts/scope-config.py
accordingly.

In `@src/brainlayer/cli/__init__.py`:
- Around line 1592-1594: Wrap the call to hook_source.chmod(0o755) in a
try/except that catches PermissionError and logs a warning (using rprint or the
existing logger) instead of letting the exception abort the installation; ensure
the symlink creation (target.symlink_to(hook_source.resolve())) remains
unchanged and that rprint still reports success (e.g., keep the
rprint(f"[green]Installed post-commit hook: {target} →
{hook_source.resolve()}[/]") call) so installs in read‑only locations succeed
with a warning about permissions.
- Around line 1544-1594: The hooks command currently hardcodes repo_root /
".git" / "hooks" which breaks for worktrees/submodules; update the hooks()
implementation to call git (subprocess.check_output) with ["git", "rev-parse",
"--git-path", "hooks"] to obtain the correct hooks directory path (fall back to
repo_root / ".git" / "hooks" only if that command fails), use that value for
hooks_dir, and handle errors similarly to the existing git check; also stop
calling hook_source.chmod(0o755) on the package source file—instead either
remove that chmod call or apply permissions to the created symlink
(target.chmod(...)) if you need the hook to be executable.

In `@src/brainlayer/mcp/__init__.py`:
- Around line 1921-1932: The queued item uses the raw project variable instead
of the normalized_project, causing inconsistent project names after flush;
update the call to _queue_store in the DB-locked handler to pass
normalized_project (the same value used on success) for the "project" field,
ensuring normalized_project is available in the scope where _queue_store is
invoked.
- Around line 1787-1848: _flush_pending_stores currently reads and rewrites the
live pending-stores.jsonl which can lose concurrent _queue_store appends; change
to perform an atomic swap: compute the path from _get_pending_store_path(),
atomically rename/move the original file to a temp file (e.g.,
pending-stores.jsonl.processing using os.rename/pathlib.Path.rename) so new
writers via _queue_store continue appending to a fresh file, read/process the
temp file lines calling store_memory from _flush_pending_stores, collect failed
lines, then append any failed lines to the current live file (open original in
"a" and write failures + newline) and finally remove the temp file (or unlink
with missing_ok); ensure all rename/read/append operations handle missing file
races and exceptions so no writes are lost.

In `@src/brainlayer/scoping.py`:
- Around line 95-105: The current prefix check using cwd.startswith(expanded)
can produce false positives; in the loop that builds matches (variables
scope_map, prefix, project, expanded, cwd, matches) replace the startswith logic
with a path-aware check: normalize/resolve both expanded and cwd
(Path(expanded).expanduser().resolve() and Path(cwd).resolve()) and use
Path.is_relative_to(resolved_expanded) (or, if compatibility required, compare
os.path.commonpath([resolved_cwd, resolved_expanded]) == str(resolved_expanded))
so only true directory/subpath relationships are matched before appending to
matches and returning the longest match.

---

Outside diff comments:
In `@src/brainlayer/store.py`:
- Around line 41-98: In store_memory, validate the new decision metadata before
adding it to meta: check confidence_score is a number between 0 and 1 (raise
ValueError otherwise), ensure outcome is one of
("pending","validated","reversed"), ensure reversibility is one of
("easy","hard","destructive"), and ensure files_changed is either None or a list
of strings (raise ValueError if any element is not str); perform these checks
after clamping importance and before building the meta dict (refer to function
store_memory, variables confidence_score, outcome, reversibility, files_changed)
and return clear ValueError messages on invalid input so bad values never get
persisted.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 851b17d and efc0cbf.

📒 Files selected for processing (9)
  • hooks/post-commit.py
  • scripts/phase-boundaries.py
  • scripts/scope-config.py
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/mcp/__init__.py
  • src/brainlayer/scoping.py
  • src/brainlayer/store.py
  • src/brainlayer/vector_store.py
  • tests/test_phase5.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.13)
  • GitHub Check: Cursor Bugbot
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use ruff check src/ for linting and ruff format src/ for code formatting

Files:

  • src/brainlayer/scoping.py
  • src/brainlayer/store.py
  • src/brainlayer/vector_store.py
  • src/brainlayer/mcp/__init__.py
  • src/brainlayer/cli/__init__.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Run tests using pytest from the project root

Files:

  • tests/test_phase5.py
src/brainlayer/vector_store.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use sqlite-vec with APSW for vector storage, WAL mode, and PRAGMA busy_timeout = 5000 for concurrent multi-process safety

Files:

  • src/brainlayer/vector_store.py
src/brainlayer/mcp/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names

Files:

  • src/brainlayer/mcp/__init__.py
src/brainlayer/cli/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Enable project-specific indexing with brainlayer index --project <project_name> and incremental indexing with brainlayer index-fast

Files:

  • src/brainlayer/cli/__init__.py
🧠 Learnings (2)
📚 Learning: 2026-02-23T16:51:38.300Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.300Z
Learning: Applies to src/brainlayer/mcp/**/*.py : Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names

Applied to files:

  • src/brainlayer/mcp/__init__.py
📚 Learning: 2026-02-23T16:51:38.300Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.300Z
Learning: Applies to src/brainlayer/cli/**/*.py : Enable project-specific indexing with `brainlayer index --project <project_name>` and incremental indexing with `brainlayer index-fast`

Applied to files:

  • src/brainlayer/cli/__init__.py
🧬 Code graph analysis (4)
tests/test_phase5.py (4)
src/brainlayer/vector_store.py (1)
  • VectorStore (72-1910)
src/brainlayer/mcp/__init__.py (5)
  • _embed (1565-1566)
  • _embed (1612-1613)
  • _embed (1872-1873)
  • _auto_importance (303-327)
  • _detect_memory_type (285-291)
src/brainlayer/scoping.py (3)
  • _cwd_heuristic (110-124)
  • _parse_scopes_simple (44-76)
  • resolve_project_scope (79-107)
src/brainlayer/store.py (1)
  • store_memory (33-142)
src/brainlayer/mcp/__init__.py (2)
src/brainlayer/scoping.py (1)
  • resolve_project_scope (79-107)
src/brainlayer/store.py (1)
  • store_memory (33-142)
src/brainlayer/cli/__init__.py (3)
src/brainlayer/embeddings.py (3)
  • get_embedding_model (109-114)
  • embed_query (87-102)
  • embed_query (128-131)
src/brainlayer/mcp/__init__.py (1)
  • _flush_pending_stores (1802-1848)
src/brainlayer/vector_store.py (1)
  • VectorStore (72-1910)
scripts/phase-boundaries.py (2)
src/brainlayer/vector_store.py (2)
  • VectorStore (72-1910)
  • close (1901-1904)
tests/test_phase5.py (1)
  • store (24-29)
🔇 Additional comments (11)
src/brainlayer/cli/__init__.py (1)

1597-1629: Flush command integrates cleanly with queue handling
Nice, straightforward flow and clear messaging.

src/brainlayer/vector_store.py (1)

322-339: Phase commits schema + indexes look good

Schema matches the new metadata fields and the indices align with planned filters.

tests/test_phase5.py (6)

46-107: Auto‑importance coverage looks solid
Covers baseline, boosts, cap, and regression cases cleanly.


112-143: Auto‑type detection tests are clear and direct
Good signal coverage for todo/mistake/decision/learning and default note.


148-200: Project scoping tests are thorough
Nice coverage of CWD heuristic and config parsing paths.


205-250: phase_commits table tests look good
Schema and basic insert coverage are exactly what’s needed.


255-327: Decision‑field persistence tests are well‑targeted
Verifies metadata round‑trip and schema enums.


332-344: Auto‑importance schema test is concise and effective
Good guard for the inputSchema description.

src/brainlayer/mcp/__init__.py (3)

294-327: Auto‑importance heuristic is clear and bounded
Simple scoring with a cap keeps behavior predictable.


631-653: Decision‑field plumbing and auto‑importance integration are consistent
Schema, call‑site wiring, and _store_new all align cleanly.

Also applies to: 810-813, 1124-1143


945-951: Auto‑scope fallback is non‑blocking
Good use of best‑effort scoping without failing searches.

Comment thread hooks/post-commit.py Outdated
Comment on lines +61 to +80
for phase, count, outcomes, avg_conf in rows:
phase_display = phase or "(unlinked)"
conf_display = f"{avg_conf:.2f}" if avg_conf else "N/A"
print(f"{phase_display:<25} {count:>8} {outcomes or 'N/A':<30} {conf_display:>8}")

# Recent commits
recent = list(
cursor.execute(
f"""SELECT commit_hash, commit_message, phase_name, project,
outcome, created_at
FROM phase_commits {where_sql}
ORDER BY created_at DESC LIMIT 10""",
params,
)
)
if recent:
print(f"\nRecent commits:")
for hash_, msg, phase, proj, outcome, ts in recent:
ts_short = (ts or "")[:19]
print(f" {hash_[:8]} {msg[:50]:<50} {phase or '':<15} {outcome or '':<10} {ts_short}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle NULL commit_message and 0.0 confidence cleanly

commit_message is nullable in the schema; slicing msg[:50] will raise when it’s NULL. Also, avg_conf == 0.0 currently renders as “N/A”. Use explicit None checks.

🛠 Suggested fix
-        conf_display = f"{avg_conf:.2f}" if avg_conf else "N/A"
+        conf_display = f"{avg_conf:.2f}" if avg_conf is not None else "N/A"
...
-            print(f"  {hash_[:8]} {msg[:50]:<50} {phase or '':<15} {outcome or '':<10} {ts_short}")
+            safe_msg = (msg or "")
+            print(f"  {hash_[:8]} {safe_msg[:50]:<50} {phase or '':<15} {outcome or '':<10} {ts_short}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/phase-boundaries.py` around lines 61 - 80, The recent-commits block
and phase-summary formatting treat nullable fields and zero-confidence
incorrectly: explicitly check for None rather than using truthiness. In the rows
loop (variable rows, avg_conf) change the conf_display logic to render "N/A"
only when avg_conf is None and render "0.00" when avg_conf == 0.0; in the recent
loop (variable recent, commit_message/msg) guard against msg being None before
slicing (e.g., use msg if msg is not None else "" or assign msg = msg or ""
before msg[:50]) and similarly ensure phase/outcome formatting uses explicit
None checks so slicing/formatting never receives None.

Comment thread scripts/scope-config.py
Comment on lines +14 to +15
import os
import subprocess
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove unused imports os and subprocess.

Neither os nor subprocess is referenced anywhere in this file; all path work is done via pathlib.Path.

🧹 Proposed fix
 import argparse
-import os
-import subprocess
 from pathlib import Path
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import os
import subprocess
import argparse
from pathlib import Path
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/scope-config.py` around lines 14 - 15, Remove the unused imports by
deleting the top-level import statements for os and subprocess in
scope-config.py; the file already uses pathlib.Path for path handling so you
should keep the pathlib import (if present) and any other used imports but
remove references to the symbols os and subprocess to eliminate the
unused-import warnings.

Comment thread scripts/scope-config.py
Comment on lines +52 to +55
for path, name in repos:
# Use ~ shorthand for home directory
display_path = str(path).replace(str(Path.home()), "~")
lines.append(f' {display_path}: "{name}"')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

YAML keys (path strings) must be quoted to avoid invalid output.

An unquoted key that contains : (e.g., a Windows drive letter or an unusual directory name) or starts with a YAML-special character produces structurally invalid YAML that a downstream parser (e.g., scoping.py) will fail to load.

🛡️ Proposed fix
-        lines.append(f'  {display_path}: "{name}"')
+        lines.append(f'  "{display_path}": "{name}"')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for path, name in repos:
# Use ~ shorthand for home directory
display_path = str(path).replace(str(Path.home()), "~")
lines.append(f' {display_path}: "{name}"')
for path, name in repos:
# Use ~ shorthand for home directory
display_path = str(path).replace(str(Path.home()), "~")
lines.append(f' "{display_path}": "{name}"')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/scope-config.py` around lines 52 - 55, The YAML key (display_path) is
unquoted which can produce invalid YAML when the path contains ":" or special
chars; in the loop that builds lines (variables: repos, display_path, lines)
change the list append to quote the key and escape any internal quotes so keys
become valid YAML — e.g. compute an escaped_display_path =
display_path.replace('"', '\\"') and use lines.append(f' 
"{escaped_display_path}": "{name}"') (or alternatively use a YAML dumper like
yaml.safe_dump to emit the mapping) so the generated YAML keys are always
quoted.

Comment thread scripts/scope-config.py
Comment on lines +84 to +86
if not repos:
print(f"No git repos found in {args.gits_dir}")
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Exit with a non-zero code when no repos are found.

Returning normally yields exit code 0, which makes the "nothing found" case indistinguishable from success in CI pipelines or shell scripts.

🧹 Proposed fix
+import sys
 ...
     if not repos:
         print(f"No git repos found in {args.gits_dir}")
-        return
+        sys.exit(1)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/scope-config.py` around lines 84 - 86, The script currently prints
"No git repos found..." then returns which yields exit code 0; modify the block
under the if not repos: check to call sys.exit(1) (or raise SystemExit(1)) after
printing so the process exits with a non-zero code; ensure the module imports
sys at top if not already present and update the if not repos: branch in
scripts/scope-config.py accordingly.

Comment on lines +1544 to +1594
@app.command("hooks")
def hooks(
action: str = typer.Argument(..., help="Action: install"),
) -> None:
"""Manage BrainLayer git hooks."""
if action != "install":
rprint(f"[red]Unknown action: {action}. Use 'install'.[/]")
raise typer.Exit(1)

import subprocess

# Find repo root
try:
repo_root = Path(
subprocess.check_output(
["git", "rev-parse", "--show-toplevel"], stderr=subprocess.DEVNULL
)
.decode()
.strip()
)
except (subprocess.CalledProcessError, FileNotFoundError):
rprint("[red]Not in a git repository.[/]")
raise typer.Exit(1)

hooks_dir = repo_root / ".git" / "hooks"
if not hooks_dir.exists():
rprint(f"[red]Git hooks directory not found: {hooks_dir}[/]")
raise typer.Exit(1)

# Find our post-commit hook
hook_source = Path(__file__).parent.parent.parent.parent / "hooks" / "post-commit.py"
if not hook_source.exists():
# Try relative to package
import brainlayer

pkg_root = Path(brainlayer.__file__).parent.parent.parent
hook_source = pkg_root / "hooks" / "post-commit.py"

if not hook_source.exists():
rprint("[red]Cannot find hooks/post-commit.py in BrainLayer repo.[/]")
raise typer.Exit(1)

target = hooks_dir / "post-commit"
if target.exists() or target.is_symlink():
rprint(f"[yellow]Existing post-commit hook found at {target}[/]")
rprint("[yellow]Backing up to post-commit.bak[/]")
target.rename(hooks_dir / "post-commit.bak")

target.symlink_to(hook_source.resolve())
hook_source.chmod(0o755)
rprint(f"[green]Installed post-commit hook: {target} → {hook_source.resolve()}[/]")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check the actual file and its context
head -20 src/brainlayer/cli/__init__.py
wc -l src/brainlayer/cli/__init__.py

Repository: EtanHey/brainlayer

Length of output: 515


🏁 Script executed:

# Read the specific lines mentioned in the review
sed -n '1544,1600p' src/brainlayer/cli/__init__.py

Repository: EtanHey/brainlayer

Length of output: 2089


🏁 Script executed:

# Test the git commands to understand the concern about .git/hooks
# Create test scenarios to validate the fix
cd /tmp
mkdir -p test_repo test_worktree

# Test 1: Normal repo
cd test_repo
git init
git rev-parse --is-inside-work-tree
git rev-parse --git-path hooks
ls -ld .git .git/hooks 2>&1 || true

# Test 2: Worktree scenario simulation
cd /tmp
mkdir -p worktree_test
cd worktree_test
git init main_repo
cd main_repo
git rev-parse --git-path hooks
git worktree list 2>&1 || echo "Worktree not available in this environment"

Repository: EtanHey/brainlayer

Length of output: 1485


Use git rev-parse --git-path hooks to locate the hooks directory

Hardcoding <repo>/.git/hooks fails in git worktrees and submodules where .git is a file, not a directory. The current code will exit even in valid repos in these configurations. Resolve the hooks path via Git to handle both normal repos and worktrees correctly.

Additionally, hook_source.chmod(0o755) modifies the package source file itself rather than making the symlink executable. Either remove this line (the symlink inherits permissions from its target) or use target.chmod() if executable permissions on the symlink are required.

🛠 Suggested fix for hooks path
-    hooks_dir = repo_root / ".git" / "hooks"
+    hooks_path = (
+        subprocess.check_output(["git", "rev-parse", "--git-path", "hooks"], stderr=subprocess.DEVNULL)
+        .decode()
+        .strip()
+    )
+    hooks_dir = Path(hooks_path)
+    if not hooks_dir.is_absolute():
+        hooks_dir = repo_root / hooks_dir
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/cli/__init__.py` around lines 1544 - 1594, The hooks command
currently hardcodes repo_root / ".git" / "hooks" which breaks for
worktrees/submodules; update the hooks() implementation to call git
(subprocess.check_output) with ["git", "rev-parse", "--git-path", "hooks"] to
obtain the correct hooks directory path (fall back to repo_root / ".git" /
"hooks" only if that command fails), use that value for hooks_dir, and handle
errors similarly to the existing git check; also stop calling
hook_source.chmod(0o755) on the package source file—instead either remove that
chmod call or apply permissions to the created symlink (target.chmod(...)) if
you need the hook to be executable.

Comment on lines +1592 to +1594
target.symlink_to(hook_source.resolve())
hook_source.chmod(0o755)
rprint(f"[green]Installed post-commit hook: {target} → {hook_source.resolve()}[/]")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard chmod for read‑only installs

hook_source.chmod(0o755) can raise PermissionError when the hook file lives in site‑packages or a read‑only repo, aborting install after the symlink is created. Catch and warn instead of failing.

🛠 Suggested fix
-    hook_source.chmod(0o755)
+    try:
+        hook_source.chmod(0o755)
+    except PermissionError:
+        rprint(f"[yellow]Warning:[/] Unable to chmod hook source: {hook_source}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
target.symlink_to(hook_source.resolve())
hook_source.chmod(0o755)
rprint(f"[green]Installed post-commit hook: {target}{hook_source.resolve()}[/]")
target.symlink_to(hook_source.resolve())
try:
hook_source.chmod(0o755)
except PermissionError:
rprint(f"[yellow]Warning:[/] Unable to chmod hook source: {hook_source}")
rprint(f"[green]Installed post-commit hook: {target}{hook_source.resolve()}[/]")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/cli/__init__.py` around lines 1592 - 1594, Wrap the call to
hook_source.chmod(0o755) in a try/except that catches PermissionError and logs a
warning (using rprint or the existing logger) instead of letting the exception
abort the installation; ensure the symlink creation
(target.symlink_to(hook_source.resolve())) remains unchanged and that rprint
still reports success (e.g., keep the rprint(f"[green]Installed post-commit
hook: {target} → {hook_source.resolve()}[/]") call) so installs in read‑only
locations succeed with a warning about permissions.

Comment on lines +1787 to +1848
def _get_pending_store_path():
"""Path for the store queue buffer file."""
from ..paths import DEFAULT_DB_PATH
return DEFAULT_DB_PATH.parent / "pending-stores.jsonl"


def _queue_store(item: dict) -> None:
"""Buffer a store request to JSONL when DB is locked."""
import json as _json
path = _get_pending_store_path()
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "a") as f:
f.write(_json.dumps(item) + "\n")


def _flush_pending_stores(store, embed_fn) -> int:
"""Flush pending-stores.jsonl (FIFO). Returns count flushed."""
import json as _json

from ..store import store_memory

path = _get_pending_store_path()
if not path.exists():
return 0

try:
lines = path.read_text().strip().splitlines()
except Exception:
return 0

if not lines:
return 0

flushed = 0
remaining = []
for line in lines:
try:
item = _json.loads(line)
store_memory(
store=store,
embed_fn=embed_fn,
content=item["content"],
memory_type=item["memory_type"],
project=item.get("project"),
tags=item.get("tags"),
importance=item.get("importance"),
confidence_score=item.get("confidence_score"),
outcome=item.get("outcome"),
reversibility=item.get("reversibility"),
files_changed=item.get("files_changed"),
)
flushed += 1
except Exception:
remaining.append(line)

# Rewrite file with only failed items
if remaining:
path.write_text("\n".join(remaining) + "\n")
else:
path.unlink(missing_ok=True)

return flushed
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Queue flush can drop concurrent writes

_flush_pending_stores reads and rewrites the queue without an atomic swap or lock. If another process appends during the flush, new entries can be lost or overwritten. Use an atomic rename (or file lock) to isolate the batch being processed and then append any remaining failures to the current queue file.

🛠 Suggested fix (atomic swap)
 def _flush_pending_stores(store, embed_fn) -> int:
@@
     path = _get_pending_store_path()
     if not path.exists():
         return 0
 
-    try:
-        lines = path.read_text().strip().splitlines()
-    except Exception:
-        return 0
+    tmp_path = path.with_suffix(".jsonl.processing")
+    try:
+        path.rename(tmp_path)
+    except FileNotFoundError:
+        return 0
+
+    try:
+        lines = tmp_path.read_text().strip().splitlines()
+    except Exception:
+        tmp_path.unlink(missing_ok=True)
+        return 0
@@
-    if remaining:
-        path.write_text("\n".join(remaining) + "\n")
-    else:
-        path.unlink(missing_ok=True)
+    if remaining:
+        with open(path, "a", encoding="utf-8") as f:
+            f.write("\n".join(remaining) + "\n")
+    tmp_path.unlink(missing_ok=True)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _get_pending_store_path():
"""Path for the store queue buffer file."""
from ..paths import DEFAULT_DB_PATH
return DEFAULT_DB_PATH.parent / "pending-stores.jsonl"
def _queue_store(item: dict) -> None:
"""Buffer a store request to JSONL when DB is locked."""
import json as _json
path = _get_pending_store_path()
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "a") as f:
f.write(_json.dumps(item) + "\n")
def _flush_pending_stores(store, embed_fn) -> int:
"""Flush pending-stores.jsonl (FIFO). Returns count flushed."""
import json as _json
from ..store import store_memory
path = _get_pending_store_path()
if not path.exists():
return 0
try:
lines = path.read_text().strip().splitlines()
except Exception:
return 0
if not lines:
return 0
flushed = 0
remaining = []
for line in lines:
try:
item = _json.loads(line)
store_memory(
store=store,
embed_fn=embed_fn,
content=item["content"],
memory_type=item["memory_type"],
project=item.get("project"),
tags=item.get("tags"),
importance=item.get("importance"),
confidence_score=item.get("confidence_score"),
outcome=item.get("outcome"),
reversibility=item.get("reversibility"),
files_changed=item.get("files_changed"),
)
flushed += 1
except Exception:
remaining.append(line)
# Rewrite file with only failed items
if remaining:
path.write_text("\n".join(remaining) + "\n")
else:
path.unlink(missing_ok=True)
return flushed
def _flush_pending_stores(store, embed_fn) -> int:
"""Flush pending-stores.jsonl (FIFO). Returns count flushed."""
import json as _json
from ..store import store_memory
path = _get_pending_store_path()
if not path.exists():
return 0
tmp_path = path.with_suffix(".jsonl.processing")
try:
path.rename(tmp_path)
except FileNotFoundError:
return 0
try:
lines = tmp_path.read_text().strip().splitlines()
except Exception:
tmp_path.unlink(missing_ok=True)
return 0
if not lines:
return 0
flushed = 0
remaining = []
for line in lines:
try:
item = _json.loads(line)
store_memory(
store=store,
embed_fn=embed_fn,
content=item["content"],
memory_type=item["memory_type"],
project=item.get("project"),
tags=item.get("tags"),
importance=item.get("importance"),
confidence_score=item.get("confidence_score"),
outcome=item.get("outcome"),
reversibility=item.get("reversibility"),
files_changed=item.get("files_changed"),
)
flushed += 1
except Exception:
remaining.append(line)
# Rewrite file with only failed items
if remaining:
with open(path, "a", encoding="utf-8") as f:
f.write("\n".join(remaining) + "\n")
tmp_path.unlink(missing_ok=True)
return flushed
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/mcp/__init__.py` around lines 1787 - 1848,
_flush_pending_stores currently reads and rewrites the live pending-stores.jsonl
which can lose concurrent _queue_store appends; change to perform an atomic
swap: compute the path from _get_pending_store_path(), atomically rename/move
the original file to a temp file (e.g., pending-stores.jsonl.processing using
os.rename/pathlib.Path.rename) so new writers via _queue_store continue
appending to a fresh file, read/process the temp file lines calling store_memory
from _flush_pending_stores, collect failed lines, then append any failed lines
to the current live file (open original in "a" and write failures + newline) and
finally remove the temp file (or unlink with missing_ok); ensure all
rename/read/append operations handle missing file races and exceptions so no
writes are lost.

Comment thread src/brainlayer/mcp/__init__.py Outdated
Comment thread src/brainlayer/scoping.py
…zation, path boundary

- post-commit.py: use Python API instead of nonexistent brainlayer-store CLI
- _store queue: normalize project name before queuing
- scoping.py: add path boundary check to prevent false prefix matches

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@EtanHey EtanHey merged commit e7e491d into main Feb 23, 2026
6 checks passed
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.


# Prohibition/imperative keywords: +2 (once)
if any(kw in lower for kw in _PROHIBITION_KEYWORDS):
score += 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Substring keyword matching causes false importance score boosts

Medium Severity

_auto_importance uses plain in substring matching (kw in lower) for keyword detection, but short keywords like "api" and "auth" in _ARCH_KEYWORDS match inside common words — "capital", "rapid", "author" — triggering a +3 boost. Similarly, "must" and "never" in _PROHIBITION_KEYWORDS match "mustard" and "whenever" for +2. Since importance_min is used in brain_search to filter results, inflated scores cause irrelevant content to appear in filtered searches. The existing _TYPE_RULES in the same file already demonstrates the correct approach using \b word boundaries with re.search.

Fix in Cursor Fix in Web

rprint(f"[green]Flushed {flushed} item(s).[/]")
remaining = len(lines) - flushed
if remaining > 0:
rprint(f"[yellow]{remaining} item(s) still pending (DB may still be locked).[/]")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flush CLI uses stale count for remaining calculation

Low Severity

The flush command reads pending-stores.jsonl at line 1603 to count items, then _flush_pending_stores reads the same file again independently at its own line 1817. The remaining = len(lines) - flushed calculation at line 1622 mixes counts from two separate reads. The initial read is entirely redundant since _flush_pending_stores handles existence/emptiness checks internally.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant