feat: Phase 5 — search/store hardening#26
Conversation
…ks, store queue) - Project scoping: auto-detect project from CWD via ~/.config/brainlayer/scopes.yaml - Auto-importance: keyword-based heuristic scoring (arch +3, prohibition +2, length +1, file ref +1) - Decision tracking: confidence_score, outcome, reversibility, files_changed fields in brain_store - phase_commits table: commit history with phase/session/project linkage - Post-commit hook: auto-store git commits into BrainLayer (hooks/post-commit.py) - Store queue buffer: DB lock → queue to pending-stores.jsonl, flush on next success - CLI: brainlayer hooks install, brainlayer flush - Scripts: scope-config.py (generate scopes.yaml), phase-boundaries.py (report) - 25 new tests (all pass), 258 total passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| if remaining: | ||
| path.write_text("\n".join(remaining) + "\n") | ||
| else: | ||
| path.unlink(missing_ok=True) |
There was a problem hiding this comment.
Flush can lose concurrently queued store entries
Low Severity
_flush_pending_stores reads the entire JSONL queue file, processes entries, then rewrites or deletes it. Between the read at line 1813 and the write at line 1844, a concurrent _queue_store call (from another process hitting a DB lock) can append new entries. The subsequent write_text or unlink overwrites those entries, silently losing queued stores.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR introduces Phase 5 features for BrainLayer, adding commit tracking via post-commit hooks, project scoping from YAML configuration, auto-importance scoring for stored items, DB-lock resilience through request queueing, enhanced decision tracking metadata, phase boundary analysis tooling, and comprehensive test coverage across new and existing functionality. Changes
Sequence DiagramsequenceDiagram
participant Client as Client/MCP
participant Store as Brain Store
participant DB as VectorStore/DB
participant Queue as Pending Queue<br/>(JSONL)
participant Embed as Embedding Service
Client->>Store: brain_store request<br/>(content, metadata)
Store->>Store: _auto_importance()<br/>score if not provided
Store->>DB: Attempt insert to chunks<br/>with metadata
alt DB Lock/Timeout
DB-->>Store: Lock Error
Store->>Queue: _queue_store()<br/>(request to JSONL)
Store-->>Client: return<br/>queued=true
else Success
DB-->>Store: Insert OK
Store->>Embed: Send to embedding model<br/>(with metadata fields)
Embed-->>Store: Embedding complete
Store->>Queue: _flush_pending_stores()<br/>process queued items
Queue-->>DB: Retry queued inserts
Store-->>Client: return<br/>queued=false,<br/>flushed count
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/brainlayer/store.py (1)
41-98:⚠️ Potential issue | 🟠 MajorValidate decision metadata before persisting
The new decision fields are documented with ranges/enums, but
store_memoryaccepts any values. That can pollute analytics and break expectations when data is ingested from non‑MCP callers. Add lightweight validation (and optional type checks forfiles_changed).🛠 Suggested fix
if memory_type not in VALID_MEMORY_TYPES: raise ValueError(f"type must be one of: {', '.join(VALID_MEMORY_TYPES)}") + + if confidence_score is not None and not (0.0 <= confidence_score <= 1.0): + raise ValueError("confidence_score must be between 0 and 1") + if outcome is not None and outcome not in {"pending", "validated", "reversed"}: + raise ValueError("outcome must be one of: pending, validated, reversed") + if reversibility is not None and reversibility not in {"easy", "hard", "destructive"}: + raise ValueError("reversibility must be one of: easy, hard, destructive") + if files_changed is not None and not all(isinstance(f, str) for f in files_changed): + raise ValueError("files_changed must be a list of file path strings")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/store.py` around lines 41 - 98, In store_memory, validate the new decision metadata before adding it to meta: check confidence_score is a number between 0 and 1 (raise ValueError otherwise), ensure outcome is one of ("pending","validated","reversed"), ensure reversibility is one of ("easy","hard","destructive"), and ensure files_changed is either None or a list of strings (raise ValueError if any element is not str); perform these checks after clamping importance and before building the meta dict (refer to function store_memory, variables confidence_score, outcome, reversibility, files_changed) and return clear ValueError messages on invalid input so bad values never get persisted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@hooks/post-commit.py`:
- Around line 59-66: The hook calls an executable named "brainlayer-store" that
isn't packaged; change the subprocess command construction in
hooks/post-commit.py (the cmd list used with subprocess.run) to use the
canonical CLI form ["brainlayer", "store", ...] (i.e. replace the single-word
entrypoint with the "brainlayer" binary plus "store" subcommand and keep the
same flags/arguments), or alternatively add a "brainlayer-store" console_script
entry to pyproject.toml so the original cmd resolves; adjust whichever path you
choose and keep the existing subprocess.run usage and error handling.
In `@scripts/phase-boundaries.py`:
- Around line 61-80: The recent-commits block and phase-summary formatting treat
nullable fields and zero-confidence incorrectly: explicitly check for None
rather than using truthiness. In the rows loop (variable rows, avg_conf) change
the conf_display logic to render "N/A" only when avg_conf is None and render
"0.00" when avg_conf == 0.0; in the recent loop (variable recent,
commit_message/msg) guard against msg being None before slicing (e.g., use msg
if msg is not None else "" or assign msg = msg or "" before msg[:50]) and
similarly ensure phase/outcome formatting uses explicit None checks so
slicing/formatting never receives None.
In `@scripts/scope-config.py`:
- Around line 52-55: The YAML key (display_path) is unquoted which can produce
invalid YAML when the path contains ":" or special chars; in the loop that
builds lines (variables: repos, display_path, lines) change the list append to
quote the key and escape any internal quotes so keys become valid YAML — e.g.
compute an escaped_display_path = display_path.replace('"', '\\"') and use
lines.append(f' "{escaped_display_path}": "{name}"') (or alternatively use a
YAML dumper like yaml.safe_dump to emit the mapping) so the generated YAML keys
are always quoted.
- Around line 14-15: Remove the unused imports by deleting the top-level import
statements for os and subprocess in scope-config.py; the file already uses
pathlib.Path for path handling so you should keep the pathlib import (if
present) and any other used imports but remove references to the symbols os and
subprocess to eliminate the unused-import warnings.
- Around line 94-96: The current write step uses args.output.write_text(content)
without an encoding and will silently overwrite existing files; update the logic
around args.output.write_text to (1) explicitly pass encoding="utf-8" when
writing, and (2) detect if args.output.exists() before writing and warn the user
and preserve the prior file (e.g., copy/rename the existing path to a
timestamped backup or .bak) or require an explicit --force flag before
overwriting; adjust the messages printed (the print(f"Wrote {args.output}
({len(repos)} repos)") call) to reflect backup/overwrite behavior and reference
args.output.parent.mkdir, args.output.exists, args.output.write_text, and repos
when implementing these changes.
- Around line 84-86: The script currently prints "No git repos found..." then
returns which yields exit code 0; modify the block under the if not repos: check
to call sys.exit(1) (or raise SystemExit(1)) after printing so the process exits
with a non-zero code; ensure the module imports sys at top if not already
present and update the if not repos: branch in scripts/scope-config.py
accordingly.
In `@src/brainlayer/cli/__init__.py`:
- Around line 1592-1594: Wrap the call to hook_source.chmod(0o755) in a
try/except that catches PermissionError and logs a warning (using rprint or the
existing logger) instead of letting the exception abort the installation; ensure
the symlink creation (target.symlink_to(hook_source.resolve())) remains
unchanged and that rprint still reports success (e.g., keep the
rprint(f"[green]Installed post-commit hook: {target} →
{hook_source.resolve()}[/]") call) so installs in read‑only locations succeed
with a warning about permissions.
- Around line 1544-1594: The hooks command currently hardcodes repo_root /
".git" / "hooks" which breaks for worktrees/submodules; update the hooks()
implementation to call git (subprocess.check_output) with ["git", "rev-parse",
"--git-path", "hooks"] to obtain the correct hooks directory path (fall back to
repo_root / ".git" / "hooks" only if that command fails), use that value for
hooks_dir, and handle errors similarly to the existing git check; also stop
calling hook_source.chmod(0o755) on the package source file—instead either
remove that chmod call or apply permissions to the created symlink
(target.chmod(...)) if you need the hook to be executable.
In `@src/brainlayer/mcp/__init__.py`:
- Around line 1921-1932: The queued item uses the raw project variable instead
of the normalized_project, causing inconsistent project names after flush;
update the call to _queue_store in the DB-locked handler to pass
normalized_project (the same value used on success) for the "project" field,
ensuring normalized_project is available in the scope where _queue_store is
invoked.
- Around line 1787-1848: _flush_pending_stores currently reads and rewrites the
live pending-stores.jsonl which can lose concurrent _queue_store appends; change
to perform an atomic swap: compute the path from _get_pending_store_path(),
atomically rename/move the original file to a temp file (e.g.,
pending-stores.jsonl.processing using os.rename/pathlib.Path.rename) so new
writers via _queue_store continue appending to a fresh file, read/process the
temp file lines calling store_memory from _flush_pending_stores, collect failed
lines, then append any failed lines to the current live file (open original in
"a" and write failures + newline) and finally remove the temp file (or unlink
with missing_ok); ensure all rename/read/append operations handle missing file
races and exceptions so no writes are lost.
In `@src/brainlayer/scoping.py`:
- Around line 95-105: The current prefix check using cwd.startswith(expanded)
can produce false positives; in the loop that builds matches (variables
scope_map, prefix, project, expanded, cwd, matches) replace the startswith logic
with a path-aware check: normalize/resolve both expanded and cwd
(Path(expanded).expanduser().resolve() and Path(cwd).resolve()) and use
Path.is_relative_to(resolved_expanded) (or, if compatibility required, compare
os.path.commonpath([resolved_cwd, resolved_expanded]) == str(resolved_expanded))
so only true directory/subpath relationships are matched before appending to
matches and returning the longest match.
---
Outside diff comments:
In `@src/brainlayer/store.py`:
- Around line 41-98: In store_memory, validate the new decision metadata before
adding it to meta: check confidence_score is a number between 0 and 1 (raise
ValueError otherwise), ensure outcome is one of
("pending","validated","reversed"), ensure reversibility is one of
("easy","hard","destructive"), and ensure files_changed is either None or a list
of strings (raise ValueError if any element is not str); perform these checks
after clamping importance and before building the meta dict (refer to function
store_memory, variables confidence_score, outcome, reversibility, files_changed)
and return clear ValueError messages on invalid input so bad values never get
persisted.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (9)
hooks/post-commit.pyscripts/phase-boundaries.pyscripts/scope-config.pysrc/brainlayer/cli/__init__.pysrc/brainlayer/mcp/__init__.pysrc/brainlayer/scoping.pysrc/brainlayer/store.pysrc/brainlayer/vector_store.pytests/test_phase5.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.13)
- GitHub Check: Cursor Bugbot
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
ruff check src/for linting andruff format src/for code formatting
Files:
src/brainlayer/scoping.pysrc/brainlayer/store.pysrc/brainlayer/vector_store.pysrc/brainlayer/mcp/__init__.pysrc/brainlayer/cli/__init__.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Run tests using
pytestfrom the project root
Files:
tests/test_phase5.py
src/brainlayer/vector_store.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use sqlite-vec with APSW for vector storage, WAL mode, and
PRAGMA busy_timeout = 5000for concurrent multi-process safety
Files:
src/brainlayer/vector_store.py
src/brainlayer/mcp/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names
Files:
src/brainlayer/mcp/__init__.py
src/brainlayer/cli/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Enable project-specific indexing with
brainlayer index --project <project_name>and incremental indexing withbrainlayer index-fast
Files:
src/brainlayer/cli/__init__.py
🧠 Learnings (2)
📚 Learning: 2026-02-23T16:51:38.300Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.300Z
Learning: Applies to src/brainlayer/mcp/**/*.py : Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-02-23T16:51:38.300Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.300Z
Learning: Applies to src/brainlayer/cli/**/*.py : Enable project-specific indexing with `brainlayer index --project <project_name>` and incremental indexing with `brainlayer index-fast`
Applied to files:
src/brainlayer/cli/__init__.py
🧬 Code graph analysis (4)
tests/test_phase5.py (4)
src/brainlayer/vector_store.py (1)
VectorStore(72-1910)src/brainlayer/mcp/__init__.py (5)
_embed(1565-1566)_embed(1612-1613)_embed(1872-1873)_auto_importance(303-327)_detect_memory_type(285-291)src/brainlayer/scoping.py (3)
_cwd_heuristic(110-124)_parse_scopes_simple(44-76)resolve_project_scope(79-107)src/brainlayer/store.py (1)
store_memory(33-142)
src/brainlayer/mcp/__init__.py (2)
src/brainlayer/scoping.py (1)
resolve_project_scope(79-107)src/brainlayer/store.py (1)
store_memory(33-142)
src/brainlayer/cli/__init__.py (3)
src/brainlayer/embeddings.py (3)
get_embedding_model(109-114)embed_query(87-102)embed_query(128-131)src/brainlayer/mcp/__init__.py (1)
_flush_pending_stores(1802-1848)src/brainlayer/vector_store.py (1)
VectorStore(72-1910)
scripts/phase-boundaries.py (2)
src/brainlayer/vector_store.py (2)
VectorStore(72-1910)close(1901-1904)tests/test_phase5.py (1)
store(24-29)
🔇 Additional comments (11)
src/brainlayer/cli/__init__.py (1)
1597-1629: Flush command integrates cleanly with queue handling
Nice, straightforward flow and clear messaging.src/brainlayer/vector_store.py (1)
322-339: Phase commits schema + indexes look goodSchema matches the new metadata fields and the indices align with planned filters.
tests/test_phase5.py (6)
46-107: Auto‑importance coverage looks solid
Covers baseline, boosts, cap, and regression cases cleanly.
112-143: Auto‑type detection tests are clear and direct
Good signal coverage for todo/mistake/decision/learning and default note.
148-200: Project scoping tests are thorough
Nice coverage of CWD heuristic and config parsing paths.
205-250: phase_commits table tests look good
Schema and basic insert coverage are exactly what’s needed.
255-327: Decision‑field persistence tests are well‑targeted
Verifies metadata round‑trip and schema enums.
332-344: Auto‑importance schema test is concise and effective
Good guard for the inputSchema description.src/brainlayer/mcp/__init__.py (3)
294-327: Auto‑importance heuristic is clear and bounded
Simple scoring with a cap keeps behavior predictable.
631-653: Decision‑field plumbing and auto‑importance integration are consistent
Schema, call‑site wiring, and _store_new all align cleanly.Also applies to: 810-813, 1124-1143
945-951: Auto‑scope fallback is non‑blocking
Good use of best‑effort scoping without failing searches.
| for phase, count, outcomes, avg_conf in rows: | ||
| phase_display = phase or "(unlinked)" | ||
| conf_display = f"{avg_conf:.2f}" if avg_conf else "N/A" | ||
| print(f"{phase_display:<25} {count:>8} {outcomes or 'N/A':<30} {conf_display:>8}") | ||
|
|
||
| # Recent commits | ||
| recent = list( | ||
| cursor.execute( | ||
| f"""SELECT commit_hash, commit_message, phase_name, project, | ||
| outcome, created_at | ||
| FROM phase_commits {where_sql} | ||
| ORDER BY created_at DESC LIMIT 10""", | ||
| params, | ||
| ) | ||
| ) | ||
| if recent: | ||
| print(f"\nRecent commits:") | ||
| for hash_, msg, phase, proj, outcome, ts in recent: | ||
| ts_short = (ts or "")[:19] | ||
| print(f" {hash_[:8]} {msg[:50]:<50} {phase or '':<15} {outcome or '':<10} {ts_short}") |
There was a problem hiding this comment.
Handle NULL commit_message and 0.0 confidence cleanly
commit_message is nullable in the schema; slicing msg[:50] will raise when it’s NULL. Also, avg_conf == 0.0 currently renders as “N/A”. Use explicit None checks.
🛠 Suggested fix
- conf_display = f"{avg_conf:.2f}" if avg_conf else "N/A"
+ conf_display = f"{avg_conf:.2f}" if avg_conf is not None else "N/A"
...
- print(f" {hash_[:8]} {msg[:50]:<50} {phase or '':<15} {outcome or '':<10} {ts_short}")
+ safe_msg = (msg or "")
+ print(f" {hash_[:8]} {safe_msg[:50]:<50} {phase or '':<15} {outcome or '':<10} {ts_short}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/phase-boundaries.py` around lines 61 - 80, The recent-commits block
and phase-summary formatting treat nullable fields and zero-confidence
incorrectly: explicitly check for None rather than using truthiness. In the rows
loop (variable rows, avg_conf) change the conf_display logic to render "N/A"
only when avg_conf is None and render "0.00" when avg_conf == 0.0; in the recent
loop (variable recent, commit_message/msg) guard against msg being None before
slicing (e.g., use msg if msg is not None else "" or assign msg = msg or ""
before msg[:50]) and similarly ensure phase/outcome formatting uses explicit
None checks so slicing/formatting never receives None.
| import os | ||
| import subprocess |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Remove unused imports os and subprocess.
Neither os nor subprocess is referenced anywhere in this file; all path work is done via pathlib.Path.
🧹 Proposed fix
import argparse
-import os
-import subprocess
from pathlib import Path📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import os | |
| import subprocess | |
| import argparse | |
| from pathlib import Path |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/scope-config.py` around lines 14 - 15, Remove the unused imports by
deleting the top-level import statements for os and subprocess in
scope-config.py; the file already uses pathlib.Path for path handling so you
should keep the pathlib import (if present) and any other used imports but
remove references to the symbols os and subprocess to eliminate the
unused-import warnings.
| for path, name in repos: | ||
| # Use ~ shorthand for home directory | ||
| display_path = str(path).replace(str(Path.home()), "~") | ||
| lines.append(f' {display_path}: "{name}"') |
There was a problem hiding this comment.
YAML keys (path strings) must be quoted to avoid invalid output.
An unquoted key that contains : (e.g., a Windows drive letter or an unusual directory name) or starts with a YAML-special character produces structurally invalid YAML that a downstream parser (e.g., scoping.py) will fail to load.
🛡️ Proposed fix
- lines.append(f' {display_path}: "{name}"')
+ lines.append(f' "{display_path}": "{name}"')📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for path, name in repos: | |
| # Use ~ shorthand for home directory | |
| display_path = str(path).replace(str(Path.home()), "~") | |
| lines.append(f' {display_path}: "{name}"') | |
| for path, name in repos: | |
| # Use ~ shorthand for home directory | |
| display_path = str(path).replace(str(Path.home()), "~") | |
| lines.append(f' "{display_path}": "{name}"') |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/scope-config.py` around lines 52 - 55, The YAML key (display_path) is
unquoted which can produce invalid YAML when the path contains ":" or special
chars; in the loop that builds lines (variables: repos, display_path, lines)
change the list append to quote the key and escape any internal quotes so keys
become valid YAML — e.g. compute an escaped_display_path =
display_path.replace('"', '\\"') and use lines.append(f'
"{escaped_display_path}": "{name}"') (or alternatively use a YAML dumper like
yaml.safe_dump to emit the mapping) so the generated YAML keys are always
quoted.
| if not repos: | ||
| print(f"No git repos found in {args.gits_dir}") | ||
| return |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Exit with a non-zero code when no repos are found.
Returning normally yields exit code 0, which makes the "nothing found" case indistinguishable from success in CI pipelines or shell scripts.
🧹 Proposed fix
+import sys
...
if not repos:
print(f"No git repos found in {args.gits_dir}")
- return
+ sys.exit(1)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/scope-config.py` around lines 84 - 86, The script currently prints
"No git repos found..." then returns which yields exit code 0; modify the block
under the if not repos: check to call sys.exit(1) (or raise SystemExit(1)) after
printing so the process exits with a non-zero code; ensure the module imports
sys at top if not already present and update the if not repos: branch in
scripts/scope-config.py accordingly.
| @app.command("hooks") | ||
| def hooks( | ||
| action: str = typer.Argument(..., help="Action: install"), | ||
| ) -> None: | ||
| """Manage BrainLayer git hooks.""" | ||
| if action != "install": | ||
| rprint(f"[red]Unknown action: {action}. Use 'install'.[/]") | ||
| raise typer.Exit(1) | ||
|
|
||
| import subprocess | ||
|
|
||
| # Find repo root | ||
| try: | ||
| repo_root = Path( | ||
| subprocess.check_output( | ||
| ["git", "rev-parse", "--show-toplevel"], stderr=subprocess.DEVNULL | ||
| ) | ||
| .decode() | ||
| .strip() | ||
| ) | ||
| except (subprocess.CalledProcessError, FileNotFoundError): | ||
| rprint("[red]Not in a git repository.[/]") | ||
| raise typer.Exit(1) | ||
|
|
||
| hooks_dir = repo_root / ".git" / "hooks" | ||
| if not hooks_dir.exists(): | ||
| rprint(f"[red]Git hooks directory not found: {hooks_dir}[/]") | ||
| raise typer.Exit(1) | ||
|
|
||
| # Find our post-commit hook | ||
| hook_source = Path(__file__).parent.parent.parent.parent / "hooks" / "post-commit.py" | ||
| if not hook_source.exists(): | ||
| # Try relative to package | ||
| import brainlayer | ||
|
|
||
| pkg_root = Path(brainlayer.__file__).parent.parent.parent | ||
| hook_source = pkg_root / "hooks" / "post-commit.py" | ||
|
|
||
| if not hook_source.exists(): | ||
| rprint("[red]Cannot find hooks/post-commit.py in BrainLayer repo.[/]") | ||
| raise typer.Exit(1) | ||
|
|
||
| target = hooks_dir / "post-commit" | ||
| if target.exists() or target.is_symlink(): | ||
| rprint(f"[yellow]Existing post-commit hook found at {target}[/]") | ||
| rprint("[yellow]Backing up to post-commit.bak[/]") | ||
| target.rename(hooks_dir / "post-commit.bak") | ||
|
|
||
| target.symlink_to(hook_source.resolve()) | ||
| hook_source.chmod(0o755) | ||
| rprint(f"[green]Installed post-commit hook: {target} → {hook_source.resolve()}[/]") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check the actual file and its context
head -20 src/brainlayer/cli/__init__.py
wc -l src/brainlayer/cli/__init__.pyRepository: EtanHey/brainlayer
Length of output: 515
🏁 Script executed:
# Read the specific lines mentioned in the review
sed -n '1544,1600p' src/brainlayer/cli/__init__.pyRepository: EtanHey/brainlayer
Length of output: 2089
🏁 Script executed:
# Test the git commands to understand the concern about .git/hooks
# Create test scenarios to validate the fix
cd /tmp
mkdir -p test_repo test_worktree
# Test 1: Normal repo
cd test_repo
git init
git rev-parse --is-inside-work-tree
git rev-parse --git-path hooks
ls -ld .git .git/hooks 2>&1 || true
# Test 2: Worktree scenario simulation
cd /tmp
mkdir -p worktree_test
cd worktree_test
git init main_repo
cd main_repo
git rev-parse --git-path hooks
git worktree list 2>&1 || echo "Worktree not available in this environment"Repository: EtanHey/brainlayer
Length of output: 1485
Use git rev-parse --git-path hooks to locate the hooks directory
Hardcoding <repo>/.git/hooks fails in git worktrees and submodules where .git is a file, not a directory. The current code will exit even in valid repos in these configurations. Resolve the hooks path via Git to handle both normal repos and worktrees correctly.
Additionally, hook_source.chmod(0o755) modifies the package source file itself rather than making the symlink executable. Either remove this line (the symlink inherits permissions from its target) or use target.chmod() if executable permissions on the symlink are required.
🛠 Suggested fix for hooks path
- hooks_dir = repo_root / ".git" / "hooks"
+ hooks_path = (
+ subprocess.check_output(["git", "rev-parse", "--git-path", "hooks"], stderr=subprocess.DEVNULL)
+ .decode()
+ .strip()
+ )
+ hooks_dir = Path(hooks_path)
+ if not hooks_dir.is_absolute():
+ hooks_dir = repo_root / hooks_dir🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/brainlayer/cli/__init__.py` around lines 1544 - 1594, The hooks command
currently hardcodes repo_root / ".git" / "hooks" which breaks for
worktrees/submodules; update the hooks() implementation to call git
(subprocess.check_output) with ["git", "rev-parse", "--git-path", "hooks"] to
obtain the correct hooks directory path (fall back to repo_root / ".git" /
"hooks" only if that command fails), use that value for hooks_dir, and handle
errors similarly to the existing git check; also stop calling
hook_source.chmod(0o755) on the package source file—instead either remove that
chmod call or apply permissions to the created symlink (target.chmod(...)) if
you need the hook to be executable.
| target.symlink_to(hook_source.resolve()) | ||
| hook_source.chmod(0o755) | ||
| rprint(f"[green]Installed post-commit hook: {target} → {hook_source.resolve()}[/]") |
There was a problem hiding this comment.
Guard chmod for read‑only installs
hook_source.chmod(0o755) can raise PermissionError when the hook file lives in site‑packages or a read‑only repo, aborting install after the symlink is created. Catch and warn instead of failing.
🛠 Suggested fix
- hook_source.chmod(0o755)
+ try:
+ hook_source.chmod(0o755)
+ except PermissionError:
+ rprint(f"[yellow]Warning:[/] Unable to chmod hook source: {hook_source}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| target.symlink_to(hook_source.resolve()) | |
| hook_source.chmod(0o755) | |
| rprint(f"[green]Installed post-commit hook: {target} → {hook_source.resolve()}[/]") | |
| target.symlink_to(hook_source.resolve()) | |
| try: | |
| hook_source.chmod(0o755) | |
| except PermissionError: | |
| rprint(f"[yellow]Warning:[/] Unable to chmod hook source: {hook_source}") | |
| rprint(f"[green]Installed post-commit hook: {target} → {hook_source.resolve()}[/]") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/brainlayer/cli/__init__.py` around lines 1592 - 1594, Wrap the call to
hook_source.chmod(0o755) in a try/except that catches PermissionError and logs a
warning (using rprint or the existing logger) instead of letting the exception
abort the installation; ensure the symlink creation
(target.symlink_to(hook_source.resolve())) remains unchanged and that rprint
still reports success (e.g., keep the rprint(f"[green]Installed post-commit
hook: {target} → {hook_source.resolve()}[/]") call) so installs in read‑only
locations succeed with a warning about permissions.
| def _get_pending_store_path(): | ||
| """Path for the store queue buffer file.""" | ||
| from ..paths import DEFAULT_DB_PATH | ||
| return DEFAULT_DB_PATH.parent / "pending-stores.jsonl" | ||
|
|
||
|
|
||
| def _queue_store(item: dict) -> None: | ||
| """Buffer a store request to JSONL when DB is locked.""" | ||
| import json as _json | ||
| path = _get_pending_store_path() | ||
| path.parent.mkdir(parents=True, exist_ok=True) | ||
| with open(path, "a") as f: | ||
| f.write(_json.dumps(item) + "\n") | ||
|
|
||
|
|
||
| def _flush_pending_stores(store, embed_fn) -> int: | ||
| """Flush pending-stores.jsonl (FIFO). Returns count flushed.""" | ||
| import json as _json | ||
|
|
||
| from ..store import store_memory | ||
|
|
||
| path = _get_pending_store_path() | ||
| if not path.exists(): | ||
| return 0 | ||
|
|
||
| try: | ||
| lines = path.read_text().strip().splitlines() | ||
| except Exception: | ||
| return 0 | ||
|
|
||
| if not lines: | ||
| return 0 | ||
|
|
||
| flushed = 0 | ||
| remaining = [] | ||
| for line in lines: | ||
| try: | ||
| item = _json.loads(line) | ||
| store_memory( | ||
| store=store, | ||
| embed_fn=embed_fn, | ||
| content=item["content"], | ||
| memory_type=item["memory_type"], | ||
| project=item.get("project"), | ||
| tags=item.get("tags"), | ||
| importance=item.get("importance"), | ||
| confidence_score=item.get("confidence_score"), | ||
| outcome=item.get("outcome"), | ||
| reversibility=item.get("reversibility"), | ||
| files_changed=item.get("files_changed"), | ||
| ) | ||
| flushed += 1 | ||
| except Exception: | ||
| remaining.append(line) | ||
|
|
||
| # Rewrite file with only failed items | ||
| if remaining: | ||
| path.write_text("\n".join(remaining) + "\n") | ||
| else: | ||
| path.unlink(missing_ok=True) | ||
|
|
||
| return flushed |
There was a problem hiding this comment.
Queue flush can drop concurrent writes
_flush_pending_stores reads and rewrites the queue without an atomic swap or lock. If another process appends during the flush, new entries can be lost or overwritten. Use an atomic rename (or file lock) to isolate the batch being processed and then append any remaining failures to the current queue file.
🛠 Suggested fix (atomic swap)
def _flush_pending_stores(store, embed_fn) -> int:
@@
path = _get_pending_store_path()
if not path.exists():
return 0
- try:
- lines = path.read_text().strip().splitlines()
- except Exception:
- return 0
+ tmp_path = path.with_suffix(".jsonl.processing")
+ try:
+ path.rename(tmp_path)
+ except FileNotFoundError:
+ return 0
+
+ try:
+ lines = tmp_path.read_text().strip().splitlines()
+ except Exception:
+ tmp_path.unlink(missing_ok=True)
+ return 0
@@
- if remaining:
- path.write_text("\n".join(remaining) + "\n")
- else:
- path.unlink(missing_ok=True)
+ if remaining:
+ with open(path, "a", encoding="utf-8") as f:
+ f.write("\n".join(remaining) + "\n")
+ tmp_path.unlink(missing_ok=True)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _get_pending_store_path(): | |
| """Path for the store queue buffer file.""" | |
| from ..paths import DEFAULT_DB_PATH | |
| return DEFAULT_DB_PATH.parent / "pending-stores.jsonl" | |
| def _queue_store(item: dict) -> None: | |
| """Buffer a store request to JSONL when DB is locked.""" | |
| import json as _json | |
| path = _get_pending_store_path() | |
| path.parent.mkdir(parents=True, exist_ok=True) | |
| with open(path, "a") as f: | |
| f.write(_json.dumps(item) + "\n") | |
| def _flush_pending_stores(store, embed_fn) -> int: | |
| """Flush pending-stores.jsonl (FIFO). Returns count flushed.""" | |
| import json as _json | |
| from ..store import store_memory | |
| path = _get_pending_store_path() | |
| if not path.exists(): | |
| return 0 | |
| try: | |
| lines = path.read_text().strip().splitlines() | |
| except Exception: | |
| return 0 | |
| if not lines: | |
| return 0 | |
| flushed = 0 | |
| remaining = [] | |
| for line in lines: | |
| try: | |
| item = _json.loads(line) | |
| store_memory( | |
| store=store, | |
| embed_fn=embed_fn, | |
| content=item["content"], | |
| memory_type=item["memory_type"], | |
| project=item.get("project"), | |
| tags=item.get("tags"), | |
| importance=item.get("importance"), | |
| confidence_score=item.get("confidence_score"), | |
| outcome=item.get("outcome"), | |
| reversibility=item.get("reversibility"), | |
| files_changed=item.get("files_changed"), | |
| ) | |
| flushed += 1 | |
| except Exception: | |
| remaining.append(line) | |
| # Rewrite file with only failed items | |
| if remaining: | |
| path.write_text("\n".join(remaining) + "\n") | |
| else: | |
| path.unlink(missing_ok=True) | |
| return flushed | |
| def _flush_pending_stores(store, embed_fn) -> int: | |
| """Flush pending-stores.jsonl (FIFO). Returns count flushed.""" | |
| import json as _json | |
| from ..store import store_memory | |
| path = _get_pending_store_path() | |
| if not path.exists(): | |
| return 0 | |
| tmp_path = path.with_suffix(".jsonl.processing") | |
| try: | |
| path.rename(tmp_path) | |
| except FileNotFoundError: | |
| return 0 | |
| try: | |
| lines = tmp_path.read_text().strip().splitlines() | |
| except Exception: | |
| tmp_path.unlink(missing_ok=True) | |
| return 0 | |
| if not lines: | |
| return 0 | |
| flushed = 0 | |
| remaining = [] | |
| for line in lines: | |
| try: | |
| item = _json.loads(line) | |
| store_memory( | |
| store=store, | |
| embed_fn=embed_fn, | |
| content=item["content"], | |
| memory_type=item["memory_type"], | |
| project=item.get("project"), | |
| tags=item.get("tags"), | |
| importance=item.get("importance"), | |
| confidence_score=item.get("confidence_score"), | |
| outcome=item.get("outcome"), | |
| reversibility=item.get("reversibility"), | |
| files_changed=item.get("files_changed"), | |
| ) | |
| flushed += 1 | |
| except Exception: | |
| remaining.append(line) | |
| # Rewrite file with only failed items | |
| if remaining: | |
| with open(path, "a", encoding="utf-8") as f: | |
| f.write("\n".join(remaining) + "\n") | |
| tmp_path.unlink(missing_ok=True) | |
| return flushed |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/brainlayer/mcp/__init__.py` around lines 1787 - 1848,
_flush_pending_stores currently reads and rewrites the live pending-stores.jsonl
which can lose concurrent _queue_store appends; change to perform an atomic
swap: compute the path from _get_pending_store_path(), atomically rename/move
the original file to a temp file (e.g., pending-stores.jsonl.processing using
os.rename/pathlib.Path.rename) so new writers via _queue_store continue
appending to a fresh file, read/process the temp file lines calling store_memory
from _flush_pending_stores, collect failed lines, then append any failed lines
to the current live file (open original in "a" and write failures + newline) and
finally remove the temp file (or unlink with missing_ok); ensure all
rename/read/append operations handle missing file races and exceptions so no
writes are lost.
…zation, path boundary - post-commit.py: use Python API instead of nonexistent brainlayer-store CLI - _store queue: normalize project name before queuing - scoping.py: add path boundary check to prevent false prefix matches Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
|
|
||
| # Prohibition/imperative keywords: +2 (once) | ||
| if any(kw in lower for kw in _PROHIBITION_KEYWORDS): | ||
| score += 2 |
There was a problem hiding this comment.
Substring keyword matching causes false importance score boosts
Medium Severity
_auto_importance uses plain in substring matching (kw in lower) for keyword detection, but short keywords like "api" and "auth" in _ARCH_KEYWORDS match inside common words — "capital", "rapid", "author" — triggering a +3 boost. Similarly, "must" and "never" in _PROHIBITION_KEYWORDS match "mustard" and "whenever" for +2. Since importance_min is used in brain_search to filter results, inflated scores cause irrelevant content to appear in filtered searches. The existing _TYPE_RULES in the same file already demonstrates the correct approach using \b word boundaries with re.search.
| rprint(f"[green]Flushed {flushed} item(s).[/]") | ||
| remaining = len(lines) - flushed | ||
| if remaining > 0: | ||
| rprint(f"[yellow]{remaining} item(s) still pending (DB may still be locked).[/]") |
There was a problem hiding this comment.
Flush CLI uses stale count for remaining calculation
Low Severity
The flush command reads pending-stores.jsonl at line 1603 to count items, then _flush_pending_stores reads the same file again independently at its own line 1817. The remaining = len(lines) - flushed calculation at line 1622 mixes counts from two separate reads. The initial read is entirely redundant since _flush_pending_stores handles existence/emptiness checks internally.


Summary
~/.config/brainlayer/scopes.yaml— brain_search auto-scopes whenprojectparam is Noneconfidence_score,outcome,reversibility,files_changedfields in brain_store for type=decisionhooks/post-commit.pyauto-stores git commits into BrainLayerpending-stores.jsonl, auto-flush on next successful storebrainlayer hooks install(symlinks hook),brainlayer flush(manual queue drain)scope-config.py(generate scopes.yaml from git repos),phase-boundaries.py(phase commit report)Test plan
tests/test_phase5.py— all pass🤖 Generated with Claude Code
Note
Medium Risk
Touches core MCP search/store paths and adds new persistence/queueing behavior around SQLite locking; failures are mostly best-effort, but regressions could affect write reliability and project scoping.
Overview
Hardens BrainLayer search/store behavior and adds phase/decision tracking.
brain_searchnow auto-scopesprojectfrom CWD via a newscoping.py+scopes.yamlmapping, whilebrain_storegains keyword-based auto-importance whenimportanceis omitted.Extends stored-memory metadata and improves write reliability.
store_memoryand the MCPbrain_storeschema now accept decision-tracking fields (confidence_score,outcome,reversibility,files_changed), and stores that hit SQLite “locked/busy” errors are buffered topending-stores.jsonland auto-flushed on the next successful write (with a newbrainlayer flushCLI command).Adds commit/phase tooling. Introduces a
phase_commitstable in the DB, ahooks/post-commit.pyhook (installable viabrainlayer hooks install) to store commit summaries into BrainLayer, and helper scripts to generatescopes.yamland report phase boundaries; adds Phase 5 tests covering these behaviors.Written by Cursor Bugbot for commit 5917a1d. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
Release Notes