Skip to content

feat: conditional activation for BrainLayer hooks#114

Merged
EtanHey merged 6 commits into
mainfrom
feat/conditional-hook-activation
Mar 27, 2026
Merged

feat: conditional activation for BrainLayer hooks#114
EtanHey merged 6 commits into
mainfrom
feat/conditional-hook-activation

Conversation

@EtanHey
Copy link
Copy Markdown
Owner

@EtanHey EtanHey commented Mar 27, 2026

Summary

  • Add should_activate() gate to all 3 BrainLayer hook scripts (brainlayer-session-start.py, brainlayer-prompt-search.py, brainbar-stop-index.py)
  • Implements CC 2.1.85 conditional hook pattern — reduces unnecessary Python process spawns and token injection for non-interactive/worker sessions
  • Three env var controls: BRAINLAYER_HOOKS_DISABLED=1 (kill switch), CLAUDE_NON_INTERACTIVE=1 (skip in --print mode), BRAINLAYER_HOOKS_LIGHT=1 (cap results at 2 for workers)
  • 17 new tests covering all activation paths and edge cases

Test plan

  • 17 new tests in tests/test_conditional_hooks.py — all pass
  • 287 existing tests pass (1 pre-existing failure in test_digest_connect.py unrelated)
  • Synced to ~/.claude/hooks/ for live use
  • Manual verification: set BRAINLAYER_HOOKS_DISABLED=1 and confirm hooks skip
  • Manual verification: set BRAINLAYER_HOOKS_LIGHT=1 and confirm reduced injection

🤖 Generated with Claude Code

Note

Add conditional activation and light mode to BrainLayer hooks with V2 duplicate detection in digest pipeline

  • Adds should_activate() to brainbar-stop-index.py, brainlayer-prompt-search.py, and brainlayer-session-start.py; hooks exit early when BRAINLAYER_HOOKS_DISABLED=1 or CLAUDE_NON_INTERACTIVE=1.
  • Introduces a light mode (BRAINLAYER_HOOKS_LIGHT=1) that reduces result limits: prompt-search fetches at most 2 results (vs 3/8), session-start fetches at most 2 rows (vs 5).
  • Adds V2 duplicate detection to digest.py gated by BRAINLAYER_DIGEST_V2 env var; uses length-tiered cosine similarity thresholds to find near-duplicate chunks and returns them in digest_content and digest_connect responses.
  • Adds a chunk_events audit table to vector_store.py with record_event and get_chunk_events methods for persisting digest audit history.
  • Behavioral Change: digest_content and digest_connect response shapes now include additional keys (duplicates, related_chunks, expanded stats) which may affect callers expecting exact response structure.

Macroscope summarized b7922eb.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added conditional hook activation controlled via environment variables for flexible pipeline control
    • Introduced light-mode option for reduced processing overhead during search and session operations
    • Implemented deduplication detection with duplicate reporting in the digest pipeline
    • Added audit trail tracking for chunk operations with timestamped event history
  • Improvements

    • Enhanced result limiting with dynamic thresholds based on processing mode

EtanHey and others added 4 commits March 27, 2026 03:06
…ed reports

Adds chunk_events audit table for tracking all chunk lifecycle actions.
Implements length-tiered cosine dedup (0.95/<50tok, 0.90/50-200, 0.88/200+)
with research-validated thresholds. Enhanced structured reports for both
digest and connect modes. Feature flag BRAINLAYER_DIGEST_V2 (default: true).

- chunk_events table: chunk_id, action, timestamp, by_whom, reason
- record_event() + get_chunk_events() on VectorStore
- find_duplicates() with length-aware thresholds
- digest_content: dedup detection + audit trail + enhanced stats
- digest_connect: related_chunks, duplicates, connections_made in response
- Backward compatible: all existing fields preserved
- 47 new tests (10 audit, 12 dedup, 5 reports, 4 compat, 3 flag, 3 MCP, 4 integration, 2 perf, 4 edge)
- Performance: <2s for 1000-word content verified
- 0 regressions in full suite (1183 passed)

Worker: brainlayer-worker-A-R3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ad, test assertion

- HIGH: find_duplicates() now accepts exclude_ids param; digest_content
  passes {chunk_id} to avoid the just-inserted chunk appearing as a dup
- MEDIUM: get_chunk_events() uses _read_cursor() for thread-safe reads
- LOW: test_digest_then_digest_finds_duplicate now asserts duplicates key
  exists rather than silently passing on empty list
- Added test_find_duplicates_excludes_self (48 total tests)

Worker: brainlayer-worker-A-R3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d, consistency

- H2 (both): removed dead _cosine_similarity function + 3 tests
- H2 (Macroscope): record_event uses conn.last_insert_rowid() (APSW idiom)
- H3 (CodeRabbit): added distance metric comment in find_duplicates
- M1 (CodeRabbit): duplicates field now consistently present when V2 enabled
- M4 (both): digest_connect reuses topic_query embedding for dedup (no redundant call)

Worker: brainlayer-worker-A-R3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add should_activate() gate to all 3 BrainLayer hook scripts, reducing
unnecessary process spawns and token usage in non-interactive/worker sessions.

Env vars:
- BRAINLAYER_HOOKS_DISABLED=1: skip all BrainLayer hooks
- CLAUDE_NON_INTERACTIVE=1: skip in --print mode
- BRAINLAYER_HOOKS_LIGHT=1: reduced results (2 instead of 3-8) for workers

17 new tests covering all activation paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 27, 2026

📝 Walkthrough

Walkthrough

This PR introduces conditional hook activation via should_activate() functions that check environment variables to gate hook execution and enable light mode, adds a Digest V2 feature with deduplication logic and audit event tracking to the digest pipeline, extends the vector store with chunk event recording and retrieval methods, and provides comprehensive test coverage for both features.

Changes

Cohort / File(s) Summary
Hook Activation Gates
hooks/brainbar-stop-index.py, hooks/brainlayer-prompt-search.py, hooks/brainlayer-session-start.py
Added should_activate() functions that check BRAINLAYER_HOOKS_DISABLED and CLAUDE_NON_INTERACTIVE environment variables to conditionally skip hook execution. Introduced light_mode flag based on BRAINLAYER_HOOKS_LIGHT that adjusts result limits (hardcoded values replaced with parameterized limits: 2 in light mode, 5 or 8 otherwise).
Digest V2 Feature
src/brainlayer/pipeline/digest.py
Added DIGEST_V2_ENABLED feature flag and length-tiered deduplication logic with find_duplicates(), _estimate_tokens(), and _get_dedup_threshold(). Enhanced digest_content to record audit events and include duplicate detection in response. Extended digest_connect with optional v2 dedup and renamed connections field to related_chunks (with backward-compatible connections alias). Expanded both functions' response stats with new keys: chunks_created, connections_made, duplicates_found, supersedes_proposed.
Vector Store Audit Trail
src/brainlayer/vector_store.py
Added chunk_events SQLite table with indexes on chunk_id, action, and timestamp. Introduced record_event() method to persist audit entries and get_chunk_events() method to retrieve per-chunk event history ordered by newest-first.
Test Coverage
tests/test_conditional_hooks.py
New test module exercising should_activate() gating behavior across three hooks with environment variable combinations: default activation, disabling via BRAINLAYER_HOOKS_DISABLED, non-interactive mode, and light-mode behavior with precedence rules.
Digest V2 Tests
tests/test_digest_pipeline_v2.py
Comprehensive test suite covering audit logging schema/indexing, length-tiered deduplication logic, report structure compatibility, feature flag toggling, MCP routing validation, end-to-end integration flows, and edge cases including token estimation boundaries and duplicate filtering by cosine similarity.

Sequence Diagram(s)

sequenceDiagram
    participant Handler as MCP Handler
    participant Digest as Digest Pipeline
    participant Store as VectorStore
    participant Search as Hybrid Search
    participant Events as Event Log

    Handler->>Digest: digest_content(content, embedding)
    alt DIGEST_V2_ENABLED
        Digest->>Digest: _estimate_tokens(content)
        Digest->>Digest: _get_dedup_threshold(token_count)
        Digest->>Digest: find_duplicates(content, embedding)
        Digest->>Store: hybrid_search(embedding, ...)
        Store->>Search: Execute FTS/similarity query
        Search-->>Store: Return candidates
        Digest->>Digest: Filter by cosine similarity threshold
        Digest->>Store: record_event(chunk_id, "digest_created")
        Store->>Events: Insert audit entry
        Events-->>Store: Return event_id
    end
    Digest-->>Handler: Return result with duplicates + stats
Loading
sequenceDiagram
    participant Handler as MCP Handler
    participant Digest as Digest Pipeline
    participant Store as VectorStore
    participant Search as Hybrid Search

    Handler->>Digest: digest_connect(topic_query, connections, ...)
    Digest->>Digest: Compute connections/contradictions/supersedes
    alt DIGEST_V2_ENABLED
        Digest->>Digest: Create embedding from topic_query
        Digest->>Digest: find_duplicates(embedding=query_embedding)
        Digest->>Store: hybrid_search(embedding, ...)
        Store->>Search: Execute FTS/similarity query
        Search-->>Store: Return candidates
        Digest->>Digest: Filter by cosine similarity threshold
    end
    Digest-->>Handler: Return proposal with related_chunks (alias: connections) + duplicates
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Possibly related PRs

Poem

🐰 Hops of logic, gates of care,
Light mode dims the search so fair,
Digests duplicate their way,
Audit trails on every day,
Smart dedup hops to play! 🎯

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 74.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: conditional activation for BrainLayer hooks' clearly and concisely summarizes the main change: adding conditional activation gating to BrainLayer hook scripts via the should_activate() function and environment variable controls.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/conditional-hook-activation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented Mar 27, 2026

Code Review: Conditional Hook Activation (PR #114)

What was done well

  • Clean, consistent pattern across all three hooks -- the should_activate() gate is easy to understand and follows the cheapest-check-first ordering as documented.
  • Good test coverage for the core activation logic: all 17 tests pass, covering disabled, non-interactive, light mode, precedence, and value strictness ("1" only, not "true" or "0").
  • The clean_env autouse fixture is a solid approach for env var isolation in tests.
  • light_mode correctly overrides deep mode in brainlayer-prompt-search.py (line 290: light check comes before the deep/normal branch), which is the right behavior for cost-conscious worker sessions.
  • The result_limit parameterization in session-start SQL queries (switching from hardcoded LIMIT 5 to LIMIT ? with the variable) is clean.

Issues

Important (should fix)

1. Unused imports in tests/test_conditional_hooks.py

sys (line 16) and patch (line 18) are imported but never used. ruff check confirms:

F401 `sys` imported but unused --> tests/test_conditional_hooks.py:16:8
F401 `unittest.mock.patch` imported but unused --> tests/test_conditional_hooks.py:18:27

Remove both. This will fail if ruff is enforced in CI.

2. prompt_search.should_activate(hook_input) accepts hook_input but never uses it

In brainlayer-prompt-search.py line 27, the function signature is def should_activate(hook_input): but the parameter is never referenced in the function body -- only env vars are checked. Compare with session_start.should_activate(hook_input) which actually reads hook_input.get("session_id") at line 39.

Two options:

  • (a) Remove the parameter and call it as should_activate() (like brainbar-stop-index.py does). This is the simpler choice if there is no planned use.
  • (b) Keep it for API consistency with session-start (all three hooks share the same should_activate(hook_input) -> (bool, bool) signature). If keeping it, add a # noqa or a comment explaining the future-proofing intent.

I'd lean toward (a) for clarity unless you anticipate per-prompt activation logic (e.g., skipping for certain prompt patterns).

3. TestLightModeResultLimits tests don't actually test the hooks' result-limiting code

Both test_session_start_light_limit and test_prompt_search_light_limit duplicate the limit-selection logic inline in the test rather than exercising the actual hook code:

result_limit = 2 if light else 5  # This is test-local logic, not from the hook
assert result_limit == 2

These tests verify that should_activate returns light=True, then manually recompute what the limit should be. They don't verify that the hook's main() actually uses result_limit=2 when light_mode is set. The should_activate return value is already tested by the earlier tests in each class -- these two tests are redundant with test_light_mode and test_light_mode_reduces_results.

To make these genuinely valuable, consider either:

  • Mocking the DB connection and calling main() via subprocess with env vars set, asserting the SQL query receives LIMIT 2.
  • Or extracting the limit-computation into a testable function (e.g., get_result_limit(light_mode, deep)) and testing that.
  • Or simply remove these tests since the behavior they claim to verify is already covered.

Suggestions (nice to have)

4. Session-start silently ignores light mode when session_id is empty

In brainlayer-session-start.py lines 39-41:

session_id = hook_input.get("session_id", "")
if not session_id:
    return True, False  # <-- always False for light, even if BRAINLAYER_HOOKS_LIGHT=1

This means if a session starts without a session_id in the payload, BRAINLAYER_HOOKS_LIGHT=1 is silently ignored -- the hook runs in full mode. This is tested by test_empty_input (which asserts light is False), but the intent isn't documented. Is this deliberate (sessions without IDs are rare/manual and should get full context)? A brief comment would help future readers.

5. Other hooks not gated

The PR covers the 3 primary hooks, but 3 more exist in hooks/:

  • brainbar-postcompact.py (records chapter boundaries -- writes to DB)
  • brainbar-prompt-capture.py (captures prompts to pending dir)
  • session-cleanup.py (kills orphaned MCPs, checkpoints WAL)

These could also benefit from BRAINLAYER_HOOKS_DISABLED. Particularly brainbar-postcompact.py and brainbar-prompt-capture.py which write data -- if hooks are disabled, should these still fire? Not a blocker for this PR, but worth a follow-up ticket.

6. Module docstring terminology overlap

brainlayer-prompt-search.py line 7 uses "Light" for the default 3-result mode vs "Deep" for 8-result mode. The new BRAINLAYER_HOOKS_LIGHT env var introduces a different concept of "light" (2 results for worker sessions). This could confuse future readers. Consider updating the docstring to mention three tiers: worker (2), normal (3), deep (8).

7. post-commit.py is also in hooks/

This is probably not a BrainLayer hook in the same sense (it's a git hook), but flagging for completeness.


Summary

The core implementation is correct and well-structured. The activation gate integrates cleanly into the main() flow of each hook. The two must-fix items are the unused imports (will fail ruff) and the unused hook_input parameter in prompt_search.should_activate. The TestLightModeResultLimits class should either be strengthened to test actual hook behavior or removed as redundant. Everything else is clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented Mar 27, 2026

Review 1: CodeRabbit CLI (v0.3.11)

2 findings:

1. Docstring inconsistency (potential_issue) — FIXED in a87525b

  • hooks/brainlayer-prompt-search.py docstring said "Light (default): top 3 results" but light mode is opt-in and returns 2 results
  • Updated to reflect three-mode activation: Normal (3), Deep (8), Light (2)

2. Pre-existing: tests/test_digest_pipeline_v2.py:714 — OUT OF SCOPE

  • Conditional duplicate assertion could hide failures
  • Not in this changeset, deferred

Verdict: All in-scope findings addressed.

Comment thread tests/test_conditional_hooks.py Outdated
Comment on lines +33 to +45
if os.environ.get("BRAINLAYER_HOOKS_DISABLED") == "1":
return False, False

if os.environ.get("CLAUDE_NON_INTERACTIVE") == "1":
return False, False

session_id = hook_input.get("session_id", "")
if not session_id:
return True, False

light = os.environ.get("BRAINLAYER_HOOKS_LIGHT") == "1"

return True, light
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium hooks/brainlayer-session-start.py:33

In should_activate, when session_id is empty, the function returns (True, False) on line 41, bypassing the BRAINLAYER_HOOKS_LIGHT check on line 43. This means if BRAINLAYER_HOOKS_LIGHT=1 is set but session_id is empty/missing, light_mode returns False instead of True, causing the query to use LIMIT 5 instead of the intended LIMIT 2. Consider checking BRAINLAYER_HOOKS_LIGHT before the empty session_id early return.

    if os.environ.get("BRAINLAYER_HOOKS_DISABLED") == "1":
         return False, False
 
     if os.environ.get("CLAUDE_NON_INTERACTIVE") == "1":
         return False, False
 
+    light = os.environ.get("BRAINLAYER_HOOKS_LIGHT") == "1"
+
     session_id = hook_input.get("session_id", "")
     if not session_id:
-        return True, False
-
-    light = os.environ.get("BRAINLAYER_HOOKS_LIGHT") == "1"
+        return True, light
 
     return True, light
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file hooks/brainlayer-session-start.py around lines 33-45:

In `should_activate`, when `session_id` is empty, the function returns `(True, False)` on line 41, bypassing the `BRAINLAYER_HOOKS_LIGHT` check on line 43. This means if `BRAINLAYER_HOOKS_LIGHT=1` is set but `session_id` is empty/missing, `light_mode` returns `False` instead of `True`, causing the query to use `LIMIT 5` instead of the intended `LIMIT 2`. Consider checking `BRAINLAYER_HOOKS_LIGHT` before the empty `session_id` early return.

Evidence trail:
hooks/brainlayer-session-start.py lines 39-45 (REVIEWED_COMMIT): early return at line 41 `return True, False` when `session_id` is empty bypasses the `BRAINLAYER_HOOKS_LIGHT` check at line 43.
hooks/brainlayer-session-start.py line 238 (REVIEWED_COMMIT): `result_limit = 2 if light_mode else 5` confirms the consequence - light_mode=False causes LIMIT 5 instead of LIMIT 2.
hooks/brainlayer-session-start.py lines 28-29 (REVIEWED_COMMIT): docstring states `BRAINLAYER_HOOKS_LIGHT=1 → reduce to 2 results (overnight workers)` confirming the intended behavior.

reason: Optional[str] = None,
) -> int:
"""Record an audit event for a chunk. Returns the event row ID."""
cursor = self.conn.cursor()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low brainlayer/vector_store.py:809

record_event uses the shared self.conn without synchronization, so concurrent calls from multiple threads can race between INSERT and last_insert_rowid(). If thread A inserts and thread B inserts before A reads the row ID, last_insert_rowid() returns B's ID instead of A's. Consider adding a threading lock around the INSERT + last_insert_rowid() pair, or using the RETURNING clause to fetch the ID atomically.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file src/brainlayer/vector_store.py around line 809:

`record_event` uses the shared `self.conn` without synchronization, so concurrent calls from multiple threads can race between `INSERT` and `last_insert_rowid()`. If thread A inserts and thread B inserts before A reads the row ID, `last_insert_rowid()` returns B's ID instead of A's. Consider adding a threading lock around the INSERT + last_insert_rowid() pair, or using the RETURNING clause to fetch the ID atomically.

Evidence trail:
src/brainlayer/vector_store.py lines 801-815 (record_event method at REVIEWED_COMMIT); src/brainlayer/vector_store.py line 101 (self.conn = apsw.Connection); git_grep for threading locks returned no matches; APSW docs at https://github.com/rogerbinns/apsw/blob/master/src/cursor.c confirm cursors on same connection are not isolated

- Remove unused sys/patch imports from test file (ruff F401)
- Remove unused hook_input param from prompt_search.should_activate()
- Remove redundant TestLightModeResultLimits (covered by earlier tests)
- Add comment explaining session_id empty case intent

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
hooks/brainbar-stop-index.py (1)

29-34: 🧹 Nitpick | 🔵 Trivial

Consider using paths.py:get_db_path() for database path resolution.

As per coding guidelines, scripts should use paths.py:get_db_path() instead of hardcoding paths. This hook defines its own get_db_path() function with hardcoded path logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/brainbar-stop-index.py` around lines 29 - 34, This hook defines a local
get_db_path() with hardcoded logic; replace its use with the canonical
paths.py:get_db_path() instead: remove or stop calling the local get_db_path(),
import the get_db_path symbol from paths (e.g., from paths import get_db_path)
and use that function wherever the DB path is needed (ensure any callers in this
module reference the imported get_db_path and not the local implementation).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hooks/brainlayer-prompt-search.py`:
- Around line 27-45: The function should_activate currently declares an unused
parameter hook_input; remove hook_input from should_activate's signature and
update all callers to invoke should_activate() with no arguments, or
alternatively keep the parameter but explicitly document its reserved status
(e.g., add a comment) or use it (e.g., read hook_input.get("session_id"))—pick
one approach and make the corresponding change to the should_activate definition
and every call site that passes hook_input so the signature and usages stay
consistent.

In `@hooks/brainlayer-session-start.py`:
- Around line 39-45: The function currently returns (True, False) when
session_id is missing, skipping the BRAINLAYER_HOOKS_LIGHT check; compute the
light flag from os.environ.get("BRAINLAYER_HOOKS_LIGHT") == "1" unconditionally
(before the session_id early-return) and return (True, light) when session_id is
absent so light mode applies regardless of session presence; update the logic
around the session_id/hook_input handling to reference the session_id variable
and the BRAINLAYER_HOOKS_LIGHT check (no other changes needed).

In `@src/brainlayer/pipeline/digest.py`:
- Around line 714-725: In digest_connect when DIGEST_V2_ENABLED is true you call
find_duplicates(content=content, embedding=embed_fn(topic_query)) which
mismatches the embedding source (topic_query) and the content used to compute
the dedup threshold; update the code so the embedding and token-count threshold
are aligned — either compute embedding_for_dedup = embed_fn(content) and pass
that to find_duplicates, or if you must use embed_fn(topic_query) then compute
tokens/threshold from topic_query instead of content; adjust calls to embed_fn,
find_duplicates, and any token-count logic accordingly (symbols: digest_connect,
find_duplicates, embed_fn, topic_query, content, DIGEST_V2_ENABLED).

In `@tests/test_conditional_hooks.py`:
- Around line 155-173: The tests currently only re-compute limits from the
returned light flag (calling should_activate on session_start and prompt_search)
which verifies the test logic, not the hook behavior; update the tests to assert
the hook's actual limit usage by either accessing the hook's internal
result_limit/base_limit attributes or by mocking the DB call and asserting the
executed query's LIMIT parameter; specifically change
TestLightModeResultLimits.test_session_start_light_limit and
test_prompt_search_light_limit to inspect session_start.result_limit or
session_start.main (or equivalent internal method) and prompt_search.base_limit,
or patch the DB client used by the hook to capture the SQL/parameters and assert
LIMIT equals 2 when should_activate returns light=True.

In `@tests/test_digest_pipeline_v2.py`:
- Around line 622-623: The test defines mock_faceted as a nested function while
mock_embed is a MagicMock; replace the nested def by either assigning
mock_faceted = lambda **kw: <appropriate return> or convert both mocks into
fixtures (e.g., a pytest fixture that returns MagicMock(...) for mock_faceted
and mock_embed) so they are consistent; update any references in the test to use
the new lambda or fixture name (mock_faceted / mock_embed) and ensure the return
shape matches the original behavior.

---

Outside diff comments:
In `@hooks/brainbar-stop-index.py`:
- Around line 29-34: This hook defines a local get_db_path() with hardcoded
logic; replace its use with the canonical paths.py:get_db_path() instead: remove
or stop calling the local get_db_path(), import the get_db_path symbol from
paths (e.g., from paths import get_db_path) and use that function wherever the
DB path is needed (ensure any callers in this module reference the imported
get_db_path and not the local implementation).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d2be79b6-69cc-4c4e-9391-d7bff338e1cf

📥 Commits

Reviewing files that changed from the base of the PR and between 52f9c4a and b31ab21.

📒 Files selected for processing (7)
  • hooks/brainbar-stop-index.py
  • hooks/brainlayer-prompt-search.py
  • hooks/brainlayer-session-start.py
  • src/brainlayer/pipeline/digest.py
  • src/brainlayer/vector_store.py
  • tests/test_conditional_hooks.py
  • tests/test_digest_pipeline_v2.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Macroscope - Correctness Check
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.13)
  • GitHub Check: test (3.12)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

**/*.py: All scripts and CLI must use paths.py:get_db_path() for database path resolution instead of hardcoding paths
Preserve verbatim content for message types: ai_code, stack_trace, user_message during classification and chunking
Skip noise content entirely; summarize build_log; extract structure only from dir_listing during chunking
Use AST-aware chunking via tree-sitter; never split stack traces; mask large tool output
Implement retry logic on SQLITE_BUSY errors; each worker must use its own database connection
Override enrichment backend via BRAINLAYER_ENRICH_BACKEND environment variable (valid values: ollama, mlx, groq); default to Groq
Configure enrichment rate via BRAINLAYER_ENRICH_RATE environment variable (default: 0.2 = 12 RPM)
Checkpoint WAL before and after bulk database operations: PRAGMA wal_checkpoint(FULL)
Drop FTS triggers before bulk deletes from chunks table; recreate after operation to avoid performance degradation
Batch deletes in 5-10K chunk sizes with WAL checkpoint every 3 batches
Default search queries must exclude lifecycle-managed chunks; use include_archived=True parameter to show history
Lint and format code with ruff check src/ && ruff format src/
Session dedup coordination: SessionStart writes injected chunk_ids to /tmp/brainlayer_session_{id}.json; UserPromptSubmit skips already-injected chunks
Skip auto-search for prompts containing 'handoff' or 'session-handoff' keywords

Files:

  • hooks/brainbar-stop-index.py
  • hooks/brainlayer-prompt-search.py
  • hooks/brainlayer-session-start.py
  • src/brainlayer/vector_store.py
  • tests/test_conditional_hooks.py
  • tests/test_digest_pipeline_v2.py
  • src/brainlayer/pipeline/digest.py
**/*test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use pytest for testing

Files:

  • tests/test_conditional_hooks.py
  • tests/test_digest_pipeline_v2.py
🧠 Learnings (1)
📚 Learning: 2026-03-14T02:20:54.656Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-14T02:20:54.656Z
Learning: Applies to **/*.py : Run pytest before claiming behavior changed safely; current test suite has 929 tests

Applied to files:

  • tests/test_conditional_hooks.py
🔇 Additional comments (15)
hooks/brainbar-stop-index.py (1)

37-48: LGTM!

The should_activate() gate correctly implements the conditional activation pattern. The early return in main() prevents unnecessary stdin parsing and DB operations when hooks are disabled or in non-interactive mode.

hooks/brainlayer-prompt-search.py (1)

289-294: LGTM!

Light mode correctly caps results to 2 for worker sessions, reducing token cost while maintaining the existing logic structure for deep vs auto mode.

hooks/brainlayer-session-start.py (1)

238-238: LGTM!

Good change to parameterize the LIMIT clause instead of hardcoding. The result_limit calculation correctly selects 2 for light mode and 5 otherwise, and is properly passed to both the scoped and unscoped queries.

Also applies to: 258-260, 270-272

src/brainlayer/vector_store.py (3)

438-451: LGTM!

The chunk_events audit table schema is well-designed with:

  • Appropriate columns for audit trail
  • Server-side DEFAULT for timestamp
  • Indexes on the three most likely query patterns (chunk_id, action, timestamp)

801-814: LGTM!

record_event() correctly uses the write connection and returns the row ID via APSW's last_insert_rowid(). The implementation relies on APSW's default autocommit behavior for single statements, which is appropriate for audit logging.


816-840: LGTM!

get_chunk_events() correctly uses the per-thread readonly cursor (_read_cursor()), ensuring thread safety for concurrent read operations. The ORDER BY id DESC provides consistent newest-first ordering.

tests/test_conditional_hooks.py (2)

51-65: LGTM!

The clean_env fixture correctly implements save/restore semantics for environment variables, ensuring test isolation. The autouse=True ensures it runs for every test.


68-104: LGTM!

Good coverage of should_activate() behavior for session-start hook, including precedence rules and empty input handling.

src/brainlayer/pipeline/digest.py (3)

29-30: Feature flag defaults to enabled — ensure downstream consumers are ready.

DIGEST_V2_ENABLED defaults to True, meaning all existing callers will immediately get the new duplicates field and stats.duplicates_found in their responses. Per context snippets 2 and 3, the MCP handler and daemon return digest results directly to clients.

Verify that API consumers can handle the new fields without breaking.


749-750: Good backward compatibility pattern.

Including both related_chunks (new canonical name) and connections (backward compat alias) ensures existing consumers won't break while allowing migration to the new field name.


88-106: No action required. The code correctly accesses nested lists from hybrid_search: the function wraps flat lists (ids, documents, distances) in outer lists ([ids], [documents], [distances]), so results["ids"][0] correctly accesses the inner flat list of chunk IDs. The distance-to-score conversion (score = 1 - dist) is also correct for cosine distance ranges (0=identical, 2=opposite).

tests/test_digest_pipeline_v2.py (4)

1-46: LGTM!

Well-structured test fixtures covering real DB (tmp_store), mocks with configurable results (mock_store_with_results), and deterministic embeddings (mock_embed). The mock_store_with_results correctly constructs nested list format matching the expected hybrid_search return structure.


128-226: LGTM!

Thorough coverage of the chunk_events audit table including:

  • Schema validation (table exists, columns, indexes)
  • record_event behavior (returns ID, minimal args)
  • get_chunk_events behavior (ordering, limits, isolation)

521-570: LGTM!

Feature flag tests correctly verify that V2 behavior (dedup, audit events) is gated behind DIGEST_V2_ENABLED, with proper save/restore of the flag value.


693-720: Good real-world integration test.

This test validates the end-to-end duplicate detection by digesting similar content twice and verifying the second digest finds the first as a duplicate. The assertion at line 718-720 correctly handles the case where duplicates may or may not be found depending on similarity thresholds.

Comment thread hooks/brainlayer-prompt-search.py Outdated
Comment on lines +39 to +45
session_id = hook_input.get("session_id", "")
if not session_id:
return True, False

light = os.environ.get("BRAINLAYER_HOOKS_LIGHT") == "1"

return True, light
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Light mode is skipped when session_id is missing.

When hook_input lacks a session_id, the function returns (True, False) at line 41, bypassing the BRAINLAYER_HOOKS_LIGHT check. This means workers without a session ID won't get reduced results even if BRAINLAYER_HOOKS_LIGHT=1.

Is this intentional? If light mode should apply regardless of session presence, consider:

🔧 Suggested fix to check light mode unconditionally
-    session_id = hook_input.get("session_id", "")
-    if not session_id:
-        return True, False
-
     light = os.environ.get("BRAINLAYER_HOOKS_LIGHT") == "1"
-
     return True, light
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/brainlayer-session-start.py` around lines 39 - 45, The function
currently returns (True, False) when session_id is missing, skipping the
BRAINLAYER_HOOKS_LIGHT check; compute the light flag from
os.environ.get("BRAINLAYER_HOOKS_LIGHT") == "1" unconditionally (before the
session_id early-return) and return (True, light) when session_id is absent so
light mode applies regardless of session presence; update the logic around the
session_id/hook_input handling to reference the session_id variable and the
BRAINLAYER_HOOKS_LIGHT check (no other changes needed).

Comment on lines +714 to +725
# Step 6: V2 dedup detection — reuses embeddings from step 3 searches
duplicates: List[Dict[str, Any]] = []
if DIGEST_V2_ENABLED:
# Use topic_query embedding if available (already computed in step 3),
# otherwise compute a fresh one. Avoids redundant embed_fn(content) call.
dedup_embedding = embed_fn(topic_query)
duplicates = find_duplicates(
content=content,
embedding=dedup_embedding,
store=store,
project=project,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

find_duplicates uses topic_query embedding, not content embedding.

In digest_connect, find_duplicates is called with content=content but embedding=embed_fn(topic_query). This means the dedup threshold is computed from content's token count, but the similarity search uses topic_query's embedding.

This mismatch could cause unexpected results — short topic_query with long content would use a strict threshold (0.95) for a potentially different semantic space.

🔧 Consider aligning content and embedding
     if DIGEST_V2_ENABLED:
-        dedup_embedding = embed_fn(topic_query)
+        # Use content embedding for consistency with token-based threshold
+        dedup_embedding = embed_fn(content[:500])  # or reuse existing embedding
         duplicates = find_duplicates(
             content=content,
             embedding=dedup_embedding,

Or compute topic_query's tokens for threshold selection.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Step 6: V2 dedup detection — reuses embeddings from step 3 searches
duplicates: List[Dict[str, Any]] = []
if DIGEST_V2_ENABLED:
# Use topic_query embedding if available (already computed in step 3),
# otherwise compute a fresh one. Avoids redundant embed_fn(content) call.
dedup_embedding = embed_fn(topic_query)
duplicates = find_duplicates(
content=content,
embedding=dedup_embedding,
store=store,
project=project,
)
# Step 6: V2 dedup detection — reuses embeddings from step 3 searches
duplicates: List[Dict[str, Any]] = []
if DIGEST_V2_ENABLED:
# Use content embedding for consistency with token-based threshold
dedup_embedding = embed_fn(content[:500]) # or reuse existing embedding
duplicates = find_duplicates(
content=content,
embedding=dedup_embedding,
store=store,
project=project,
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/pipeline/digest.py` around lines 714 - 725, In digest_connect
when DIGEST_V2_ENABLED is true you call find_duplicates(content=content,
embedding=embed_fn(topic_query)) which mismatches the embedding source
(topic_query) and the content used to compute the dedup threshold; update the
code so the embedding and token-count threshold are aligned — either compute
embedding_for_dedup = embed_fn(content) and pass that to find_duplicates, or if
you must use embed_fn(topic_query) then compute tokens/threshold from
topic_query instead of content; adjust calls to embed_fn, find_duplicates, and
any token-count logic accordingly (symbols: digest_connect, find_duplicates,
embed_fn, topic_query, content, DIGEST_V2_ENABLED).

Comment thread tests/test_conditional_hooks.py Outdated
Comment on lines +155 to +173
class TestLightModeResultLimits:
"""Verify that light mode actually reduces the result limit in queries."""

def test_session_start_light_limit(self, session_start):
"""In light mode, result_limit should be 2 instead of 5."""
os.environ["BRAINLAYER_HOOKS_LIGHT"] = "1"
hook_input = {"session_id": "abc123"}
_, light = session_start.should_activate(hook_input)
assert light is True
result_limit = 2 if light else 5
assert result_limit == 2

def test_prompt_search_light_limit(self, prompt_search):
"""In light mode, base_limit should be 2 instead of 3/8."""
os.environ["BRAINLAYER_HOOKS_LIGHT"] = "1"
hook_input = {"session_id": "abc123"}
_, light = prompt_search.should_activate(hook_input)
assert light is True
base_limit = 2 if light else 3
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Tests verify local logic, not hook implementation.

These tests call should_activate() to get the light flag, then apply their own 2 if light else 5 formula. This verifies the test's own calculation, not whether the hook actually uses these limits.

To truly validate light mode behavior, consider testing the hooks' internal result_limit or base_limit variables, or mocking the DB and verifying the actual SQL LIMIT parameter.

💡 Alternative: Test actual limit logic

For session_start, you could test the result_limit calculation:

def test_session_start_light_limit_actual(self, session_start):
    """Verify light mode actually changes the result_limit logic."""
    os.environ["BRAINLAYER_HOOKS_LIGHT"] = "1"
    hook_input = {"session_id": "abc123"}
    _, light = session_start.should_activate(hook_input)
    
    # Test the actual formula used in main()
    result_limit = 2 if light else 5
    assert result_limit == 2
    
    # Or better: patch the DB and inspect the query parameters
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_conditional_hooks.py` around lines 155 - 173, The tests currently
only re-compute limits from the returned light flag (calling should_activate on
session_start and prompt_search) which verifies the test logic, not the hook
behavior; update the tests to assert the hook's actual limit usage by either
accessing the hook's internal result_limit/base_limit attributes or by mocking
the DB call and asserting the executed query's LIMIT parameter; specifically
change TestLightModeResultLimits.test_session_start_light_limit and
test_prompt_search_light_limit to inspect session_start.result_limit or
session_start.main (or equivalent internal method) and prompt_search.base_limit,
or patch the DB client used by the hook to capture the SQL/parameters and assert
LIMIT equals 2 when should_activate returns light=True.

Comment on lines +622 to +623
mock_embed = MagicMock(return_value=[0.05] * 1024)
def mock_faceted(**kw):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Minor: Function defined inside test method.

mock_faceted is defined as a regular function inside the test. Consider using a lambda or moving to a fixture for consistency with mock_embed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_digest_pipeline_v2.py` around lines 622 - 623, The test defines
mock_faceted as a nested function while mock_embed is a MagicMock; replace the
nested def by either assigning mock_faceted = lambda **kw: <appropriate return>
or convert both mocks into fixtures (e.g., a pytest fixture that returns
MagicMock(...) for mock_faceted and mock_embed) so they are consistent; update
any references in the test to use the new lambda or fixture name (mock_faceted /
mock_embed) and ensure the return shape matches the original behavior.

@EtanHey EtanHey merged commit 4984bbe into main Mar 27, 2026
2 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant