feat: R81b meta-noise filter + enrichment prompt additions by EtanHey · Pull Request #225 · EtanHey/brainlayer

EtanHey · 2026-04-08T09:43:48Z

Summary

search-time filter using META_NOISE_PATTERNS to exclude literal MCP/meta-noise chunk content from hybrid_search by default, with explicit opt-out support
enrichment meta-research detection for literal brain_search(...) / brain_entity(...) tool invocations, setting low importance and adding the meta-research tag
conversational chunk guidance so short/informal chunks extract actionable items and commitments instead of discussion flow

Tests

8 hybrid_search + 38 enrichment = 46 pass

Note

Add meta-noise filter to `hybrid_search` and extend enrichment prompts with meta-research detection

Adds filter_meta_noise: bool = True to SearchMixin.hybrid_search in search_repo.py, excluding chunks matching tool-transcript and QA-table patterns via SQL predicates and a post-filter; callers can opt out with filter_meta_noise=False.
Extends ENRICHMENT_PROMPT in enrichment.py with rubrics for meta-research detection, short/conversational chunks, epistemic_level, debt_impact, and sentiment_label.
Entity persistence in enrichment_controller.py now upserts kg_entities (refreshing updated_at on conflict) and inserts entity-to-chunk links into kg_entity_chunks, ignoring duplicates; failures are logged and do not interrupt enrichment.
Adds one-time per-process stderr signature emission in both build_prompt/build_external_prompt and _emit_enrichment_start (realtime mode); write failures are swallowed and logged at debug.
Behavioral Change: hybrid_search excludes meta-noise chunks by default, which may reduce result counts for callers that previously received tool-transcript content.

^{Macroscope summarized bcd32e0.}

Summary by CodeRabbit

New Features
- Search now filters out "meta-noise" by default to improve relevance, with an option to disable this behavior.
- Enrichment prompts updated with clearer meta-research detection, guidance for short conversational chunks, and expanded evaluation rubrics (epistemic, debt, sentiment).
Bug Fixes
- Enrichment startup marker emission made robust so telemetry writes won't interrupt realtime processing.
Tests
- New and expanded tests cover meta-noise filtering, prompt content, concurrency, and error-handling.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-04-08T09:44:01Z

📝 Walkthrough

Walkthrough

Adds process-level stderr diagnostic markers for enrichment runtime and prompt loading, strengthens prompt instructions for meta-research and chunking, implements optional meta-noise filtering in hybrid search, and adds tests covering prompt content, concurrency for prompt signature emission, and meta-noise filtering behavior.

Changes

Cohort / File(s)	Summary
Enrichment Diagnostics `src/brainlayer/enrichment_controller.py`	Emit a realtime-only `ENRICHMENT_RUNTIME_LOADED` marker to stderr at enrichment start (swallows OSError and logs debug); KG entity upsert conflict target changed to `ON CONFLICT(id)` while preserving `updated_at` update.
Enrichment Prompt & Emission `src/brainlayer/pipeline/enrichment.py`	Add process-wide, thread-safe `_emit_prompt_signature_once()` that writes `ENRICHMENT_PROMPT_LOADED` to stderr once per process (swallows OSError and logs debug). Expand `ENRICHMENT_PROMPT` with meta-research detection, tool-invocation handling, short conversational chunk guidance, and expanded epistemic/debt/sentiment rubrics; call emitter from prompt builders.
Search Meta-Noise Filtering `src/brainlayer/search_repo.py`	Add `filter_meta_noise: bool = True` parameter to `hybrid_search()` and incorporate it into cache key. Define `META_NOISE_PATTERNS` and casefolded variants; add FTS5 `NOT LIKE` constraints when enabled and a post-retrieval `_contains_meta_noise()` guard to skip matching candidates.
Tests — Enrichment Prompt & Concurrency `tests/test_enrichment_v2.py`, `tests/test_enrichment_controller.py`, `tests/test_enrichment_entity_schema.py`	Add concurrency tests asserting `_emit_prompt_signature_once()` emits exactly once across threads and swallows OSError while logging; add test ensuring `_emit_enrichment_start()` swallows os.write errors and still emits the Axiom start event; expand prompt content assertions for meta-research, chunking, and rubric retention.
Tests — Hybrid Search `tests/test_hybrid_search.py`	Add integration tests that seed meta-noise and real chunks, asserting default `hybrid_search()` filters meta-noise case-insensitively and that disabling `filter_meta_noise` returns meta-noise hits.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: enrichment prompt v2 — fix 73.7% meta-description pattern (R81) #224 — Overlaps enrichment controller KG upsert and enrichment runtime/prompt emission changes; likely code-level overlap.
feat: add entities array to Gemini enrichment schema #215 — Modifies ENRICHMENT_PROMPT content and related tests; strong overlap with prompt text and assertions.

Poem

🐇 I hop and nudge the stdout sky,
A tiny marker scrawled nearby,
Prompts grow wiser, noise takes flight,
Threads whisper once — then sleep at night.
Hooray — the search is tidy and spry!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main changes: addition of a meta-noise filter and enrichment prompt enhancements, both of which are central to this PR.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/r81b-meta-noise-filter

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/brainlayer/enrichment_controller.py`:
- Around line 463-467: The direct os.write() of the "ENRICHMENT_RUNTIME_LOADED"
marker in enrich_realtime() must be made best-effort: wrap the os.write(...)
call in a try/except OSError block and catch/log the exception at debug level
(include the exception message and that the stderr write failed) so an OSError
on FD 2 won’t abort enrich_realtime(); apply the same guard pattern around the
analogous os.write(...) call in the pipeline enrichment module (the call
referenced at enrichment.py:409) so both emitters are fault-tolerant.

In `@src/brainlayer/pipeline/enrichment.py`:
- Around line 403-412: Protect the unsynchronized check/set in
_emit_prompt_signature_once by introducing a module-level threading.Lock
(similar to existing _groq_rate_lock), acquire it before checking
_prompt_signature_emitted, perform a double-check inside the lock, set
_prompt_signature_emitted=True and call os.write() while still holding the lock,
and add try/except around os.write() to catch and log/ignore OSError so the call
won’t crash; update the same pattern at the other emission sites referenced by
the comment (the calls around lines where build_prompt/build_external_prompt are
invoked) so all prompt-signature emissions use the new locked, exception-safe
logic.

In `@src/brainlayer/search_repo.py`:
- Around line 107-110: The post-filter _contains_meta_noise currently does a
case-sensitive substring check against META_NOISE_PATTERNS so variants like
"Brain_Search" slip through; update _contains_meta_noise to normalize comparison
(e.g., use content.casefold() and compare against a pre-normalized, casefolded
META_NOISE_PATTERNS or use re.search with re.IGNORECASE) and also make the
SQL-level filter case-insensitive by switching pattern matching to ILIKE or
wrapping fields with LOWER(...) and comparing to lowercased patterns; ensure any
other places that reference META_NOISE_PATTERNS for SQL construction or
filtering (the other post-filter/code paths noted around the same sections) are
updated to use the same case-insensitive approach so both SQL and Python
post-filtering behave consistently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a840f78c-ac92-4913-9ad4-c59c860ea1a3

📥 Commits

Reviewing files that changed from the base of the PR and between 01f9cb3 and 80f1488.

📒 Files selected for processing (5)

src/brainlayer/enrichment_controller.py
src/brainlayer/pipeline/enrichment.py
src/brainlayer/search_repo.py
tests/test_enrichment_entity_schema.py
tests/test_hybrid_search.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: test (3.11)
GitHub Check: test (3.13)
GitHub Check: test (3.12)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

**/*.py: Use paths.py:get_db_path() for all database path resolution; all scripts and CLI must use this function rather than hardcoding paths
When performing bulk database operations: stop enrichment workers first, checkpoint WAL before and after, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks, and checkpoint every 3 batches

Files:

src/brainlayer/enrichment_controller.py
tests/test_hybrid_search.py
src/brainlayer/search_repo.py
tests/test_enrichment_entity_schema.py
src/brainlayer/pipeline/enrichment.py

src/brainlayer/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/**/*.py: Use retry logic on SQLITE_BUSY errors; each worker must use its own database connection to handle concurrency safely
Classification must preserve ai_code, stack_trace, and user_message verbatim; skip noise entries entirely and summarize build_log and dir_listing entries (structure only)
Use AST-aware chunking via tree-sitter; never split stack traces; mask large tool output
For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback via enrichment_controller.py, and Ollama as offline last-resort; allow override via BRAINLAYER_ENRICH_BACKEND env var
Configure enrichment rate via BRAINLAYER_ENRICH_RATE environment variable (default 0.2 = 12 RPM)
Implement chunk lifecycle columns: superseded_by, aggregated_into, archived_at on chunks table; exclude lifecycle-managed chunks from default search; allow include_archived=True to show history
Implement brain_supersede with safety gate for personal data (journals, notes, health/finance); use soft-delete for brain_archive with timestamp
Add supersedes parameter to brain_store for atomic store-and-replace operations
Run linting and formatting with: ruff check src/ && ruff format src/
Run tests with pytest
Use PRAGMA wal_checkpoint(FULL) before and after bulk database operations to prevent WAL bloat

Files:

src/brainlayer/enrichment_controller.py
src/brainlayer/search_repo.py
src/brainlayer/pipeline/enrichment.py

🧠 Learnings (8)

📚 Learning: 2026-04-02T23:32:14.543Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T23:32:14.543Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment rate configurable via `BRAINLAYER_ENRICH_RATE` environment variable (default 0.2 = 12 RPM)

Applied to files:

src/brainlayer/enrichment_controller.py
src/brainlayer/pipeline/enrichment.py

📚 Learning: 2026-04-06T08:40:13.531Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : Configure enrichment rate via `BRAINLAYER_ENRICH_RATE` environment variable (default 0.2 = 12 RPM)

Applied to files:

src/brainlayer/enrichment_controller.py

📚 Learning: 2026-03-22T15:55:22.017Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 100
File: src/brainlayer/enrichment_controller.py:175-199
Timestamp: 2026-03-22T15:55:22.017Z
Learning: In `src/brainlayer/enrichment_controller.py`, the `parallel` parameter in `enrich_local()` is intentionally kept in the function signature (currently unused, suppressed with `# noqa: ARG001`) for API stability. Parallel local enrichment via a thread pool or process pool is planned for a future iteration. Do not flag this as dead code requiring removal.

Applied to files:

src/brainlayer/enrichment_controller.py
src/brainlayer/pipeline/enrichment.py

📚 Learning: 2026-04-06T08:40:13.531Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : Implement chunk lifecycle columns: `superseded_by`, `aggregated_into`, `archived_at` on chunks table; exclude lifecycle-managed chunks from default search; allow `include_archived=True` to show history

Applied to files:

src/brainlayer/search_repo.py

📚 Learning: 2026-04-04T23:24:03.159Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-04T23:24:03.159Z
Learning: Applies to src/brainlayer/{vector_store,search}*.py : Chunk lifecycle: implement columns `superseded_by`, `aggregated_into`, `archived_at` on chunks table; exclude lifecycle-managed chunks from default search

Applied to files:

src/brainlayer/search_repo.py

📚 Learning: 2026-04-04T15:22:02.740Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 198
File: hooks/brainlayer-prompt-search.py:241-259
Timestamp: 2026-04-04T15:22:02.740Z
Learning: In `hooks/brainlayer-prompt-search.py` (Python), `record_injection_event()` is explicitly best-effort telemetry: silent `except sqlite3.Error: pass` is intentional — table non-existence or lock failures are acceptable silent failures. `sqlite3.connect(timeout=2)` is the file-open timeout; `PRAGMA busy_timeout` governs per-statement lock-wait. The `DEADLINE_MS` (450ms) guard applies only to the FTS search phase, not to this side-channel write.

Applied to files:

src/brainlayer/search_repo.py

📚 Learning: 2026-04-04T15:21:39.570Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 198
File: hooks/brainlayer-prompt-search.py:169-169
Timestamp: 2026-04-04T15:21:39.570Z
Learning: In EtanHey/brainlayer, `hooks/brainlayer-prompt-search.py` reads `entity_type` directly from existing rows in `kg_entities` (read-only). `contracts/entity-types.yaml` defines the write-side schema only and is not authoritative for what `entity_type` values exist in the DB. The DB already stores `technology` (72 entities), `project` (24), and `tool` (1) as valid `entity_type` values, so `INJECT_TYPES` in the hook should match these DB values, not the contract file.

Applied to files:

tests/test_enrichment_entity_schema.py

📚 Learning: 2026-04-06T08:40:13.531Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Pipeline architecture: Extract → Classify → Chunk → Embed → Index, with post-processing for enrichment, brain graph, and Obsidian export

Applied to files:

src/brainlayer/pipeline/enrichment.py

🔇 Additional comments (4)

src/brainlayer/search_repo.py (1)

683-684: Nice cache-key isolation for filter_meta_noise.

Including filter_meta_noise in the cache key avoids filtered/unfiltered cross-contamination from _hybrid_cache.

Also applies to: 717-717

tests/test_hybrid_search.py (1)

230-263: Strong coverage for default-vs-opt-out meta-noise behavior.

This test captures both expected defaults and explicit override behavior in one place.

src/brainlayer/pipeline/enrichment.py (1)

304-308: Prompt additions are aligned and actionable.

The meta-research detection and short conversational chunk guidance are clear, and rubric wording remains explicit.

Also applies to: 315-316, 396-399

tests/test_enrichment_entity_schema.py (1)

100-119: Good prompt-contract regression checks.

These assertions lock in the newly required meta-research and rubric directives with clear intent.

coderabbitai · 2026-04-08T09:49:56Z

+    if mode == "realtime":
+        os.write(
+            2,
+            b"ENRICHMENT_RUNTIME_LOADED mode=realtime prompt=r81 truncation=8000 split=4800/3200 rubrics=epistemic_level,debt_impact,sentiment_label\n",
+        )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify unguarded stderr marker writes in enrichment paths. rg -n -C3 'def _emit_enrichment_start|os\.write\(' src/brainlayer/enrichment_controller.py src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 1299

🏁 Script executed:

#!/bin/bash # Get full context of _emit_enrichment_start and its callers # First, check the function definition and return behavior sed -n '462,480p' src/brainlayer/enrichment_controller.py # Then find where _emit_enrichment_start is called echo "=== Callers of _emit_enrichment_start ===" rg -n '_emit_enrichment_start' src/brainlayer/enrichment_controller.py # Check if there are other unguarded os.write calls in the codebase echo "=== All os.write calls in brainlayer ===" rg -n 'os\.write\(' src/brainlayer/ --type py

Repository: EtanHey/brainlayer

Length of output: 1095

🏁 Script executed:

#!/bin/bash # Check enrich_realtime to see what happens if _emit_enrichment_start fails # Search for the function definition and how it handles _emit_enrichment_start rg -B5 -A20 'def enrich_realtime' src/brainlayer/enrichment_controller.py

Repository: EtanHey/brainlayer

Length of output: 894

🏁 Script executed:

#!/bin/bash # Check if os.write can actually raise OSError and under what conditions # Also verify if any existing guard patterns are used in the codebase rg -B2 -A2 'try:' src/brainlayer/enrichment_controller.py | head -50

Repository: EtanHey/brainlayer

Length of output: 1582

🏁 Script executed:

#!/bin/bash # Check exact current state of lines 462-470 sed -n '462,470p' src/brainlayer/enrichment_controller.py

Repository: EtanHey/brainlayer

Length of output: 418

🏁 Script executed:

#!/bin/bash # Check if logger is already available in enrichment_controller.py head -30 src/brainlayer/enrichment_controller.py | rg -n 'import|logger'

Repository: EtanHey/brainlayer

Length of output: 516

🏁 Script executed:

#!/bin/bash # Check if there's any error handling pattern around _emit_enrichment_start calls sed -n '515,530p' src/brainlayer/enrichment_controller.py

Repository: EtanHey/brainlayer

Length of output: 705

Guard realtime stderr marker emission to prevent enrichment failure.

Line 464 writes directly to FD 2 without error handling. An OSError from a broken/unavailable stderr—common in containerized or CI environments—will interrupt the entire enrich_realtime() call before any chunks are processed. This diagnostic marker should be best-effort.

Wrap the os.write() call in a try-except block that logs the failure at debug level:

Proposed fix

def _emit_enrichment_start(mode: str, limit: int) -> bool: if mode == "realtime": - os.write( - 2, - b"ENRICHMENT_RUNTIME_LOADED mode=realtime prompt=r81 truncation=8000 split=4800/3200 rubrics=epistemic_level,debt_impact,sentiment_label\n", - ) + try: + os.write( + 2, + b"ENRICHMENT_RUNTIME_LOADED mode=realtime prompt=r81 truncation=8000 split=4800/3200 rubrics=epistemic_level,debt_impact,sentiment_label\n", + ) + except OSError: + logger.debug("Failed to emit ENRICHMENT_RUNTIME_LOADED marker", exc_info=True) return _emit_enrichment_event(

Note: A similar unguarded os.write() call exists at src/brainlayer/pipeline/enrichment.py:409. Apply the same guard pattern there.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if mode == "realtime":

os.write(

2,

b"ENRICHMENT_RUNTIME_LOADED mode=realtime prompt=r81 truncation=8000 split=4800/3200 rubrics=epistemic_level,debt_impact,sentiment_label\n",

)

if mode == "realtime":

try:

os.write(

2,

b"ENRICHMENT_RUNTIME_LOADED mode=realtime prompt=r81 truncation=8000 split=4800/3200 rubrics=epistemic_level,debt_impact,sentiment_label\n",

)

except OSError:

logger.debug("Failed to emit ENRICHMENT_RUNTIME_LOADED marker", exc_info=True)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/enrichment_controller.py` around lines 463 - 467, The direct os.write() of the "ENRICHMENT_RUNTIME_LOADED" marker in enrich_realtime() must be made best-effort: wrap the os.write(...) call in a try/except OSError block and catch/log the exception at debug level (include the exception message and that the stderr write failed) so an OSError on FD 2 won’t abort enrich_realtime(); apply the same guard pattern around the analogous os.write(...) call in the pipeline enrichment module (the call referenced at enrichment.py:409) so both emitters are fault-tolerant.

coderabbitai · 2026-04-08T09:49:56Z

+def _emit_prompt_signature_once() -> None:
+    """Write a single prompt signature line per process for daemon verification."""
+    global _prompt_signature_emitted
+    if _prompt_signature_emitted:
+        return
+    _prompt_signature_emitted = True
+    os.write(
+        2,
+        b"ENRICHMENT_PROMPT_LOADED truncation=8000 split=4800/3200 rubrics=epistemic_level,debt_impact,sentiment_label\n",
+    )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify call sites and parallel paths for prompt building and signature emission. rg -n -C3 '_emit_prompt_signature_once|build_external_prompt\(|build_prompt\(|ThreadPoolExecutor|executor\.submit\(' src/brainlayer/pipeline/enrichment.py src/brainlayer/enrichment_controller.py

Repository: EtanHey/brainlayer

Length of output: 10848

🏁 Script executed:

#!/bin/bash # Check for any existing synchronization around _prompt_signature_emitted # and verify the actual execution context # 1. Check the full context of the global variable and function sed -n '40,50p' src/brainlayer/pipeline/enrichment.py sed -n '403,420p' src/brainlayer/pipeline/enrichment.py # 2. Check if there are any threading locks defined in the module rg -n "threading\.(Lock|RLock|Semaphore|Condition)" src/brainlayer/pipeline/enrichment.py # 3. Verify the execution flow in _enrich_one to see if it calls build_prompt rg -n -A20 "^def _enrich_one" src/brainlayer/pipeline/enrichment.py | head -40

Repository: EtanHey/brainlayer

Length of output: 2005

🏁 Script executed:

#!/bin/bash # Check enrichment_controller to confirm ThreadPoolExecutor calls paths with _emit_prompt_signature_once sed -n '540,560p' src/brainlayer/enrichment_controller.py sed -n '340,360p' src/brainlayer/enrichment_controller.py

Repository: EtanHey/brainlayer

Length of output: 1528

Add lock to protect prompt-signature emission from race conditions.

_prompt_signature_emitted uses unsynchronized check-then-set at lines 405–407, and os.write() at line 409 has no error handling. Multiple threads call build_prompt() and build_external_prompt() via ThreadPoolExecutor in both enrichment.py and enrichment_controller.py, creating a race condition that can emit duplicate markers or fail silently.

The module already uses threading.Lock() for _groq_rate_lock (line 99); apply the same pattern here with a double-check inside the lock and exception handling for os.write().

Proposed fix

logger = logging.getLogger(__name__) _prompt_signature_emitted = False +_prompt_signature_lock = threading.Lock() @@ def _emit_prompt_signature_once() -> None: """Write a single prompt signature line per process for daemon verification.""" global _prompt_signature_emitted if _prompt_signature_emitted: return - _prompt_signature_emitted = True - os.write( - 2, - b"ENRICHMENT_PROMPT_LOADED truncation=8000 split=4800/3200 rubrics=epistemic_level,debt_impact,sentiment_label\n", - ) + with _prompt_signature_lock: + if _prompt_signature_emitted: + return + try: + os.write( + 2, + b"ENRICHMENT_PROMPT_LOADED truncation=8000 split=4800/3200 rubrics=epistemic_level,debt_impact,sentiment_label\n", + ) + except OSError: + logger.debug("Failed to emit ENRICHMENT_PROMPT_LOADED marker", exc_info=True) + finally: + _prompt_signature_emitted = True

Also applies to call sites: enrichment.py:444, 515

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/pipeline/enrichment.py` around lines 403 - 412, Protect the unsynchronized check/set in _emit_prompt_signature_once by introducing a module-level threading.Lock (similar to existing _groq_rate_lock), acquire it before checking _prompt_signature_emitted, perform a double-check inside the lock, set _prompt_signature_emitted=True and call os.write() while still holding the lock, and add try/except around os.write() to catch and log/ignore OSError so the call won’t crash; update the same pattern at the other emission sites referenced by the comment (the calls around lines where build_prompt/build_external_prompt are invoked) so all prompt-signature emissions use the new locked, exception-safe logic.

- ON CONFLICT(id): The id is uuid5(etype, name.lower()) — deterministic. Conflict should be on id, not (entity_type, name), so retries idempotently UPDATE the same row instead of failing with PRIMARY KEY violation. Fixes UNIQUE constraint errors during enrichment runs. - ruff format: test_enrichment_v2.py line wrapping per --check requirements. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

EtanHey added 2 commits April 7, 2026 19:37

fix: restore enrichment rubrics and daemon markers

9857460

feat: R81b meta-noise filter + enrichment prompt additions

80f1488

greptile-apps Bot reviewed Apr 8, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 8, 2026

View reviewed changes

EtanHey merged commit c367486 into main Apr 8, 2026
6 checks passed

This was referenced Apr 30, 2026

fix: harden BrainLayer FTS recall across all three layers #263

Merged

fix: filter audit recursion from default search #277

Closed

This was referenced May 20, 2026

feat(enrich): bound enrichment write pressure #303

Merged

fix(writer): single-writer pidfile mutex + retry/timeout/throttle hardening #309

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: R81b meta-noise filter + enrichment prompt additions#225

feat: R81b meta-noise filter + enrichment prompt additions#225
EtanHey merged 3 commits into
mainfrom
feat/r81b-meta-noise-filter

EtanHey commented Apr 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented Apr 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 8, 2026

Uh oh!

coderabbitai Bot Apr 8, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EtanHey commented Apr 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Add meta-noise filter to hybrid_search and extend enrichment prompts with meta-research detection

Summary by CodeRabbit

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

EtanHey commented Apr 8, 2026 •

edited by coderabbitai Bot

Loading

Add meta-noise filter to `hybrid_search` and extend enrichment prompts with meta-research detection

coderabbitai Bot commented Apr 8, 2026 •

edited

Loading