Skip to content

feat: tune BrainLayer read path PRAGMAs and default MMR off#305

Merged
EtanHey merged 2 commits into
mainfrom
feat/readpath-phase-3-pragma-mmr-default-off
May 21, 2026
Merged

feat: tune BrainLayer read path PRAGMAs and default MMR off#305
EtanHey merged 2 commits into
mainfrom
feat/readpath-phase-3-pragma-mmr-default-off

Conversation

@EtanHey
Copy link
Copy Markdown
Owner

@EtanHey EtanHey commented May 21, 2026

Summary

  • Adds production read-path PRAGMA tuning: read connections set mmap_size=30000000000 and cache_size=-64000, with BRAINLAYER_READ_MMAP_BYTES and BRAINLAYER_READ_CACHE_KB overrides.
  • Tunes WAL checkpoint cadence after writer WAL setup with wal_autocheckpoint=10000, with BRAINLAYER_WAL_AUTOCHECKPOINT override.
  • Defaults MMR reranking off via BRAINLAYER_MMR_LAMBDA=1.0; lower values still opt into the existing MMR path. Triple RRF is preserved unchanged.
  • Ships Phase 3 R4 α + γ only. No β. No signal-pause.

R3 Stress-Test Evidence

Path median p95 max
baseline (Phase 1 only) 31.6ms 4501ms 9831ms
α (PRAGMA tuning) 36.0ms 3141ms 8352ms
γ (MMR off) 33.9ms 2627ms 4354ms
α + γ combined 33.0ms 2384ms 2955ms
α on LIVE DB (this PR) 61.0ms 2505ms 8385ms

Headline from the read-path redesign: brain_search median 16930ms -> 33ms (~510x), p95 11337ms -> 2384ms (~4.6x). Triple RRF remains unchanged.

Phase 4 Future Work

The remaining 8s tail outliers and sqlite-vec brute-force KNN floor are Phase 4 future work. The likely durable fix is HNSW/vector-backend replacement after TechGym, not broadening this PR.

Repro Commands

.venv/bin/python /tmp/phase3-r3-bench.py bench --mode baseline --db /tmp/readpath-r3-10gb.db --n 24 --writers 18 --rate 15 --duration 28 > /tmp/phase3-r3-baseline.json 2> /tmp/phase3-r3-baseline.stderr.log
.venv/bin/python /tmp/phase3-r3-bench.py bench --mode alpha --db /tmp/readpath-r3-10gb.db --n 24 --writers 18 --rate 15 --duration 28 > /tmp/phase3-r3-alpha.json 2> /tmp/phase3-r3-alpha.stderr.log
.venv/bin/python /tmp/phase3-r3-bench.py bench --mode gamma --db /tmp/readpath-r3-10gb.db --n 24 --writers 18 --rate 15 --duration 28 > /tmp/phase3-r3-gamma.json 2> /tmp/phase3-r3-gamma.stderr.log
.venv/bin/python /tmp/phase3-r3-bench.py bench --mode alpha-gamma --db /tmp/readpath-r3-10gb.db --n 24 --writers 18 --rate 15 --duration 28 > /tmp/phase3-r3-alpha-gamma.json 2> /tmp/phase3-r3-alpha-gamma.stderr.log
.venv/bin/python /tmp/phase3-r3-bench.py bench --mode live-alpha --db /Users/etanheyman/.local/share/brainlayer/brainlayer.db --n 20 --writers 0 --live > /tmp/phase3-r3-live-alpha.json 2> /tmp/phase3-r3-live-alpha.stderr.log

Verification

  • RED first: .venv/bin/python -m pytest tests/test_vector_store_pragma_tuning.py tests/test_mmr_default_off.py -v produced expected failures before implementation (4 failed, 1 passed, 1 skipped).
  • RED invalid-env regressions: targeted invalid-env tests failed before the fallback parser fix.
  • Target tests: .venv/bin/python -m pytest tests/test_vector_store_pragma_tuning.py tests/test_mmr_default_off.py -v -> 8 passed.
  • Wider affected tests: .venv/bin/python -m pytest tests/test_hybrid_search.py tests/test_vector_store_pragma_tuning.py tests/test_mmr_default_off.py -v -> 24 passed.
  • Lint/format: .venv/bin/ruff check src tests passed; .venv/bin/ruff format --check src tests passed.
  • Pre-push gate passed: unit pytest (2054 passed, 9 skipped, 75 deselected, 1 xfailed), MCP registration (3 passed), isolated eval/hook routing (32 passed), bun stale-index suite (1 pass), FTS5 determinism shell regression passed.
  • Live eval on /tmp/readpath-r3-10gb.db copy: BRAINLAYER_DB=/tmp/readpath-r3-10gb.db .venv/bin/python -m pytest tests/test_eval_baselines.py -m live -v -> 32 passed, 2 failed, 5 xfailed, 2 xpassed; the two failures match the accepted pre-existing R4 prompt allowances: no recent-week chunks in the fixture copy and generic Python entity injection.
  • Actual live eval against ~/.local/share/brainlayer/brainlayer.db was blocked by active DB locks from enrichment/drain (apsw.BusyError).
  • Full unmarked pytest also hit live/default DB lock errors and the known ranx/numba order-dependent eval issue; isolated tests/test_eval_framework.py passed (11 passed).

Local Review

  • cr review --plain after fixes reports only .gemini/settings.json, which is untracked and not included in this branch.

Note

Medium Risk
Medium risk because it changes search ranking behavior (MMR reranking now disabled by default) and adjusts SQLite connection PRAGMAs, which can affect performance and operational characteristics across deployments.

Overview
Tunes BrainLayer’s SQLite read/write connection settings and makes MMR reranking opt-in.

Read-only connections now set PRAGMA mmap_size and PRAGMA cache_size, and writer init sets PRAGMA wal_autocheckpoint; all three are configurable via BRAINLAYER_READ_MMAP_BYTES, BRAINLAYER_READ_CACHE_KB, and BRAINLAYER_WAL_AUTOCHECKPOINT with safe parsing/fallbacks.

Hybrid search MMR reranking is now off by default (_MMR_LAMBDA=1.0), configurable via BRAINLAYER_MMR_LAMBDA, and short-circuits to skip embedding loads when disabled; tests were added/updated to cover env overrides, clamping/invalid values, and PRAGMA application.

Reviewed by Cursor Bugbot for commit 50389f5. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Tune BrainLayer read path SQLite PRAGMAs and disable MMR reranking by default

  • Read-only connections in VectorStore._get_read_conn now set PRAGMA mmap_size (default 30 GB) and PRAGMA cache_size (default -64000 KB); writer connections set PRAGMA wal_autocheckpoint (default 10000 pages). All values are overridable via environment variables.
  • _MMR_LAMBDA in search_repo.py now defaults to 1.0 (MMR off), readable from BRAINLAYER_MMR_LAMBDA; invalid or non-finite values fall back to 1.0.
  • SearchMixin._mmr_rerank_scored_results short-circuits and skips embedding loads when _MMR_LAMBDA >= 1.0.
  • Behavioral Change: MMR diversification is now off by default; existing deployments relying on MMR reranking must set BRAINLAYER_MMR_LAMBDA to a value below 1.0.

Macroscope summarized 50389f5.

Summary by CodeRabbit

  • New Features

    • Search result reranking is now disabled by default and can be configured via the BRAINLAYER_MMR_LAMBDA environment variable.
    • Database performance parameters are now configurable via environment variables (BRAINLAYER_WAL_AUTOCHECKPOINT, BRAINLAYER_READ_MMAP_BYTES, BRAINLAYER_READ_CACHE_KB).
  • Tests

    • Added comprehensive test coverage for MMR configuration behavior and database pragma settings.

Review Change Stack

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented May 21, 2026

@coderabbitai review

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented May 21, 2026

@greptile-apps review

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented May 21, 2026

@codex review

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented May 21, 2026

@cursor @BugBot review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 21, 2026

You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

📝 Walkthrough

Walkthrough

This PR makes MMR reranking behavior and SQLite connection tuning configurable via environment variables. MMR is disabled by default, and database pragmas are set to conservative values. Both changes include comprehensive tests validating environment-variable parsing, fallback behavior on invalid input, and observable configuration effects.

Changes

Environment-driven configuration for MMR and SQLite PRAGMA tuning

Layer / File(s) Summary
MMR lambda configuration and early-exit guard
src/brainlayer/search_repo.py
_MMR_LAMBDA is now loaded from the BRAINLAYER_MMR_LAMBDA environment variable with a safe float parse and 1.0 default. An early-exit guard skips MMR diversification when _MMR_LAMBDA >= 1.0.
Update existing MMR tests to monkeypatch lambda
tests/test_hybrid_search.py
Two existing reranking tests now accept a monkeypatch fixture and explicitly set _MMR_LAMBDA to 0.65 before assertions, making them deterministic with respect to the environment configuration.
Test MMR default-off behavior and env overrides
tests/test_mmr_default_off.py
New test module validates that MMR is disabled by default, does not trigger embedding loading, and correctly applies BRAINLAYER_MMR_LAMBDA overrides while falling back to defaults on invalid values.
Vector store SQLite PRAGMA configuration helpers
src/brainlayer/vector_store.py
Helper functions read integer configuration from BRAINLAYER_WAL_AUTOCHECKPOINT, BRAINLAYER_READ_MMAP_BYTES, and BRAINLAYER_READ_CACHE_KB environment variables with safe clamping. Writer initialization sets PRAGMA wal_autocheckpoint, and readonly connections apply PRAGMA mmap_size and PRAGMA cache_size.
Test SQLite PRAGMA configuration via environment variables
tests/test_vector_store_pragma_tuning.py
New test module validates that VectorStore correctly configures PRAGMA values for reader and writer connections based on environment variables, applies overrides when set, and falls back to defaults on invalid input.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • EtanHey/brainlayer#242: Both PRs modify src/brainlayer/search_repo.py's MMR reranking logic, with this PR making _MMR_LAMBDA configurable via environment variable and adding an early-exit when _MMR_LAMBDA >= 1.0.
  • EtanHey/brainlayer#245: Both PRs modify hybrid search MMR reranking behavior in src/brainlayer/search_repo.py—this PR gates reranking with environment-driven _MMR_LAMBDA logic while the other PR changes how candidates are selected and recombined.

Poem

🐰 Configurable pragmas and lambdas so fine,
No hardcoding choices, just .env's design,
MMR sleeps well at 1.0 default state,
SQLite sings sweet with cache tuned great!
Environment whispers what numbers to use, 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the two main changes: tuning read-path PRAGMAs and defaulting MMR off, matching the PR's primary objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/readpath-phase-3-pragma-mmr-default-off

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/brainlayer/search_repo.py`:
- Around line 35-38: The current parsing of BRAINLAYER_MMR_LAMBDA assigns any
float (including nan, inf, negatives) to _MMR_LAMBDA; change the try/except
block that sets _MMR_LAMBDA so after converting the env var to float you
validate with math.isfinite(value) and then clamp the value into the [0.0, 1.0]
range (e.g., value = max(0.0, min(1.0, value))); if parsing fails or the value
is not finite, fall back to 1.0; keep the symbol name _MMR_LAMBDA and the env
key "BRAINLAYER_MMR_LAMBDA" so the change is localized to that assignment.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: cf52f1fc-c2d2-4eae-a0d4-46f48758e908

📥 Commits

Reviewing files that changed from the base of the PR and between 6bd3360 and 348cee9.

📒 Files selected for processing (5)
  • src/brainlayer/search_repo.py
  • src/brainlayer/vector_store.py
  • tests/test_hybrid_search.py
  • tests/test_mmr_default_off.py
  • tests/test_vector_store_pragma_tuning.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Cursor Bugbot
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.13)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

**/*.py: Use paths.py:get_db_path() for all database path resolution; all scripts and CLI must use this function rather than hardcoding paths
When performing bulk database operations: stop enrichment workers first, checkpoint WAL before and after, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks, and checkpoint every 3 batches

Files:

  • tests/test_vector_store_pragma_tuning.py
  • tests/test_hybrid_search.py
  • tests/test_mmr_default_off.py
  • src/brainlayer/search_repo.py
  • src/brainlayer/vector_store.py
src/brainlayer/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/**/*.py: Use retry logic on SQLITE_BUSY errors; each worker must use its own database connection to handle concurrency safely
Classification must preserve ai_code, stack_trace, and user_message verbatim; skip noise entries entirely and summarize build_log and dir_listing entries (structure only)
Use AST-aware chunking via tree-sitter; never split stack traces; mask large tool output
For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback via enrichment_controller.py, and Ollama as offline last-resort; allow override via BRAINLAYER_ENRICH_BACKEND env var
Configure enrichment rate via BRAINLAYER_ENRICH_RATE environment variable (default 0.2 = 12 RPM)
Implement chunk lifecycle columns: superseded_by, aggregated_into, archived_at on chunks table; exclude lifecycle-managed chunks from default search; allow include_archived=True to show history
Implement brain_supersede with safety gate for personal data (journals, notes, health/finance); use soft-delete for brain_archive with timestamp
Add supersedes parameter to brain_store for atomic store-and-replace operations
Run linting and formatting with: ruff check src/ && ruff format src/
Run tests with pytest
Use PRAGMA wal_checkpoint(FULL) before and after bulk database operations to prevent WAL bloat

Files:

  • src/brainlayer/search_repo.py
  • src/brainlayer/vector_store.py
🪛 OpenGrep (1.21.0)
tests/test_vector_store_pragma_tuning.py

[ERROR] 7-7: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)

src/brainlayer/vector_store.py

[ERROR] 202-202: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)


[ERROR] 1201-1201: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)


[ERROR] 1202-1202: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)

🔇 Additional comments (6)
src/brainlayer/vector_store.py (2)

202-202: OpenGrep f-string SQL warnings here are false positives.

PRAGMA statements in SQLite do not support bound parameters for their values, so an f-string is required. The interpolated values come from _wal_autocheckpoint_pages(), _read_mmap_bytes(), and _read_cache_size_kb(), all of which return ints derived via int(...) with safe fallbacks — no untrusted string flows into the SQL. No change needed; flagging only to document the rationale for ignoring the static-analysis hits.

Also applies to: 1200-1202


74-91: LGTM!

tests/test_vector_store_pragma_tuning.py (1)

11-84: LGTM!

src/brainlayer/search_repo.py (1)

484-485: LGTM!

tests/test_hybrid_search.py (1)

434-436: LGTM!

Also applies to: 492-494

tests/test_mmr_default_off.py (1)

1-55: LGTM!

Comment thread src/brainlayer/search_repo.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 348cee945a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +35 to +38
try:
_MMR_LAMBDA = float(os.environ.get("BRAINLAYER_MMR_LAMBDA", "1.0"))
except (TypeError, ValueError):
_MMR_LAMBDA = 1.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Sanitize MMR lambda env to finite [0,1] before use

BRAINLAYER_MMR_LAMBDA is parsed with float(...), so non-finite values like nan/inf are accepted and do not hit the fallback. With nan, the reranker enters the MMR path (nan >= 1.0 is false) and computes nan MMR scores, so candidate selection no longer reflects relevance/diversity and retrieval ordering can silently degrade for all queries in that process. Validate the parsed value (finite and bounded) and fall back to the default when invalid.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50389f500c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".



def _wal_autocheckpoint_pages() -> int:
return max(_int_env("BRAINLAYER_WAL_AUTOCHECKPOINT", 10_000), 0)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate WAL autocheckpoint override is positive

_wal_autocheckpoint_pages() currently clamps negative BRAINLAYER_WAL_AUTOCHECKPOINT values to 0, and SQLite treats wal_autocheckpoint <= 0 as disabling auto-checkpointing. That means a common misconfiguration like -1 silently turns checkpointing off instead of falling back to the safe default, which can rapidly grow the WAL file under write load (a known stability risk in this repo). Treat non-positive values as invalid and fall back to 10000 rather than passing 0 through.

Useful? React with 👍 / 👎.

@EtanHey EtanHey merged commit be11d44 into main May 21, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant