Skip to content

feat(enrich): long-lived supervisor replaces respawn-init storm#306

Merged
EtanHey merged 2 commits into
mainfrom
feat/enrich-long-lived-supervisor
May 21, 2026
Merged

feat(enrich): long-lived supervisor replaces respawn-init storm#306
EtanHey merged 2 commits into
mainfrom
feat/enrich-long-lived-supervisor

Conversation

@EtanHey
Copy link
Copy Markdown
Owner

@EtanHey EtanHey commented May 21, 2026

Status

DRAFT: PR-alpha (fix/single-writer-arbitration-option-a) is not merged or visible on GitHub yet. This PR intentionally stays draft until it can be rebased on the pidfile mutex base and the real single-instance path can be validated.

Summary

  • Added run_enrich_supervisor() as a long-lived realtime enrichment loop around one writable VectorStore(db_path).
  • Added brainlayer enrich --mode realtime --supervisor with SIGTERM/SIGINT stop-event handling.
  • Replaced the LaunchAgent shell while true respawn loop with a direct supervisor invocation.
  • Added supervisor lifecycle, error survival, idle polling, shutdown, log, CLI, and plist regression tests.

Architecture

launchd KeepAlive
  -> brainlayer enrich --mode realtime --supervisor
      -> VectorStore(db_path) once
          -> PR-alpha pidfile mutex once, after rebase
      -> loop until SIGTERM/SIGINT
          -> enrich_realtime(store, limit=200000, since_hours=87600)
              -> existing bounded Gemini worker pool
              -> existing bounded enrichment write batcher
          -> if queue empty: sleep BRAINLAYER_ENRICH_IDLE_POLL_SECONDS, default 30s
          -> if transient cycle error: log and continue
      -> close VectorStore, releasing PR-alpha pidfile mutex

Single-init proof

Focused test: pytest tests/test_enrich_supervisor.py::test_supervisor_logs_single_vectorstore_initialized_line -q

Relevant assertion: caplog records exactly one VectorStore initialized for enrich supervisor line across three supervisor cycles.

Error survival proof

Focused test: pytest tests/test_enrich_supervisor.py::test_supervisor_logs_gemini_error_and_continues -q

This injects RuntimeError("gemini 503") on cycle 1 and asserts cycle 2 still enriches successfully, with failed == 1 and enriched == 1.

SIGTERM graceful-shutdown proof

Focused test: pytest tests/test_cli_enrich.py::test_cli_enrich_supervisor_handles_sigterm_gracefully -q

This sends a real SIGTERM to the process while the CLI handler is installed, asserts the supervisor stop event is set, and exits with code 0.

Before/after enrich rate

  • Baseline from dispatch prompt: 213 enrichments / 77 min = ~166/hr.
  • Draft PR after-rate: pending. Must be sampled for >=30 min after PR-alpha is merged, this branch is rebased, the LaunchAgent is installed/restarted, and the supervisor is running against the production queue.
  • Target: >=1500/hr sustained per prompt.

Reproducible bench commands

# Verify supervisor tests
pytest tests/test_enrich_supervisor.py tests/test_cli_enrich.py -q

# Verify launchd template shape
pytest tests/test_enrichment_controller.py::test_enrichment_plist_invokes_cli_enrich_entrypoint \
  tests/test_enrich_supervisor.py::test_enrichment_launchagent_runs_supervisor_not_shell_respawn_loop -q

# Full repo gate used by pre-push
PYTHONPATH=/Users/etanheyman/Gits/brainlayer-pr-beta/src /Users/etanheyman/Gits/brainlayer-pr-beta/.venv/bin/pytest
ruff check src tests
ruff format --check src tests

# Post-merge rate sample sketch, run for >=30 min after deploying plist
brainlayer enrich --stats
# record enriched count and timestamp, wait >=30m, run again, compute delta/hour
brainlayer enrich --stats

Test plan

  • RED first: pytest tests/test_enrich_supervisor.py -q initially failed 6/6 on missing run_enrich_supervisor.
  • RED CLI: pytest tests/test_cli_enrich.py::test_cli_enrich_supervisor_routes_to_controller -q initially failed on unknown --supervisor.
  • RED plist: pytest tests/test_enrich_supervisor.py::test_enrichment_launchagent_runs_supervisor_not_shell_respawn_loop -q initially failed on /bin/zsh -lc while true args.
  • Focused final: pytest tests/test_enrich_supervisor.py tests/test_cli_enrich.py tests/test_enrichment_controller.py::test_enrichment_plist_invokes_cli_enrich_entrypoint -q -> 18 passed.
  • ruff check src tests -> passed.
  • ruff format --check src tests -> passed.
  • Pre-push gate -> 2067 passed / 9 skipped / 75 deselected / 1 xfailed, MCP registration 3 passed, isolated eval/hook routing 32 passed, bun 1 passed, FTS5 shell regression passed.

Known dependency

The actual pidfile single-instance enforcement remains owned by PR-alpha. Until that merges, this PR proves constructor-error propagation with a WriterInUseError-shaped fake and keeps the PR draft.

Note

Replace shell-loop respawn with a long-lived run_enrich_supervisor loop in the enrichment LaunchAgent

  • Adds run_enrich_supervisor to enrichment_controller.py: a persistent loop that reuses a single VectorStore, sleeps when the queue is idle, continues on per-cycle errors, and returns an EnrichmentSupervisorResult with aggregated stats.
  • Adds --supervisor flag to the enrich CLI command; when used with --mode realtime, the CLI runs the supervisor loop, installs SIGTERM/SIGINT handlers for graceful shutdown, and exits with the supervisor's exit code.
  • Updates the LaunchAgent plist to invoke __BRAINLAYER_BIN__ enrich --mode realtime --supervisor directly, delegating restart-on-crash to launchd instead of a while true shell loop.
  • Idle poll interval is tunable via BRAINLAYER_ENRICH_IDLE_POLL_SECONDS environment variable, defaulting to 30 seconds.
  • Behavioral Change: the enrichment process is now a single long-running binary managed by launchd rather than a zsh subprocess; crashes surface to launchd immediately rather than being silently restarted by the shell loop.
📊 Macroscope summarized 7a89948. 3 files reviewed, 1 issue evaluated, 0 issues filtered, 1 comment posted

🗂️ Filtered Issues

Summary by CodeRabbit

  • New Features

    • Added a supervisor mode to run continuous realtime enrichment as a long-lived background service with configurable limits, lookback window, and idle polling.
    • CLI gains a --supervisor option and improved graceful SIGTERM/SIGINT shutdown with propagated exit codes.
    • LaunchAgent updated to directly run the enrichment supervisor as a background process (no shell respawn loop).
  • Tests

    • New and extended tests covering supervisor control flow, signal handling, retries/backoff, observability, and LaunchAgent arguments.

Review Change Stack


Note

Medium Risk
Changes how realtime enrichment runs under launchd by introducing a long-lived supervisor loop with signal handling and new defaults, which could impact background job stability and throughput if misconfigured. Test coverage is added, but behavior changes are operationally significant.

Overview
Switches realtime enrichment from a launchd-driven shell while true respawn loop to a first-class --supervisor mode that keeps a single VectorStore open and repeatedly runs realtime enrichment, sleeping when idle and continuing past per-cycle errors.

Updates brainlayer enrich to support --supervisor (with SIGTERM/SIGINT graceful shutdown and explicit since_hours defaulting behavior), adds supervisor defaults/tuning (BRAINLAYER_ENRICH_IDLE_POLL_SECONDS) in enrichment_controller, and revises the enrichment LaunchAgent plist to invoke the supervisor directly; adds regression tests covering CLI routing, supervisor lifecycle/error handling, and plist shape.

Reviewed by Cursor Bugbot for commit e96848f. Bugbot is set up for automated code reviews on this repo. Configure here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1993333d-89f8-4369-823f-606634c76cf7

📥 Commits

Reviewing files that changed from the base of the PR and between 7a89948 and e96848f.

📒 Files selected for processing (6)
  • scripts/launchd/com.brainlayer.enrichment.plist
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/enrichment_controller.py
  • tests/test_cli_enrich.py
  • tests/test_enrich_supervisor.py
  • tests/test_enrichment_controller.py

📝 Walkthrough

Walkthrough

This PR introduces a new enrich --supervisor mode that runs long-lived realtime enrichment within a persistent controller loop, replaces the LaunchAgent's shell-based supervision with direct CLI invocation, and adds comprehensive test coverage for the supervisor lifecycle, signal handling, error recovery, and plist configuration.

Changes

Enrichment Supervisor Mode Implementation

Layer / File(s) Summary
Supervisor data structures and environment defaults
src/brainlayer/enrichment_controller.py
Introduces DEFAULT_ENRICH_SUPERVISOR_LIMIT, DEFAULT_ENRICH_SUPERVISOR_SINCE_HOURS, DEFAULT_ENRICH_IDLE_POLL_SECONDS constants; adds EnrichmentSupervisorResult dataclass to track cycles, work counts, errors, and exit codes; implements _get_idle_poll_seconds() and _stop_aware_sleep() helpers for environment-configured polling and graceful shutdown.
Supervisor controller implementation
src/brainlayer/enrichment_controller.py
Implements run_enrich_supervisor() to initialize a reusable vector store, repeatedly invoke an enrichment function (default: enrich_realtime), aggregate per-cycle results and errors, continue on exceptions while recording failures, sleep when idle, respect stop events and max-cycle limits, and ensure store closure in a finally block.
CLI supervisor mode integration
src/brainlayer/cli/__init__.py
Adds --supervisor flag to enrich command, imports supervisor controller and constants, creates a threading.Event for shutdown signaling, installs SIGTERM/SIGINT handlers, invokes run_enrich_supervisor with computed limit/since_hours defaults, prints aggregate results, and re-raises typer.Exit before generic exception handling.
LaunchAgent configuration update
scripts/launchd/com.brainlayer.enrichment.plist
Replaces ProgramArguments from /bin/zsh -lc infinite loop (with hardcoded --limit/--since-hours and sleep) to direct array invocation [__BRAINLAYER_BIN__, enrich, --mode, realtime, --supervisor].
CLI supervisor mode tests
tests/test_cli_enrich.py
Adds tests for enrich --supervisor flag: verifies dispatch to run_enrich_supervisor with expected defaults, confirms non-None stop event is passed, and validates SIGTERM handling sets the stop event and emits shutdown message.
Supervisor controller behavior tests
tests/test_enrich_supervisor.py
Comprehensive suite validating supervisor lifecycle: single vector store reuse across cycles, error logging and recovery, graceful shutdown on stop event, constructor error propagation, idle polling sleep intervals, single store-init log line, and launchd plist argument validation (mode/supervisor flags, no shell loop, no legacy options).
Plist configuration test update
tests/test_enrichment_controller.py
Updates test_enrichment_plist_invokes_cli_enrich_entrypoint to assert ProgramArguments array format directly and exclude while true, --limit, and --since-hours strings.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • EtanHey/brainlayer#306: The main PR and the retrieved PR both refactor LaunchAgent ProgramArguments to run enrich --mode realtime --supervisor, add the same --supervisor CLI dispatch, and implement the same run_enrich_supervisor loop in enrichment_controller.py with corresponding tests around that supervisor behavior.
  • EtanHey/brainlayer#243: Both PRs modify the enrich CLI entrypoint in src/brainlayer/cli/__init__.py (main adds --supervisor realtime dispatch; retrieved PR adds batch --phase/--model routing), so the changes are related at the command-dispatch level.
  • EtanHey/brainlayer#151: Both PRs modify scripts/launchd/com.brainlayer.enrichment.plist’s LaunchAgent ProgramArguments to change how com.brainlayer.enrichment starts (retrieved: Python one-liner calling enrich_realtime; main: __BRAINLAYER_BIN__ enrich --mode realtime --supervisor).

Poem

🐰 A supervisor is born, to loop and loop with care,
Enriching data endlessly through the midnight air,
With signal handlers gentle and vector stores that stay,
It polls when idle, sleeps by the way,
Then rises renewed for another day!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: replacing a respawn-init shell loop with a long-lived supervisor pattern for realtime enrichment.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/enrich-long-lived-supervisor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented May 21, 2026

@coderabbitai review

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented May 21, 2026

@codex review

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented May 21, 2026

@cursor @BugBot review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@cursor
Copy link
Copy Markdown

cursor Bot commented May 21, 2026

You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7a899484cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +180 to +181
logger.exception("Enrich supervisor cycle failed; continuing")
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Back off after failed supervisor cycle

When enrich_fn raises, the supervisor immediately continues without any sleep, so a persistent failure path (for example API outage or DB write error) will spin the loop as fast as possible, flooding logs and burning CPU while repeatedly retrying. This is a regression from the previous launchd shell loop behavior, which had a delay between attempts, and it can destabilize production during incidents instead of degrading gracefully.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ed786c4. Supervisor cycle exceptions now sleep via the same idle-poll helper before retrying, with a persistent-failure regression test.

Comment thread src/brainlayer/cli/__init__.py Outdated
Comment on lines +962 to +964
since_hours=since_hours
if since_hours != DEFAULT_REALTIME_ENRICH_SINCE_HOURS
else DEFAULT_ENRICH_SUPERVISOR_SINCE_HOURS,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve explicit --since-hours in supervisor mode

This remapping treats since_hours == DEFAULT_REALTIME_ENRICH_SINCE_HOURS as "option not provided" and replaces it with DEFAULT_ENRICH_SUPERVISOR_SINCE_HOURS, so users cannot intentionally request the realtime default window (currently 8760h) in supervisor mode. Passing --since-hours 8760 is silently changed to 87600, which can unexpectedly widen scope and alter enrichment workload.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ed786c4. --since-hours now defaults to None at the CLI boundary, so omitted supervisor runs use 87600h while explicit values like 8760 are preserved.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/brainlayer/enrichment_controller.py`:
- Around line 137-195: The supervisor tight-loop on exceptions is caused by the
except Exception branch in run_enrich_supervisor skipping any sleep/backoff;
modify that except block in run_enrich_supervisor (the one that currently
increments stats and logger.exception) to call the same idle-poll/backoff helper
_sleep_or_wait_for_stop(stop_event, idle_poll_seconds, sleep_fn) (or a small
exponential backoff using sleep_fn) before continuing so repeated failures yield
CPU/log backoff; keep the existing stats increments and error recording. Also
add a regression test (e.g., similar to
test_supervisor_logs_gemini_error_and_continues) that forces enrich_fn to raise
on every cycle with max_cycles > 1 and asserts the provided sleep_fn was invoked
between failures.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e959b4d0-118e-43ff-bd47-bbca7008d924

📥 Commits

Reviewing files that changed from the base of the PR and between be11d44 and 7a89948.

📒 Files selected for processing (6)
  • scripts/launchd/com.brainlayer.enrichment.plist
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/enrichment_controller.py
  • tests/test_cli_enrich.py
  • tests/test_enrich_supervisor.py
  • tests/test_enrichment_controller.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.13)
  • GitHub Check: Macroscope - Correctness Check
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

**/*.py: Use paths.py:get_db_path() for all database path resolution; all scripts and CLI must use this function rather than hardcoding paths
When performing bulk database operations: stop enrichment workers first, checkpoint WAL before and after, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks, and checkpoint every 3 batches

Files:

  • tests/test_cli_enrich.py
  • tests/test_enrichment_controller.py
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/enrichment_controller.py
  • tests/test_enrich_supervisor.py
src/brainlayer/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/**/*.py: Use retry logic on SQLITE_BUSY errors; each worker must use its own database connection to handle concurrency safely
Classification must preserve ai_code, stack_trace, and user_message verbatim; skip noise entries entirely and summarize build_log and dir_listing entries (structure only)
Use AST-aware chunking via tree-sitter; never split stack traces; mask large tool output
For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback via enrichment_controller.py, and Ollama as offline last-resort; allow override via BRAINLAYER_ENRICH_BACKEND env var
Configure enrichment rate via BRAINLAYER_ENRICH_RATE environment variable (default 0.2 = 12 RPM)
Implement chunk lifecycle columns: superseded_by, aggregated_into, archived_at on chunks table; exclude lifecycle-managed chunks from default search; allow include_archived=True to show history
Implement brain_supersede with safety gate for personal data (journals, notes, health/finance); use soft-delete for brain_archive with timestamp
Add supersedes parameter to brain_store for atomic store-and-replace operations
Run linting and formatting with: ruff check src/ && ruff format src/
Run tests with pytest
Use PRAGMA wal_checkpoint(FULL) before and after bulk database operations to prevent WAL bloat

Files:

  • src/brainlayer/cli/__init__.py
  • src/brainlayer/enrichment_controller.py
🔇 Additional comments (11)
scripts/launchd/com.brainlayer.enrichment.plist (1)

9-17: LGTM!

tests/test_enrichment_controller.py (1)

1230-1233: LGTM!

src/brainlayer/enrichment_controller.py (1)

36-38: LGTM!

Also applies to: 109-119, 121-125, 128-134

tests/test_enrich_supervisor.py (1)

1-213: LGTM!

src/brainlayer/cli/__init__.py (4)

906-906: LGTM!


911-916: LGTM!


944-976: LGTM!


1020-1021: LGTM!

tests/test_cli_enrich.py (3)

3-4: LGTM!


33-61: LGTM!


63-88: LGTM!

Comment thread src/brainlayer/enrichment_controller.py
Comment on lines +967 to +969
finally:
signal.signal(signal.SIGTERM, previous_sigterm)
signal.signal(signal.SIGINT, previous_sigint)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low cli/__init__.py:967

signal.getsignal() returns None if the previous handler wasn't installed from Python, but signal.signal() raises TypeError when passed None. The finally block at lines 967-969 unconditionally restores the previous handlers, so if either previous_sigterm or previous_sigint is None, the cleanup itself crashes.

-                    signal.signal(signal.SIGTERM, previous_sigterm)
-                    signal.signal(signal.SIGINT, previous_sigint)
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file src/brainlayer/cli/__init__.py around lines 967-969:

`signal.getsignal()` returns `None` if the previous handler wasn't installed from Python, but `signal.signal()` raises `TypeError` when passed `None`. The `finally` block at lines 967-969 unconditionally restores the previous handlers, so if either `previous_sigterm` or `previous_sigint` is `None`, the cleanup itself crashes.

Evidence trail:
src/brainlayer/cli/__init__.py lines 954-969 (viewed at REVIEWED_COMMIT): `previous_sigterm = signal.getsignal(signal.SIGTERM)` (line 954), `previous_sigint = signal.getsignal(signal.SIGINT)` (line 955), and unconditional restore in `finally` block at lines 967-969. Python docs (https://docs.python.org/3/library/signal.html): getsignal() 'returned value may be a callable Python object, or one of the special values signal.SIG_IGN, signal.SIG_DFL or None'; signal.signal() handler must be 'a callable Python object taking two arguments, or one of the special values signal.SIG_IGN or signal.SIG_DFL'.

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented May 21, 2026

@coderabbitai review

@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented May 21, 2026

@codex review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@chatgpt-codex-connector
Copy link
Copy Markdown

Summary

  • Fixed supervisor-mode signal cleanup to avoid restoring handlers when signal.getsignal() returned None, preventing a cleanup-time TypeError in finally.
  • Added a regression test that simulates None prior handlers for SIGTERM/SIGINT and asserts the CLI supervisor path exits successfully without crashing during signal-handler restoration.

Testing

  • pytest tests/test_cli_enrich.py -q

View task →

stop_event: threading.Event | None = None,
vector_store_cls=None,
enrich_fn=None,
sleep_fn=time.sleep,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium brainlayer/enrichment_controller.py:148

run_enrich_supervisor returns EnrichmentSupervisorResult with exit_code=0 even when failed > 0 or failed_cycles > 0, causing the CLI to exit with status 0 on failure. Consider setting exit_code=1 when any failures occur so the caller can detect the supervisor encountered errors.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file src/brainlayer/enrichment_controller.py around line 148:

`run_enrich_supervisor` returns `EnrichmentSupervisorResult` with `exit_code=0` even when `failed > 0` or `failed_cycles > 0`, causing the CLI to exit with status 0 on failure. Consider setting `exit_code=1` when any failures occur so the caller can detect the supervisor encountered errors.

Evidence trail:
src/brainlayer/enrichment_controller.py lines 110-119 (EnrichmentSupervisorResult dataclass with exit_code=0 default), lines 138-198 (run_enrich_supervisor function — exit_code never modified), lines 176-184 (failed_cycles incremented but exit_code untouched), lines 186-191 (failed accumulated but exit_code untouched). src/brainlayer/cli/__init__.py line 973 (`raise typer.Exit(result.exit_code)` — CLI uses exit_code directly).

@EtanHey EtanHey force-pushed the feat/enrich-long-lived-supervisor branch from ed786c4 to e96848f Compare May 21, 2026 23:19
@EtanHey EtanHey marked this pull request as ready for review May 21, 2026 23:19
@EtanHey EtanHey merged commit cce646a into main May 21, 2026
6 checks passed
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e96848f497

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


db_path = Path(db_path)
stats = EnrichmentSupervisorResult()
store = vector_store_cls(db_path)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid holding writable VectorStore for full supervisor life

run_enrich_supervisor creates one writable VectorStore before the loop and only closes it on shutdown, so the writer pidfile stays owned for the entire LaunchAgent lifetime. In this codebase, writable VectorStore initialization acquires that pidfile and competing writable opens raise WriterInUseError, so this change turns lock contention from temporary (per one-shot cycle) into effectively permanent while supervisor mode is running. That can block other processes that still need writable VectorStore access (e.g., maintenance/bootstrap paths) until enrichment is stopped.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant