feat: mid-run backend fallback for enrichment by EtanHey · Pull Request #38 · EtanHey/brainlayer

EtanHey · 2026-02-26T16:29:39Z

Summary

When MLX crashes mid-batch (Abort trap: 6), pipeline now auto-switches to Ollama after 3 consecutive failures
Prevents losing entire enrichment batches when the primary backend dies
Fallback availability is checked once and cached for the run
State resets on each run_enrichment() call

Context

The nightly enrichment window (11pm-11am) has been failing because MLX crashes ~1 minute into each run. All subsequent chunks fail with "Connection refused". With this change, the pipeline detects the crash pattern and falls back to Ollama automatically.

Test plan

9 new tests in test_enrichment_fallback.py
Full suite: 478 passed, 2 skipped
Manual verification: 3 chunks enriched successfully via MLX

🤖 Generated with Claude Code

Note

Medium Risk
Adds stateful mid-run backend switching logic in the enrichment pipeline; incorrect switching/caching could mask transient failures or route load to an unintended backend, but the change is localized and covered by new tests.

Overview
Prevents enrichment runs from dying when the primary local LLM backend crashes mid-batch by adding mid-run automatic fallback in call_llm after 3 consecutive failures, including a one-time/cached availability probe and a retry of the failed chunk on the fallback backend.

Resets fallback state at the start of each run_enrichment() invocation, and adds a new test suite (test_enrichment_fallback.py) covering failure counting, threshold switching, stickiness of the fallback mode, reverse fallback direction (Ollama→MLX), cached availability checks, and state reset.

^{Written by Cursor Bugbot for commit 19c6c86. This will update automatically on new commits. Configure here.}

Summary by CodeRabbit

New Features
- Implemented mid-run fallback mechanism for LLM calls that automatically switches to a fallback backend after detecting consecutive failures on the primary backend, with availability checks and state tracking.
Tests
- Added comprehensive test suite for fallback behavior including state management, backend switching transitions, and availability verification.

coderabbitai · 2026-02-26T16:31:39Z

Warning

Rate limit exceeded

@EtanHey has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 37 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 076f73e and 19c6c86.

📒 Files selected for processing (2)

src/brainlayer/pipeline/enrichment.py
tests/test_enrichment_fallback.py

📝 Walkthrough

Walkthrough

This pull request introduces a mid-run fallback mechanism for LLM calls in the enrichment pipeline. It tracks consecutive failures on a primary backend, switches to a fallback backend after reaching a threshold, and retries failed chunks on the fallback. Global state variables manage failure counts, activation status, and availability checks. Comprehensive tests validate the fallback behavior across success, failure, and transition scenarios.

Changes

Cohort / File(s)	Summary
Fallback Mechanism Core `src/brainlayer/pipeline/enrichment.py`	Adds mid-run fallback logic with global state tracking (_consecutive_failures, _FALLBACK_THRESHOLD, _fallback_active, _fallback_available). Implements _check_fallback_available() for cached fallback reachability checks. Extends call_llm to route to fallback backend after consecutive failures and resets state at start of run_enrichment.
Fallback Tests `tests/test_enrichment_fallback.py`	Adds 123 lines of unit tests covering fallback state tracking, call_llm behavior under success/failure/activation paths, transition to and persistence of fallback mode, state reset via run_enrichment, bidirectional fallback scenarios, and availability checking with caching.

Sequence Diagram

sequenceDiagram
    actor Client
    participant enrichment as Enrichment Pipeline
    participant primary as Primary Backend
    participant fallback as Fallback Backend
    participant checker as Availability Checker

    Client->>enrichment: call_llm(request)
    enrichment->>primary: attempt LLM call
    
    alt Success
        primary-->>enrichment: response
        enrichment->>enrichment: reset _consecutive_failures
        enrichment-->>Client: return response
    else Failure & Threshold Not Reached
        primary-->>enrichment: error
        enrichment->>enrichment: increment _consecutive_failures
        enrichment-->>Client: propagate error
    else Failure & Threshold Reached
        primary-->>enrichment: error
        enrichment->>enrichment: increment _consecutive_failures
        enrichment->>checker: _check_fallback_available()
        
        alt Fallback Available
            checker-->>enrichment: true (cached)
            enrichment->>enrichment: _fallback_active = true
            enrichment->>enrichment: log fallback activation
            enrichment->>fallback: retry with fallback backend
            fallback-->>enrichment: response/error
            enrichment-->>Client: return result
        else Fallback Unavailable
            checker-->>enrichment: false (cached)
            enrichment->>enrichment: log no-fallback warning
            enrichment-->>Client: propagate error
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Possibly related PRs

feat: MLX auto-detection, stall detection, current_context fix #14: Modifies enrichment.py with runtime backend fallback/selection logic including call_llm changes and fallback handling.

Poem

🐰 A hoppy harvest of resilience blooms,
When primary paths crumble and gloom,
The fallback whispers: "fear not, I'm near!"
Retrying with grace through each computing sphere,
Two backends dancing—now that's frontier! 🌟

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: mid-run backend fallback for enrichment' directly and concisely describes the main change: implementing automatic fallback to a secondary backend during enrichment processing when the primary fails.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/enrichment-mid-run-fallback

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-02-26T16:38:37Z

+                f"  FALLBACK: {_consecutive_failures} consecutive failures on {effective}, switching to {fallback}",
+                file=sys.stderr,
+            )
+            _fallback_active = True


Race condition on global fallback state with parallel workers

Medium Severity

The module-level _consecutive_failures, _fallback_active, and _fallback_available are read and written from call_llm without synchronization, but enrich_batch calls _enrich_one (which calls call_llm) from multiple threads via ThreadPoolExecutor when parallel > 1 (recommended value is 3 for MLX). The _consecutive_failures += 1 operation is not atomic — even under the GIL, it decomposes into LOAD/ADD/STORE bytecodes that can interleave. This causes lost increments, delaying or preventing fallback activation in exactly the crash scenario this feature targets.

Additional Locations (1)

src/brainlayer/pipeline/enrichment.py#L433-L437

cursor · 2026-02-26T16:38:37Z

+        _fallback_available = True
+    except Exception:
+        _fallback_available = False
+    return _fallback_available


Fallback availability cache ignores primary backend parameter

Low Severity

_check_fallback_available caches its result in a single _fallback_available boolean, but which backend it checks depends on the primary parameter. If first called with primary="mlx" (checks ollama), the cached True is returned for a subsequent call with primary="ollama" without ever checking if mlx is available. The cache key doesn't account for the input, making the cached result potentially wrong for a different primary.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/brainlayer/pipeline/enrichment.py`:
- Around line 431-457: The fallback shared state (_consecutive_failures,
_fallback_active, _fallback_available) is accessed concurrently and must be
protected with a lock: add a module-level threading.Lock (or RLock) and wrap all
read/modify/write accesses in call_llm(), _check_fallback_available(), and the
reset logic in run_enrichment() with that lock (acquire before reading the
cached _fallback_available, before incrementing or checking
_consecutive_failures and before setting _fallback_active, and before resetting
these variables) to ensure atomicity and prevent race conditions.

In `@tests/test_enrichment_fallback.py`:
- Around line 93-96: Replace the bare try/except in the test with an explicit
pytest.raises assertion: call enrichment.run_enrichment(max_chunks=0,
batch_size=1) inside a pytest.raises(RuntimeError) context so the test asserts
the specific RuntimeError from the backend check rather than swallowing any
Exception; keep the same parameters and test setup around the call to ensure
behavior is unchanged.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3eced96 and 076f73e.

📒 Files selected for processing (2)

src/brainlayer/pipeline/enrichment.py
tests/test_enrichment_fallback.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Cursor Bugbot

🧰 Additional context used

📓 Path-based instructions (3)

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Run tests using pytest from the project root

Files:

tests/test_enrichment_fallback.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use ruff check src/ for linting and ruff format src/ for code formatting

Files:

src/brainlayer/pipeline/enrichment.py

src/brainlayer/pipeline/enrichment.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/pipeline/enrichment.py: Set "think": false in Ollama API calls for GLM-4.7 to avoid unnecessary thinking mode overhead (350+ tokens, ~20s delay)
Provide enrichment backends (Ollama and MLX) with environment variable selection via BRAINLAYER_ENRICH_BACKEND
Enrich chunks with 10 metadata fields: summary, tags, importance (1-10), intent, primary_symbols, resolved_query, epistemic_level, version_scope, debt_impact, and external_deps

Files:

src/brainlayer/pipeline/enrichment.py

🧠 Learnings (4)

📓 Common learnings

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Provide enrichment backends (Ollama and MLX) with environment variable selection via `BRAINLAYER_ENRICH_BACKEND`

📚 Learning: 2026-02-23T16:51:38.317Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Provide enrichment backends (Ollama and MLX) with environment variable selection via `BRAINLAYER_ENRICH_BACKEND`

Applied to files:

src/brainlayer/pipeline/enrichment.py

📚 Learning: 2026-02-23T16:51:38.317Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Enrich chunks with 10 metadata fields: summary, tags, importance (1-10), intent, primary_symbols, resolved_query, epistemic_level, version_scope, debt_impact, and external_deps

Applied to files:

src/brainlayer/pipeline/enrichment.py

📚 Learning: 2026-02-23T16:51:38.317Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Set `"think": false` in Ollama API calls for GLM-4.7 to avoid unnecessary thinking mode overhead (350+ tokens, ~20s delay)

Applied to files:

src/brainlayer/pipeline/enrichment.py

🧬 Code graph analysis (1)

tests/test_enrichment_fallback.py (2)

src/brainlayer/pipeline/enrichment.py (3)

call_llm (459-515)

run_enrichment (770-883)

_check_fallback_available (440-456)

src/brainlayer/vector_store.py (1)

get_enrichment_stats (1385-1419)

🪛 GitHub Actions: CI

tests/test_enrichment_fallback.py

[error] ruff format check detected formatting issues. File would be reformatted. Run 'ruff format' to fix code style.

src/brainlayer/pipeline/enrichment.py

[error] ruff format check detected formatting issues. File would be reformatted. Run 'ruff format' to fix code style.

🔇 Additional comments (2)

tests/test_enrichment_fallback.py (2)
37-75: Nice coverage of fallback transitions and availability caching paths.

These cases validate threshold activation, reverse fallback direction, sticky fallback mode, and cached availability behavior well.

Also applies to: 103-123

31-33: ⚠️ Potential issue | 🟠 Major

Fix Ruff formatting for multiline with blocks (currently CI-blocking).

The pipeline already reports ruff format check failure; these backslash-continued with blocks are a likely source and should be reformatted before merge.
🧹 Ruff-compatible pattern
-        with patch.object(enrichment, "call_mlx", return_value=None), \
-             patch.object(enrichment, "_check_fallback_available", return_value=False):
+        with (
+            patch.object(enrichment, "call_mlx", return_value=None),
+            patch.object(enrichment, "_check_fallback_available", return_value=False),
+        ):
Also applies to: 40-43, 50-52, 60-62, 69-72
⛔ Skipped due to learnings
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Provide enrichment backends (Ollama and MLX) with environment variable selection via `BRAINLAYER_ENRICH_BACKEND`
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/**/*.py : Use `ruff check src/` for linting and `ruff format src/` for code formatting

coderabbitai · 2026-02-26T16:40:09Z

+# Mid-run fallback state — tracks consecutive failures for automatic backend switching.
+# When the primary backend crashes mid-run (e.g., MLX "Abort trap: 6"), the pipeline
+# automatically retries failed chunks on the fallback backend instead of losing the entire batch.
+_consecutive_failures = 0
+_FALLBACK_THRESHOLD = 3  # Switch after 3 consecutive failures
+_fallback_active = False
+_fallback_available: Optional[bool] = None  # None = not checked yet
+
+
+def _check_fallback_available(primary: str) -> bool:
+    """Check if the fallback backend is reachable. Cached for the run."""
+    global _fallback_available
+    if _fallback_available is not None:
+        return _fallback_available
+
+    fallback = "ollama" if primary == "mlx" else "mlx"
+    try:
+        if fallback == "ollama":
+            resp = requests.get(f"{OLLAMA_BASE_URL}/api/tags", timeout=3)
+        else:
+            resp = requests.get(f"{MLX_BASE_URL}/v1/models", timeout=3)
+        resp.raise_for_status()
+        _fallback_available = True
+    except Exception:
+        _fallback_available = False
+    return _fallback_available
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the code structure and threading claims echo "=== Checking ThreadPoolExecutor and concurrent submission ===" sed -n '700,710p' src/brainlayer/pipeline/enrichment.py echo -e "\n=== Checking call_llm function definition and global state mutations ===" sed -n '625,635p' src/brainlayer/pipeline/enrichment.py echo -e "\n=== Checking global state mutations in call_llm (469-515) ===" sed -n '469,515p' src/brainlayer/pipeline/enrichment.py echo -e "\n=== Checking state reset in run_enrichment (779-783) ===" sed -n '779,783p' src/brainlayer/pipeline/enrichment.py echo -e "\n=== Searching for any existing thread locks ===" rg -n 'Lock|mutex|RLock|Semaphore|threading' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 3213

Protect fallback state with a lock in parallel execution.

call_llm() mutates shared module state (_consecutive_failures, _fallback_active, _fallback_available) while worker threads execute concurrently via ThreadPoolExecutor (line 703–706). The read-modify-write operations on these globals—especially the increment and threshold check on _consecutive_failures (lines 494–495) and the set operation on _fallback_active (line 498)—are unsynchronized, creating race conditions that cause nondeterministic fallback activation.

🔧 Suggested fix (thread-safe fallback state)

_consecutive_failures = 0 _FALLBACK_THRESHOLD = 3 # Switch after 3 consecutive failures _fallback_active = False _fallback_available: Optional[bool] = None # None = not checked yet +_fallback_state_lock = threading.Lock()

Apply lock protection to all read/modify/write operations on these three globals in call_llm() (lines 474, 489, 494–498) and _check_fallback_available(), and to the reset in run_enrichment() (lines 779–783).

🧰 Tools

🪛 GitHub Actions: CI

[error] ruff format check detected formatting issues. File would be reformatted. Run 'ruff format' to fix code style.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/pipeline/enrichment.py` around lines 431 - 457, The fallback shared state (_consecutive_failures, _fallback_active, _fallback_available) is accessed concurrently and must be protected with a lock: add a module-level threading.Lock (or RLock) and wrap all read/modify/write accesses in call_llm(), _check_fallback_available(), and the reset logic in run_enrichment() with that lock (acquire before reading the cached _fallback_available, before incrementing or checking _consecutive_failures and before setting _fallback_active, and before resetting these variables) to ensure atomicity and prevent race conditions.

coderabbitai · 2026-02-26T16:40:09Z

+            try:
+                enrichment.run_enrichment(max_chunks=0, batch_size=1)
+            except Exception:
+                pass  # Will fail on backend check, that's fine


🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

cat -n tests/test_enrichment_fallback.py | head -120 | tail -40

Repository: EtanHey/brainlayer

Length of output: 1730

🏁 Script executed:

head -20 tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 558

🏁 Script executed:

sed -n '85,105p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 798

🏁 Script executed:

rg -n "run_enrichment" tests/test_enrichment_fallback.py | head -20

Repository: EtanHey/brainlayer

Length of output: 264

🏁 Script executed:

rg -n "pytest.raises" tests/ | head -10

Repository: EtanHey/brainlayer

Length of output: 606

🏁 Script executed:

grep -n "import pytest" tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 44

🏁 Script executed:

sed -n '107,115p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 403

🏁 Script executed:

sed -n '1,50p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 2083

🏁 Script executed:

grep -rn "def run_enrichment" brainlayer/

Repository: EtanHey/brainlayer

Length of output: 106

🏁 Script executed:

python3 << 'EOF' import subprocess result = subprocess.run(['grep', '-A', '20', 'def run_enrichment', 'brainlayer/pipeline/enrichment.py'], capture_output=True, text=True) print(result.stdout) EOF

Repository: EtanHey/brainlayer

Length of output: 44

🏁 Script executed:

find . -name "enrichment.py" -type f

Repository: EtanHey/brainlayer

Length of output: 101

🏁 Script executed:

fd -t f "enrichment.py"

Repository: EtanHey/brainlayer

Length of output: 178

🏁 Script executed:

git ls-files | grep enrichment

Repository: EtanHey/brainlayer

Length of output: 685

🏁 Script executed:

sed -n '1,100p' src/brainlayer/pipeline/enrichment.py | grep -n "def run_enrichment" -A 30

Repository: EtanHey/brainlayer

Length of output: 44

🏁 Script executed:

grep -n "def run_enrichment" src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 85

🏁 Script executed:

sed -n '100,150p' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 2260

🏁 Script executed:

sed -n '768,850p' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 3358

🏁 Script executed:

sed -n '768,900p' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 5428

🏁 Script executed:

grep -n "import pytest\|from pytest" tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 44

🏁 Script executed:

cat -n tests/test_enrichment_fallback.py | grep -A 25 "class TestRunEnrichmentResetsState"

Repository: EtanHey/brainlayer

Length of output: 1227

🏁 Script executed:

python3 << 'EOF' # Check if requests is imported in enrichment module import subprocess result = subprocess.run(['grep', '-n', '^import requests\|^from requests', 'src/brainlayer/pipeline/enrichment.py'], capture_output=True, text=True) print("Requests import:", result.stdout if result.stdout else "Not found at top level") # Check what exceptions run_enrichment raises result2 = subprocess.run(['grep', '-n', 'raise RuntimeError', 'src/brainlayer/pipeline/enrichment.py'], capture_output=True, text=True) print("RuntimeError raises:", result2.stdout) EOF

Repository: EtanHey/brainlayer

Length of output: 257

🏁 Script executed:

cd tests && python3 -m pytest --co -q test_enrichment_fallback.py 2>&1 | head -20

Repository: EtanHey/brainlayer

Length of output: 1256

🏁 Script executed:

# Check if pytest is available python3 -c "import pytest; print(pytest.__version__)"

Repository: EtanHey/brainlayer

Length of output: 67

🏁 Script executed:

# Verify the backend check happens before other code sed -n '768,810p' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 1830

🏁 Script executed:

sed -n '1,50p' src/brainlayer/pipeline/enrichment.py | grep -n "import\|from"

Repository: EtanHey/brainlayer

Length of output: 492

🏁 Script executed:

# Check what the actual pattern is in other tests grep -A 10 "pytest.raises" tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 44

🏁 Script executed:

# Check all imports in the test file cat -n tests/test_enrichment_fallback.py | sed -n '1,10p'

Repository: EtanHey/brainlayer

Length of output: 404

🏁 Script executed:

# Check if pytest is imported anywhere grep -n "^import pytest\|^from pytest" tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 44

🏁 Script executed:

# Look at the full test method to understand test setup sed -n '87,115p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 1152

🏁 Script executed:

# Check if there are other patterns in the codebase for handling expected exceptions grep -B 5 -A 5 "pytest.raises" tests/test_storage.py | head -30

Repository: EtanHey/brainlayer

Length of output: 747

🏁 Script executed:

# Verify requests module is accessible in enrichment grep -n "requests\." src/brainlayer/pipeline/enrichment.py | head -10

Repository: EtanHey/brainlayer

Length of output: 678

🏁 Script executed:

# Check what error is actually raised when requests.get fails python3 << 'EOF' # Simulate what happens when run_enrichment tries to connect to a non-existent backend import requests try: resp = requests.get("http://localhost:9999/v1/models", timeout=5) resp.raise_for_status() except Exception as e: print(f"Exception type: {type(e).__name__}") print(f"Exception: {e}") EOF

Repository: EtanHey/brainlayer

Length of output: 573

🏁 Script executed:

# Check if the setup_method resets state for this test class sed -n '87,92p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 308

🏁 Script executed:

# Verify the actual exception type raised by run_enrichment when backends fail python3 << 'EOF' # Looking at the code path: # 1. run_enrichment tries requests.get # 2. If that fails with Exception, it tries fallback # 3. If fallback also fails with Exception, it raises RuntimeError # The test doesn't mock requests, so it will get ConnectionError # Which will trigger the nested except, leading to RuntimeError print("Expected behavior:") print("1. requests.get() fails -> ConnectionError (subclass of Exception)") print("2. First except catches it, tries fallback") print("3. Fallback also fails -> another ConnectionError") print("4. Second except catches it, raises RuntimeError") print("5. Test should catch RuntimeError, not generic Exception") EOF

Repository: EtanHey/brainlayer

Length of output: 346

🏁 Script executed:

# Check if there's a setup_method in this test class grep -A 5 "class TestRunEnrichmentResetsState" tests/test_enrichment_fallback.py | grep -E "setup_method|def test"

Repository: EtanHey/brainlayer

Length of output: 103

Use pytest.raises() to explicitly assert the expected exception type.

Catching bare Exception masks potential regressions like assertion typos. The test should explicitly verify that RuntimeError is raised when the backend check fails, using the established pattern from other tests in the suite.

Suggested refactor

+import pytest from unittest.mock import patch from brainlayer.pipeline import enrichment @@ with patch.object(enrichment, "VectorStore") as mock_vs: mock_store = mock_vs.return_value mock_store.get_enrichment_stats.return_value = { "enriched": 0, "enrichable": 0, "remaining": 0, "skipped": 0, "percent": "0", "total_chunks": 0, "by_intent": {}, } - try: - enrichment.run_enrichment(max_chunks=0, batch_size=1) - except Exception: - pass # Will fail on backend check, that's fine + with patch.object(enrichment.requests, "get", side_effect=ConnectionError("backend down")): + with pytest.raises(RuntimeError): + enrichment.run_enrichment(max_chunks=0, batch_size=1)

🧰 Tools

🪛 GitHub Actions: CI

[error] ruff format check detected formatting issues. File would be reformatted. Run 'ruff format' to fix code style.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_enrichment_fallback.py` around lines 93 - 96, Replace the bare try/except in the test with an explicit pytest.raises assertion: call enrichment.run_enrichment(max_chunks=0, batch_size=1) inside a pytest.raises(RuntimeError) context so the test asserts the specific RuntimeError from the backend check rather than swallowing any Exception; keep the same parameters and test setup around the call to ensure behavior is unchanged.

When MLX crashes mid-batch (e.g., "Abort trap: 6"), the pipeline now automatically switches to Ollama after 3 consecutive failures instead of failing every remaining chunk in the batch. - Track consecutive failures in call_llm() - Auto-detect fallback backend availability - Cache availability check for the run - Reset fallback state on each run_enrichment() call - 9 new tests for fallback logic Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor Bot reviewed Feb 26, 2026

View reviewed changes

coderabbitai Bot reviewed Feb 26, 2026

View reviewed changes

EtanHey and others added 2 commits February 26, 2026 18:40

style: format enrichment and test files with ruff

19c6c86

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

EtanHey force-pushed the feat/enrichment-mid-run-fallback branch from 79a0402 to 19c6c86 Compare February 26, 2026 16:40

EtanHey merged commit 8a468da into main Feb 26, 2026
5 checks passed

coderabbitai Bot mentioned this pull request Feb 27, 2026

fix: DB lock resilience + MLX health recovery #44

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: mid-run backend fallback for enrichment#38

feat: mid-run backend fallback for enrichment#38
EtanHey merged 2 commits into
mainfrom
feat/enrichment-mid-run-fallback

EtanHey commented Feb 26, 2026 •

edited by cursor Bot

Loading

Uh oh!

coderabbitai Bot commented Feb 26, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Feb 26, 2026

Uh oh!

cursor Bot Feb 26, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Feb 26, 2026

Uh oh!

coderabbitai Bot Feb 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EtanHey commented Feb 26, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Feb 26, 2026

Choose a reason for hiding this comment

Race condition on global fallback state with parallel workers

Uh oh!

cursor Bot Feb 26, 2026

Choose a reason for hiding this comment

Fallback availability cache ignores primary backend parameter

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

EtanHey commented Feb 26, 2026 •

edited by cursor Bot

Loading

coderabbitai Bot commented Feb 26, 2026 •

edited

Loading