Skip to content

feat: mid-run backend fallback for enrichment#38

Merged
EtanHey merged 2 commits into
mainfrom
feat/enrichment-mid-run-fallback
Feb 26, 2026
Merged

feat: mid-run backend fallback for enrichment#38
EtanHey merged 2 commits into
mainfrom
feat/enrichment-mid-run-fallback

Conversation

@EtanHey
Copy link
Copy Markdown
Owner

@EtanHey EtanHey commented Feb 26, 2026

Summary

  • When MLX crashes mid-batch (Abort trap: 6), pipeline now auto-switches to Ollama after 3 consecutive failures
  • Prevents losing entire enrichment batches when the primary backend dies
  • Fallback availability is checked once and cached for the run
  • State resets on each run_enrichment() call

Context

The nightly enrichment window (11pm-11am) has been failing because MLX crashes ~1 minute into each run. All subsequent chunks fail with "Connection refused". With this change, the pipeline detects the crash pattern and falls back to Ollama automatically.

Test plan

  • 9 new tests in test_enrichment_fallback.py
  • Full suite: 478 passed, 2 skipped
  • Manual verification: 3 chunks enriched successfully via MLX

🤖 Generated with Claude Code


Note

Medium Risk
Adds stateful mid-run backend switching logic in the enrichment pipeline; incorrect switching/caching could mask transient failures or route load to an unintended backend, but the change is localized and covered by new tests.

Overview
Prevents enrichment runs from dying when the primary local LLM backend crashes mid-batch by adding mid-run automatic fallback in call_llm after 3 consecutive failures, including a one-time/cached availability probe and a retry of the failed chunk on the fallback backend.

Resets fallback state at the start of each run_enrichment() invocation, and adds a new test suite (test_enrichment_fallback.py) covering failure counting, threshold switching, stickiness of the fallback mode, reverse fallback direction (Ollama→MLX), cached availability checks, and state reset.

Written by Cursor Bugbot for commit 19c6c86. This will update automatically on new commits. Configure here.

Summary by CodeRabbit

  • New Features

    • Implemented mid-run fallback mechanism for LLM calls that automatically switches to a fallback backend after detecting consecutive failures on the primary backend, with availability checks and state tracking.
  • Tests

    • Added comprehensive test suite for fallback behavior including state management, backend switching transitions, and availability verification.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 26, 2026

Warning

Rate limit exceeded

@EtanHey has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 37 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 076f73e and 19c6c86.

📒 Files selected for processing (2)
  • src/brainlayer/pipeline/enrichment.py
  • tests/test_enrichment_fallback.py
📝 Walkthrough

Walkthrough

This pull request introduces a mid-run fallback mechanism for LLM calls in the enrichment pipeline. It tracks consecutive failures on a primary backend, switches to a fallback backend after reaching a threshold, and retries failed chunks on the fallback. Global state variables manage failure counts, activation status, and availability checks. Comprehensive tests validate the fallback behavior across success, failure, and transition scenarios.

Changes

Cohort / File(s) Summary
Fallback Mechanism Core
src/brainlayer/pipeline/enrichment.py
Adds mid-run fallback logic with global state tracking (_consecutive_failures, _FALLBACK_THRESHOLD, _fallback_active, _fallback_available). Implements _check_fallback_available() for cached fallback reachability checks. Extends call_llm to route to fallback backend after consecutive failures and resets state at start of run_enrichment.
Fallback Tests
tests/test_enrichment_fallback.py
Adds 123 lines of unit tests covering fallback state tracking, call_llm behavior under success/failure/activation paths, transition to and persistence of fallback mode, state reset via run_enrichment, bidirectional fallback scenarios, and availability checking with caching.

Sequence Diagram

sequenceDiagram
    actor Client
    participant enrichment as Enrichment Pipeline
    participant primary as Primary Backend
    participant fallback as Fallback Backend
    participant checker as Availability Checker

    Client->>enrichment: call_llm(request)
    enrichment->>primary: attempt LLM call
    
    alt Success
        primary-->>enrichment: response
        enrichment->>enrichment: reset _consecutive_failures
        enrichment-->>Client: return response
    else Failure & Threshold Not Reached
        primary-->>enrichment: error
        enrichment->>enrichment: increment _consecutive_failures
        enrichment-->>Client: propagate error
    else Failure & Threshold Reached
        primary-->>enrichment: error
        enrichment->>enrichment: increment _consecutive_failures
        enrichment->>checker: _check_fallback_available()
        
        alt Fallback Available
            checker-->>enrichment: true (cached)
            enrichment->>enrichment: _fallback_active = true
            enrichment->>enrichment: log fallback activation
            enrichment->>fallback: retry with fallback backend
            fallback-->>enrichment: response/error
            enrichment-->>Client: return result
        else Fallback Unavailable
            checker-->>enrichment: false (cached)
            enrichment->>enrichment: log no-fallback warning
            enrichment-->>Client: propagate error
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Possibly related PRs

Poem

🐰 A hoppy harvest of resilience blooms,
When primary paths crumble and gloom,
The fallback whispers: "fear not, I'm near!"
Retrying with grace through each computing sphere,
Two backends dancing—now that's frontier! 🌟

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: mid-run backend fallback for enrichment' directly and concisely describes the main change: implementing automatic fallback to a secondary backend during enrichment processing when the primary fails.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/enrichment-mid-run-fallback

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

f" FALLBACK: {_consecutive_failures} consecutive failures on {effective}, switching to {fallback}",
file=sys.stderr,
)
_fallback_active = True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition on global fallback state with parallel workers

Medium Severity

The module-level _consecutive_failures, _fallback_active, and _fallback_available are read and written from call_llm without synchronization, but enrich_batch calls _enrich_one (which calls call_llm) from multiple threads via ThreadPoolExecutor when parallel > 1 (recommended value is 3 for MLX). The _consecutive_failures += 1 operation is not atomic — even under the GIL, it decomposes into LOAD/ADD/STORE bytecodes that can interleave. This causes lost increments, delaying or preventing fallback activation in exactly the crash scenario this feature targets.

Additional Locations (1)

Fix in Cursor Fix in Web

_fallback_available = True
except Exception:
_fallback_available = False
return _fallback_available
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fallback availability cache ignores primary backend parameter

Low Severity

_check_fallback_available caches its result in a single _fallback_available boolean, but which backend it checks depends on the primary parameter. If first called with primary="mlx" (checks ollama), the cached True is returned for a subsequent call with primary="ollama" without ever checking if mlx is available. The cache key doesn't account for the input, making the cached result potentially wrong for a different primary.

Fix in Cursor Fix in Web

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/brainlayer/pipeline/enrichment.py`:
- Around line 431-457: The fallback shared state (_consecutive_failures,
_fallback_active, _fallback_available) is accessed concurrently and must be
protected with a lock: add a module-level threading.Lock (or RLock) and wrap all
read/modify/write accesses in call_llm(), _check_fallback_available(), and the
reset logic in run_enrichment() with that lock (acquire before reading the
cached _fallback_available, before incrementing or checking
_consecutive_failures and before setting _fallback_active, and before resetting
these variables) to ensure atomicity and prevent race conditions.

In `@tests/test_enrichment_fallback.py`:
- Around line 93-96: Replace the bare try/except in the test with an explicit
pytest.raises assertion: call enrichment.run_enrichment(max_chunks=0,
batch_size=1) inside a pytest.raises(RuntimeError) context so the test asserts
the specific RuntimeError from the backend check rather than swallowing any
Exception; keep the same parameters and test setup around the call to ensure
behavior is unchanged.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3eced96 and 076f73e.

📒 Files selected for processing (2)
  • src/brainlayer/pipeline/enrichment.py
  • tests/test_enrichment_fallback.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cursor Bugbot
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Run tests using pytest from the project root

Files:

  • tests/test_enrichment_fallback.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use ruff check src/ for linting and ruff format src/ for code formatting

Files:

  • src/brainlayer/pipeline/enrichment.py
src/brainlayer/pipeline/enrichment.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/pipeline/enrichment.py: Set "think": false in Ollama API calls for GLM-4.7 to avoid unnecessary thinking mode overhead (350+ tokens, ~20s delay)
Provide enrichment backends (Ollama and MLX) with environment variable selection via BRAINLAYER_ENRICH_BACKEND
Enrich chunks with 10 metadata fields: summary, tags, importance (1-10), intent, primary_symbols, resolved_query, epistemic_level, version_scope, debt_impact, and external_deps

Files:

  • src/brainlayer/pipeline/enrichment.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Provide enrichment backends (Ollama and MLX) with environment variable selection via `BRAINLAYER_ENRICH_BACKEND`
📚 Learning: 2026-02-23T16:51:38.317Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Provide enrichment backends (Ollama and MLX) with environment variable selection via `BRAINLAYER_ENRICH_BACKEND`

Applied to files:

  • src/brainlayer/pipeline/enrichment.py
📚 Learning: 2026-02-23T16:51:38.317Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Enrich chunks with 10 metadata fields: summary, tags, importance (1-10), intent, primary_symbols, resolved_query, epistemic_level, version_scope, debt_impact, and external_deps

Applied to files:

  • src/brainlayer/pipeline/enrichment.py
📚 Learning: 2026-02-23T16:51:38.317Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Set `"think": false` in Ollama API calls for GLM-4.7 to avoid unnecessary thinking mode overhead (350+ tokens, ~20s delay)

Applied to files:

  • src/brainlayer/pipeline/enrichment.py
🧬 Code graph analysis (1)
tests/test_enrichment_fallback.py (2)
src/brainlayer/pipeline/enrichment.py (3)
  • call_llm (459-515)
  • run_enrichment (770-883)
  • _check_fallback_available (440-456)
src/brainlayer/vector_store.py (1)
  • get_enrichment_stats (1385-1419)
🪛 GitHub Actions: CI
tests/test_enrichment_fallback.py

[error] ruff format check detected formatting issues. File would be reformatted. Run 'ruff format' to fix code style.

src/brainlayer/pipeline/enrichment.py

[error] ruff format check detected formatting issues. File would be reformatted. Run 'ruff format' to fix code style.

🔇 Additional comments (2)
tests/test_enrichment_fallback.py (2)

37-75: Nice coverage of fallback transitions and availability caching paths.

These cases validate threshold activation, reverse fallback direction, sticky fallback mode, and cached availability behavior well.

Also applies to: 103-123


31-33: ⚠️ Potential issue | 🟠 Major

Fix Ruff formatting for multiline with blocks (currently CI-blocking).

The pipeline already reports ruff format check failure; these backslash-continued with blocks are a likely source and should be reformatted before merge.

🧹 Ruff-compatible pattern
-        with patch.object(enrichment, "call_mlx", return_value=None), \
-             patch.object(enrichment, "_check_fallback_available", return_value=False):
+        with (
+            patch.object(enrichment, "call_mlx", return_value=None),
+            patch.object(enrichment, "_check_fallback_available", return_value=False),
+        ):

Also applies to: 40-43, 50-52, 60-62, 69-72

⛔ Skipped due to learnings
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Provide enrichment backends (Ollama and MLX) with environment variable selection via `BRAINLAYER_ENRICH_BACKEND`
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/**/*.py : Use `ruff check src/` for linting and `ruff format src/` for code formatting

Comment on lines +431 to +457
# Mid-run fallback state — tracks consecutive failures for automatic backend switching.
# When the primary backend crashes mid-run (e.g., MLX "Abort trap: 6"), the pipeline
# automatically retries failed chunks on the fallback backend instead of losing the entire batch.
_consecutive_failures = 0
_FALLBACK_THRESHOLD = 3 # Switch after 3 consecutive failures
_fallback_active = False
_fallback_available: Optional[bool] = None # None = not checked yet


def _check_fallback_available(primary: str) -> bool:
"""Check if the fallback backend is reachable. Cached for the run."""
global _fallback_available
if _fallback_available is not None:
return _fallback_available

fallback = "ollama" if primary == "mlx" else "mlx"
try:
if fallback == "ollama":
resp = requests.get(f"{OLLAMA_BASE_URL}/api/tags", timeout=3)
else:
resp = requests.get(f"{MLX_BASE_URL}/v1/models", timeout=3)
resp.raise_for_status()
_fallback_available = True
except Exception:
_fallback_available = False
return _fallback_available

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the code structure and threading claims
echo "=== Checking ThreadPoolExecutor and concurrent submission ==="
sed -n '700,710p' src/brainlayer/pipeline/enrichment.py

echo -e "\n=== Checking call_llm function definition and global state mutations ==="
sed -n '625,635p' src/brainlayer/pipeline/enrichment.py

echo -e "\n=== Checking global state mutations in call_llm (469-515) ==="
sed -n '469,515p' src/brainlayer/pipeline/enrichment.py

echo -e "\n=== Checking state reset in run_enrichment (779-783) ==="
sed -n '779,783p' src/brainlayer/pipeline/enrichment.py

echo -e "\n=== Searching for any existing thread locks ==="
rg -n 'Lock|mutex|RLock|Semaphore|threading' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 3213


Protect fallback state with a lock in parallel execution.

call_llm() mutates shared module state (_consecutive_failures, _fallback_active, _fallback_available) while worker threads execute concurrently via ThreadPoolExecutor (line 703–706). The read-modify-write operations on these globals—especially the increment and threshold check on _consecutive_failures (lines 494–495) and the set operation on _fallback_active (line 498)—are unsynchronized, creating race conditions that cause nondeterministic fallback activation.

🔧 Suggested fix (thread-safe fallback state)
 _consecutive_failures = 0
 _FALLBACK_THRESHOLD = 3  # Switch after 3 consecutive failures
 _fallback_active = False
 _fallback_available: Optional[bool] = None  # None = not checked yet
+_fallback_state_lock = threading.Lock()

Apply lock protection to all read/modify/write operations on these three globals in call_llm() (lines 474, 489, 494–498) and _check_fallback_available(), and to the reset in run_enrichment() (lines 779–783).

🧰 Tools
🪛 GitHub Actions: CI

[error] ruff format check detected formatting issues. File would be reformatted. Run 'ruff format' to fix code style.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/pipeline/enrichment.py` around lines 431 - 457, The fallback
shared state (_consecutive_failures, _fallback_active, _fallback_available) is
accessed concurrently and must be protected with a lock: add a module-level
threading.Lock (or RLock) and wrap all read/modify/write accesses in call_llm(),
_check_fallback_available(), and the reset logic in run_enrichment() with that
lock (acquire before reading the cached _fallback_available, before incrementing
or checking _consecutive_failures and before setting _fallback_active, and
before resetting these variables) to ensure atomicity and prevent race
conditions.

Comment on lines +93 to +96
try:
enrichment.run_enrichment(max_chunks=0, batch_size=1)
except Exception:
pass # Will fail on backend check, that's fine
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

cat -n tests/test_enrichment_fallback.py | head -120 | tail -40

Repository: EtanHey/brainlayer

Length of output: 1730


🏁 Script executed:

head -20 tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 558


🏁 Script executed:

sed -n '85,105p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 798


🏁 Script executed:

rg -n "run_enrichment" tests/test_enrichment_fallback.py | head -20

Repository: EtanHey/brainlayer

Length of output: 264


🏁 Script executed:

rg -n "pytest.raises" tests/ | head -10

Repository: EtanHey/brainlayer

Length of output: 606


🏁 Script executed:

grep -n "import pytest" tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 44


🏁 Script executed:

sed -n '107,115p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 403


🏁 Script executed:

sed -n '1,50p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 2083


🏁 Script executed:

grep -rn "def run_enrichment" brainlayer/

Repository: EtanHey/brainlayer

Length of output: 106


🏁 Script executed:

python3 << 'EOF'
import subprocess
result = subprocess.run(['grep', '-A', '20', 'def run_enrichment', 'brainlayer/pipeline/enrichment.py'], 
                       capture_output=True, text=True)
print(result.stdout)
EOF

Repository: EtanHey/brainlayer

Length of output: 44


🏁 Script executed:

find . -name "enrichment.py" -type f

Repository: EtanHey/brainlayer

Length of output: 101


🏁 Script executed:

fd -t f "enrichment.py"

Repository: EtanHey/brainlayer

Length of output: 178


🏁 Script executed:

git ls-files | grep enrichment

Repository: EtanHey/brainlayer

Length of output: 685


🏁 Script executed:

sed -n '1,100p' src/brainlayer/pipeline/enrichment.py | grep -n "def run_enrichment" -A 30

Repository: EtanHey/brainlayer

Length of output: 44


🏁 Script executed:

grep -n "def run_enrichment" src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 85


🏁 Script executed:

sed -n '100,150p' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 2260


🏁 Script executed:

sed -n '768,850p' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 3358


🏁 Script executed:

sed -n '768,900p' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 5428


🏁 Script executed:

grep -n "import pytest\|from pytest" tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 44


🏁 Script executed:

cat -n tests/test_enrichment_fallback.py | grep -A 25 "class TestRunEnrichmentResetsState"

Repository: EtanHey/brainlayer

Length of output: 1227


🏁 Script executed:

python3 << 'EOF'
# Check if requests is imported in enrichment module
import subprocess
result = subprocess.run(['grep', '-n', '^import requests\|^from requests', 'src/brainlayer/pipeline/enrichment.py'], 
                       capture_output=True, text=True)
print("Requests import:", result.stdout if result.stdout else "Not found at top level")

# Check what exceptions run_enrichment raises
result2 = subprocess.run(['grep', '-n', 'raise RuntimeError', 'src/brainlayer/pipeline/enrichment.py'], 
                        capture_output=True, text=True)
print("RuntimeError raises:", result2.stdout)
EOF

Repository: EtanHey/brainlayer

Length of output: 257


🏁 Script executed:

cd tests && python3 -m pytest --co -q test_enrichment_fallback.py 2>&1 | head -20

Repository: EtanHey/brainlayer

Length of output: 1256


🏁 Script executed:

# Check if pytest is available
python3 -c "import pytest; print(pytest.__version__)"

Repository: EtanHey/brainlayer

Length of output: 67


🏁 Script executed:

# Verify the backend check happens before other code
sed -n '768,810p' src/brainlayer/pipeline/enrichment.py

Repository: EtanHey/brainlayer

Length of output: 1830


🏁 Script executed:

sed -n '1,50p' src/brainlayer/pipeline/enrichment.py | grep -n "import\|from"

Repository: EtanHey/brainlayer

Length of output: 492


🏁 Script executed:

# Check what the actual pattern is in other tests
grep -A 10 "pytest.raises" tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 44


🏁 Script executed:

# Check all imports in the test file
cat -n tests/test_enrichment_fallback.py | sed -n '1,10p'

Repository: EtanHey/brainlayer

Length of output: 404


🏁 Script executed:

# Check if pytest is imported anywhere
grep -n "^import pytest\|^from pytest" tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 44


🏁 Script executed:

# Look at the full test method to understand test setup
sed -n '87,115p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 1152


🏁 Script executed:

# Check if there are other patterns in the codebase for handling expected exceptions
grep -B 5 -A 5 "pytest.raises" tests/test_storage.py | head -30

Repository: EtanHey/brainlayer

Length of output: 747


🏁 Script executed:

# Verify requests module is accessible in enrichment
grep -n "requests\." src/brainlayer/pipeline/enrichment.py | head -10

Repository: EtanHey/brainlayer

Length of output: 678


🏁 Script executed:

# Check what error is actually raised when requests.get fails
python3 << 'EOF'
# Simulate what happens when run_enrichment tries to connect to a non-existent backend
import requests
try:
    resp = requests.get("http://localhost:9999/v1/models", timeout=5)
    resp.raise_for_status()
except Exception as e:
    print(f"Exception type: {type(e).__name__}")
    print(f"Exception: {e}")
EOF

Repository: EtanHey/brainlayer

Length of output: 573


🏁 Script executed:

# Check if the setup_method resets state for this test class
sed -n '87,92p' tests/test_enrichment_fallback.py

Repository: EtanHey/brainlayer

Length of output: 308


🏁 Script executed:

# Verify the actual exception type raised by run_enrichment when backends fail
python3 << 'EOF'
# Looking at the code path:
# 1. run_enrichment tries requests.get
# 2. If that fails with Exception, it tries fallback
# 3. If fallback also fails with Exception, it raises RuntimeError
# The test doesn't mock requests, so it will get ConnectionError
# Which will trigger the nested except, leading to RuntimeError

print("Expected behavior:")
print("1. requests.get() fails -> ConnectionError (subclass of Exception)")
print("2. First except catches it, tries fallback")
print("3. Fallback also fails -> another ConnectionError")
print("4. Second except catches it, raises RuntimeError")
print("5. Test should catch RuntimeError, not generic Exception")
EOF

Repository: EtanHey/brainlayer

Length of output: 346


🏁 Script executed:

# Check if there's a setup_method in this test class
grep -A 5 "class TestRunEnrichmentResetsState" tests/test_enrichment_fallback.py | grep -E "setup_method|def test"

Repository: EtanHey/brainlayer

Length of output: 103


Use pytest.raises() to explicitly assert the expected exception type.

Catching bare Exception masks potential regressions like assertion typos. The test should explicitly verify that RuntimeError is raised when the backend check fails, using the established pattern from other tests in the suite.

Suggested refactor
+import pytest
 from unittest.mock import patch

 from brainlayer.pipeline import enrichment
@@
         with patch.object(enrichment, "VectorStore") as mock_vs:
             mock_store = mock_vs.return_value
             mock_store.get_enrichment_stats.return_value = {
                 "enriched": 0,
                 "enrichable": 0,
                 "remaining": 0,
                 "skipped": 0,
                 "percent": "0",
                 "total_chunks": 0,
                 "by_intent": {},
             }
-            try:
-                enrichment.run_enrichment(max_chunks=0, batch_size=1)
-            except Exception:
-                pass  # Will fail on backend check, that's fine
+            with patch.object(enrichment.requests, "get", side_effect=ConnectionError("backend down")):
+                with pytest.raises(RuntimeError):
+                    enrichment.run_enrichment(max_chunks=0, batch_size=1)
🧰 Tools
🪛 GitHub Actions: CI

[error] ruff format check detected formatting issues. File would be reformatted. Run 'ruff format' to fix code style.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_enrichment_fallback.py` around lines 93 - 96, Replace the bare
try/except in the test with an explicit pytest.raises assertion: call
enrichment.run_enrichment(max_chunks=0, batch_size=1) inside a
pytest.raises(RuntimeError) context so the test asserts the specific
RuntimeError from the backend check rather than swallowing any Exception; keep
the same parameters and test setup around the call to ensure behavior is
unchanged.

EtanHey and others added 2 commits February 26, 2026 18:40
When MLX crashes mid-batch (e.g., "Abort trap: 6"), the pipeline now
automatically switches to Ollama after 3 consecutive failures instead
of failing every remaining chunk in the batch.

- Track consecutive failures in call_llm()
- Auto-detect fallback backend availability
- Cache availability check for the run
- Reset fallback state on each run_enrichment() call
- 9 new tests for fallback logic

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@EtanHey EtanHey force-pushed the feat/enrichment-mid-run-fallback branch from 79a0402 to 19c6c86 Compare February 26, 2026 16:40
@EtanHey EtanHey merged commit 8a468da into main Feb 26, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant