Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
226 changes: 226 additions & 0 deletions BUGBOT_CRITICAL_ISSUES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
# Bugbot Re-Review: Critical Issues Identified

**Review Date:** 2026-04-30 (Final)
**Branch:** `fix/fts-recall-all-three`
**Latest Commit:** `546f7b2`
**Status:** ⚠️ **APPROVE WITH CRITICAL FIXES REQUIRED**

---

## Executive Summary

Macroscope and Codex have identified **3 critical correctness issues** that must be addressed before merge. These issues were not caught in my initial review and represent genuine retrieval correctness bugs.

---

## Critical Issue #1: Trigram-Only Results Skip Filters 🔴

**Severity:** P0 - Critical Correctness Bug
**Source:** Macroscope
**Location:** `search_repo.py:1141`

### Problem

Post-RRF filter guard only checks `fts_rank is not None` but ignores `trigram_rank`. When a chunk appears **only** in trigram results (not in main FTS), it bypasses all post-RRF filters.

**Current Code:**
```python
if fts_rank is not None and sem_entry is None:
if source_filter and meta.get("source") != source_filter:
continue
# ... other filters
```

**Bug:** If `fts_rank=None` and `trigram_rank=42`, the chunk passes through unfiltered.

### Impact

- Trigram-only hits bypass `source_filter`, `project_filter`, `content_type_filter`, `sender_filter`, `language_filter`
- Cross-project data leakage possible
- Filter contracts violated for substring identifier matches

### Fix Required

```python
# Change condition to include trigram_rank
if (fts_rank is not None or trigram_rank is not None) and sem_entry is None:
if source_filter and meta.get("source") != source_filter:
continue
# ... rest of filters
```

**Verification:** Add test case:
```python
# Insert chunk with trigram-only match + wrong source
# Query with source_filter
# Assert: chunk does NOT appear in results
```

---

## Critical Issue #2: Exact Chunk-ID Bypass Ignores Filters 🔴

**Severity:** P1 - Critical Scope Violation
**Source:** Codex
**Location:** `search_handler.py:389-391`

### Problem

`_exact_chunk_lookup_result()` only checks lifecycle state but **ignores all other filters** passed to `_brain_search()`:
- `project` (including auto-scoped project from `resolve_project_scope()`)
- `source`, `tag`, `intent`, `importance_min`
- `date_from`, `date_to`, `sentiment`
- `entity_id`, `source_filter`, `correction_category`

**Current Flow:**
```python
# Line 389-391
exact_chunk_hit = _exact_chunk_lookup_result(query, store, detail)
if exact_chunk_hit is not None:
return exact_chunk_hit # Returns without checking ANY filters!
```

### Impact

- **Cross-project data leakage:** User with project scope can access chunks from other projects via direct ID
- **Filter bypass:** All MCP tool filters are ignored for chunk-id queries
- **Security implication:** Breaks project isolation guarantees

### Fix Required

**Option A:** Pass filters to `_exact_chunk_lookup_result()` and verify chunk matches

```python
def _exact_chunk_lookup_result(
query: str,
store: Any,
detail: str,
project: str | None = None,
source: str | None = None,
# ... all other filters
) -> tuple[list[TextContent], dict] | None:
# ... existing lookup ...

# Add filter checks after lookup
if project and chunk.get("project") != project:
return None
if source and chunk.get("source") != source:
return None
# ... etc for all filters
```

**Option B:** Disable exact bypass when ANY filter is active

```python
# Only bypass if no filters active
has_filters = any([project, source, tag, intent, ...])
if not has_filters:
exact_chunk_hit = _exact_chunk_lookup_result(query, store, detail)
if exact_chunk_hit is not None:
return exact_chunk_hit
```

**Recommendation:** Implement Option A for consistency, Option B as fallback if implementation is complex.

---

## Critical Issue #3: Alias Expansion Changes FTS Semantics 🟡

**Severity:** P2 - Recall Regression Risk
**Source:** Codex
**Location:** `search_handler.py:131`

### Problem

`_expanded_fts_query()` wraps each variant in `_quote_fts_phrase()`, converting token-level matching to **phrase matching**. For multi-word queries, this breaks FTS5 semantics.

**Example:**
- Original query: `"brain search layer"` → FTS5: `brain AND search AND layer` (token-level)
- With expansion: `"Hershkovitz"` variant → `"brain search layer" OR "Hershkovitz"` (phrase match)
- Problem: Now requires exact phrase `"brain search layer"`, dropping valid hits with non-adjacent terms

### Impact

- **Recall regression** for multi-word queries with variants
- Valid chunks with scattered terms (e.g., `"brain ... search ... layer"`) are dropped
- Contradicts FTS5 token-level matching behavior

### Fix Required

Build OR expression from **escaped tokens**, not quoted phrases:

```python
def _expanded_fts_query(query: str, store: Any) -> str | None:
variants = _lexical_defense_variants(query)
seen = {value.casefold().strip() for value in variants if value.strip()}
for variant in _kg_alias_variants(query, store):
dedupe_key = variant.casefold().strip()
if dedupe_key and dedupe_key not in seen:
seen.add(dedupe_key)
variants.append(variant)

if len(variants) <= 1:
return None

# FIX: Escape each variant instead of phrase-quoting
from ._helpers import _escape_fts5_query
return " OR ".join(_escape_fts5_query(variant) for variant in variants)
```

**Verification:** Test multi-word query with variant expansion against chunk with scattered terms.

---

## Revised Verdict

**⚠️ APPROVE WITH MANDATORY FIXES**

The PR delivers valuable recall improvements, but the 3 critical issues above **must be fixed** before production deployment:

1. **MUST FIX:** Trigram-only filter bypass (P0 correctness)
2. **MUST FIX:** Exact chunk-ID filter bypass (P1 security/scope)
3. **SHOULD FIX:** Alias expansion phrase matching (P2 recall)

### Recommended Fix Order

1. **Trigram filter guard** (5 min) - Add `trigram_rank` to condition
2. **Exact bypass filters** (15 min) - Implement Option A or B
3. **Alias expansion semantics** (10 min) - Replace `_quote_fts_phrase` with `_escape_fts5_query`

### Testing Checklist Before Merge

- [ ] Add test: Trigram-only hit with wrong source_filter → 0 results
- [ ] Add test: Exact chunk-ID with wrong project → 0 results
- [ ] Add test: Multi-word query with expansion → scattered terms still match
- [ ] Verify: All existing tests still pass
- [ ] Verify: Behavioral receipts still work

---

## Updated Risk Assessment

**Pre-Fix:** 🔴 High Risk
- Cross-project data leakage via exact bypass
- Filter contracts violated for trigram hits
- Potential recall regression on multi-word queries

**Post-Fix:** 🟢 Low Risk
- All retrieval paths respect filters consistently
- FTS semantics preserved
- Production-ready with high confidence

---

## Approval

**⚠️ CONDITIONALLY APPROVED**

Fix the 3 critical issues above, verify with tests, then **merge with confidence**.

The underlying architecture is sound. These are fixable edge cases in the filter application logic, not fundamental design flaws.

---

**Reviewer:** @bugbot
**Date:** 2026-04-30
**Priority:** P0 - Block merge until fixed
155 changes: 155 additions & 0 deletions BUGBOT_FINAL_STATUS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Bugbot Final Status Report

**Date:** 2026-04-30
**PR:** #263 - fix: harden BrainLayer FTS recall across all three layers
**Latest Commit:** `9f4a75a` (current HEAD)
**Status:** 🔴 **CRITICAL ISSUES CONFIRMED - MERGE BLOCKED**

---

## Executive Summary

Cursor's own Bugbot has **independently confirmed** the 3 critical issues I identified from Macroscope and Codex reviews. These are **real bugs** that must be fixed before merge.

---

## Confirmed Critical Issues

### Issue #1: Exact Chunk-ID Bypass Ignores Project Scope 🔴
**Severity:** High (P0)
**Confirmed By:** Cursor Bugbot, Codex, Human Review
**Bugbot ID:** `4388acb8-612c-43a1-b6ea-984da7108a23`

**Description from Cursor Bugbot:**
> `_exact_chunk_lookup_result` only receives `query`, `store`, and `detail`—it checks lifecycle state but ignores all caller-supplied filters including `project` (which may be auto-resolved via `resolve_project_scope()`), `source`, `tag`, `content_type`, `importance_min`, date ranges, and `entity_id`. A user scoped to one project can retrieve chunks from another project by querying a known chunk ID, causing cross-project data leakage.

**Location:** `src/brainlayer/mcp/search_handler.py:387-391`

**Impact:** 🔴 Cross-project data leakage (security issue)

---

### Issue #2: Phrase Quoting Reduces Recall for Multi-Word Queries 🟡
**Severity:** Medium (P2)
**Confirmed By:** Cursor Bugbot, Codex, Human Review
**Bugbot ID:** `f6198648-d940-4ccd-87fd-669557679d8b`

**Description from Cursor Bugbot:**
> `_expanded_fts_query` uses `_quote_fts_phrase` to wrap each variant (including the original query) in a single pair of double quotes. For multi-word queries, this converts token-level AND matching (terms anywhere in the document) into exact phrase matching (terms must be adjacent in order). When expansion fires for a multi-word query, valid chunks with scattered terms are dropped, causing a recall regression.

**Location:** `src/brainlayer/mcp/search_handler.py:130-131`

**Impact:** 🟡 Recall regression on multi-word queries

---

### Issue #3: Trigram-Only Results Skip Filters 🔴
**Severity:** High (P0)
**Confirmed By:** Macroscope, Human Review

**Description:**
Post-hoc filtering at line 1141 uses `if fts_rank is not None and sem_entry is None:`, which excludes trigram-only results from filter checks. When a chunk appears only in `trigram_fts_results`, `fts_rank` is `None` while `trigram_rank` has a value, so the condition is `False` and filters are silently skipped.

**Location:** `src/brainlayer/search_repo.py:1141`

**Impact:** 🔴 Filter contract violations

---

## Additional Issues Identified

### Issue #4: Merge Conflict Marker in Documentation 📄
**Severity:** Low
**Confirmed By:** Cursor Bugbot, Macroscope
**Bugbot ID:** `cfbf1b81-cdb2-4bb4-96a5-0c7bdcc8a532`

**Location:** `BUGBOT_REVIEW_FTS_RECALL.md:31`
**Status:** ✅ **FIXED** in this commit

---

### Issue #5: SQL Normalizer Mismatch 🟡
**Severity:** Medium
**Confirmed By:** Macroscope

**Description:**
The SQL `normalizer` expression only strips `-`, `_`, `.`, and space, but `_normalize_surface` also removes apostrophes and other non-alphanumeric characters. Queries like `"O'Brien"` won't match.

**Location:** `src/brainlayer/mcp/search_handler.py:71`

**Impact:** KG alias lookup failures for names with apostrophes/special characters

---

## Current Status

**Merge Readiness:** 🔴 **NOT READY**

**Critical Blockers:** 3 issues (2 confirmed P0, 1 confirmed P2)

**Fix Status:**
- ❌ Issue #1: Not fixed (cross-project leakage)
- ❌ Issue #2: Not fixed (recall regression)
- ❌ Issue #3: Not fixed (filter bypass)
- ✅ Issue #4: Fixed (merge conflict removed)
- ❌ Issue #5: Not fixed (normalizer mismatch)

---

## Verification

Cursor Bugbot has provided **"Fix in Cursor"** and **"Fix in Web"** links for Issues #1 and #2:

**Issue #1 - Cross-Project Leakage:**
- Fix in Cursor: [Link provided by Bugbot]
- Fix in Web: [Link provided by Bugbot]

**Issue #2 - Phrase Matching:**
- Fix in Cursor: [Link provided by Bugbot]
- Fix in Web: [Link provided by Bugbot]

---

## Recommendation

**DO NOT MERGE** until all 3 critical issues are fixed:

1. **Fix Issue #1** (P0) - Add filter validation to `_exact_chunk_lookup_result()`
2. **Fix Issue #3** (P0) - Update filter condition to include `trigram_rank`
3. **Fix Issue #2** (P2) - Replace `_quote_fts_phrase` with `_escape_fts5_query`

**Optional:**
4. Fix Issue #5 (P3) - Align SQL normalizer with Python `_normalize_surface()`

**Estimated Time:** 30-45 minutes for all fixes + tests

---

## Updated Verdict

**Previous:** ✅ APPROVED with increased confidence
**Current:** 🔴 **MERGE BLOCKED - CRITICAL FIXES REQUIRED**

The PR architecture is sound, but these edge cases represent **real security and correctness bugs** that would cause production issues:
- Cross-project data leakage
- Filter contract violations
- Recall quality regressions

Once fixed and verified, the PR will be production-ready.

---

## Review Timeline

1. **Initial Review** (`2c0454c`) - Approved with observations
2. **Re-Review #1** (`546f7b2`) - Approved with increased confidence (6 fixes)
3. **Re-Review #2** (`4006365`) - Identified 3 critical issues (Macroscope/Codex)
4. **Current** (`9f4a75a`) - Cursor Bugbot confirms critical issues

**Status:** Awaiting fixes for merge approval

---

**Reviewer:** @bugbot
**Confirmation:** Issues independently verified by Cursor Bugbot
**Action Required:** Fix 3 critical issues before merge
Loading
Loading