fix: threading crash, duplicate symbols, logging, and embedding insert (4 bugs)#11
Merged
kapillamba4 merged 2 commits intoMay 20, 2026
Conversation
Four bugs found while indexing openclaw/openclaw (17,212 source files,
945 doc files) on an RTX 5060 Ti. The repo is a large TypeScript/Swift/
Kotlin monorepo (~17k files across 60+ extensions). All bugs surface only
at scale and were invisible in small test cases.
---
Bug 1: cross-thread SQLite access crashes ~30% of file parses
_parse_file_for_indexing ran inside ThreadPoolExecutor workers and called
db.execute() on the shared main-thread connection. This caused:
sqlite3.InterfaceError: bad parameter or other API misuse
on roughly 30% of files, even though the connection was opened with
check_same_thread=False. Python's sqlite3 binding is not safe for
concurrent access without explicit locking.
Fix: pre-fetch all existing file records into a dict[path → mtime] in the
main thread before launching the pool. Workers receive the dict and do a
dict.get() lookup instead of a DB query. No DB access in any worker thread.
---
Bug 2: duplicate symbols from tree-sitter AST crash DB write
tree-sitter can produce multiple symbols with the same (name, kind,
line_start) for a single file. The plain INSERT INTO symbols raised:
sqlite3.IntegrityError: UNIQUE constraint failed:
symbols.file_id, symbols.name, symbols.kind, symbols.line_start
This killed the entire DB write phase after all parsing and GPU embedding
had already completed — wasting the entire indexing run.
Fix: INSERT OR IGNORE INTO symbols. Use cursor.rowcount == 1 to detect
whether the insert actually happened. cursor.lastrowid is NOT reliable
here — after a no-op INSERT OR IGNORE it retains the rowid from the
previous successful insert on the same connection, not 0.
---
Bug 3: embedding insert crashes on sqlite-vec virtual table
After the Bug 2 fix, a duplicate symbol falls through to a SELECT that
returns the existing symbol_id. That ID already has an entry in
symbol_embeddings (a sqlite-vec virtual table). Attempting to insert
another embedding for it raised:
sqlite3.OperationalError: UNIQUE constraint failed on
symbol_embeddings primary key
INSERT OR IGNORE does not work on sqlite-vec virtual tables — the
conflict-resolution clause is rejected at the SQL level (OperationalError
instead of the usual IntegrityError).
Fix: guard embedding_pairs.append() with `if is_new` — only freshly
inserted symbols get embeddings queued. Existing symbols already have one.
---
Bug 4: logger.exception() reports all errors as "NoneType: None"
Exceptions from worker threads are stored as return values:
return (fpath, None, e)
Then in the main thread:
logger.exception("Failed to index %s", fpath)
logger.exception() reads sys.exc_info() — the current thread's exception
context — which is (None, None, None) since the exception occurred in a
different thread. Every failure logged as "NoneType: None" with no
traceback, making Bug 1 completely invisible.
Fix: logger.error("Failed to index %s", fpath, exc_info=error)
---
Tested against openclaw/openclaw:
Before: ~30% of files silently skipped; DB write crash on first run
After: 17,212/17,212 code files indexed, 111,000 symbols, 750 MB DB
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
get_index_stats was returning `last_file_indexed` (a raw float Unix timestamp) in the freshness dict, but api_types.py defines the field as `last_code_indexed: str | None`. This caused a Pydantic validation error in MCP clients that validate tool output against the schema. Two changes in get_index_stats(): - Rename key from `last_file_indexed` to `last_code_indexed` - Convert float timestamp to ISO-8601 string via datetime.fromtimestamp().isoformat() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
kapillamba4
approved these changes
May 20, 2026
kapillamba4
added a commit
that referenced
this pull request
May 20, 2026
PR #11 left an unsorted import block in db.py (`from datetime import datetime` placed among the plain `import` statements), which fails `ruff check` (I001) and broke CI on main. Move it into the sorted from-import group. Bump version 1.0.32 -> 1.0.33 in pyproject.toml, server.json (x2), and uv.lock. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four bugs found while indexing openclaw/openclaw — a large TypeScript/Swift/Kotlin monorepo with 17,212 source files, 945 doc files, and 60+ extensions. All four bugs surface only at scale and were invisible in small test cases. A fifth bug (schema mismatch in
get_index_stats) was found via MCP client validation.Bug 1 — Cross-thread SQLite access crashes ~30% of file parses
Symptom:
sqlite3.InterfaceError: bad parameter or other API misuseon roughly 30% of files during the parallel parse phase. Appeared asNoneType: Nonein logs due to Bug 4 below.Root cause:
_parse_file_for_indexingruns insideThreadPoolExecutorworkers but callsdb.execute()on the shared main-thread connection. Python'ssqlite3binding is not safe for concurrent multi-thread access even withcheck_same_thread=False.Fix: Pre-fetch all existing file records into a
dict[path → mtime]in the main thread before launching the pool. Workers receive the dict and do adict.get()lookup instead of a DB query. No DB access occurs in any worker thread.Bug 2 — Duplicate symbols from tree-sitter AST crash DB write
Symptom:
sqlite3.IntegrityError: UNIQUE constraint failed: symbols.file_id, symbols.name, symbols.kind, symbols.line_startin Phase 3 (DB write), after all parsing and GPU embedding had already completed.Root cause: tree-sitter can produce multiple symbols with identical
(name, kind, line_start)for a single file. The plainINSERT INTO symbolsraised on the second occurrence and killed the entire run.Fix:
INSERT OR IGNORE INTO symbols. Usecursor.rowcount == 1to detect a real insert. Important:cursor.lastrowidis not reliable here — after a no-opINSERT OR IGNOREit retains the rowid from the previous successful insert, not0.Bug 3 — Embedding insert crashes on sqlite-vec virtual table
Symptom:
sqlite3.OperationalError: UNIQUE constraint failed on symbol_embeddings primary keyimmediately after the Bug 2 fix was applied.Root cause:
symbol_embeddingsis asqlite-vecvirtual table. It rejectsINSERT OR IGNOREat the SQL level (OperationalError, notIntegrityError). When Bug 2's fallbackSELECTreturned an existingsymbol_id, the code still queued an embedding for it — and the existing row already had one.Fix: Guard
embedding_pairs.append()withif is_new— only queue embeddings for freshly inserted symbols. Existing symbols already have an embedding in the virtual table.Bug 4 —
logger.exception()reports all errors asNoneType: NoneSymptom: Every parse failure logged as
NoneType: Nonewith no traceback, making Bug 1 completely invisible.Root cause: Exceptions from worker threads are stored as return values.
logger.exception("Failed to index %s", fpath)readssys.exc_info()— the current thread's exception context — which is(None, None, None)since the exception happened in a different thread.Fix:
logger.error("Failed to index %s", fpath, exc_info=error)passes the stored exception explicitly.Bug 5 —
get_index_statsfreshness fields fail MCP schema validationSymptom: MCP clients that validate tool output against
api_types.pyreceive a PydanticValidationErroron everyget_index_statscall. The tool appears to fail even though the server responds successfully.Root cause:
db.pybuilds the freshness dict with keylast_file_indexed(a rawfloatUnix timestamp fromMAX(last_modified)), butapi_types.IndexFreshnessdeclareslast_code_indexed: str | None. Two mismatches: wrong field name and wrong type.Fix: Rename key to
last_code_indexedand convert the float viadatetime.fromtimestamp(...).isoformat().Test results on openclaw/openclaw
get_index_statsMCP validationVerification
🤖 Generated with Claude Code