Conversation
fix: TUI indexing status + SCIP LMDB MDB_MAP_FULL fix (v1.0.128)
* [worker] cleanup: AGENTS.md β 73% reduction, removed stale test report and duplicate bug details * docs: update CHANGELOG β v1.0.132 consolidated release notes (v1.0.97...v1.0.132) --------- Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* fix(mcp): ignore project/group params in local/stdio mode instead of erroring When running MCP in local mode (no serve_state), project/group routing is meaningless because only one DB is available. Log a warning and fall back to local DB instead of returning an error. * fix(qc): define YELLOW color in bash QC script
β¦al duplicate (v1.0.137) (#76) * fix: create DB directory before acquiring writer lock (serve auto-register) When `serve` is running and `codesearch index` is run for a repo not yet known to it, auto-register (POST /repos) failed with a misleading "Database is locked by another process" 500: SharedStores::new() acquired the writer lock before the .codesearch.db directory existed, so opening .writer.lock failed with "path not found". This rolled back the repos.json registration and made the CLI fall back to a local duplicate index instead of delegating to serve. - acquire_writer_lock / SharedStores::new now create the DB directory first; genuine I/O errors surface distinctly instead of as a lock conflict. - Serve config writes route through ServeState::persist_config() (honors the config path override) β production behavior unchanged, register/remove path now hermetically testable. - Regression guards exercise the brand-new-repo create/register path with the DB directory genuinely absent (verified to fail against the pre-fix code). - CHANGELOG: 1.0.136. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: never silently create a local duplicate index when serve is busy The CLI probes serve's /health before delegating `index`/`index add`. Any health failure β including a *timeout* while serve is warming up its repos at startup β was classified as "serve not running", so the CLI silently created a local index. That local index is a duplicate serve does not manage and can cause LMDB file-lock conflicts (and the repo never gets registered with serve). New behavior via probe_serve_health(): - Responsive -> delegate as before. - Connection refused / cannot connect -> serve not running; index locally. Detected immediately (no timeout elapses, no retries), so the common "no serve -> local" path is NOT slowed down. - Listening but unresponsive (timeout, retried briefly) -> serve is up but busy. The CLI now REFUSES to create a local duplicate and tells the user to retry shortly or stop serve first. The fallback is never silent anymore. Delegation errors are now typed (DelegateError: ServeDown / ServeUnresponsive / Failed) instead of string-matched. Applies to `index` and `index add` (the index-creating paths); `index rm` is unchanged. Tests: probe classification guards (responsive -> Up; listening-but-slow -> Unresponsive). Rolls into the 1.0.137 release together with the writer-lock fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brings the two files that drifted on master during the v1.0.137 release back to develop: the updated protect-master.yml (allows release/* branches) and the CHANGELOG [1.0.135] entry. After this, develop and master trees are identical. Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦8) (#79) Two Windows path-handling bugs that caused spurious "Database not found" errors and local duplicate indexes: 1. register()/register_with_alias() stored the raw canonicalize() result in repos.json. On Windows, canonicalize() returns \?\C:\... (extended-length UNC prefix). Downstream .join(".codesearch.db") and Path::exists() calls then fail inconsistently (\?\C:\foo\.codesearch.db not found even when C:\foo\.codesearch.db exists). 7 repos were affected. Fix: strip_unc() removes the prefix before storage. Existing repos.json patched in-place. Regression test: register_strips_unc_prefix_from_stored_path. 2. 500 "Database not found" from reindex (alias registered but DB gone) was treated as a generic failure -> local fallback -> duplicate index. Fix: triggers the same auto-register POST /repos path as 404 (DB recreated by serve, no local fallback). Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
β¦ across codebase (#82) ROOT CAUSE OF RECURRING BUG CLASS Path::canonicalize() on Windows returns \?\C:\... (extended-length UNC prefix). Any downstream .join(), .exists(), or HashMap key built from that path behaves inconsistently β the sub-path \?\C:\foo\.codesearch.db may return false from exists() even when C:\foo\.codesearch.db is present. This class of bug has silently broken registrations multiple times. FIX Introduce safe_canonicalize(path: &Path) -> io::Result<PathBuf> and strip_unc_prefix(path: PathBuf) -> PathBuf in src/cache/file_meta.rs. These are the ONLY approved way to canonicalize paths in this codebase. Exported via crate::cache. CALL SITES UPDATED (all raw .canonicalize() removed) - src/cache/file_meta.rs β central definition + 5 new regression tests - src/db_discovery/repos.rs β register, register_with_alias, unregister_path, alias_for_path; local strip_unc() removed - src/db_discovery/mod.rs β find_best_database, get_db_path_for_cwd - src/index/mod.rs β find_git_root, get_global_db_path, add_to_index, remove_from_index, try_delegate_reindex_to_serve (x2), try_delegate_rm_to_serve - src/lmdb_registry.rs β TrackedEnv registry key (eliminates double-open risk when same dir accessed with and without \?\ prefix) - src/serve/mod.rs β add_repo_handler, run_serve --register path POLICY DOCUMENTED AGENTS.md: "β οΈ Canonical Path Policy β MANDATORY" section with rule, code example, and pointer to regression tests. REGRESSION TESTS (6 new in cache/file_meta.rs + 1 existing in repos.rs) - strip_unc_prefix_removes_windows_unc - strip_unc_prefix_is_idempotent_on_{plain_path,unix_path} - safe_canonicalize_on_existing_dir_returns_plain_path - safe_canonicalize_on_nonexistent_path_returns_error - register_strips_unc_prefix_from_stored_path (repos.rs β verifies fallback path also strips UNC when canonicalize() fails) 407 lib tests pass. clippy -D warnings clean. Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦b_path_smart (#84) The old normalize_path(&p.canonicalize()...) pattern in get_db_path_smart was missed in the central safe_canonicalize refactor (v1.0.139). It worked correctly (normalize_path also strips UNC) but was inconsistent with the policy. Now all .canonicalize() calls outside safe_canonicalize's own definition are eliminated. Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#86) PROBLEM 1 β ServeUnresponsive aborted with error instead of waiting When serve is warming up (opening LMDB for 15+ repos blocks the tokio runtime, causing /health to time out), the CLI refused with an error. The user had to retry manually. FIX: serve_delegate_with_warmup_wait() wraps both try_delegate_reindex_to_serve and try_delegate_add_to_serve. On ServeUnresponsive it prints "β³ serve is starting up, waiting..." and retries every 8s up to 6 times (~2 min budget). On success it prints "β serve is ready, delegating...". Only exhausting the full budget returns an error. PROBLEM 2 β 409 Conflict from POST /repos on "Database not found" path When a registered repo's DB was missing, the CLI tried POST /repos to recreate it. Serve correctly returned 409 (alias already registered). The CLI treated 409 as a failure and fell back to local indexing. FIX: when auto-add returns 409, retry as POST /repos/{alias}/reindex?force=true. Force reindex uses allow_create=true and creates the DB via serve without local fallback. AGENTS.md: document the root cause (tokio blocking during warmup) as a remaining work item with diagnosis and fix guidance. Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦n_blocking (#88) PROBLEM codesearch serve became unresponsive during startup warmup: FileWalker::walk, VectorStore::build_index (HNSW), and fastembed/ONNX embedding (saturates all cores) ran synchronously on tokio worker threads. This starved the async runtime, /health timed out (>3s), and `codesearch index` reported "serve did not respond in time". The server already returns 202 + spawns background indexing (accept-and-defer); it just couldn't respond while warming. FIX Offload the heavy synchronous warmup work to tokio::task::spawn_blocking, so the async executor stays responsive (answers /health and accepts POST /repos immediately, runs the job in the background). - serve/mod.rs warmup_repo: read stats under .read(); build_index via spawn_blocking + Arc clone + blocking_write. Build failure only warns. - manager.rs perform_incremental_refresh_with_stores: walk, read+chunk+embed, and build_index all offloaded. - manager.rs refresh_index_with_stores: walk + both build_index calls offloaded. LOCK SAFETY (verified by review) Every async RwLock guard scope CLOSES before the spawn_blocking that calls .blocking_write() on the same store β no lock-over-await deadlock. blocking_write is only ever called inside spawn_blocking (never on an async worker). Test: test_incremental_refresh_up_to_date_is_noop exercises the refactored walk path. 408 lib tests pass, clippy -D warnings clean. Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦ commands (#91) * chore: add /merge and /release Claude Code slash commands Codify the project release workflow as two committed slash commands under .claude/commands/ (force-added past .gitignore, like .claude/CLAUDE.md): - /merge: README/CHANGELOG freshness checks -> commit -> validate -> push -> PR to develop -> auto-merge after CI. No tag. - /release: /merge, then promote develop -> master via a "Release vX.Y.Z" PR (protect-master allows develop), then push the vX.Y.Z tag that triggers release.yml. Includes optional post-release develop sync. Commands document the repo's real conventions: feature->develop->master flow, master branch protection, and the pre-commit version-bump-on-feature-branches rule that fixes the release version at the feature commit. Tooling-only change on a chore/ branch: no version bump, no CHANGELOG entry (CHANGELOG tracks the shipped binary's behavior). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore: address review remarks on /merge and /release commands - /merge: abort unless on feature/*|features/*|fix/* (the only branches the pre-commit hook version-bumps) β closes the gap where running from a non-bumping branch silently broke the version/CHANGELOG premise. - Clarify CHANGELOG heading version math for multi-commit landings (hook bumps +1 per commit; verify heading matches Cargo.toml after the final commit). - Capture PR numbers explicitly (gh pr view --json number) before merge/poll. - /release: fetch --tags and guard against a double release (stop if the tag already exists locally or on origin). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: document /merge and /release workflow in AGENTS.md Add a Release workflow section describing the two slash commands, the branch-protection rule, the tag-triggers-release.yml pipeline, and the feature-branch-only version-bump rule that fixes the release version. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(chunker): semantic Markdown chunking via tree-sitter-md Markdown and .txt files were indexed as a single whole-file block (the fallback chunker has no char budget), so a search hit returned an entire page β real Aprimo docs reached 80 KB in one chunk. Add the tree-sitter-md *block* grammar and chunk Markdown by heading section instead: each chunk is one heading plus its own prose/code, excluding nested subsections (which become their own chunks). The heading path is carried in the breadcrumb context (File > Title > Subsection) so embeddings capture each section's place in the document. Also add split_oversized, a char- and line-aware splitter for the unstructured paths (Markdown + the generic fallback): a single physical line longer than the char budget is hard-split on UTF-8 boundaries, so scraped one-line HTML/markdown can no longer produce an enormous chunk. The structured code path keeps using split_if_needed unchanged, so code chunking is unaffected. - Cargo.toml: add tree-sitter-md 0.5.3 - grammar.rs/language.rs: register Markdown as tree-sitter-supported - semantic.rs: chunk_markdown + emit_md_section + split_oversized - tests: section split, nested breadcrumbs, oversized + long-line splits Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * [worker] final review: fix chunk_markdown doc comment Reference the actual splitter used by the markdown path (split_oversized, char-aware) instead of split_if_needed (the code path's line-based splitter). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: document semantic Markdown chunking + correct language table - CHANGELOG: add [1.0.145] entry for tree-sitter-md block-grammar Markdown chunking (sections/headings/code fences). - README: expand the Supported Languages table to all 15 tree-sitter languages and bump the "9 languages" count to 15 β correcting pre-existing drift that omitted Shell, Ruby, PHP, YAML, JSON, and (new) Markdown. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(test): sanitize customer ref in markdown chunking fixtures The pre-push customer-ref guard flagged "aprimo" in two semantic.rs test fixtures (a frontmatter URL and a comment). Replaced with generic example.com / "real-world scraped docs" β the test assertions never reference either, so behavior is unchanged. Realign CHANGELOG heading to the post-bump version (1.0.146). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* [worker] stage 1/5: capture git remote identity per repo Add RepoMeta.git_remote (serde default, backward compatible) and a best-effort git_remote_url() helper. Populate it in register() and register_with_alias() so every registered repo records its remote.origin.url for later relocation of moved/renamed folders. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * @ [worker] stage 2/5: relocate moved repos + reconcile pass + index prune - Best-effort git relocation: try_relocate() walks to nearest existing ancestor and bounded-depth scans for a git root with matching remote.origin.url; unambiguous single match rewrites repos.json. - ServeState::reconcile_all_paths() runs at startup before phase 1/2/3; relocates or warns+skips missing paths (never crashes). - Existence guards added to phase-2 SCIP and phase-3 prewarm consumers. - New `codesearch index prune` command: relocate-first, else unregister stale aliases, with summary output. - CODESEARCH_RELOCATE_MAX_DEPTH env (default 3). - Unit tests for capture-on-register and try_relocate (renamed leaf, path-exists, no-remote, ambiguous). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> @ * @ [worker] stage 3/5: remove user-settable --alias, always derive - Drop `--alias`/`-a` from `index add` subcommand and the legacy `index --add` flag path. Alias is always derived from the directory name via ReposConfig::register(). - add_to_index() loses its `alias` parameter; legacy current-dir local DBs are now auto-registered with a derived alias. - Serve delegation always sends None so serve derives the alias too. - Replace test_cli_index_add_accepts_alias_flag with test_cli_index_add_rejects_alias_flag + parses_without_alias. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> @ * @ [worker] stage 4/5: tolerate hand-edited repos.json via reconcile() - ReposConfig::reconcile() runs from load_from() on both new and legacy parse paths (in-memory only, no disk write): 1. drop entries with empty/blank alias keys 2. drop orphan repos_meta entries with no matching repo 3. prune group members referencing unknown aliases; drop empty groups - Never renames existing alias keys (would break group refs); a non-standard hand-edited alias is tolerated as-is. Never crashes. - Unit tests for empty-key, group-pruning/empty-group, and orphan-meta. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> @ * @ [worker] stage 5/5: docs + tighten reconcile() visibility - Document stale-path relocation, `index prune`, derived-alias policy, and repos.json reconcile() in AGENTS.md and .claude/CLAUDE.md. - reconcile() is now pub(crate) (only used internally + same-module tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> @ * @ [worker] final review: use DB_DIR_NAME constant in relocation scan skip-list Replace hardcoded ".codesearch.db" literal with crate::constants::DB_DIR_NAME in is_skippable_scan_dir (no-hardcoded-config-strings rule). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> @ * @ [worker] tests: extract testable prune_stale/relocate_missing + expand coverage Refactor for testability (no behavior change): - Add pure ReposConfig::relocate_missing() -> (relocated, unresolved) and prune_stale() -> (relocated, removed); no disk I/O, no logging. - prune_index() and ServeState::reconcile_all_paths() now delegate to these, removing duplicated relocate-loop logic. New unit tests (8): - register_derives_alias_from_directory_name - try_relocate_finds_renamed_parent (parent-level rename within depth) - try_relocate_none_beyond_max_depth (depth bound enforced) - relocate_missing_rewrites_only_moved_repos - prune_stale_removes_unrelocatable_entries (+ group cleanup) - prune_stale_relocates_then_keeps_relocatable_entries - load_from_applies_reconcile_to_hand_edited_file (load-path reconcile) 24 repos lib tests pass; clippy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> @ * @ docs: README + CHANGELOG for relocation, index prune, derived alias - README: document `codesearch index prune`, automatic relocation of moved/renamed repos (CODESEARCH_RELOCATE_MAX_DEPTH), the alias-always- derived policy (no --alias flag), and hand-edited repos.json tolerance. - CHANGELOG: consolidated 1.0.149 entry (Added/Changed/Fixed). - README language table + alias example updates (pre-existing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> @ * @ [worker] address review remarks: align CHANGELOG version + restore log path - CHANGELOG entry retitled to 1.0.151 to match the shipped Cargo.toml version (pre-commit bumps patch by 1 on this commit). - reconcile warn for unresolved repos again includes the missing path for diagnostics (lost during the relocate_missing extraction). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> @ --------- Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat: auto-prune stale repos during Phase 1 warmup When a repo's database or path no longer exists (e.g. folder moved), Phase 1 now automatically unregisters the alias from repos.json instead of logging a warning and leaving the stale entry forever. Prune conditions (safe β only missing-db / path-gone, not transient errors): - .codesearch.db directory does not exist at registered path - Registered path itself no longer exists - Alias resolves to nothing in config Side effects per pruned alias: - stop_fsw + evict from DashMap + remove last_access timer - unregister_alias (removes from repos, repos_meta, groups) - persist via config.save() Closes: stale repos.json entries after folder reorganization * fix: add missing YELLOW color variable in qc.sh * bump version to 1.0.153 β align with CHANGELOG Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow-up on PR #42 + #43 audit. Two gaps identified: - No automated tests for new Warm/Write state semantics, zombie-proof reaper, or /status endpoint - No HTTP timeouts in standalone TUI reqwest calls Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦ELOG (#98) Squash merge fix/windows-8dot3-path-relocation β develop
* Fix formatting of codesearch index command
* Create codeql.yml
* feat: add tree-sitter grammars for Bash, Ruby, PHP, YAML, JSON
* fix: CI test resilience + protect-master workflow (#58) (#59)
* release: v1.0.132 β tree-sitter expansion, LMDB stability, CI hardening (#64)
* fix: CI test resilience + protect-master workflow (#58)
* Sync master β develop (tree-sitter) (#60)
* Fix formatting of codesearch index command
* Create codeql.yml
* feat: add tree-sitter grammars for Bash, Ruby, PHP, YAML, JSON
* fix: CI test resilience + protect-master workflow (#58) (#59)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* docs: update CHANGELOG β v1.0.132 consolidated release notes (#61)
* [worker] cleanup: AGENTS.md β 73% reduction, removed stale test report and duplicate bug details
* docs: update CHANGELOG β v1.0.132 consolidated release notes (v1.0.97...v1.0.132)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* sync: align develop with master β AGENTS.md, Cargo.toml, Cargo.lock (#63)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* Release v1.0.135: MCP local mode fix + QC fix + release branch support (#72)
* Release v1.0.134: MCP local mode fix + QC script fix (closes #65)
* ci: allow release/* branches to target master PRs
* docs: add CHANGELOG entry for v1.0.135 (MCP local mode fix) (#73)
* Release v1.0.137 β serve-aware indexing fixes (#77)
* fix: CI test resilience + protect-master workflow (#58)
* Sync master β develop (tree-sitter) (#60)
* Fix formatting of codesearch index command
* Create codeql.yml
* feat: add tree-sitter grammars for Bash, Ruby, PHP, YAML, JSON
* fix: CI test resilience + protect-master workflow (#58) (#59)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* docs: update CHANGELOG β v1.0.132 consolidated release notes (#61)
* [worker] cleanup: AGENTS.md β 73% reduction, removed stale test report and duplicate bug details
* docs: update CHANGELOG β v1.0.132 consolidated release notes (v1.0.97...v1.0.132)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* sync: align develop with master β AGENTS.md, Cargo.toml, Cargo.lock (#63)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* fix: MCP local mode project/group fallback + QC script fix (#66)
* fix(mcp): ignore project/group params in local/stdio mode instead of erroring
When running MCP in local mode (no serve_state), project/group routing is meaningless because only one DB is available.
Log a warning and fall back to local DB instead of returning an error.
* fix(qc): define YELLOW color in bash QC script
* chore: simplify release workflow β feature-only version bump (#74)
* fix: serve-aware indexing β create DB dir before lock + no silent local duplicate (v1.0.137) (#76)
* fix: create DB directory before acquiring writer lock (serve auto-register)
When `serve` is running and `codesearch index` is run for a repo not yet known
to it, auto-register (POST /repos) failed with a misleading "Database is locked
by another process" 500: SharedStores::new() acquired the writer lock before
the .codesearch.db directory existed, so opening .writer.lock failed with
"path not found". This rolled back the repos.json registration and made the CLI
fall back to a local duplicate index instead of delegating to serve.
- acquire_writer_lock / SharedStores::new now create the DB directory first;
genuine I/O errors surface distinctly instead of as a lock conflict.
- Serve config writes route through ServeState::persist_config() (honors the
config path override) β production behavior unchanged, register/remove path
now hermetically testable.
- Regression guards exercise the brand-new-repo create/register path with the
DB directory genuinely absent (verified to fail against the pre-fix code).
- CHANGELOG: 1.0.136.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: never silently create a local duplicate index when serve is busy
The CLI probes serve's /health before delegating `index`/`index add`. Any
health failure β including a *timeout* while serve is warming up its repos at
startup β was classified as "serve not running", so the CLI silently created a
local index. That local index is a duplicate serve does not manage and can
cause LMDB file-lock conflicts (and the repo never gets registered with serve).
New behavior via probe_serve_health():
- Responsive -> delegate as before.
- Connection refused / cannot connect -> serve not running; index locally.
Detected immediately (no timeout elapses, no retries), so the common
"no serve -> local" path is NOT slowed down.
- Listening but unresponsive (timeout, retried briefly) -> serve is up but
busy. The CLI now REFUSES to create a local duplicate and tells the user to
retry shortly or stop serve first. The fallback is never silent anymore.
Delegation errors are now typed (DelegateError: ServeDown / ServeUnresponsive /
Failed) instead of string-matched. Applies to `index` and `index add` (the
index-creating paths); `index rm` is unchanged.
Tests: probe classification guards (responsive -> Up; listening-but-slow ->
Unresponsive). Rolls into the 1.0.137 release together with the writer-lock fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Release v1.0.138 β strip UNC paths + auto-add on missing DB (#80)
* fix: CI test resilience + protect-master workflow (#58)
* Sync master β develop (tree-sitter) (#60)
* Fix formatting of codesearch index command
* Create codeql.yml
* feat: add tree-sitter grammars for Bash, Ruby, PHP, YAML, JSON
* fix: CI test resilience + protect-master workflow (#58) (#59)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* docs: update CHANGELOG β v1.0.132 consolidated release notes (#61)
* [worker] cleanup: AGENTS.md β 73% reduction, removed stale test report and duplicate bug details
* docs: update CHANGELOG β v1.0.132 consolidated release notes (v1.0.97...v1.0.132)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* sync: align develop with master β AGENTS.md, Cargo.toml, Cargo.lock (#63)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* fix: MCP local mode project/group fallback + QC script fix (#66)
* fix(mcp): ignore project/group params in local/stdio mode instead of erroring
When running MCP in local mode (no serve_state), project/group routing is meaningless because only one DB is available.
Log a warning and fall back to local DB instead of returning an error.
* fix(qc): define YELLOW color in bash QC script
* chore: simplify release workflow β feature-only version bump (#74)
* fix: serve-aware indexing β create DB dir before lock + no silent local duplicate (v1.0.137) (#76)
* fix: create DB directory before acquiring writer lock (serve auto-register)
When `serve` is running and `codesearch index` is run for a repo not yet known
to it, auto-register (POST /repos) failed with a misleading "Database is locked
by another process" 500: SharedStores::new() acquired the writer lock before
the .codesearch.db directory existed, so opening .writer.lock failed with
"path not found". This rolled back the repos.json registration and made the CLI
fall back to a local duplicate index instead of delegating to serve.
- acquire_writer_lock / SharedStores::new now create the DB directory first;
genuine I/O errors surface distinctly instead of as a lock conflict.
- Serve config writes route through ServeState::persist_config() (honors the
config path override) β production behavior unchanged, register/remove path
now hermetically testable.
- Regression guards exercise the brand-new-repo create/register path with the
DB directory genuinely absent (verified to fail against the pre-fix code).
- CHANGELOG: 1.0.136.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: never silently create a local duplicate index when serve is busy
The CLI probes serve's /health before delegating `index`/`index add`. Any
health failure β including a *timeout* while serve is warming up its repos at
startup β was classified as "serve not running", so the CLI silently created a
local index. That local index is a duplicate serve does not manage and can
cause LMDB file-lock conflicts (and the repo never gets registered with serve).
New behavior via probe_serve_health():
- Responsive -> delegate as before.
- Connection refused / cannot connect -> serve not running; index locally.
Detected immediately (no timeout elapses, no retries), so the common
"no serve -> local" path is NOT slowed down.
- Listening but unresponsive (timeout, retried briefly) -> serve is up but
busy. The CLI now REFUSES to create a local duplicate and tells the user to
retry shortly or stop serve first. The fallback is never silent anymore.
Delegation errors are now typed (DelegateError: ServeDown / ServeUnresponsive /
Failed) instead of string-matched. Applies to `index` and `index add` (the
index-creating paths); `index rm` is unchanged.
Tests: probe classification guards (responsive -> Up; listening-but-slow ->
Unresponsive). Rolls into the 1.0.137 release together with the writer-lock fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.137 release) (#78)
Brings the two files that drifted on master during the v1.0.137 release back
to develop: the updated protect-master.yml (allows release/* branches) and the
CHANGELOG [1.0.135] entry. After this, develop and master trees are identical.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: strip UNC prefix in repos.json + auto-add on missing DB (v1.0.138) (#79)
Two Windows path-handling bugs that caused spurious "Database not found"
errors and local duplicate indexes:
1. register()/register_with_alias() stored the raw canonicalize() result in
repos.json. On Windows, canonicalize() returns \?\C:\... (extended-length
UNC prefix). Downstream .join(".codesearch.db") and Path::exists() calls
then fail inconsistently (\?\C:\foo\.codesearch.db not found even when
C:\foo\.codesearch.db exists). 7 repos were affected. Fix: strip_unc()
removes the prefix before storage. Existing repos.json patched in-place.
Regression test: register_strips_unc_prefix_from_stored_path.
2. 500 "Database not found" from reindex (alias registered but DB gone) was
treated as a generic failure -> local fallback -> duplicate index. Fix:
triggers the same auto-register POST /repos path as 404 (DB recreated by
serve, no local fallback).
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Release v1.0.139 β central safe_canonicalize() for all path ops (#83)
* fix: CI test resilience + protect-master workflow (#58)
* Sync master β develop (tree-sitter) (#60)
* Fix formatting of codesearch index command
* Create codeql.yml
* feat: add tree-sitter grammars for Bash, Ruby, PHP, YAML, JSON
* fix: CI test resilience + protect-master workflow (#58) (#59)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* docs: update CHANGELOG β v1.0.132 consolidated release notes (#61)
* [worker] cleanup: AGENTS.md β 73% reduction, removed stale test report and duplicate bug details
* docs: update CHANGELOG β v1.0.132 consolidated release notes (v1.0.97...v1.0.132)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* sync: align develop with master β AGENTS.md, Cargo.toml, Cargo.lock (#63)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* fix: MCP local mode project/group fallback + QC script fix (#66)
* fix(mcp): ignore project/group params in local/stdio mode instead of erroring
When running MCP in local mode (no serve_state), project/group routing is meaningless because only one DB is available.
Log a warning and fall back to local DB instead of returning an error.
* fix(qc): define YELLOW color in bash QC script
* chore: simplify release workflow β feature-only version bump (#74)
* fix: serve-aware indexing β create DB dir before lock + no silent local duplicate (v1.0.137) (#76)
* fix: create DB directory before acquiring writer lock (serve auto-register)
When `serve` is running and `codesearch index` is run for a repo not yet known
to it, auto-register (POST /repos) failed with a misleading "Database is locked
by another process" 500: SharedStores::new() acquired the writer lock before
the .codesearch.db directory existed, so opening .writer.lock failed with
"path not found". This rolled back the repos.json registration and made the CLI
fall back to a local duplicate index instead of delegating to serve.
- acquire_writer_lock / SharedStores::new now create the DB directory first;
genuine I/O errors surface distinctly instead of as a lock conflict.
- Serve config writes route through ServeState::persist_config() (honors the
config path override) β production behavior unchanged, register/remove path
now hermetically testable.
- Regression guards exercise the brand-new-repo create/register path with the
DB directory genuinely absent (verified to fail against the pre-fix code).
- CHANGELOG: 1.0.136.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: never silently create a local duplicate index when serve is busy
The CLI probes serve's /health before delegating `index`/`index add`. Any
health failure β including a *timeout* while serve is warming up its repos at
startup β was classified as "serve not running", so the CLI silently created a
local index. That local index is a duplicate serve does not manage and can
cause LMDB file-lock conflicts (and the repo never gets registered with serve).
New behavior via probe_serve_health():
- Responsive -> delegate as before.
- Connection refused / cannot connect -> serve not running; index locally.
Detected immediately (no timeout elapses, no retries), so the common
"no serve -> local" path is NOT slowed down.
- Listening but unresponsive (timeout, retried briefly) -> serve is up but
busy. The CLI now REFUSES to create a local duplicate and tells the user to
retry shortly or stop serve first. The fallback is never silent anymore.
Delegation errors are now typed (DelegateError: ServeDown / ServeUnresponsive /
Failed) instead of string-matched. Applies to `index` and `index add` (the
index-creating paths); `index rm` is unchanged.
Tests: probe classification guards (responsive -> Up; listening-but-slow ->
Unresponsive). Rolls into the 1.0.137 release together with the writer-lock fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.137 release) (#78)
Brings the two files that drifted on master during the v1.0.137 release back
to develop: the updated protect-master.yml (allows release/* branches) and the
CHANGELOG [1.0.135] entry. After this, develop and master trees are identical.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: strip UNC prefix in repos.json + auto-add on missing DB (v1.0.138) (#79)
Two Windows path-handling bugs that caused spurious "Database not found"
errors and local duplicate indexes:
1. register()/register_with_alias() stored the raw canonicalize() result in
repos.json. On Windows, canonicalize() returns \?\C:\... (extended-length
UNC prefix). Downstream .join(".codesearch.db") and Path::exists() calls
then fail inconsistently (\?\C:\foo\.codesearch.db not found even when
C:\foo\.codesearch.db exists). 7 repos were affected. Fix: strip_unc()
removes the prefix before storage. Existing repos.json patched in-place.
Regression test: register_strips_unc_prefix_from_stored_path.
2. 500 "Database not found" from reindex (alias registered but DB gone) was
treated as a generic failure -> local fallback -> duplicate index. Fix:
triggers the same auto-register POST /repos path as 404 (DB recreated by
serve, no local fallback).
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.138) (#81)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* refactor: central safe_canonicalize() β eliminate raw .canonicalize() across codebase (#82)
ROOT CAUSE OF RECURRING BUG CLASS
Path::canonicalize() on Windows returns \?\C:\... (extended-length UNC
prefix). Any downstream .join(), .exists(), or HashMap key built from that
path behaves inconsistently β the sub-path \?\C:\foo\.codesearch.db may
return false from exists() even when C:\foo\.codesearch.db is present.
This class of bug has silently broken registrations multiple times.
FIX
Introduce safe_canonicalize(path: &Path) -> io::Result<PathBuf> and
strip_unc_prefix(path: PathBuf) -> PathBuf in src/cache/file_meta.rs.
These are the ONLY approved way to canonicalize paths in this codebase.
Exported via crate::cache.
CALL SITES UPDATED (all raw .canonicalize() removed)
- src/cache/file_meta.rs β central definition + 5 new regression tests
- src/db_discovery/repos.rs β register, register_with_alias, unregister_path,
alias_for_path; local strip_unc() removed
- src/db_discovery/mod.rs β find_best_database, get_db_path_for_cwd
- src/index/mod.rs β find_git_root, get_global_db_path,
add_to_index, remove_from_index,
try_delegate_reindex_to_serve (x2),
try_delegate_rm_to_serve
- src/lmdb_registry.rs β TrackedEnv registry key (eliminates
double-open risk when same dir accessed
with and without \?\ prefix)
- src/serve/mod.rs β add_repo_handler, run_serve --register path
POLICY DOCUMENTED
AGENTS.md: "β οΈ Canonical Path Policy β MANDATORY" section with rule,
code example, and pointer to regression tests.
REGRESSION TESTS (6 new in cache/file_meta.rs + 1 existing in repos.rs)
- strip_unc_prefix_removes_windows_unc
- strip_unc_prefix_is_idempotent_on_{plain_path,unix_path}
- safe_canonicalize_on_existing_dir_returns_plain_path
- safe_canonicalize_on_nonexistent_path_returns_error
- register_strips_unc_prefix_from_stored_path (repos.rs β verifies
fallback path also strips UNC when canonicalize() fails)
407 lib tests pass. clippy -D warnings clean.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: remove stale strip_unc() from repos.rs (merged from master, superseded by central safe_canonicalize)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Release v1.0.140 β last raw .canonicalize() eliminated (#85)
* fix: CI test resilience + protect-master workflow (#58)
* Sync master β develop (tree-sitter) (#60)
* Fix formatting of codesearch index command
* Create codeql.yml
* feat: add tree-sitter grammars for Bash, Ruby, PHP, YAML, JSON
* fix: CI test resilience + protect-master workflow (#58) (#59)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* docs: update CHANGELOG β v1.0.132 consolidated release notes (#61)
* [worker] cleanup: AGENTS.md β 73% reduction, removed stale test report and duplicate bug details
* docs: update CHANGELOG β v1.0.132 consolidated release notes (v1.0.97...v1.0.132)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* sync: align develop with master β AGENTS.md, Cargo.toml, Cargo.lock (#63)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* fix: MCP local mode project/group fallback + QC script fix (#66)
* fix(mcp): ignore project/group params in local/stdio mode instead of erroring
When running MCP in local mode (no serve_state), project/group routing is meaningless because only one DB is available.
Log a warning and fall back to local DB instead of returning an error.
* fix(qc): define YELLOW color in bash QC script
* chore: simplify release workflow β feature-only version bump (#74)
* fix: serve-aware indexing β create DB dir before lock + no silent local duplicate (v1.0.137) (#76)
* fix: create DB directory before acquiring writer lock (serve auto-register)
When `serve` is running and `codesearch index` is run for a repo not yet known
to it, auto-register (POST /repos) failed with a misleading "Database is locked
by another process" 500: SharedStores::new() acquired the writer lock before
the .codesearch.db directory existed, so opening .writer.lock failed with
"path not found". This rolled back the repos.json registration and made the CLI
fall back to a local duplicate index instead of delegating to serve.
- acquire_writer_lock / SharedStores::new now create the DB directory first;
genuine I/O errors surface distinctly instead of as a lock conflict.
- Serve config writes route through ServeState::persist_config() (honors the
config path override) β production behavior unchanged, register/remove path
now hermetically testable.
- Regression guards exercise the brand-new-repo create/register path with the
DB directory genuinely absent (verified to fail against the pre-fix code).
- CHANGELOG: 1.0.136.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: never silently create a local duplicate index when serve is busy
The CLI probes serve's /health before delegating `index`/`index add`. Any
health failure β including a *timeout* while serve is warming up its repos at
startup β was classified as "serve not running", so the CLI silently created a
local index. That local index is a duplicate serve does not manage and can
cause LMDB file-lock conflicts (and the repo never gets registered with serve).
New behavior via probe_serve_health():
- Responsive -> delegate as before.
- Connection refused / cannot connect -> serve not running; index locally.
Detected immediately (no timeout elapses, no retries), so the common
"no serve -> local" path is NOT slowed down.
- Listening but unresponsive (timeout, retried briefly) -> serve is up but
busy. The CLI now REFUSES to create a local duplicate and tells the user to
retry shortly or stop serve first. The fallback is never silent anymore.
Delegation errors are now typed (DelegateError: ServeDown / ServeUnresponsive /
Failed) instead of string-matched. Applies to `index` and `index add` (the
index-creating paths); `index rm` is unchanged.
Tests: probe classification guards (responsive -> Up; listening-but-slow ->
Unresponsive). Rolls into the 1.0.137 release together with the writer-lock fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.137 release) (#78)
Brings the two files that drifted on master during the v1.0.137 release back
to develop: the updated protect-master.yml (allows release/* branches) and the
CHANGELOG [1.0.135] entry. After this, develop and master trees are identical.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: strip UNC prefix in repos.json + auto-add on missing DB (v1.0.138) (#79)
Two Windows path-handling bugs that caused spurious "Database not found"
errors and local duplicate indexes:
1. register()/register_with_alias() stored the raw canonicalize() result in
repos.json. On Windows, canonicalize() returns \?\C:\... (extended-length
UNC prefix). Downstream .join(".codesearch.db") and Path::exists() calls
then fail inconsistently (\?\C:\foo\.codesearch.db not found even when
C:\foo\.codesearch.db exists). 7 repos were affected. Fix: strip_unc()
removes the prefix before storage. Existing repos.json patched in-place.
Regression test: register_strips_unc_prefix_from_stored_path.
2. 500 "Database not found" from reindex (alias registered but DB gone) was
treated as a generic failure -> local fallback -> duplicate index. Fix:
triggers the same auto-register POST /repos path as 404 (DB recreated by
serve, no local fallback).
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.138) (#81)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* refactor: central safe_canonicalize() β eliminate raw .canonicalize() across codebase (#82)
ROOT CAUSE OF RECURRING BUG CLASS
Path::canonicalize() on Windows returns \?\C:\... (extended-length UNC
prefix). Any downstream .join(), .exists(), or HashMap key built from that
path behaves inconsistently β the sub-path \?\C:\foo\.codesearch.db may
return false from exists() even when C:\foo\.codesearch.db is present.
This class of bug has silently broken registrations multiple times.
FIX
Introduce safe_canonicalize(path: &Path) -> io::Result<PathBuf> and
strip_unc_prefix(path: PathBuf) -> PathBuf in src/cache/file_meta.rs.
These are the ONLY approved way to canonicalize paths in this codebase.
Exported via crate::cache.
CALL SITES UPDATED (all raw .canonicalize() removed)
- src/cache/file_meta.rs β central definition + 5 new regression tests
- src/db_discovery/repos.rs β register, register_with_alias, unregister_path,
alias_for_path; local strip_unc() removed
- src/db_discovery/mod.rs β find_best_database, get_db_path_for_cwd
- src/index/mod.rs β find_git_root, get_global_db_path,
add_to_index, remove_from_index,
try_delegate_reindex_to_serve (x2),
try_delegate_rm_to_serve
- src/lmdb_registry.rs β TrackedEnv registry key (eliminates
double-open risk when same dir accessed
with and without \?\ prefix)
- src/serve/mod.rs β add_repo_handler, run_serve --register path
POLICY DOCUMENTED
AGENTS.md: "β οΈ Canonical Path Policy β MANDATORY" section with rule,
code example, and pointer to regression tests.
REGRESSION TESTS (6 new in cache/file_meta.rs + 1 existing in repos.rs)
- strip_unc_prefix_removes_windows_unc
- strip_unc_prefix_is_idempotent_on_{plain_path,unix_path}
- safe_canonicalize_on_existing_dir_returns_plain_path
- safe_canonicalize_on_nonexistent_path_returns_error
- register_strips_unc_prefix_from_stored_path (repos.rs β verifies
fallback path also strips UNC when canonicalize() fails)
407 lib tests pass. clippy -D warnings clean.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: replace last raw .canonicalize() with safe_canonicalize in get_db_path_smart (#84)
The old normalize_path(&p.canonicalize()...) pattern in get_db_path_smart
was missed in the central safe_canonicalize refactor (v1.0.139). It worked
correctly (normalize_path also strips UNC) but was inconsistent with the
policy. Now all .canonicalize() calls outside safe_canonicalize's own
definition are eliminated.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Release v1.0.141 β serve warmup wait + 409 DB-recreate fix (#87)
* fix: CI test resilience + protect-master workflow (#58)
* Sync master β develop (tree-sitter) (#60)
* Fix formatting of codesearch index command
* Create codeql.yml
* feat: add tree-sitter grammars for Bash, Ruby, PHP, YAML, JSON
* fix: CI test resilience + protect-master workflow (#58) (#59)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* docs: update CHANGELOG β v1.0.132 consolidated release notes (#61)
* [worker] cleanup: AGENTS.md β 73% reduction, removed stale test report and duplicate bug details
* docs: update CHANGELOG β v1.0.132 consolidated release notes (v1.0.97...v1.0.132)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* sync: align develop with master β AGENTS.md, Cargo.toml, Cargo.lock (#63)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* fix: MCP local mode project/group fallback + QC script fix (#66)
* fix(mcp): ignore project/group params in local/stdio mode instead of erroring
When running MCP in local mode (no serve_state), project/group routing is meaningless because only one DB is available.
Log a warning and fall back to local DB instead of returning an error.
* fix(qc): define YELLOW color in bash QC script
* chore: simplify release workflow β feature-only version bump (#74)
* fix: serve-aware indexing β create DB dir before lock + no silent local duplicate (v1.0.137) (#76)
* fix: create DB directory before acquiring writer lock (serve auto-register)
When `serve` is running and `codesearch index` is run for a repo not yet known
to it, auto-register (POST /repos) failed with a misleading "Database is locked
by another process" 500: SharedStores::new() acquired the writer lock before
the .codesearch.db directory existed, so opening .writer.lock failed with
"path not found". This rolled back the repos.json registration and made the CLI
fall back to a local duplicate index instead of delegating to serve.
- acquire_writer_lock / SharedStores::new now create the DB directory first;
genuine I/O errors surface distinctly instead of as a lock conflict.
- Serve config writes route through ServeState::persist_config() (honors the
config path override) β production behavior unchanged, register/remove path
now hermetically testable.
- Regression guards exercise the brand-new-repo create/register path with the
DB directory genuinely absent (verified to fail against the pre-fix code).
- CHANGELOG: 1.0.136.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: never silently create a local duplicate index when serve is busy
The CLI probes serve's /health before delegating `index`/`index add`. Any
health failure β including a *timeout* while serve is warming up its repos at
startup β was classified as "serve not running", so the CLI silently created a
local index. That local index is a duplicate serve does not manage and can
cause LMDB file-lock conflicts (and the repo never gets registered with serve).
New behavior via probe_serve_health():
- Responsive -> delegate as before.
- Connection refused / cannot connect -> serve not running; index locally.
Detected immediately (no timeout elapses, no retries), so the common
"no serve -> local" path is NOT slowed down.
- Listening but unresponsive (timeout, retried briefly) -> serve is up but
busy. The CLI now REFUSES to create a local duplicate and tells the user to
retry shortly or stop serve first. The fallback is never silent anymore.
Delegation errors are now typed (DelegateError: ServeDown / ServeUnresponsive /
Failed) instead of string-matched. Applies to `index` and `index add` (the
index-creating paths); `index rm` is unchanged.
Tests: probe classification guards (responsive -> Up; listening-but-slow ->
Unresponsive). Rolls into the 1.0.137 release together with the writer-lock fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.137 release) (#78)
Brings the two files that drifted on master during the v1.0.137 release back
to develop: the updated protect-master.yml (allows release/* branches) and the
CHANGELOG [1.0.135] entry. After this, develop and master trees are identical.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: strip UNC prefix in repos.json + auto-add on missing DB (v1.0.138) (#79)
Two Windows path-handling bugs that caused spurious "Database not found"
errors and local duplicate indexes:
1. register()/register_with_alias() stored the raw canonicalize() result in
repos.json. On Windows, canonicalize() returns \?\C:\... (extended-length
UNC prefix). Downstream .join(".codesearch.db") and Path::exists() calls
then fail inconsistently (\?\C:\foo\.codesearch.db not found even when
C:\foo\.codesearch.db exists). 7 repos were affected. Fix: strip_unc()
removes the prefix before storage. Existing repos.json patched in-place.
Regression test: register_strips_unc_prefix_from_stored_path.
2. 500 "Database not found" from reindex (alias registered but DB gone) was
treated as a generic failure -> local fallback -> duplicate index. Fix:
triggers the same auto-register POST /repos path as 404 (DB recreated by
serve, no local fallback).
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.138) (#81)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* refactor: central safe_canonicalize() β eliminate raw .canonicalize() across codebase (#82)
ROOT CAUSE OF RECURRING BUG CLASS
Path::canonicalize() on Windows returns \?\C:\... (extended-length UNC
prefix). Any downstream .join(), .exists(), or HashMap key built from that
path behaves inconsistently β the sub-path \?\C:\foo\.codesearch.db may
return false from exists() even when C:\foo\.codesearch.db is present.
This class of bug has silently broken registrations multiple times.
FIX
Introduce safe_canonicalize(path: &Path) -> io::Result<PathBuf> and
strip_unc_prefix(path: PathBuf) -> PathBuf in src/cache/file_meta.rs.
These are the ONLY approved way to canonicalize paths in this codebase.
Exported via crate::cache.
CALL SITES UPDATED (all raw .canonicalize() removed)
- src/cache/file_meta.rs β central definition + 5 new regression tests
- src/db_discovery/repos.rs β register, register_with_alias, unregister_path,
alias_for_path; local strip_unc() removed
- src/db_discovery/mod.rs β find_best_database, get_db_path_for_cwd
- src/index/mod.rs β find_git_root, get_global_db_path,
add_to_index, remove_from_index,
try_delegate_reindex_to_serve (x2),
try_delegate_rm_to_serve
- src/lmdb_registry.rs β TrackedEnv registry key (eliminates
double-open risk when same dir accessed
with and without \?\ prefix)
- src/serve/mod.rs β add_repo_handler, run_serve --register path
POLICY DOCUMENTED
AGENTS.md: "β οΈ Canonical Path Policy β MANDATORY" section with rule,
code example, and pointer to regression tests.
REGRESSION TESTS (6 new in cache/file_meta.rs + 1 existing in repos.rs)
- strip_unc_prefix_removes_windows_unc
- strip_unc_prefix_is_idempotent_on_{plain_path,unix_path}
- safe_canonicalize_on_existing_dir_returns_plain_path
- safe_canonicalize_on_nonexistent_path_returns_error
- register_strips_unc_prefix_from_stored_path (repos.rs β verifies
fallback path also strips UNC when canonicalize() fails)
407 lib tests pass. clippy -D warnings clean.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: replace last raw .canonicalize() with safe_canonicalize in get_db_path_smart (#84)
The old normalize_path(&p.canonicalize()...) pattern in get_db_path_smart
was missed in the central safe_canonicalize refactor (v1.0.139). It worked
correctly (normalize_path also strips UNC) but was inconsistent with the
policy. Now all .canonicalize() calls outside safe_canonicalize's own
definition are eliminated.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: wait for serve warmup instead of refusing; fix 409 on DB-recreate (#86)
PROBLEM 1 β ServeUnresponsive aborted with error instead of waiting
When serve is warming up (opening LMDB for 15+ repos blocks the tokio
runtime, causing /health to time out), the CLI refused with an error.
The user had to retry manually.
FIX: serve_delegate_with_warmup_wait() wraps both try_delegate_reindex_to_serve
and try_delegate_add_to_serve. On ServeUnresponsive it prints
"β³ serve is starting up, waiting..." and retries every 8s up to 6 times
(~2 min budget). On success it prints "β
serve is ready, delegating...".
Only exhausting the full budget returns an error.
PROBLEM 2 β 409 Conflict from POST /repos on "Database not found" path
When a registered repo's DB was missing, the CLI tried POST /repos to
recreate it. Serve correctly returned 409 (alias already registered).
The CLI treated 409 as a failure and fell back to local indexing.
FIX: when auto-add returns 409, retry as POST /repos/{alias}/reindex?force=true.
Force reindex uses allow_create=true and creates the DB via serve without
local fallback.
AGENTS.md: document the root cause (tokio blocking during warmup) as a
remaining work item with diagnosis and fix guidance.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Release v1.0.142 β serve responsive during warmup (spawn_blocking) (#89)
* fix: CI test resilience + protect-master workflow (#58)
* Sync master β develop (tree-sitter) (#60)
* Fix formatting of codesearch index command
* Create codeql.yml
* feat: add tree-sitter grammars for Bash, Ruby, PHP, YAML, JSON
* fix: CI test resilience + protect-master workflow (#58) (#59)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* docs: update CHANGELOG β v1.0.132 consolidated release notes (#61)
* [worker] cleanup: AGENTS.md β 73% reduction, removed stale test report and duplicate bug details
* docs: update CHANGELOG β v1.0.132 consolidated release notes (v1.0.97...v1.0.132)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* sync: align develop with master β AGENTS.md, Cargo.toml, Cargo.lock (#63)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* fix: MCP local mode project/group fallback + QC script fix (#66)
* fix(mcp): ignore project/group params in local/stdio mode instead of erroring
When running MCP in local mode (no serve_state), project/group routing is meaningless because only one DB is available.
Log a warning and fall back to local DB instead of returning an error.
* fix(qc): define YELLOW color in bash QC script
* chore: simplify release workflow β feature-only version bump (#74)
* fix: serve-aware indexing β create DB dir before lock + no silent local duplicate (v1.0.137) (#76)
* fix: create DB directory before acquiring writer lock (serve auto-register)
When `serve` is running and `codesearch index` is run for a repo not yet known
to it, auto-register (POST /repos) failed with a misleading "Database is locked
by another process" 500: SharedStores::new() acquired the writer lock before
the .codesearch.db directory existed, so opening .writer.lock failed with
"path not found". This rolled back the repos.json registration and made the CLI
fall back to a local duplicate index instead of delegating to serve.
- acquire_writer_lock / SharedStores::new now create the DB directory first;
genuine I/O errors surface distinctly instead of as a lock conflict.
- Serve config writes route through ServeState::persist_config() (honors the
config path override) β production behavior unchanged, register/remove path
now hermetically testable.
- Regression guards exercise the brand-new-repo create/register path with the
DB directory genuinely absent (verified to fail against the pre-fix code).
- CHANGELOG: 1.0.136.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: never silently create a local duplicate index when serve is busy
The CLI probes serve's /health before delegating `index`/`index add`. Any
health failure β including a *timeout* while serve is warming up its repos at
startup β was classified as "serve not running", so the CLI silently created a
local index. That local index is a duplicate serve does not manage and can
cause LMDB file-lock conflicts (and the repo never gets registered with serve).
New behavior via probe_serve_health():
- Responsive -> delegate as before.
- Connection refused / cannot connect -> serve not running; index locally.
Detected immediately (no timeout elapses, no retries), so the common
"no serve -> local" path is NOT slowed down.
- Listening but unresponsive (timeout, retried briefly) -> serve is up but
busy. The CLI now REFUSES to create a local duplicate and tells the user to
retry shortly or stop serve first. The fallback is never silent anymore.
Delegation errors are now typed (DelegateError: ServeDown / ServeUnresponsive /
Failed) instead of string-matched. Applies to `index` and `index add` (the
index-creating paths); `index rm` is unchanged.
Tests: probe classification guards (responsive -> Up; listening-but-slow ->
Unresponsive). Rolls into the 1.0.137 release together with the writer-lock fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.137 release) (#78)
Brings the two files that drifted on master during the v1.0.137 release back
to develop: the updated protect-master.yml (allows release/* branches) and the
CHANGELOG [1.0.135] entry. After this, develop and master trees are identical.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: strip UNC prefix in repos.json + auto-add on missing DB (v1.0.138) (#79)
Two Windows path-handling bugs that caused spurious "Database not found"
errors and local duplicate indexes:
1. register()/register_with_alias() stored the raw canonicalize() result in
repos.json. On Windows, canonicalize() returns \?\C:\... (extended-length
UNC prefix). Downstream .join(".codesearch.db") and Path::exists() calls
then fail inconsistently (\?\C:\foo\.codesearch.db not found even when
C:\foo\.codesearch.db exists). 7 repos were affected. Fix: strip_unc()
removes the prefix before storage. Existing repos.json patched in-place.
Regression test: register_strips_unc_prefix_from_stored_path.
2. 500 "Database not found" from reindex (alias registered but DB gone) was
treated as a generic failure -> local fallback -> duplicate index. Fix:
triggers the same auto-register POST /repos path as 404 (DB recreated by
serve, no local fallback).
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.138) (#81)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* refactor: central safe_canonicalize() β eliminate raw .canonicalize() across codebase (#82)
ROOT CAUSE OF RECURRING BUG CLASS
Path::canonicalize() on Windows returns \?\C:\... (extended-length UNC
prefix). Any downstream .join(), .exists(), or HashMap key built from that
path behaves inconsistently β the sub-path \?\C:\foo\.codesearch.db may
return false from exists() even when C:\foo\.codesearch.db is present.
This class of bug has silently broken registrations multiple times.
FIX
Introduce safe_canonicalize(path: &Path) -> io::Result<PathBuf> and
strip_unc_prefix(path: PathBuf) -> PathBuf in src/cache/file_meta.rs.
These are the ONLY approved way to canonicalize paths in this codebase.
Exported via crate::cache.
CALL SITES UPDATED (all raw .canonicalize() removed)
- src/cache/file_meta.rs β central definition + 5 new regression tests
- src/db_discovery/repos.rs β register, register_with_alias, unregister_path,
alias_for_path; local strip_unc() removed
- src/db_discovery/mod.rs β find_best_database, get_db_path_for_cwd
- src/index/mod.rs β find_git_root, get_global_db_path,
add_to_index, remove_from_index,
try_delegate_reindex_to_serve (x2),
try_delegate_rm_to_serve
- src/lmdb_registry.rs β TrackedEnv registry key (eliminates
double-open risk when same dir accessed
with and without \?\ prefix)
- src/serve/mod.rs β add_repo_handler, run_serve --register path
POLICY DOCUMENTED
AGENTS.md: "β οΈ Canonical Path Policy β MANDATORY" section with rule,
code example, and pointer to regression tests.
REGRESSION TESTS (6 new in cache/file_meta.rs + 1 existing in repos.rs)
- strip_unc_prefix_removes_windows_unc
- strip_unc_prefix_is_idempotent_on_{plain_path,unix_path}
- safe_canonicalize_on_existing_dir_returns_plain_path
- safe_canonicalize_on_nonexistent_path_returns_error
- register_strips_unc_prefix_from_stored_path (repos.rs β verifies
fallback path also strips UNC when canonicalize() fails)
407 lib tests pass. clippy -D warnings clean.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: replace last raw .canonicalize() with safe_canonicalize in get_db_path_smart (#84)
The old normalize_path(&p.canonicalize()...) pattern in get_db_path_smart
was missed in the central safe_canonicalize refactor (v1.0.139). It worked
correctly (normalize_path also strips UNC) but was inconsistent with the
policy. Now all .canonicalize() calls outside safe_canonicalize's own
definition are eliminated.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: wait for serve warmup instead of refusing; fix 409 on DB-recreate (#86)
PROBLEM 1 β ServeUnresponsive aborted with error instead of waiting
When serve is warming up (opening LMDB for 15+ repos blocks the tokio
runtime, causing /health to time out), the CLI refused with an error.
The user had to retry manually.
FIX: serve_delegate_with_warmup_wait() wraps both try_delegate_reindex_to_serve
and try_delegate_add_to_serve. On ServeUnresponsive it prints
"β³ serve is starting up, waiting..." and retries every 8s up to 6 times
(~2 min budget). On success it prints "β
serve is ready, delegating...".
Only exhausting the full budget returns an error.
PROBLEM 2 β 409 Conflict from POST /repos on "Database not found" path
When a registered repo's DB was missing, the CLI tried POST /repos to
recreate it. Serve correctly returned 409 (alias already registered).
The CLI treated 409 as a failure and fell back to local indexing.
FIX: when auto-add returns 409, retry as POST /repos/{alias}/reindex?force=true.
Force reindex uses allow_create=true and creates the DB via serve without
local fallback.
AGENTS.md: document the root cause (tokio blocking during warmup) as a
remaining work item with diagnosis and fix guidance.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: keep serve responsive during warmup β offload heavy work to spawn_blocking (#88)
PROBLEM
codesearch serve became unresponsive during startup warmup: FileWalker::walk,
VectorStore::build_index (HNSW), and fastembed/ONNX embedding (saturates all
cores) ran synchronously on tokio worker threads. This starved the async
runtime, /health timed out (>3s), and `codesearch index` reported "serve did
not respond in time". The server already returns 202 + spawns background
indexing (accept-and-defer); it just couldn't respond while warming.
FIX
Offload the heavy synchronous warmup work to tokio::task::spawn_blocking, so
the async executor stays responsive (answers /health and accepts POST /repos
immediately, runs the job in the background).
- serve/mod.rs warmup_repo: read stats under .read(); build_index via
spawn_blocking + Arc clone + blocking_write. Build failure only warns.
- manager.rs perform_incremental_refresh_with_stores: walk, read+chunk+embed,
and build_index all offloaded.
- manager.rs refresh_index_with_stores: walk + both build_index calls offloaded.
LOCK SAFETY (verified by review)
Every async RwLock guard scope CLOSES before the spawn_blocking that calls
.blocking_write() on the same store β no lock-over-await deadlock. blocking_write
is only ever called inside spawn_blocking (never on an async worker).
Test: test_incremental_refresh_up_to_date_is_noop exercises the refactored walk
path. 408 lib tests pass, clippy -D warnings clean.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Release v1.0.154 β markdown chunking, stale-path relocation, auto-prune, 15 languages (#99)
* fix: CI test resilience + protect-master workflow (#58)
* Sync master β develop (tree-sitter) (#60)
* Fix formatting of codesearch index command
* Create codeql.yml
* feat: add tree-sitter grammars for Bash, Ruby, PHP, YAML, JSON
* fix: CI test resilience + protect-master workflow (#58) (#59)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* docs: update CHANGELOG β v1.0.132 consolidated release notes (#61)
* [worker] cleanup: AGENTS.md β 73% reduction, removed stale test report and duplicate bug details
* docs: update CHANGELOG β v1.0.132 consolidated release notes (v1.0.97...v1.0.132)
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* sync: align develop with master β AGENTS.md, Cargo.toml, Cargo.lock (#63)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* fix: MCP local mode project/group fallback + QC script fix (#66)
* fix(mcp): ignore project/group params in local/stdio mode instead of erroring
When running MCP in local mode (no serve_state), project/group routing is meaningless because only one DB is available.
Log a warning and fall back to local DB instead of returning an error.
* fix(qc): define YELLOW color in bash QC script
* chore: simplify release workflow β feature-only version bump (#74)
* fix: serve-aware indexing β create DB dir before lock + no silent local duplicate (v1.0.137) (#76)
* fix: create DB directory before acquiring writer lock (serve auto-register)
When `serve` is running and `codesearch index` is run for a repo not yet known
to it, auto-register (POST /repos) failed with a misleading "Database is locked
by another process" 500: SharedStores::new() acquired the writer lock before
the .codesearch.db directory existed, so opening .writer.lock failed with
"path not found". This rolled back the repos.json registration and made the CLI
fall back to a local duplicate index instead of delegating to serve.
- acquire_writer_lock / SharedStores::new now create the DB directory first;
genuine I/O errors surface distinctly instead of as a lock conflict.
- Serve config writes route through ServeState::persist_config() (honors the
config path override) β production behavior unchanged, register/remove path
now hermetically testable.
- Regression guards exercise the brand-new-repo create/register path with the
DB directory genuinely absent (verified to fail against the pre-fix code).
- CHANGELOG: 1.0.136.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: never silently create a local duplicate index when serve is busy
The CLI probes serve's /health before delegating `index`/`index add`. Any
health failure β including a *timeout* while serve is warming up its repos at
startup β was classified as "serve not running", so the CLI silently created a
local index. That local index is a duplicate serve does not manage and can
cause LMDB file-lock conflicts (and the repo never gets registered with serve).
New behavior via probe_serve_health():
- Responsive -> delegate as before.
- Connection refused / cannot connect -> serve not running; index locally.
Detected immediately (no timeout elapses, no retries), so the common
"no serve -> local" path is NOT slowed down.
- Listening but unresponsive (timeout, retried briefly) -> serve is up but
busy. The CLI now REFUSES to create a local duplicate and tells the user to
retry shortly or stop serve first. The fallback is never silent anymore.
Delegation errors are now typed (DelegateError: ServeDown / ServeUnresponsive /
Failed) instead of string-matched. Applies to `index` and `index add` (the
index-creating paths); `index rm` is unchanged.
Tests: probe classification guards (responsive -> Up; listening-but-slow ->
Unresponsive). Rolls into the 1.0.137 release together with the writer-lock fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.137 release) (#78)
Brings the two files that drifted on master during the v1.0.137 release back
to develop: the updated protect-master.yml (allows release/* branches) and the
CHANGELOG [1.0.135] entry. After this, develop and master trees are identical.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: strip UNC prefix in repos.json + auto-add on missing DB (v1.0.138) (#79)
Two Windows path-handling bugs that caused spurious "Database not found"
errors and local duplicate indexes:
1. register()/register_with_alias() stored the raw canonicalize() result in
repos.json. On Windows, canonicalize() returns \?\C:\... (extended-length
UNC prefix). Downstream .join(".codesearch.db") and Path::exists() calls
then fail inconsistently (\?\C:\foo\.codesearch.db not found even when
C:\foo\.codesearch.db exists). 7 repos were affected. Fix: strip_unc()
removes the prefix before storage. Existing repos.json patched in-place.
Regression test: register_strips_unc_prefix_from_stored_path.
2. 500 "Database not found" from reindex (alias registered but DB gone) was
treated as a generic failure -> local fallback -> duplicate index. Fix:
triggers the same auto-register POST /repos path as 404 (DB recreated by
serve, no local fallback).
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: align develop with master (post-v1.0.138) (#81)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* refactor: central safe_canonicalize() β eliminate raw .canonicalize() across codebase (#82)
ROOT CAUSE OF RECURRING BUG CLASS
Path::canonicalize() on Windows returns \?\C:\... (extended-length UNC
prefix). Any downstream .join(), .exists(), or HashMap key built from that
path behaves inconsistently β the sub-path \?\C:\foo\.codesearch.db may
return false from exists() even when C:\foo\.codesearch.db is present.
This class of bug has silently broken registrations multiple times.
FIX
Introduce safe_canonicalize(path: &Path) -> io::Result<PathBuf> and
strip_unc_prefix(path: PathBuf) -> PathBuf in src/cache/file_meta.rs.
These are the ONLY approved way to canonicalize paths in this codebase.
Exported via crate::cache.
CALL SITES UPDATED (all raw .canonicalize() removed)
- src/cache/file_meta.rs β central definition + 5 new regression tests
- src/db_discovery/repos.rs β register, register_with_alias, unregister_path,
alias_for_path; local strip_unc() removed
- src/db_discovery/mod.rs β find_best_database, get_db_path_for_cwd
- src/index/mod.rs β find_git_root, get_global_db_path,
add_to_index, remove_from_index,
try_delegate_reindex_to_serve (x2),
try_delegate_rm_to_serve
- src/lmdb_registry.rs β TrackedEnv registry key (eliminates
double-open risk when same dir accessed
with and without \?\ prefix)
- src/serve/mod.rs β add_repo_handler, run_serve --register path
POLICY DOCUMENTED
AGENTS.md: "β οΈ Canonical Path Policy β MANDATORY" section with rule,
code example, and pointer to regression tests.
REGRESSION TESTS (6 new in cache/file_meta.rs + 1 existing in repos.rs)
- strip_unc_prefix_removes_windows_unc
- strip_unc_prefix_is_idempotent_on_{plain_path,unix_path}
- safe_canonicalize_on_existing_dir_returns_plain_path
- safe_canonicalize_on_nonexistent_path_returns_error
- register_strips_unc_prefix_from_stored_path (repos.rs β verifies
fallback path also strips UNC when canonicalize() fails)
407 lib tests pass. clippy -D warnings clean.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: replace last raw .canonicalize() with safe_canonicalize in get_db_path_smart (#84)
The old normalize_path(&p.canonicalize()...) pattern in get_db_path_smart
was missed in the central safe_canonicalize refactor (v1.0.139). It worked
correctly (normalize_path also strips UNC) but was inconsistent with the
policy. Now all .canonicalize() calls outside safe_canonicalize's own
definition are eliminated.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: wait for serve warmup instead of refusing; fix 409 on DB-recreate (#86)
PROBLEM 1 β ServeUnresponsive aborted with error instead of waiting
When serve is warming up (opening LMDB for 15+ repos blocks the tokio
runtime, causing /health to time out), the CLI refused with an error.
The user had to retry manually.
FIX: serve_delegate_with_warmup_wait() wraps both try_delegate_reindex_to_serve
and try_delegate_add_to_serve. On ServeUnresponsive it prints
"β³ serve is starting up, waiting..." and retries every 8s up to 6 times
(~2 min budget). On success it prints "β
serve is ready, delegating...".
Only exhausting the full budget returns an error.
PROBLEM 2 β 409 Conflict from POST /repos on "Database not found" path
When a registered repo's DB was missing, the CLI tried POST /repos to
recreate it. Serve correctly returned 409 (alias already registered).
The CLI treated 409 as a failure and fell back to local indexing.
FIX: when auto-add returns 409, retry as POST /repos/{alias}/reindex?force=true.
Force reindex uses allow_create=true and creates the DB via serve without
local fallback.
AGENTS.md: document the root cause (tokio blocking during warmup) as a
remaining work item with diagnosis and fix guidance.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: keep serve responsive during warmup β offload heavy work to spawn_blocking (#88)
PROBLEM
codesearch serve became unresponsive during startup warmup: FileWalker::walk,
VectorStore::build_index (HNSW), and fastembed/ONNX embedding (saturates all
cores) ran synchronously on tokio worker threads. This starved the async
runtime, /health timed out (>3s), and `codesearch index` reported "serve did
not respond in time". The server already returns 202 + spawns background
indexing (accept-and-defer); it just couldn't respond while warming.
FIX
Offload the heavy synchronous warmup work to tokio::task::spawn_blocking, so
the async executor stays responsive (answers /health and accepts POST /repos
immediately, runs the job in the background).
- serve/mod.rs warmup_repo: read stats under .read(); build_index via
spawn_blocking + Arc clone + blocking_write. Build failure only warns.
- manager.rs perform_incremental_refresh_with_stores: walk, read+chunk+embed,
and build_index all offloaded.
- manager.rs refresh_index_with_stores: walk + both build_index calls offloaded.
LOCK SAFETY (verified by review)
Every async RwLock guard scope CLOSES before the spawn_blocking that calls
.blocking_write() on the same store β no lock-over-await deadlock. blocking_write
is only ever called inside spawn_blocking (never on an async worker).
Test: test_incremental_refresh_up_to_date_is_noop exercises the refactored walk
path. 408 lib tests pass, clippy -D warnings clean.
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* sync: backfill CHANGELOG 1.0.139-1.0.142 from master (post-release) (#90)
Co-authored-by: flupkede <flupkede@users.noreply.github.com>
* feat: semantic Markdown chunking (tree-sitter-md) + /merge & /release commands (#91)
* chore: add /merge and /release Claude Code slash commands
Codify the project release workflow as two committed slash commands under
.claude/commands/ (force-added past .gitignore, like .claude/CLAUDE.md):
- /merge: README/CHANGELOG freshness checks -> commit -> validate -> push ->
PR to develop -> auto-merge after CI. No tag.
- /release: /merge, then promote develop -> master via a "Release vX.Y.Z" PR
(protect-master allows develop), then push the vX.Y.Z tag that triggers
release.yml. Includes optional post-release develop sync.
Commands document the repo's real conventions: feature->develop->master flow,
master branch protection, and the pre-commit version-bump-on-feature-branches
rule that fixes the release version at the feature commit.
Tooling-only change on a chore/ branch: no version bump, no CHANGELOG entry
(CHANGELOG tracks the shipped binary's behavior).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore: address review remarks on /merge and /release commands
- /merge: abort unless on feature/*|features/*|fix/* (the only branches the
pre-commit hook version-bumps) β closes the gap where running from a
non-bumping branch silently broke the version/CHANGELOG premise.
- Clarify CHANGELOG heading version math for multi-commit landings (hook bumps
+1 per commit; verify heading matches Cargo.toml after the final commit).
- Capture PR numbers explicitly (gh pr view --json number) before merge/poll.
- /release: fetch --tags and guard against a double release (stop if the tag
already exists locally or on origin).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs: document /merge and /release workflow in AGENTS.md
Add a Release workflow section describing the two slash commands, the
branch-protection rule, the tag-triggers-release.yml pipeline, and the
feature-branch-only version-bump rule that fixes the release version.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(chunker): semantic Markdown chunking via tree-sitter-md
Markdown and .txt files were indexed as a single whole-file block (the
fallback chunker has no char budget), so a search hit returned an entire
page β real Aprimo docs reached 80 KB in one chunk.
Add the tree-sitter-md *block* grammar and chunk Markdown by heading
section instead: each chunk is one heading plus its own prose/code,
excluding nested subsections (which become their own chunks). The
heading path is carried in the breadcrumb context (File > Title >
Subsection) so embeddings capture each section's place in the document.
Also add split_oversized, a char- and liβ¦
#101) * fix: reconcile_all_paths in spawn_blocking + use persist_config in prune Two correctness fixes flagged in post-release review: 1. reconcile_all_paths() was called synchronously inside tokio::spawn, blocking a Tokio worker thread while spawning git subprocesses and holding the config RwLock write-guard. Moved to spawn_blocking so the async runtime stays responsive during startup reconciliation. 2. Phase 1 auto-prune wrote repos.json via config.save() instead of self.persist_config(&config). All other ServeState save sites use persist_config to honour config_path_override (e.g. in tests). Now consistent. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: add CHANGELOG entry for v1.0.156 (reconcile spawn_blocking + persist_config prune fix) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦doc-comments (#102) * fix: doc-comments accuracy + safe_canonicalize in reload_if_changed - repos.rs: correct "Pure (no disk I/O)" doc-comments on relocate_missing and prune_stale β both transitively call scan_for_remote which does read_dir and spawns a git subprocess; callers must use spawn_blocking in async contexts. - serve/mod.rs: replace raw std::fs::canonicalize with safe_canonicalize in reload_if_changed so Windows UNC prefix (\?\) is stripped before comparison, consistent with the project-wide safe_canonicalize rule. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: index/mod.rs quality β extract ensure_hnsw_index_if_needed + tests + metadata consistency + cancel best-effort - Extract the safety-net HNSW rebuild into ensure_hnsw_index_if_needed() so the logic is unit-testable; add 3 tests (unindexed with chunks rebuilds, already-indexed is idempotent, empty DB skips rebuild). - metadata.json schema consistency: add "partial": false to the normal (non-cancelled) path so readers always see the field regardless of how indexing ended. - Cancellation finalisation path: change non-critical ? propagations to log-and-continue (metadata.json write, FileMetaStore update/save, stats read) β keeps partial chunks searchable even if any recovery step fails. store.build_index() still propagates errors as before. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: concurrency β evaluate_csharp_rebuild outside write lock + build_index in spawn_blocking Three related concurrency fixes in serve/mod.rs, consistent with the reconcile_all_paths fix from PR #101: 1. evaluate_csharp_rebuild: bootstrap_last_changed (git subprocess + β€10k file fs-walk) was running while holding config.write(), blocking every concurrent config.read() for the scan duration. Fixed by checking whether bootstrap is needed under a read lock, running the slow I/O with no lock held, then taking the write lock only for the brief config update. 2. evaluate_csharp_rebuild call site in run_phase_2_csharp_scip: even with the above fix the function still ran synchronously on a Tokio worker thread. Wrapped in spawn_blocking so the async runtime stays responsive while scanning all C# candidates at startup. 3. warmup_repo and add_repo_handler background task: two build_index() calls were running directly on async threads while holding a tokio RwLock write guard. build_index() is CPU-heavy (HNSW construction). Both are now offloaded via spawn_blocking + blocking_write(), matching the established pattern at serve/mod.rs:1249. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: CHANGELOG entry for v1.0.160 (full review fixes) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: rename_retry helper for Windows flaky tests in repos.rs On Windows, git subprocesses spawned by init_git_remote keep file handles open briefly after the process exits, causing std::fs::rename and std::fs::remove_dir_all to fail with "Access is denied" under parallel test load. Fixes: - Add rename_retry() test helper that retries with exponential back-off (up to 10 attempts, 20-200ms delays) - Replace all 7 std::fs::rename(...).unwrap() calls in the repos test module with rename_retry() - Change remove_dir_all(...).unwrap() in try_relocate_none_when_ambiguous to let _ = ... (the assertion holds either way) Verified stable: 3 consecutive full-suite runs with 432 passed, 0 failed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: CHANGELOG entry for v1.0.162 (Windows flaky test fix) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: flupkede <flupkede@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦ revert unrelated Cargo.toml change
β¦en metadata reports 0 chunks
β¦FileWatcher missing .codesearchignore bug
β¦ter .ipynb support
β¦uage::Jupyter, and FileWatcher precedence
β¦ist (M6) C1 (CRITICAL): perform_incremental_refresh_with_stores hardcoded ModelType::default() for embedding while force_reindex wrote the --model override only into metadata.json. Choosing a non-384d model (bge-base/ bge-large/mxbai-large) recorded the new dimension in metadata but still produced 384d vectors -> dimension mismatch / corrupt index. Fix: resolve the embedding model from the short name recorded in metadata.json, fail fast on an unknown model or a dims/model mismatch, and thread the resolved ModelType into the embedding closure instead of the hardcoded default. M6: replace the two hand-maintained "valid models" lists (CLI add --model and serve POST /repos), which both omitted bge-large, with a single ModelType::valid_short_names() derived from all(). Tests: short_name round-trips through parse() for every model; valid_short_names() lists every model incl. bge-large. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦er path index_single_file (reached from the file-watcher loop and git branch-change refresh) still hardcoded ModelType::default() for embedding, re-introducing the exact dimension-mismatch corruption C1 set out to fix: a repo indexed with a non-default model would re-embed changed files at 384d on the next edit. Extract resolve_embed_model(db_path) -> (ModelType, usize) as the single source for model resolution (reads metadata, fail-fast on unknown model / dims mismatch) and use it in BOTH perform_incremental_refresh_with_stores and index_single_file. Removes the duplicated inline resolution block and the redundant closure rebind. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦tant-time auth (M2) M1: the generated post-checkout hook embedded $(pwd) directly into the JSON request body. A repo path containing a double quote or backslash broke out of the JSON string literal (malformed body / injection). Now JSON-escape the path in pure bash (backslashes then quotes) before embedding β no jq dependency, handles Windows paths. M2: API key was compared with raw string `==`, which short-circuits on the first mismatched byte β a timing side-channel on the network-exposed require_auth_for_network path (whose "localhost-only, timing impractical" justification was actually copied from the admin middleware). Add api_key_matches() using SHA-256 digest + non-short-circuiting byte compare, and a shared request_has_valid_api_key() helper so both middlewares use the constant-time path from one place. Fix the misleading doc comment. Test: api_key_matches covers match/mismatch/length/empty/case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦ msys bash
The previous M1 fix used bare-backslash parameter expansion
(${REPO_PATH//\/\\}); empirically this does NOT double backslashes in
Git Bash / msys 5.2 (the surrounding double-quotes halve them and a bare
\ pattern fails to match a literal backslash), so a path with a backslash
still produced invalid JSON.
Use the variable-based idiom instead β quoted variables as the search and
replace operands force LITERAL matching:
BS='\'; DQ='"'
REPO_PATH=${REPO_PATH//"$BS"/"$BS$BS"}
REPO_PATH=${REPO_PATH//"$DQ"/"$BS$DQ"}
Verified with node JSON.parse round-trip for paths containing both " and \
(e.g. /c/Users/a"b\c -> {"path":"/c/Users/a\"b\c"} -> parses back exactly).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦5) + fsync (minor) M5: force_reindex_with_stores step 4 still wrote metadata.json via a plain std::fs::write(to_string_pretty(..)), bypassing the atomic writer. A crash there could truncate metadata.json β the exact failure the atomic RMW was introduced to prevent. Route it through crate::vectordb::merge_metadata_atomic, overlaying the preserved keys + a fresh indexed_at onto any existing content. Minor: atomic_write_json now fsyncs the temp file (File::create + write_all + sync_all) BEFORE the rename, so a power-loss cannot leave a zero-length/garbage file in place of the old metadata. Doc comment updated to match the actual guarantee instead of overstating it. vectordb store tests pass (incl. test_persistence). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦M3) + csharp_error (M4)
M3: the remote TUI's `i` and `d` keys called GET /repos/{alias}/info and
POST /repos/{alias}/doctor, but neither route existed server-side, so they
always failed with "endpoint not available" / HTTP error. Add:
- info_handler (GET /repos/:alias/info): mirrors tui::build_info_overlay,
returns the exact InfoResponse shape (chunks/files/max_chunk_id/
db_size_human/model/dims/lock/index_age). 404 on unknown alias.
- doctor_handler (POST /repos/:alias/doctor): runs diagnose_with_store when
the repo's stores are open (reuses the LMDB handle β avoids double-open),
else diagnose(); returns {"results": [...]} from DoctorReport::render_tui.
Both registered in the same auth-layered Router as reindex; like /status
they're open on localhost and key-protected on network binds.
format_age/dir_size_human in tui.rs made pub(crate) for reuse.
M4: status JSON now includes "csharp_error" so the remote detail panel shows
the real C# error instead of the literal "Unknown error".
cargo check + clippy clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦rt classify + tests Jupyter (src/chunker/jupyter.rs): - Normalize RawCell.line_count to >=1 in extract_cell so the two passes in merge_adjacent_cells (line numbering vs merge accumulation) agree; empty cells previously stored 0 which the passes counted differently. - Add a loud module-header CAVEAT that chunk start/end lines are synthetic cell-relative positions, NOT real .ipynb JSON offsets β future jump-to-line / re-extraction features must not trust them. Note kernel language is not read (generic code labels). Dart (src/chunker/extractor.rs): - Remove the dead/misleading `mixin_application` branch from classify(). Empirically verified: Dart mixin members parent to `class_body` (already Method); `mixin_application` is the `with A, B` clause and never parents a function_declaration, so the branch never fired. Tests (src/chunker/semantic.rs): - test_dart_semantic_chunking: top-level fn β Function, class & mixin members β Method (regression guard for the misclassification). - test_dart_unparseable_still_chunks: malformed .dart still yields fallback chunks (grammar-failure resilience). 64 chunker tests pass; clippy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
DRY (src/serve/tui_common.rs, tui.rs, tui_remote.rs): - Extract shared format_uptime_secs(u64) β format_uptime delegates; remote's format_uptime_from_secs removed. - Move byte-identical restore_terminal into tui_common (pub); both TUIs call it. - render_centered_modal now delegates to the _with_border_color variant (Color::Cyan), removing ~45 duplicated lines. Hygiene: - Fix stale comments: tui.rs "'s' pressed" -> "'l' pressed"; tui_remote module doc "Actions (i/d/f/s)" -> accurate i=info/d=doctor/n=reindex/r=remove/l=reload. - Remove dead `let _ = tx;` suppression in tui_remote. - Align C# sentinel: tui.rs map_repo_rows emits "none" (was "") to match status_handler; shared consumer handles it via default branch (no display change). - Reuse a single reqwest::Client across the remote TUI poll/actions instead of constructing one per request. - search/mod.rs: warn! on query-side model parse-failure/override fallback (previously a silent unwrap_or_default) β keeps the default, just makes a model mismatch visible in logs. Test (src/serve/mod.rs): info_doctor_routes_registered asserts GET /info and POST /doctor on an unknown alias return 404 (mirrors reindex route test). cargo check + clippy clean; 29 serve tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦regression) The Stage 1 fail-fast resolved the embedding model at the top of perform_incremental_refresh_with_stores, which made an up-to-date (no-op) refresh error out on any index whose metadata model can't be parsed β breaking test_incremental_refresh_up_to_date_is_noop and, more importantly, failing no-op refreshes that never embed anything. Move the strict resolve_embed_model() call inside the `if !changed_files` block so it runs only when there are files to embed; keep a lenient model_name/dimensions read at the top for the FileMetaStore. Embedding still fails fast with the same guarantee; no-op refreshes no longer require a resolvable model. Full lib+bins suite green (482 passed; the unrelated Windows relocation test is pre-existing flakiness β passes in isolation). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦auth notes Holistic-review findings on the combined branch: IMPORTANT β partial-deletion window: in perform_incremental_refresh_with_stores the lazy resolve_embed_model ran AFTER stale-chunk deletions were committed, so on a corrupt index (unknown model / dims mismatch) it could delete chunks then error before re-embedding, leaving the index with data removed. Move the fail-fast resolve to right after the no-op early-return β before any destructive store mutation β so a corrupt index errors with the index still intact. embed_model is still consumed in the changed-files block. MINOR β doctor_handler ran synchronous diagnose() (tree walk + LMDB scan) while holding a tokio read guard on an async worker. Wrap it in spawn_blocking using blocking_read on the cloned Arc<RwLock> (mirrors the reindex path); still reuses the open LMDB handle to avoid double-open. MINOR β document at the route declaration why POST /doctor is intentionally outside require_admin_auth's management set (read-only diagnostics). cargo check + clippy clean; target tests pass (noop refresh, info_doctor routes). The unrelated Windows relocation test remains pre-existing flakiness (passes in isolation). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The db_discovery::repos relocation tests flaked under full-suite parallel load (intermittent "Access is denied" on rename, and occasional missing git remote). Root cause: these tests `git init` a directory then rename it; on Windows the indexer/antivirus scans each fresh .git tree and holds handles on it, blocking the rename. When many such tests run concurrently the scanner is overwhelmed and handles linger for >7s β long enough to exhaust the old 10-attempt rename retry. A transient msys fork failure (EAGAIN) spawning git could also silently drop a repo's remote. Fixes: - Serialize the 9 git-spawning / renaming relocation tests behind a shared, poison-tolerant Mutex so only one .git tree is created/renamed at a time β keeping each Defender scan window short. This is the decisive fix. - Harden production git_remote_url(): retry on transient spawn failure (fork exhaustion) instead of treating it as "no remote", which would wrongly strip a repo's git identity and break relocation / cause prune. NotFound (git absent) still returns immediately; an Ok-but-nonzero status is a real answer and is not retried. - init_git_remote test helper: same spawn-retry instead of .expect() panic. - rename_retry: raise budget to 40 attempts with a 250ms-capped backoff. Validation: full lib suite run 6x consecutively under parallel load β 482 passed, 0 failed every time (incl. a 15s heavy-contention run that previously failed). clippy -D warnings clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦ions" error
The reported 500 ("an environment is already opened with different options"
opening BAYR.Aprimo) is heed 0.20.5's Error::BadOpenOptions from heed's own
process-global OPENED_ENV registry β NOT codesearch's TrackedEnv guard (which
would say "double-open prevented"). It fires when a path resolves to a
still-live heed env whose recorded map_size differs from the reopen's resolved
size (e.g. after an MDB_MAP_FULL resize).
Root cause: TrackedEnv::drop called unregister() in the drop body, but the
`inner: heed::Env` field is dropped only AFTER the body returns (Rust drop
order). That leaves a window where codesearch's LMDB_REGISTRY slot is free but
heed's env is still alive. A concurrent TrackedEnv::open (idle reaper dropping a
repo while a reindex/query reopens it) passes register() and falls through to
opts.open(), hitting heed's raw error.
Note: the previously applied stop_fsw fix (drop Readonly/Conflicted) does not
cover this incident β the repo was idle-evicted, so stop_fsw returns None before
reaching that branch.
Fix: wrap inner in ManuallyDrop and drop the heed::Env BEFORE unregister(),
enforcing "our slot free => heed's slot free". A concurrent open now either sees
our slot occupied (clear double-open message + retry) or both free (clean
reopen) β never the inconsistent state. Adds a multi-threaded regression guard
that asserts the forbidden heed string never surfaces (non-flaky: only fails on
real regression).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦ons" 500 Strengthening the TrackedEnv concurrency test to 8 threads x 4000 iters revealed the drop-order reorder alone does NOT close the window: with DIFFERENT map sizes on the same path, heed still raised "an environment is already opened with different options". Probing with a CONSISTENT size proved the error is specific to size mismatch β heed defers env close, so a reopen can briefly observe the prior live env, and only disagreeing options trigger the error. Root fix: a process-global per-path map_size pin (MAP_SIZE_PINS). The first resolve_map_size() for a path fixes its size for the process lifetime (monotonically non-decreasing, capped at MAX_LMDB_MAP_SIZE_MB); resize_environment raises the pin so post-resize reopens (e.g. after idle eviction) match the still-live env. This makes every open of a path use a consistent size regardless of metadata-persistence state β the user's BAYR metadata.json had no lmdb_map_size_mb, which is exactly how a resized env could be reopened with a mismatched size. The Drop reorder remains as complementary hardening. Tests: - lmdb_registry: concurrent open/drop/reopen with consistent size (8x4000, barrier-synced) asserts the heed string never surfaces β non-flaky guard. - store: pin is stable+monotonic per path (a later smaller persisted size does not shrink the live pin) and capped at MAX. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pressing 'n' (reindex) previously gave no instant confirmation β the status column only flipped to "Indexingβ¦" on the next 500ms redraw, which was easy to miss on a fast reindex. Now a transient footer flash confirms the action immediately and honestly reflects the launch outcome: - Started β "β³ Reindex started for '<alias>' β¦" - AlreadyRunning β "β³ Reindex already running for '<alias>'" - Failed β "β Cannot reindex '<alias>' β see logs" spawn_force_reindex now returns a ReindexLaunch enum so the flash maps to the real synchronous launch result (guard hit / unresolved alias / read-only / open error). The flash auto-clears after 4s. The existing pulsing "β³ idxβ¦" status label is unchanged. Scope: local in-process TUI only (per request); render_footer gains an Option<&str> flash arg, remote TUI passes None (behavior unchanged). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
π fix: review corrections β LMDB reopen 500, indexing & TUI hardening
β¦& TUI reindex feedback - README: add Dart row to Supported Languages table (the 16th tree-sitter grammar was already counted at line 18; only the table row was missing). - CHANGELOG [Unreleased]: document the heed "different options" reopen-500 fix (TrackedEnv drop-order + per-path map_size pin) under Fixed, and the immediate TUI reindex feedback under Added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- CHANGELOG: move [Unreleased] items under [1.0.207] - 2026-06-12. - Add the `serve --host` / non-localhost bind feature (#114) to the release notes (env CODESEARCH_SERVE_HOST, 0.0.0.0 for containers, pair with CODESEARCH_SERVE_API_KEY). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
β¦ion directly Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
π docs: add Dart to README + changelog for reopen-500 fix & TUI reindex feedback
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Promote develop to master for the v1.0.207 release. Includes: Dart support, Jupyter notebooks, global .codesearchignore, --host/#114, embedding-model fix (#118), and the LMDB reopen-500 fix.