Skip to content

fix: offload long-running MCP tools to a thread (#46, #136)#231

Merged
tirth8205 merged 1 commit intomainfrom
fix/windows-async-wrappers
Apr 11, 2026
Merged

fix: offload long-running MCP tools to a thread (#46, #136)#231
tirth8205 merged 1 commit intomainfrom
fix/windows-async-wrappers

Conversation

@tirth8205
Copy link
Copy Markdown
Owner

Summary

Follow-up to the v2.2.4 Windows fix based on Limucc's test on #136.

Limucc's test result on v2.2.4

Test Result
CLI uvx code-review-graph build ✅ 153 files, 1637 nodes, 7752 edges
MCP list_graph_stats_tool (read-only) ✅ returns instantly
MCP build_or_update_graph_tool(full_rebuild=True) hangs indefinitely
MCP embed_graph_tool (sentence-transformers) hangs indefinitely

Environment: Python 3.14.0 / Windows-11-10.0.26200. So the WindowsSelectorEventLoopPolicy fix from v2.2.4 was necessary (read-only tools work now) but not sufficient — long-running sync handlers still block the event loop.

Root cause

FastMCP 2.x dispatches sync handlers inline on the only event-loop thread. When a handler runs for more than a few seconds — especially one that spawns child processes (full_build uses ProcessPoolExecutor) or does CPU-bound inference (sentence-transformers) — the loop stops pumping stdin/stdout, Claude Code's request never gets a response, and the MCP client shows "Synthesizing…" forever.

The WindowsSelectorEventLoopPolicy switch from v2.2.4 prevents the ProactorEventLoop / ProcessPoolExecutor deadlock at the loop level, but the handler itself still needs to return control to the loop so other work (including stdin reads) can make progress.

Fix

Make the five heavy tool handlers async def and offload their blocking work with asyncio.to_thread. The event loop stays responsive and the stdio transport keeps pumping. Zero config, no-op for short tools, works on every platform.

Tools converted:

  • build_or_update_graph_toolfull_build / incremental_update
  • run_postprocess_tool — community detection can take 20s+ on large graphs
  • embed_graph_tool — sentence-transformers inference
  • detect_changes_toolgit diff subprocess + BFS traversal
  • generate_wiki_tool — many SQLite reads + file writes

The other 19 tools are fast SQLite-read paths and stay sync.

Lock-in tests

tests/test_main.py::TestLongRunningToolsAreAsync (2 new tests):

  1. test_heavy_tools_are_coroutines — uses mcp.get_tools() introspection to assert all 5 heavy tools register as coroutine functions.
  2. test_heavy_tool_source_uses_to_thread — defense in depth: greps each tool's source for a literal asyncio.to_thread call, so we don't accidentally make a tool async def without actually offloading the work.

These will fail at collection time if someone in the future converts one of the heavy tools back to sync without realizing the Windows implication.

Test plan

Verified locally on Python 3.11 / macOS:

  • uv run ruff check code_review_graph/ → clean
  • uv run mypy code_review_graph/ --ignore-missing-imports --no-strict-optional → clean
  • uv run bandit -r code_review_graph/ -c pyproject.toml → 0 H/M/L
  • uv run pytest --cov-fail-under=65737 passed, 1 skipped, 2 xpassed, coverage 74.63% (+2 new lock-in tests vs v2.3.0)
  • mcp.get_tools() returns 24 tools with build_or_update_graph_tool, run_postprocess_tool, embed_graph_tool, detect_changes_tool, generate_wiki_tool as coroutines
  • CI matrix (3.10 / 3.11 / 3.12 / 3.13)
  • Windows verification — will ping @dev-limucc on embed_graph_tool hangs indefinitely on Windows with both sentence-transformers and Gemini provider #136 once v2.3.1 ships to PyPI

After merge

Cut v2.3.1 (patch-level bump — this is a bug fix, not a new feature), publish to PyPI, then ask Limucc to re-test with uvx code-review-graph@2.3.1.

Closes (pending Windows verification)

🤖 Generated with Claude Code

Limucc's v2.2.4 test on Windows 11 / Python 3.14 confirmed that the
WindowsSelectorEventLoopPolicy fix from v2.2.4 was *necessary but
not sufficient* — read-only tools work, but long-running tools still
hang indefinitely over stdio MCP:

    build_or_update_graph_tool(full_rebuild=True)   → hangs
    embed_graph_tool (sentence-transformers)         → hangs

Root cause: FastMCP 2.x dispatches sync handlers inline on the only
event-loop thread. When a handler runs for more than a few seconds —
especially one that spawns child processes (full_build uses
ProcessPoolExecutor) or does CPU-bound inference (sentence-transformers)
— the loop stops pumping stdin/stdout, Claude Code's request never
gets a response, and the MCP client shows "Synthesizing…" forever.

Fix: make the five heavy tool handlers ``async def`` and offload the
blocking work with ``asyncio.to_thread``. The event loop stays
responsive and the stdio transport keeps pumping. This is a no-op
for short tools, zero-config, and works on every platform.

Tools converted to async + asyncio.to_thread:
- build_or_update_graph_tool  (full_build / incremental_update)
- run_postprocess_tool        (community detection can be slow)
- embed_graph_tool            (sentence-transformers inference)
- detect_changes_tool         (git diff + BFS traversal)
- generate_wiki_tool          (many SQLite reads + file writes)

The other 19 tools are fast SQLite-read paths and stay sync.

Tests: tests/test_main.py::TestLongRunningToolsAreAsync
- Asserts all 5 tools are registered as coroutines via FastMCP's
  get_tools() introspection
- Defense-in-depth: asserts each tool's source literally contains
  "asyncio.to_thread" so we don't accidentally make a tool async
  without offloading the blocking work

Verified locally on Python 3.11 / macOS:
- ruff clean, mypy clean, bandit clean
- 737 tests pass (+2 new async lock-in tests), coverage 74.63%
- mcp.get_tools() returns 24 tools with the 5 above as coroutines

Windows verification requested from @dev-limucc on #136 once v2.3.1
ships.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tirth8205 tirth8205 merged commit 7093e56 into main Apr 11, 2026
9 checks passed
azizur100389 added a commit to azizur100389/code-review-graph that referenced this pull request Apr 11, 2026
Three distinct, deterministic bugs that all reproduce on clean
origin/main on Windows.  They are NOT flakiness — each has a clear root
cause and a targeted regression test that fails before the fix and
passes after.  All three have been masking real product bugs behind
"CI is just flaky on Windows".

Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows
============================================================
File: code_review_graph/incremental.py:128-138

``inner_gitignore.write_text(...)`` was called without an encoding
argument.  The string literal contains an em-dash (U+2014).  On Windows,
Path.write_text uses the system default codepage (cp1252 in most
locales), which encodes U+2014 as the single byte 0x97.  Anything that
later reads the file as UTF-8 — including get_data_dir's own test
`test_default_uses_repo_subdir` — raises
`UnicodeDecodeError: 'utf-8' codec can't decode byte 0x97 ...`.

Fix: add encoding="utf-8" to write_text.  Sibling function
_ensure_repo_gitignore at line 170 already uses encoding="utf-8"; this
one was missed when the inner .gitignore writer was added.

Regression guard: TestDataDir::test_auto_gitignore_is_valid_utf8
- reads the generated file as raw bytes
- asserts the em-dash is stored as UTF-8 bytes 0xE2 0x80 0x94
- asserts cp1252 byte 0x97 does NOT appear
- round-trips through strict UTF-8 decoding

Bug 2: Databricks notebook auto-detection fails on CRLF line endings
====================================================================
File: code_review_graph/parser.py:390-394 (in parse_bytes)

The check was hard-coded to LF:
    source.startswith(b"# Databricks notebook source\n")

On Windows, `git config core.autocrlf=true` (the default) rewrites text
files to CRLF on checkout.  The Databricks fixture
`tests/fixtures/sample_databricks_export.py` starts with
`b"# Databricks notebook source\r\n"` on any Windows checkout, so the
check returns False, the file is parsed as regular Python, and ALL
Databricks-specific handling is bypassed: notebook_format metadata is
never tagged, # MAGIC %sql cells are not parsed as SQL, # MAGIC %md
cells are not skipped, and per-cell cell_index metadata is missing.
This silently breaks four tests in TestDatabricksPyNotebook on Windows:
- test_detects_databricks_header
- test_extracts_sql_tables
- test_skips_magic_md_cells
- test_cell_index_tracking

Fix: parse the first line robustly by finding the first b"\n" and
stripping any trailing b"\r" before comparing to the exact header.  This
matches both LF and CRLF endings AND rejects prefix false positives
(e.g. a file whose first line is "# Databricks notebook source code
examples" — only the exact header now triggers Databricks parsing).

Regression guards in TestDatabricksPyNotebook:
- test_databricks_header_crlf_line_endings (CRLF path, the actual bug)
- test_databricks_header_lf_line_endings_still_work (LF path unchanged)
- test_databricks_header_prefix_false_positive_rejected (guard against
  naive "just use startswith" fix)

Bug 3: Stale FastMCP API in test_heavy_tools_are_coroutines
===========================================================
File: tests/test_main.py:68-71

``tools = await crg_main.mcp.get_tools()`` — `FastMCP.get_tools()` does
not exist in fastmcp>=2.14.0 (which pyproject.toml pins via
`fastmcp>=2.14.0,<3`).  The current API is `list_tools()` and it
returns MCP protocol Tool pydantic objects that do NOT expose the
underlying Python function at all, so the old lookup pattern cannot be
directly ported.

The test dies at runtime with AttributeError on EVERY platform, which
means the async regression guard promised by PR tirth8205#231 ("There are
regression tests that will fail at CI collection time if anyone
converts one of the 5 tools back to sync") has been silently inert
since the test was merged.  A future refactor converting any of the
5 heavy tools back to sync would NOT be caught by CI because the
guard is already red.

Fix: mirror the sibling test `test_heavy_tool_source_uses_to_thread`,
which resolves each heavy tool by ``getattr(crg_main, name)`` — a
resilient approach that does not depend on any FastMCP internal
surface.  Also drop @pytest.mark.asyncio on both guards since they no
longer need an event loop.

Regression guard:
test_regression_guard_does_not_depend_on_fastmcp_internals
- AST-walks the two guard functions' source
- fails if they reference mcp.get_tools / mcp._tools / mcp.tool_manager
  / mcp._tool_manager on actual Attribute nodes (docstrings ignored)
- fails if any heavy tool cannot be resolved via getattr(crg_main, name)

Test results
============
Stage 1 (new targeted regression tests):      7/7 passed.
Stage 2 (tests/test_incremental.py + test_notebook.py + test_main.py):
  99 passed + 2 xpassed, 2 pre-existing failures in TestFindRepoRoot /
  TestFindProjectRoot that are environmental (user's home dir
  contains .git; walk finds it).  Intentionally NOT fixed by this PR —
  worth a separate discussion, see the tracking issue.
Stage 3 (tests/test_parser.py adjacent — parser.py touched): 67/67.
Stage 4 (full suite): 743 passed, 2 unrelated find_repo_root failures,
  165 pre-existing Windows file-lock teardown errors.  That's +10 net
  tests and -6 pre-existing failures resolved compared to baseline on
  main.
Stage 5 (ruff check on all 5 changed files): clean.

Why this fix is safe
====================
- Fix 1 (encoding="utf-8") matches the established pattern in the
  sibling function in the same file.  No API change.
- Fix 2 uses a more restrictive match than the original (exact first
  line, not startswith), so it CANNOT loosen the prior behavior — only
  the CRLF false negative is newly accepted.  Prefix false positives
  are explicitly rejected by a test.
- Fix 3 is test-only.  The underlying product code (the 5 heavy tools)
  is unchanged.  The regression guard is now actually enforced.

See the umbrella tracking issue for the full root-cause analysis.
azizur100389 added a commit to azizur100389/code-review-graph that referenced this pull request Apr 14, 2026
Three distinct, deterministic bugs that all reproduce on clean
origin/main on Windows.  They are NOT flakiness — each has a clear root
cause and a targeted regression test that fails before the fix and
passes after.  All three have been masking real product bugs behind
"CI is just flaky on Windows".

Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows
============================================================
File: code_review_graph/incremental.py:128-138

``inner_gitignore.write_text(...)`` was called without an encoding
argument.  The string literal contains an em-dash (U+2014).  On Windows,
Path.write_text uses the system default codepage (cp1252 in most
locales), which encodes U+2014 as the single byte 0x97.  Anything that
later reads the file as UTF-8 — including get_data_dir's own test
`test_default_uses_repo_subdir` — raises
`UnicodeDecodeError: 'utf-8' codec can't decode byte 0x97 ...`.

Fix: add encoding="utf-8" to write_text.  Sibling function
_ensure_repo_gitignore at line 170 already uses encoding="utf-8"; this
one was missed when the inner .gitignore writer was added.

Regression guard: TestDataDir::test_auto_gitignore_is_valid_utf8
- reads the generated file as raw bytes
- asserts the em-dash is stored as UTF-8 bytes 0xE2 0x80 0x94
- asserts cp1252 byte 0x97 does NOT appear
- round-trips through strict UTF-8 decoding

Bug 2: Databricks notebook auto-detection fails on CRLF line endings
====================================================================
File: code_review_graph/parser.py:390-394 (in parse_bytes)

The check was hard-coded to LF:
    source.startswith(b"# Databricks notebook source\n")

On Windows, `git config core.autocrlf=true` (the default) rewrites text
files to CRLF on checkout.  The Databricks fixture
`tests/fixtures/sample_databricks_export.py` starts with
`b"# Databricks notebook source\r\n"` on any Windows checkout, so the
check returns False, the file is parsed as regular Python, and ALL
Databricks-specific handling is bypassed: notebook_format metadata is
never tagged, # MAGIC %sql cells are not parsed as SQL, # MAGIC %md
cells are not skipped, and per-cell cell_index metadata is missing.
This silently breaks four tests in TestDatabricksPyNotebook on Windows:
- test_detects_databricks_header
- test_extracts_sql_tables
- test_skips_magic_md_cells
- test_cell_index_tracking

Fix: parse the first line robustly by finding the first b"\n" and
stripping any trailing b"\r" before comparing to the exact header.  This
matches both LF and CRLF endings AND rejects prefix false positives
(e.g. a file whose first line is "# Databricks notebook source code
examples" — only the exact header now triggers Databricks parsing).

Regression guards in TestDatabricksPyNotebook:
- test_databricks_header_crlf_line_endings (CRLF path, the actual bug)
- test_databricks_header_lf_line_endings_still_work (LF path unchanged)
- test_databricks_header_prefix_false_positive_rejected (guard against
  naive "just use startswith" fix)

Bug 3: Stale FastMCP API in test_heavy_tools_are_coroutines
===========================================================
File: tests/test_main.py:68-71

``tools = await crg_main.mcp.get_tools()`` — `FastMCP.get_tools()` does
not exist in fastmcp>=2.14.0 (which pyproject.toml pins via
`fastmcp>=2.14.0,<3`).  The current API is `list_tools()` and it
returns MCP protocol Tool pydantic objects that do NOT expose the
underlying Python function at all, so the old lookup pattern cannot be
directly ported.

The test dies at runtime with AttributeError on EVERY platform, which
means the async regression guard promised by PR tirth8205#231 ("There are
regression tests that will fail at CI collection time if anyone
converts one of the 5 tools back to sync") has been silently inert
since the test was merged.  A future refactor converting any of the
5 heavy tools back to sync would NOT be caught by CI because the
guard is already red.

Fix: mirror the sibling test `test_heavy_tool_source_uses_to_thread`,
which resolves each heavy tool by ``getattr(crg_main, name)`` — a
resilient approach that does not depend on any FastMCP internal
surface.  Also drop @pytest.mark.asyncio on both guards since they no
longer need an event loop.

Regression guard:
test_regression_guard_does_not_depend_on_fastmcp_internals
- AST-walks the two guard functions' source
- fails if they reference mcp.get_tools / mcp._tools / mcp.tool_manager
  / mcp._tool_manager on actual Attribute nodes (docstrings ignored)
- fails if any heavy tool cannot be resolved via getattr(crg_main, name)

Test results
============
Stage 1 (new targeted regression tests):      7/7 passed.
Stage 2 (tests/test_incremental.py + test_notebook.py + test_main.py):
  99 passed + 2 xpassed, 2 pre-existing failures in TestFindRepoRoot /
  TestFindProjectRoot that are environmental (user's home dir
  contains .git; walk finds it).  Intentionally NOT fixed by this PR —
  worth a separate discussion, see the tracking issue.
Stage 3 (tests/test_parser.py adjacent — parser.py touched): 67/67.
Stage 4 (full suite): 743 passed, 2 unrelated find_repo_root failures,
  165 pre-existing Windows file-lock teardown errors.  That's +10 net
  tests and -6 pre-existing failures resolved compared to baseline on
  main.
Stage 5 (ruff check on all 5 changed files): clean.

Why this fix is safe
====================
- Fix 1 (encoding="utf-8") matches the established pattern in the
  sibling function in the same file.  No API change.
- Fix 2 uses a more restrictive match than the original (exact first
  line, not startswith), so it CANNOT loosen the prior behavior — only
  the CRLF false negative is newly accepted.  Prefix false positives
  are explicitly rejected by a test.
- Fix 3 is test-only.  The underlying product code (the 5 heavy tools)
  is unchanged.  The regression guard is now actually enforced.

See the umbrella tracking issue for the full root-cause analysis.
azizur100389 added a commit to azizur100389/code-review-graph that referenced this pull request Apr 14, 2026
…stop_at (tirth8205#239, tirth8205#241)

Five root-cause fixes that together resolve every pre-existing Windows
test failure on main.  Each fix has targeted regression tests; the net
effect is full green CI on Windows (8 failures -> 0).

Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows (tirth8205#239)
--------------------------------------------------------------------
write_text() called without encoding="utf-8".  The em-dash in the header
is U+2014 which Python encodes as cp1252 byte 0x97 on Windows.  Any later
UTF-8 read fails with UnicodeDecodeError.

Fix: add encoding="utf-8" (matches sibling _ensure_repo_gitignore).
Test: test_auto_gitignore_is_valid_utf8 — asserts UTF-8 byte sequence,
rejects cp1252 byte.

Bug 2: Databricks notebook detection fails on CRLF line endings (tirth8205#239)
----------------------------------------------------------------------
source.startswith(b"# Databricks notebook source\n") hard-codes LF.
Windows git checkout (core.autocrlf=true) produces CRLF.  All Databricks
handling silently bypassed — 4 tests fail.

Fix: parse first line robustly, strip trailing \r before exact match.
Tests: test_databricks_header_crlf_line_endings,
test_databricks_header_lf_line_endings_still_work,
test_databricks_header_prefix_false_positive_rejected.

Bug 3: Stale FastMCP API in async regression guard (tirth8205#239)
---------------------------------------------------------
test_heavy_tools_are_coroutines called mcp.get_tools() which does not
exist in fastmcp>=2.14.0 (pinned in pyproject.toml).  The guard has
been silently broken since it was written — the protection promised by
PR tirth8205#231 for tirth8205#46/tirth8205#136 was never actually enforced.

Fix: resolve tools via getattr(crg_main, name) like the sibling test.
Drop @pytest.mark.asyncio since no event loop is needed.
Test: test_regression_guard_does_not_depend_on_fastmcp_internals —
AST-walks the guard source to ensure no mcp internal API references.

Bug 4-5: find_repo_root walks above test sandbox (tirth8205#241)
-------------------------------------------------------
test_returns_none_without_git and test_falls_back_to_start fail on any
machine where tmp_path has a git-initialized ancestor (dotfiles repo at
~/.git — very common on developer machines).

Fix: add optional stop_at parameter to find_repo_root() and
find_project_root().  When set, the walk examines stop_at for .git and
then stops.  Default is None (existing walk-to-root behavior).  Fully
backward-compatible — all 7 production callers unchanged.
Tests: test_stop_at_prevents_escape_to_outer_git,
test_stop_at_finds_git_at_boundary,
test_stop_at_forwarded_to_find_repo_root.

Files changed
-------------
- code_review_graph/incremental.py — encoding fix + stop_at API
- code_review_graph/parser.py — CRLF-tolerant Databricks detection
- tests/test_incremental.py — gitignore UTF-8 guard + stop_at tests
- tests/test_main.py — fixed async guard + meta-guard
- tests/test_notebook.py — CRLF + LF + false-positive guards
azizur100389 added a commit to azizur100389/code-review-graph that referenced this pull request Apr 14, 2026
…stop_at (tirth8205#239, tirth8205#241)

Five root-cause fixes that together resolve every pre-existing Windows
test failure on main.  Each fix has targeted regression tests; the net
effect is full green CI on Windows (8 failures -> 0).

Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows (tirth8205#239)
--------------------------------------------------------------------
write_text() called without encoding="utf-8".  The em-dash in the header
is U+2014 which Python encodes as cp1252 byte 0x97 on Windows.  Any later
UTF-8 read fails with UnicodeDecodeError.

Fix: add encoding="utf-8" (matches sibling _ensure_repo_gitignore).
Test: test_auto_gitignore_is_valid_utf8 — asserts UTF-8 byte sequence,
rejects cp1252 byte.

Bug 2: Databricks notebook detection fails on CRLF line endings (tirth8205#239)
----------------------------------------------------------------------
source.startswith(b"# Databricks notebook source\n") hard-codes LF.
Windows git checkout (core.autocrlf=true) produces CRLF.  All Databricks
handling silently bypassed — 4 tests fail.

Fix: parse first line robustly, strip trailing \r before exact match.
Tests: test_databricks_header_crlf_line_endings,
test_databricks_header_lf_line_endings_still_work,
test_databricks_header_prefix_false_positive_rejected.

Bug 3: Stale FastMCP API in async regression guard (tirth8205#239)
---------------------------------------------------------
test_heavy_tools_are_coroutines called mcp.get_tools() which does not
exist in fastmcp>=2.14.0 (pinned in pyproject.toml).  The guard has
been silently broken since it was written — the protection promised by
PR tirth8205#231 for tirth8205#46/tirth8205#136 was never actually enforced.

Fix: resolve tools via getattr(crg_main, name) like the sibling test.
Drop @pytest.mark.asyncio since no event loop is needed.
Test: test_regression_guard_does_not_depend_on_fastmcp_internals —
AST-walks the guard source to ensure no mcp internal API references.

Bug 4-5: find_repo_root walks above test sandbox (tirth8205#241)
-------------------------------------------------------
test_returns_none_without_git and test_falls_back_to_start fail on any
machine where tmp_path has a git-initialized ancestor (dotfiles repo at
~/.git — very common on developer machines).

Fix: add optional stop_at parameter to find_repo_root() and
find_project_root().  When set, the walk examines stop_at for .git and
then stops.  Default is None (existing walk-to-root behavior).  Fully
backward-compatible — all 7 production callers unchanged.
Tests: test_stop_at_prevents_escape_to_outer_git,
test_stop_at_finds_git_at_boundary,
test_stop_at_forwarded_to_find_repo_root.

Files changed
-------------
- code_review_graph/incremental.py — encoding fix + stop_at API
- code_review_graph/parser.py — CRLF-tolerant Databricks detection
- tests/test_incremental.py — gitignore UTF-8 guard + stop_at tests
- tests/test_main.py — fixed async guard + meta-guard
- tests/test_notebook.py — CRLF + LF + false-positive guards
azizur100389 added a commit to azizur100389/code-review-graph that referenced this pull request Apr 14, 2026
…stop_at (tirth8205#239, tirth8205#241)

Five root-cause fixes that together resolve every pre-existing Windows
test failure on main.  Each fix has targeted regression tests; the net
effect is full green CI on Windows (8 failures -> 0).

Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows (tirth8205#239)
--------------------------------------------------------------------
write_text() called without encoding="utf-8".  The em-dash in the header
is U+2014 which Python encodes as cp1252 byte 0x97 on Windows.  Any later
UTF-8 read fails with UnicodeDecodeError.

Fix: add encoding="utf-8" (matches sibling _ensure_repo_gitignore).
Test: test_auto_gitignore_is_valid_utf8 — asserts UTF-8 byte sequence,
rejects cp1252 byte.

Bug 2: Databricks notebook detection fails on CRLF line endings (tirth8205#239)
----------------------------------------------------------------------
source.startswith(b"# Databricks notebook source\n") hard-codes LF.
Windows git checkout (core.autocrlf=true) produces CRLF.  All Databricks
handling silently bypassed — 4 tests fail.

Fix: parse first line robustly, strip trailing \r before exact match.
Tests: test_databricks_header_crlf_line_endings,
test_databricks_header_lf_line_endings_still_work,
test_databricks_header_prefix_false_positive_rejected.

Bug 3: Stale FastMCP API in async regression guard (tirth8205#239)
---------------------------------------------------------
test_heavy_tools_are_coroutines called mcp.get_tools() which does not
exist in fastmcp>=2.14.0 (pinned in pyproject.toml).  The guard has
been silently broken since it was written — the protection promised by
PR tirth8205#231 for tirth8205#46/tirth8205#136 was never actually enforced.

Fix: resolve tools via getattr(crg_main, name) like the sibling test.
Drop @pytest.mark.asyncio since no event loop is needed.
Test: test_regression_guard_does_not_depend_on_fastmcp_internals —
AST-walks the guard source to ensure no mcp internal API references.

Bug 4-5: find_repo_root walks above test sandbox (tirth8205#241)
-------------------------------------------------------
test_returns_none_without_git and test_falls_back_to_start fail on any
machine where tmp_path has a git-initialized ancestor (dotfiles repo at
~/.git — very common on developer machines).

Fix: add optional stop_at parameter to find_repo_root() and
find_project_root().  When set, the walk examines stop_at for .git and
then stops.  Default is None (existing walk-to-root behavior).  Fully
backward-compatible — all 7 production callers unchanged.
Tests: test_stop_at_prevents_escape_to_outer_git,
test_stop_at_finds_git_at_boundary,
test_stop_at_forwarded_to_find_repo_root.

Files changed
-------------
- code_review_graph/incremental.py — encoding fix + stop_at API
- code_review_graph/parser.py — CRLF-tolerant Databricks detection
- tests/test_incremental.py — gitignore UTF-8 guard + stop_at tests
- tests/test_main.py — fixed async guard + meta-guard
- tests/test_notebook.py — CRLF + LF + false-positive guards
GrimoireScribe pushed a commit to GrimoireScribe/code-review-graph that referenced this pull request Apr 17, 2026
CLAUDE.md:
- Update graph DB baseline to schema v7 / 1914 nodes / 20966 edges
- Flip "Git subprocess / MCP stdio" defect from OPEN to RESOLVED,
  reference the DEVNULL fix commit f0f8c35
- Add Upstream Sync Policy section documenting what landed (tirth8205#184,
  tirth8205#216, tirth8205#127), what's hard-skipped (tirth8205#231, tirth8205#253), and what's gated
  (tirth8205#249 on synthetic-node bypass confirmation)
- Inline the v7-slot occupation hazard so future upstream pulls
  see the landmine before renumbering goes wrong
- Note that MCP scope is user, not project

.mcp.json:
- Remove the project-scope code-review-graph entry that was
  shadowing the user-scope fork binary with upstream PyPI.
  User scope now wins unambiguously.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tirth8205 pushed a commit that referenced this pull request Apr 18, 2026
)

Five root-cause fixes that together resolve every pre-existing Windows
test failure on main.  Each fix has targeted regression tests; the net
effect is full green CI on Windows (8 failures -> 0).

Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows (#239)
--------------------------------------------------------------------
write_text() called without encoding="utf-8".  The em-dash in the header
is U+2014 which Python encodes as cp1252 byte 0x97 on Windows.  Any later
UTF-8 read fails with UnicodeDecodeError.

Fix: add encoding="utf-8" (matches sibling _ensure_repo_gitignore).
Test: test_auto_gitignore_is_valid_utf8 — asserts UTF-8 byte sequence,
rejects cp1252 byte.

Bug 2: Databricks notebook detection fails on CRLF line endings (#239)
----------------------------------------------------------------------
source.startswith(b"# Databricks notebook source\n") hard-codes LF.
Windows git checkout (core.autocrlf=true) produces CRLF.  All Databricks
handling silently bypassed — 4 tests fail.

Fix: parse first line robustly, strip trailing \r before exact match.
Tests: test_databricks_header_crlf_line_endings,
test_databricks_header_lf_line_endings_still_work,
test_databricks_header_prefix_false_positive_rejected.

Bug 3: Stale FastMCP API in async regression guard (#239)
---------------------------------------------------------
test_heavy_tools_are_coroutines called mcp.get_tools() which does not
exist in fastmcp>=2.14.0 (pinned in pyproject.toml).  The guard has
been silently broken since it was written — the protection promised by
PR #231 for #46/#136 was never actually enforced.

Fix: resolve tools via getattr(crg_main, name) like the sibling test.
Drop @pytest.mark.asyncio since no event loop is needed.
Test: test_regression_guard_does_not_depend_on_fastmcp_internals —
AST-walks the guard source to ensure no mcp internal API references.

Bug 4-5: find_repo_root walks above test sandbox (#241)
-------------------------------------------------------
test_returns_none_without_git and test_falls_back_to_start fail on any
machine where tmp_path has a git-initialized ancestor (dotfiles repo at
~/.git — very common on developer machines).

Fix: add optional stop_at parameter to find_repo_root() and
find_project_root().  When set, the walk examines stop_at for .git and
then stops.  Default is None (existing walk-to-root behavior).  Fully
backward-compatible — all 7 production callers unchanged.
Tests: test_stop_at_prevents_escape_to_outer_git,
test_stop_at_finds_git_at_boundary,
test_stop_at_forwarded_to_find_repo_root.

Files changed
-------------
- code_review_graph/incremental.py — encoding fix + stop_at API
- code_review_graph/parser.py — CRLF-tolerant Databricks detection
- tests/test_incremental.py — gitignore UTF-8 guard + stop_at tests
- tests/test_main.py — fixed async guard + meta-guard
- tests/test_notebook.py — CRLF + LF + false-positive guards
npkriami18 pushed a commit to npkriami18/code-review-graph that referenced this pull request Apr 21, 2026
…irth8205#274)

Five root-cause fixes that together resolve every pre-existing Windows
test failure on main.  Each fix has targeted regression tests; the net
effect is full green CI on Windows (8 failures -> 0).

Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows (tirth8205#239)
--------------------------------------------------------------------
write_text() called without encoding="utf-8".  The em-dash in the header
is U+2014 which Python encodes as cp1252 byte 0x97 on Windows.  Any later
UTF-8 read fails with UnicodeDecodeError.

Fix: add encoding="utf-8" (matches sibling _ensure_repo_gitignore).
Test: test_auto_gitignore_is_valid_utf8 — asserts UTF-8 byte sequence,
rejects cp1252 byte.

Bug 2: Databricks notebook detection fails on CRLF line endings (tirth8205#239)
----------------------------------------------------------------------
source.startswith(b"# Databricks notebook source\n") hard-codes LF.
Windows git checkout (core.autocrlf=true) produces CRLF.  All Databricks
handling silently bypassed — 4 tests fail.

Fix: parse first line robustly, strip trailing \r before exact match.
Tests: test_databricks_header_crlf_line_endings,
test_databricks_header_lf_line_endings_still_work,
test_databricks_header_prefix_false_positive_rejected.

Bug 3: Stale FastMCP API in async regression guard (tirth8205#239)
---------------------------------------------------------
test_heavy_tools_are_coroutines called mcp.get_tools() which does not
exist in fastmcp>=2.14.0 (pinned in pyproject.toml).  The guard has
been silently broken since it was written — the protection promised by
PR tirth8205#231 for tirth8205#46/tirth8205#136 was never actually enforced.

Fix: resolve tools via getattr(crg_main, name) like the sibling test.
Drop @pytest.mark.asyncio since no event loop is needed.
Test: test_regression_guard_does_not_depend_on_fastmcp_internals —
AST-walks the guard source to ensure no mcp internal API references.

Bug 4-5: find_repo_root walks above test sandbox (tirth8205#241)
-------------------------------------------------------
test_returns_none_without_git and test_falls_back_to_start fail on any
machine where tmp_path has a git-initialized ancestor (dotfiles repo at
~/.git — very common on developer machines).

Fix: add optional stop_at parameter to find_repo_root() and
find_project_root().  When set, the walk examines stop_at for .git and
then stops.  Default is None (existing walk-to-root behavior).  Fully
backward-compatible — all 7 production callers unchanged.
Tests: test_stop_at_prevents_escape_to_outer_git,
test_stop_at_finds_git_at_boundary,
test_stop_at_forwarded_to_find_repo_root.

Files changed
-------------
- code_review_graph/incremental.py — encoding fix + stop_at API
- code_review_graph/parser.py — CRLF-tolerant Databricks detection
- tests/test_incremental.py — gitignore UTF-8 guard + stop_at tests
- tests/test_main.py — fixed async guard + meta-guard
- tests/test_notebook.py — CRLF + LF + false-positive guards

(cherry picked from commit aa627fb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant