Skip to content

ENH: Add comprehensive GIL release safety tests#6029

Merged
hjmjohnson merged 1 commit intoInsightSoftwareConsortium:mainfrom
hjmjohnson:gil-release-tests
Apr 9, 2026
Merged

ENH: Add comprehensive GIL release safety tests#6029
hjmjohnson merged 1 commit intoInsightSoftwareConsortium:mainfrom
hjmjohnson:gil-release-tests

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

Summary

Add 8 safety tests for the SWIG `-threads` GIL release feature
(`ITK_PYTHON_RELEASE_GIL`), covering known gotchas from the SWIG
issue tracker and Python C API documentation.

Companion to PR #6022. Tests pass with both `GIL=ON` and `GIL=OFF`.

Test results

Setting Python 3.10 3.11 3.12 3.13 3.14
GIL=ON 15/15 15/15 15/15 15/15 15/15
GIL=OFF -- -- -- 15/15 --

Tests and references

# Test Reference
1 Callback safety during filter execution swig/swig#3091, swig/swig#2670
2 Concurrent callbacks from 4 threads Python C API: non-Python threads
3 Exception propagation across GIL boundary SWIG-devel: exception + threads segfault
4 Object destruction under concurrent access Python refcounting
5 Concurrent image processing (data races) ITK MultiThreaderBase thread safety
6 Signal handling while GIL released Python signals and threads
7 GIL reacquisition deadlock detection PyGILState_Ensure
8 Stress test (200 create/destroy, 4 threads) swig/swig#2396, swig/swig#3121

🤖 Generated with Claude Code

Add test_gil_release_safety.py covering 8 known SWIG -threads gotchas:

  1. Callback safety (swig/swig#3091, swig/swig#2670)
  2. Concurrent callbacks from 4 threads
  3. Exception propagation across GIL boundary
  4. Object destruction under concurrent access
  5. Concurrent image processing (data races)
  6. Signal handling while GIL is released
  7. GIL reacquisition deadlock detection
  8. Threading stress test (200 create/destroy cycles)

Test results (15 assertions, 0 failures):

  ITK_PYTHON_RELEASE_GIL=ON:
    Python 3.10.20: 15 passed, 0 failed, 0 skipped
    Python 3.11.15: 15 passed, 0 failed, 0 skipped
    Python 3.12.13: 15 passed, 0 failed, 0 skipped
    Python 3.13.12: 15 passed, 0 failed, 0 skipped
    Python 3.14.3:  15 passed, 0 failed, 0 skipped

  ITK_PYTHON_RELEASE_GIL=OFF:
    Python 3.13.12: 15 passed, 0 failed, 0 skipped

All tests pass regardless of the GIL release setting because they
test safety invariants (no crashes, no deadlocks, no data corruption)
rather than concurrency speedup (which is tested by the existing
test_gil_release.py).

Each test includes references to the SWIG issue or Python doc that
motivated it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Enhancement Improvement of existing methods or implementation area:Python wrapping Python bindings for a class type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct labels Apr 9, 2026
@hjmjohnson hjmjohnson requested a review from thewtex April 9, 2026 18:22
@hjmjohnson hjmjohnson self-assigned this Apr 9, 2026
Copy link
Copy Markdown
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjmjohnson thank you! 💯

@hjmjohnson hjmjohnson marked this pull request as ready for review April 9, 2026 23:14
@hjmjohnson hjmjohnson merged commit 36bf35c into InsightSoftwareConsortium:main Apr 9, 2026
18 checks passed
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 9, 2026

Greptile Summary

This PR adds test_gil_release_safety.py, a new 489-line Python test covering 8 GIL-release safety scenarios (callbacks, concurrent execution, exception propagation, object destruction, data races, signal delivery, deadlock detection, and a stress test), plus the corresponding CMake registration.

  • P1 — deadlock thread blocks interpreter exit (Test 7, line 419): the deadlock-detection thread is non-daemon. If a real deadlock is found, sys.exit(1) hangs forever because Python waits for all non-daemon threads; the CI runner never receives the failure exit code. Fix: create the thread with daemon=True.

Confidence Score: 4/5

Safe to merge after fixing the daemon-thread issue in Test 7, which would cause CI to hang indefinitely if a real deadlock were ever detected.

One P1 finding: the deadlock-detection thread is non-daemon, so detecting a deadlock would hang the interpreter at sys.exit() and prevent CI from reporting the failure. All other findings are P2 style issues.

Wrapping/Generators/Python/Tests/test_gil_release_safety.py — specifically the thread creation at line 419 in Test 7.

Important Files Changed

Filename Overview
Wrapping/Generators/Python/Tests/test_gil_release_safety.py New 489-line test file covering 8 GIL-release safety scenarios; one P1 bug (non-daemon deadlock thread blocks interpreter exit) and two minor P2 style issues (unused import, unused variable).
Wrapping/Generators/Python/Tests/CMakeLists.txt Adds PythonGILReleaseSafetyTest entry following the same pattern as the existing PythonGILReleaseTest; no issues.

Sequence Diagram

sequenceDiagram
    participant Main as Main Thread
    participant ITK as ITK C++ (GIL released)
    participant CB as Python Callback
    participant Worker as Worker Thread

    Note over Main,ITK: Test 1 & 7 — Callback safety / deadlock
    Main->>ITK: filter.Update() [GIL released via SWIG -threads]
    ITK->>CB: progress event → PyGILState_Ensure
    CB-->>ITK: return → PyGILState_Release
    ITK-->>Main: return [GIL re-acquired]

    Note over Main,Worker: Tests 2, 4, 5, 8 — Concurrent execution
    Main->>Worker: Thread.start() (×4)
    Worker->>ITK: filter.Update() [GIL released]
    ITK-->>Worker: result
    Worker-->>Main: Thread.join(timeout)

    Note over Main,ITK: Test 3 — Exception propagation
    Main->>ITK: itk.imread(bad_path) [GIL released]
    ITK-->>Main: C++ exception → SWIG restores thread state → RuntimeError

    Note over Main: Test 6 — Signal delivery
    Main->>ITK: filter.Update() [GIL released]
    Worker->>Main: os.kill(SIGUSR1) after 10 ms
    ITK-->>Main: return [GIL re-acquired → signal handler fires]
Loading

Reviews (1): Last reviewed commit: "ENH: Add comprehensive GIL release safet..." | Re-trigger Greptile

Comment on lines +419 to +421
t = threading.Thread(target=test_deadlock)
t.start()
t.join(timeout=DEADLOCK_TIMEOUT)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Deadlocked thread blocks interpreter shutdown

If the deadlock scenario actually occurs, t.is_alive() is True, the check(..., False, ...) path records a failure, and the script eventually reaches sys.exit(1). Python's shutdown sequence waits for all non-daemon threads to finish before exiting. Because t is genuinely deadlocked, sys.exit(1) will block indefinitely — the CI runner never receives the failure exit code and the test job hangs until it times out externally.

Mark the thread as a daemon so Python can exit cleanly even if the deadlock is real:

Suggested change
t = threading.Thread(target=test_deadlock)
t.start()
t.join(timeout=DEADLOCK_TIMEOUT)
t = threading.Thread(target=test_deadlock, daemon=True)
t.start()
t.join(timeout=DEADLOCK_TIMEOUT)

import sys
import threading
import time
import weakref
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unused import

weakref is imported but never referenced anywhere in the file. Remove it to keep the imports clean.

Suggested change
import weakref

with concurrent_lock:
concurrent_callback_calls.append(threading.current_thread().name)
# Verify Python objects are accessible under GIL
result = sum(range(50))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unused local variable

result is assigned but never read. The intent (exercising Python runtime to confirm GIL is held) is clear from the comment, so use the conventional throwaway name.

Suggested change
result = sum(range(50))
_ = sum(range(50))

Comment on lines +186 to +193
threads = [
threading.Thread(target=run_filter_with_callback, name=f"cb-{i}") for i in range(4)
]
for t in threads:
t.start()
for t in threads:
t.join(timeout=30)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-daemon threads may delay interpreter exit on timeout

Tests 2, 4, 5, and 8 all join threads with a finite timeout. If a thread is still alive after the timeout, the script continues and may eventually call sys.exit(), but Python's shutdown will wait for all non-daemon threads to finish. Making stress/concurrent threads daemons keeps shutdown predictable:

Suggested change
threads = [
threading.Thread(target=run_filter_with_callback, name=f"cb-{i}") for i in range(4)
]
for t in threads:
t.start()
for t in threads:
t.join(timeout=30)
threads = [
threading.Thread(target=run_filter_with_callback, name=f"cb-{i}", daemon=True) for i in range(4)
]

The same pattern applies to the thread lists in Tests 4, 5, and 8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Python wrapping Python bindings for a class type:Enhancement Improvement of existing methods or implementation type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants