Context
When using claude-agent-sdk via sdk_query(), the underlying SubprocessCLITransport spawns a claude CLI child process. If the calling code needs to cancel (timeout, SIGTERM, etc.), the async generator can be aclose()'d, which triggers SubprocessCLITransport.close() → process.terminate().
However, close() has no bounded wait after terminate() and no SIGKILL fallback. If the CLI process ignores SIGTERM (e.g. stuck in a network call), close() hangs and the process becomes orphaned.
Current workaround
We extract the PID by walking async generator frame internals:
def _extract_sdk_pid(gen):
frame = getattr(gen, "ag_frame", None)
for val in frame.f_locals.values():
inner_frame = getattr(val, "ag_frame", None)
transport = inner_frame.f_locals.get("chosen_transport")
proc = getattr(transport, "_process", None)
return proc.pid
This is fragile and depends on SDK internals.
Feature request
-
Expose transport.pid or transport.process as a public property on SubprocessCLITransport so callers can safely extract the PID for cleanup.
-
Add a bounded wait + SIGKILL fallback to SubprocessCLITransport.close() so callers don't need to implement their own cleanup. Something like:
def close(self, timeout=10):
self._process.terminate()
try:
self._process.wait(timeout)
except subprocess.TimeoutExpired:
self._process.kill()
-
Consider start_new_session=True in Popen() so os.killpg() can kill the entire process group (CLI spawns node.js children).
Environment
- claude-agent-sdk 0.1.58
- Python 3.12
- Docker (Linux containers)
- Use case: FastAPI sidecar service wrapping SDK for concurrent LLM completions
Context
When using
claude-agent-sdkviasdk_query(), the underlyingSubprocessCLITransportspawns aclaudeCLI child process. If the calling code needs to cancel (timeout, SIGTERM, etc.), the async generator can beaclose()'d, which triggersSubprocessCLITransport.close()→process.terminate().However,
close()has no bounded wait afterterminate()and noSIGKILLfallback. If the CLI process ignores SIGTERM (e.g. stuck in a network call),close()hangs and the process becomes orphaned.Current workaround
We extract the PID by walking async generator frame internals:
This is fragile and depends on SDK internals.
Feature request
Expose
transport.pidortransport.processas a public property onSubprocessCLITransportso callers can safely extract the PID for cleanup.Add a bounded wait + SIGKILL fallback to
SubprocessCLITransport.close()so callers don't need to implement their own cleanup. Something like:Consider
start_new_session=TrueinPopen()soos.killpg()can kill the entire process group (CLI spawns node.js children).Environment