fix(voice): skip commit_user_turn on close when no turn is pending#5818
fix(voice): skip commit_user_turn on close when no turn is pending#5818chenghao-mou wants to merge 1 commit into
Conversation
- Add `AudioRecognition.has_pending_user_turn` and gate the session-close commit on it, avoiding spurious empty-turn commits and the resulting timeout warnings during teardown. - Make the close-path commit best-effort: catch timeout / API / runtime errors so STT or EOU failures don't block a clean shutdown. - Wrap `activity.drain()` in a non-cancelling watchdog that logs a snapshot of running tasks if drain stalls past 300s, for shutdown observability. - Mark inference STT error 2007 (idle session closed by server) as non-retryable; reconnecting just produces the same error. - Honor `APIError.retryable` in STT (`stt/stt.py`) and TTS (`tts/tts.py` chunked stream) retry loops, matching the existing pattern in `llm/llm.py` and `inference/interruption.py`. Co-authored-by: Cursor <cursoragent@cursor.com>
| try: | ||
| await audio_recognition.commit_user_turn( | ||
| audio_detached=True, | ||
| transcript_timeout=self._opts.session_close_transcript_timeout, | ||
| ) | ||
| except (asyncio.TimeoutError, APIError, RuntimeError) as e: | ||
| logger.warning("commit_user_turn during close failed", exc_info=e) |
There was a problem hiding this comment.
π΄ Caught exception from commit_user_turn re-raised by aclose(), breaking cleanup
When commit_user_turn fails with a caught exception (e.g. APIError), the exception is correctly handled in the try/except block at line 1017-1023. However, the underlying _commit_user_turn_atask still holds the exception. When await activity.aclose() is called on line 1025, it calls audio_recognition.aclose() (audio_recognition.py:561-567) which re-awaits the same completed task:
if self._commit_user_turn_atask is not None:
try:
await self._commit_user_turn_atask
except asyncio.CancelledError:
passSince aclose() only catches CancelledError, the original exception (e.g. APIError, RuntimeError) is re-raised. This propagates through activity.aclose() β _close_session() at agent_activity.py:1004, skipping remaining cleanup (background speech interruption, scheduling task cancellation, event listener detachment, etc.).
Why this is new behavior
In the old fire-and-forget code, the task was typically still running when aclose() was called (e.g. waiting on transcript_timeout), so aclose() would wait for it to finish normally. The new await pattern guarantees the task has already completed (possibly with an error) before aclose() runs, making the re-raise path reliably reachable.
Prompt for agents
The problem is that after catching an exception from await commit_user_turn(), the underlying _commit_user_turn_atask still holds the exception. When activity.aclose() later calls audio_recognition.aclose(), it re-awaits that same task, which re-raises the exception (only CancelledError is caught there).
Two possible approaches:
1. In agent_session.py _aclose_impl, after the except block that catches the commit_user_turn error, set audio_recognition._commit_user_turn_atask = None so that aclose() skips the re-await. This is the simplest fix but accesses a private attribute.
2. In audio_recognition.py aclose() (around line 563-567), broaden the except clause to catch Exception (not just CancelledError) for the _commit_user_turn_atask await, since by the time aclose() runs any exception from that task should have already been handled by the caller. For example:
except (asyncio.CancelledError, Exception):
pass
Approach 2 is more robust because it fixes the issue at the source (aclose not being resilient to task failures) rather than relying on the caller to clean up internal state.
Was this helpful? React with π or π to provide feedback.
AudioRecognition.has_pending_user_turnand gate the session-close commit on it, avoiding spurious empty-turn commits and the resulting timeout warnings during teardown.activity.drain()in a non-cancelling watchdog that logs a snapshot of running tasks if drain stalls past 300s, for shutdown observability.APIError.retryablein STT (stt/stt.py) and TTS (tts/tts.pychunked stream) retry loops, matching the existing pattern inllm/llm.pyandinference/interruption.py.