fix(anthropic): keep audio flowing when <thinking> tags split across stream deltas#5794
Conversation
β¦deltas When tools are attached, Claude can wrap chain-of-thought in <thinking> tags. The previous stripping logic only checked each streamed delta in isolation, so a closing </thinking> tag split across deltas (e.g. "</" then "thinking>") left the parser stuck ignoring all remaining text. The assistant's actual reply was dropped, so TTS received no text and no audio was published whenever a function_tool was present. Replace the per-delta check with a stateful filter that scans across deltas, strips complete <thinking> spans, never gets stuck, and never drops text that only resembles a partial tag.
|
|
| def flush(self) -> str: | ||
| """Return any buffered text that is not part of a thinking span.""" | ||
| if self._inside: | ||
| self._buf = "" | ||
| return "" | ||
|
|
||
| # a dangling partial opening tag never completed: it was real text | ||
| out = self._buf | ||
| self._buf = "" | ||
| return out |
There was a problem hiding this comment.
π‘ _ThinkingTagFilter.flush() does not reset _inside flag, causing silent text loss across content blocks
When flush() is called at content_block_stop while _inside is True (i.e., a <thinking> tag was opened but never closed within that block), the method clears _buf but leaves _inside = True. Any subsequent text content blocks in the same stream will have ALL their text silently dropped because push() still treats incoming text as part of a thinking span.
Reproduction trace showing text permanently lost
f = _ThinkingTagFilter()
f.push('<thinking>reasoning...') # _inside becomes True
f.flush() # content_block_stop: clears _buf but _inside stays True
# Subsequent text block
f.push('The actual answer') # returns '' β silently dropped!
f.flush() # returns '' β answer permanently lostWhile this requires the model to emit an unclosed <thinking> tag (uncommon in practice), the consequence is severe when triggered: the assistant's spoken answer is entirely suppressed, producing silence from TTS. The fix should reset self._inside = False inside flush() (or at least when called from the content_block_stop handler at llm.py:467).
| def flush(self) -> str: | |
| """Return any buffered text that is not part of a thinking span.""" | |
| if self._inside: | |
| self._buf = "" | |
| return "" | |
| # a dangling partial opening tag never completed: it was real text | |
| out = self._buf | |
| self._buf = "" | |
| return out | |
| def flush(self) -> str: | |
| """Return any buffered text that is not part of a thinking span.""" | |
| if self._inside: | |
| self._buf = "" | |
| self._inside = False | |
| return "" | |
| # a dangling partial opening tag never completed: it was real text | |
| out = self._buf | |
| self._buf = "" | |
| return out | |
Was this helpful? React with π or π to provide feedback.
Summary
Agent audio stopped reaching the room whenever any
function_toolwas attached to an Agent using the Anthropic LLM, even a trivial no-arg tool. Withtools=[]audio worked fine.The Anthropic plugin strips Claude's
<thinking>β¦</thinking>chain-of-thought, which the model only emits when tools are supplied. The stripping logic inspected each streamed text delta on its own: it began ignoring text on a delta starting with<thinking>and only stopped if a single delta contained</thinking>. Since tokens stream piecemeal, the closing tag usually arrives split across deltas (e.g."</"then"thinking>"), so the parser stayed stuck and dropped every remaining text delta. The assistant's actual reply never reached TTS, so no audio was synthesized/published β matching the report (LLM fires, TTS shows audio duration, but nothing plays).This replaces the fragile per-delta check with a small stateful filter that scans across deltas, strips complete
<thinking>spans, can never get permanently stuck, and never drops text that merely looks like a partial tag.Test plan
LLMStream._parse_eventend-to-end proves the answer is emitted when the closing tag is split (fails before, passes after)pytest tests/test_plugin_anthropic.pyβ 14 passedpytest tests/test_agent_session.pyβ 27 passed (no regressions)ruff check,ruff format --check, andmypycleanFixes #5617