Skip to content

Thinking Parts and Text Parts get merged Into Single Part during Streaming Mode #770

@a9a4k

Description

@a9a4k

Describe the bug
When using StreamingMode.SSE with an agent that implements thinking, thinking parts and regular text parts get merged into a single text part in the final saved event. This happens because the streaming text accumulation logic in google_llm.py combines all text chunks into a single part, without preserving their original classification as thinking or regular text.

This issue makes it impossible to properly display thinking parts separately from the model's final output when retrieving events from the session.

To Reproduce
Steps to reproduce the behavior:

Run the agent in streaming mode:

    agent = Agent(
        name="thinking_search_agent",
        model="gemini-2.5-pro-preview-03-25",
        description="Agent to answer questions using Google Search.",
        instruction="You are an expert researcher. You always stick to the facts.",
        planner=BuiltInPlanner(
            thinking_config=types.ThinkingConfig(
                include_thoughts=True,
            ),
        ),
        tools=[google_search],
    )

    runner = InMemoryRunner(agent, app_name="test-app")

    session = runner.session_service.create_session(
        app_name="test-app",
        user_id="test-user",
    )

    content = types.Content(
        role="user",
        parts=[
            types.Part.from_text(
                text="""Conduct a comprehensive analysis of remote-work productivity trends since 2020. 1. Search for and compile peer-reviewed studies and industry reports (2020–2025) on key productivity metrics (e.g., output per hour, project completion rates). 2. Identify and summarize at least three major factors (technology tools, management practices, employee well-being) driving changes in those metrics. 3. Compare findings across at least two different sectors (e.g., software vs. finance). 4. Assess any counter-evidence or limitations in the data. 5. Based on your sequential analysis with citations, conclude whether fully remote, hybrid, or in-office models currently yield the highest productivity—and why."""
            )
        ],
    )

    async for event in runner.run_async(
        user_id="test-user",
        session_id=session.id,
        new_message=content,
        run_config=RunConfig(streaming_mode=StreamingMode.SSE),
    ):
        print(f"event: {event}")

    print(
        f"Session events: {runner.session_service.get_session(app_name='test-app', user_id='test-user', session_id=session.id).events}"
    )

Examine the final event saved in the session - it will have a single part with both thinking content and regular text combined

Expected behavior
The final event should contain two separate parts:

A thinking part with thought=True containing all planning/reasoning content
A regular text part with the final answer or response

Each streaming chunk should maintain its proper classification as "thinking" or "regular text," and the final accumulated response should preserve this distinction with separate parts.

Screenshots
N/A

Desktop (please complete the following information):

  • OS:: MacOS
  • Python version(python -V): Python 3.12.7
  • ADK version(pip show google-adk): 0.2.0

Additional context
The issue might not only be related to Streaming mode, but I have not tested normal mode or Bidi to say for certain

The root cause seems to be in the streaming accumulation logic in google_llm.py.

Metadata

Metadata

Assignees

Labels

core[Component] This issue is related to the core interface and implementation

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions