Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 18, 2025

Description

Changes GetResponseAsync in PersistentAgentsChatClient to use the non-streaming API (CreateRunAsync with stream: false) instead of delegating to GetStreamingResponseAsync. This provides a proper non-streaming implementation that follows the SDK's standard polling pattern.

Changes

  • New GetResponseAsync implementation: Uses CreateRunAsync(stream: false) with polling via GetRunAsync (500ms interval, matching SDK samples)
  • BuildChatResponseFromRunAsync helper: Handles run completion states:
    • Extracts usage tokens from run
    • Returns ErrorContent or throws on RunStatus.Failed
    • Returns FunctionCallContent/McpServerToolApprovalRequestContent on RunStatus.RequiresAction
    • Fetches run steps via GetRunStepsAsync to extract CodeInterpreter and MCP results
    • Fetches messages via GetMessagesAsync on completion
  • CodeInterpreter result handling: Processes RunStepCodeInterpreterToolCall to produce:
    • CodeInterpreterToolCallContent for Python code input
    • CodeInterpreterToolResultContent for outputs (logs and images)
  • MCP result handling: Processes RunStepMcpToolCall to produce McpServerToolResultContent with tool call output
  • ConvertThreadMessageToChatMessage helper: Converts PersistentThreadMessageChatMessage with annotation handling for file citations and paths
  • Refactored shared initialization logic between GetResponseAsync and GetStreamingResponseAsync:
    • PreparedRunContext class: Holds prepared run state (RequestContext, RunOptions, ThreadId, ActiveRun, ToolOutputs, ToolApprovals)
    • PrepareRunContextAsync method: Creates RequestContext, calls CreateRunOptionsAsync, gets thread ID, finds active runs
    • EnsureThreadAsync method: Creates new thread if needed or cancels active run

Note

Recorded test sessions will need updates since requests now contain stream: false instead of stream: true.


This checklist is used to make sure that common guidelines for a pull request are followed.

General Guidelines

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

SDK Generation Guidelines

  • If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.
  • The generate.cmd file for the SDK has been updated with the version of AutoRest, as well as the commitid of your swagger spec or link to the swagger spec, used to generate the code.
  • The *.csproj and AssemblyInfo.cs files have been updated with the new version of the SDK.
Original prompt

Currently the https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/ai/Azure.AI.Agents.Persistent/src/Custom/PersistentAgentsChatClient.cs

Uses the GetResponsesAsync implementation via the PersistentChatClient streaming API. I need to change this internal implementation to actually uses the non-streaming API from the innerClient.

This might require some further changes in the internals of the client. Please consider and use the samples as inspiration for how to consume the non-streaming API and do the same abstract conversion necessary for the Microsoft.Extensions.AI types.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

This change modifies the GetResponseAsync method to use the non-streaming
API from the innerClient instead of delegating to GetStreamingResponseAsync.

Key changes:
- GetResponseAsync now uses CreateRunAsync with stream: false
- Polling loop to wait for run completion using GetRunAsync
- Fetches messages using GetMessagesAsync after run completes
- Handles RequiresAction status for function calls
- Converts PersistentThreadMessage to ChatMessage format

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor PersistentAgentsChatClient to use non-streaming API Implement non-streaming API for PersistentAgentsChatClient.GetResponseAsync Dec 18, 2025
Copilot AI requested a review from rogerbarreto December 18, 2025 12:10
Copilot AI and others added 2 commits December 18, 2025 13:13
…ResponseAsync

This change adds handling for CodeInterpreter and MCP tool call results in the
non-streaming GetResponseAsync implementation:

Key changes:
- Fetch run steps after run completion using GetRunStepsAsync
- For RunStepCodeInterpreterToolCall:
  - Extract Input and convert to CodeInterpreterToolCallContent
  - Extract Outputs (logs/images) and convert to CodeInterpreterToolResultContent
- For RunStepMcpToolCall:
  - Extract Output and convert to McpServerToolResultContent

This matches the behavior of the streaming implementation that processes
RunStepDetailsUpdate for CodeInterpreter content.

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 18, 2025

@copilot I don't see implementation to handle MCP results neither CodeInterpreterResults from the content similar hows is done in the streaming.

Please have a look on the below samples:...

Addressed in commit 0324512. Added handling for:

  • Code Interpreter: Fetches run steps via GetRunStepsAsync and processes RunStepCodeInterpreterToolCall to produce:

    • CodeInterpreterToolCallContent (for the Python code input)
    • CodeInterpreterToolResultContent (for outputs - both image files and log outputs)
  • MCP: Processes RunStepMcpToolCall to produce McpServerToolResultContent with the tool call output

This mirrors the behavior of the streaming implementation that handles RunStepDetailsUpdate.CodeInterpreterInput and RunStepDetailsUpdate.CodeInterpreterOutputs.

@rogerbarreto
Copy link
Contributor

@copilot Both GetResponse and GetResponseStreaming methods now share a same initial logit for thread.GetRuns and some other lines, can we improve the reusability of this implementation as best as possible to avoid duplicated code/impl.

…on initialization logic

Extract shared run initialization code into:
- PreparedRunContext class: Holds prepared run state
- PrepareRunContextAsync method: Gets thread ID, active run, and tool results
- EnsureThreadAsync method: Creates thread if needed or cancels active run

This reduces code duplication between streaming and non-streaming implementations
while maintaining the same behavior.

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 18, 2025

@copilot Both GetResponse and GetResponseStreaming methods now share a same initial logit for thread.GetRuns and some other lines, can we improve the reusability of this implementation as best as poss...

Addressed in commit 968be26. Extracted the shared initialization logic into:

  • PreparedRunContext class: Holds the prepared run state (RequestContext, RunOptions, ThreadId, ActiveRun, ToolOutputs, ToolApprovals)
  • PrepareRunContextAsync method: Handles creating RequestContext, calling CreateRunOptionsAsync, getting thread ID, and finding active runs
  • EnsureThreadAsync method: Creates a new thread if needed or cancels an active run before creating a new one

Both GetResponseAsync and GetStreamingResponseAsync now call PrepareRunContextAsync() to get the prepared context, then use prepared.HasToolResultsForActiveRun to determine if submitting tool results or prepared.EnsureThreadAsync() before creating a new run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants