Skip to content

Fix agent tool call execution diagnostics#353

Merged
ModerRAS merged 1 commit into
masterfrom
codex/fix-agent-tool-calls
May 13, 2026
Merged

Fix agent tool call execution diagnostics#353
ModerRAS merged 1 commit into
masterfrom
codex/fix-agent-tool-calls

Conversation

@ModerRAS
Copy link
Copy Markdown
Owner

@ModerRAS ModerRAS commented May 13, 2026

Summary by CodeRabbit

  • New Features

    • Added detection for MiniMax-compatible endpoints to improve compatibility with alternative LLM providers.
    • Enhanced proxy tool support in tool-call parsing to handle additional tool types.
  • Tests

    • Added unit tests for proxy tool XML parsing validation.
    • Added unit tests for MiniMax endpoint compatibility detection.
  • Improvements

    • Enhanced logging and telemetry across the system for improved debugging and monitoring.
    • Improved snapshot handling with better null value normalization.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

📝 Walkthrough

Walkthrough

This PR enhances observability and tool resolution across the LLM service pipeline. It unifies tool-name lookup across built-in, external MCP, and proxy registries; adds MiniMax endpoint detection; and introduces structured logging with rich metadata throughout tool-call parsing, execution, and tracking workflows.

Changes

Tool Observability and Resolution Pipeline

Layer / File(s) Summary
Unified tool-name resolution and validation
TelegramSearchBot.LLM/Service/AI/LLM/McpToolHelper.cs
Introduces ResolveRegisteredToolName helper for case-insensitive lookup across built-in, external, and proxy registries. Updates TryParseToolCalls to use unified registration checks, refactors ParseToolElement to delegate tool resolution, and enriches logging with available tool counts when resolution fails.
Proxy tool XML parsing test and collection wiring
TelegramSearchBot.LLM.Test/Service/AI/LLM/McpToolHelperProxyTests.cs
Adds test collection attribute for isolation and introduces a test case verifying XML parsing of registered proxy tools with multiple required and optional parameters.
MiniMax endpoint compatibility detection and tests
TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs, TelegramSearchBot.LLM.Test/Service/AI/LLM/OpenAIToolCallNormalizationTests.cs
Introduces IsMiniMaxCompatibleEndpoint helper to detect MiniMax-compatible endpoints by provider, gateway, or model name. Adds test imports and three test cases covering MiniMax provider, MiniMax gateway with OpenAI provider, and OpenAI fallback scenarios.
Native tool-calling cycle instrumentation
TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs
Adds structured logs at setup (with MiniMax flag), cycle iteration (debug), post-cycle completion (finish reason, argument metadata), and tool execution (parsed arguments with context). Logs now capture tool names, call ids, chat/user/message identifiers, and serialized argument previews.
XML fallback and resume-from-snapshot tool execution logging
TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs
Logs tool execution before parsing in both XML fallback and resume-from-snapshot paths, capturing tool name, context, and argument payload. Reorders tool-indicator generation in resume path to occur after execution.
OpenAIResponsesService tool-call structured logging and ordering
TelegramSearchBot.LLM/Service/AI/LLM/OpenAIResponsesService.cs
Updates main and resume-from-snapshot tool execution logs with detailed structured metadata (tool name, call id, chat/user/message ids, arguments JSON). Moves tool-indicator generation after tool execution in resume path.
AgentLoopService content normalization and telemetry
TelegramSearchBot.LLMAgent/Service/AgentLoopService.cs
Normalizes recoveredContent to non-null empty strings throughout execution; adds telemetry logs on task start (recovery metadata, content length) and completion (final sequence, snapshot length). Extracts primary exceptions once for consistent error metadata propagation in retry and failure paths.
ChunkPollingService polling and snapshot delivery observability
TelegramSearchBot/Service/AI/LLM/ChunkPollingService.cs
Adds structured logs for task tracking, terminal chunk discovery/processing, snapshot delivery, and final snapshot handling. Introduces payload truncation helper for safe warning previews. Validates task status explicitly, warning on missing or unknown values during completion.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • ModerRAS/TelegramSearchBot#268: Both PRs modify McpToolHelper tool registration behavior—main PR refactors registered-tool resolution used by TryParseToolCalls, while PR #268 expands available tools through assembly registration.
  • ModerRAS/TelegramSearchBot#269: Main PR's unified tool-name resolution across built-in, external, and proxy registries directly builds on PR #269's proxy-tool registration and export changes in McpToolHelper.
  • ModerRAS/TelegramSearchBot#337: Both PRs modify OpenAIService.cs native tool-call handling and add to OpenAIToolCallNormalizationTests.cs; main PR adds logging and metadata normalization while PR #337 focuses on tool-call construction and argument fallbacks.

Poem

🐰 Tools now resolve where they belong,
Logging shows the journey strong,
MiniMax gates and native flows,
Every snapshot's secrets glows.
Observability made right,
From parsing through delivery's flight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.93% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the primary focus: improving diagnostic logging and handling for agent tool call execution across multiple services.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/fix-agent-tool-calls

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
TelegramSearchBot.LLM/Service/AI/LLM/OpenAIResponsesService.cs (1)

549-552: ⚡ Quick win

Consider aligning tool indicator display behavior with the main execution path.

The resume path now yields each tool indicator immediately after that tool executes (incremental feedback), while the main path batches all tool indicators and yields them together after all tools complete (lines 338-339). This creates a user-visible inconsistency between the two execution paths.

The resume path's incremental feedback approach provides better progress visibility. Consider refactoring the main path to match this behavior for consistency.

♻️ Align main path to yield tool indicators incrementally

In the main execution path (around lines 254-339), move the tool indicator generation and yielding inside the loop to match the resume path behavior:

 foreach (var funcCall in completedFuncCalls) {
     // ... normalization and history updates ...
     
     var argsDict = OpenAIService.DeserializeToolArgumentsForDisplay(argsJson);
-    toolIndicators.Append(McpToolHelper.FormatToolCallDisplay(name, argsDict));
     
     // Execute tool
     _logger.LogInformation(...);
     
     string toolResultString;
     try {
         // ... tool execution ...
     } catch (Exception ex) {
         // ... error handling ...
     }
     
+    var toolIndicator = McpToolHelper.FormatToolCallDisplay(name, argsDict);
+    currentMessageContentBuilder.Append(toolIndicator);
+    yield return currentMessageContentBuilder.ToString();
+    
     // Add function call output to history
     inputItems.Add(ResponseItem.CreateFunctionCallOutputItem(callId, toolResultString));
 }
 
-currentMessageContentBuilder.Append(toolIndicators.ToString());
-yield return currentMessageContentBuilder.ToString();

This would provide consistent incremental feedback in both execution paths.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIResponsesService.cs` around lines
549 - 552, The main execution path currently batches tool-indicator output,
causing inconsistency with the resume path which appends and yields each
indicator immediately; update the main tool-processing loop that builds tool
indicators (the code that currently calls McpToolHelper.FormatToolCallDisplay
and appends to newContentBuilder/fullContentBuilder in batch) to instead create
the toolIndicator, append it to newContentBuilder and fullContentBuilder and
yield newContentBuilder.ToString() inside the per-tool loop (the same pattern
used in the resume path) so both paths produce incremental feedback
consistently.
TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs (1)

1061-1066: ⚡ Quick win

Consider truncating the final string instead of taking 80 tool names.

The current code uses .Take(80) which takes up to 80 tool names before joining them with commas. If tool names are long, this could produce log lines exceeding thousands of characters. Consider truncating the final joined string to a character limit instead.

♻️ Proposed fix to limit log line length
             _logger.LogInformation(
                 "{ServiceName}: Starting native tool call cycle. Model={Model}, ToolCount={ToolCount}, ToolNames={ToolNames}",
                 ServiceName,
                 modelName,
                 nativeTools.Count,
-                string.Join(",", nativeTools.Select(t => t.FunctionName).Take(80)));
+                SanitizeAndTruncateArguments(string.Join(",", nativeTools.Select(t => t.FunctionName)), 200));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs` around lines 1061 -
1066, The log builds ToolNames by joining nativeTools.Select(t =>
t.FunctionName).Take(80) which limits count but not character length; change the
code around the _logger.LogInformation call in OpenAIService to first build the
joinedNames string (e.g., string.Join(",", nativeTools.Select(t =>
t.FunctionName))) and then truncate it to a safe max character length (e.g.,
200–500 chars) with an ellipsis when longer before passing it as the ToolNames
argument; ensure you still use ServiceName, modelName and nativeTools.Count as
before and only replace the ToolNames expression with the truncated string.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIResponsesService.cs`:
- Around line 549-552: The main execution path currently batches tool-indicator
output, causing inconsistency with the resume path which appends and yields each
indicator immediately; update the main tool-processing loop that builds tool
indicators (the code that currently calls McpToolHelper.FormatToolCallDisplay
and appends to newContentBuilder/fullContentBuilder in batch) to instead create
the toolIndicator, append it to newContentBuilder and fullContentBuilder and
yield newContentBuilder.ToString() inside the per-tool loop (the same pattern
used in the resume path) so both paths produce incremental feedback
consistently.

In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs`:
- Around line 1061-1066: The log builds ToolNames by joining
nativeTools.Select(t => t.FunctionName).Take(80) which limits count but not
character length; change the code around the _logger.LogInformation call in
OpenAIService to first build the joinedNames string (e.g., string.Join(",",
nativeTools.Select(t => t.FunctionName))) and then truncate it to a safe max
character length (e.g., 200–500 chars) with an ellipsis when longer before
passing it as the ToolNames argument; ensure you still use ServiceName,
modelName and nativeTools.Count as before and only replace the ToolNames
expression with the truncated string.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0822b8bc-2884-4434-9edf-2dccd09f517f

📥 Commits

Reviewing files that changed from the base of the PR and between 27617c0 and b204041.

📒 Files selected for processing (7)
  • TelegramSearchBot.LLM.Test/Service/AI/LLM/McpToolHelperProxyTests.cs
  • TelegramSearchBot.LLM.Test/Service/AI/LLM/OpenAIToolCallNormalizationTests.cs
  • TelegramSearchBot.LLM/Service/AI/LLM/McpToolHelper.cs
  • TelegramSearchBot.LLM/Service/AI/LLM/OpenAIResponsesService.cs
  • TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs
  • TelegramSearchBot.LLMAgent/Service/AgentLoopService.cs
  • TelegramSearchBot/Service/AI/LLM/ChunkPollingService.cs

@github-actions
Copy link
Copy Markdown
Contributor

PR Check Report

Summary

Test Results

Platform Status Details
Ubuntu Passed Tests passed, artifacts uploaded
Windows Passed Tests passed, artifacts uploaded

Code Quality

  • Code formatting check
  • Security vulnerability scan
  • Dependency analysis
  • Code coverage collection

Test Artifacts

  • Test results artifacts count: 2
  • Code coverage uploaded to Codecov

Links


This report is auto-generated by GitHub Actions

@ModerRAS ModerRAS merged commit c347dd1 into master May 13, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant