Skip to content

[Feature] Improve plugin tools execution and multimodal support#758

Merged
dingyi222666 merged 4 commits into
v1-devfrom
fix/plugin-tools
Mar 5, 2026
Merged

[Feature] Improve plugin tools execution and multimodal support#758
dingyi222666 merged 4 commits into
v1-devfrom
fix/plugin-tools

Conversation

@dingyi222666
Copy link
Copy Markdown
Member

This PR improves tool execution with case-insensitive matching, native shell spawning, and enhanced multimodal content support for tool call events.

New Features

  • Case-insensitive tool name matching in agent executor for consistent tool resolution
  • Multimodal content support for tool call events with AgentAction['content'] parameter
  • Windows shell resolution with Git Bash and PowerShell detection
  • Native child process spawning with proper timeout handling and signal detection

Bug fixes

  • Fixed config patching for callback propagation in tool invocation
  • Improved error handling for tool input parsing with dedicated function
  • Fixed tool runtime error handling with dedicated error handler support

Other Changes

  • Replaced shelljs dependency with native child_process.spawn() for better control
  • Added UTF-8 encoding handling for cross-platform command execution
  • Refactored tool call handler to support multimodal message content
  • Inlined message content validation for better code clarity
  • Improved signal-based termination detection alongside exit codes
  • Removed unnecessary validation constraints on offset/limit/timeout parameters

…ive matching and native shell spawning

- Normalize tool names to lowercase in agent executor for consistent matching
- Refactor error handling for tool input parsing into dedicated function
- Add runtime error handler support for tool execution failures
- Fix config patching for callback propagation in tool invocation
- Replace shelljs with native child_process spawn for better control
- Add Windows shell resolution (Git Bash, PowerShell) detection
- Improve command execution with proper UTF-8 encoding handling
- Add signal-based termination detection alongside exit codes
- Update timeout handling with cleaner child process lifecycle management
- Remove unnecessary validation constraints on offset/limit/timeout parameters
- Extend llm-call-tool event signature with AgentAction content parameter
- Update plugin_chat_chain to pass content from agent actions to events
- Refactor tool call handler to support multimodal message content
- Add hasMessageContent() helper to validate message content
- Extract sendRenderedMessage() for consistent message rendering
- Improve tool call event handling with content-aware message delivery
- Remove hasMessageContent() helper function
- Inline content validation logic directly into createToolCallHandler
- Simplify single-use helper abstraction for better code clarity
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 5, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 12df6cb3-a277-4493-b207-42aa77a71b39

📥 Commits

Reviewing files that changed from the base of the PR and between ae87c93 and 6fb2e1b.

📒 Files selected for processing (1)
  • packages/core/src/llm-core/agent/executor.ts

Walkthrough

本次变更:统一工具名称为小写以做映射查找;集中处理工具输入解析错误并包装为观察;在事件链与中间件中传播 AgentAction.content;并为 BashTool 引入跨平台(含 Windows PowerShell / Git Bash)命令执行与更宽松的 schema 校验。

Changes

Cohort / File(s) Summary
代理执行器核心
packages/core/src/llm-core/agent/executor.ts
将工具映射与查找统一为小写键,新增 toToolInputErrorObservation() 以标准化工具输入解析错误观察;替换若干分散错误分支并在调用路径中使用 patched 回调配置与统一错误处理(包括 ToolInputParsingException 与 handleToolRuntimeErrors)。
事件与类型
packages/core/src/llm-core/chain/plugin_chat_chain.ts, packages/core/src/services/types.ts
为 handleAgentAction 添加 AgentAction 类型注解;将 ChatEvents 中 llm-call-tool 事件签名扩展为 (tool, args, content, log),以传播 AgentAction.content;事件发射包含 action.content。
请求中间件与渲染
packages/core/src/middlewares/model/request_model.ts
增加对 AgentAction['content'] 的处理路径:若 content 存在则渲染并发送;重构日志/工具调用发送流程,新增 sendRenderedMessage 以统一渲染与审查(censor)后再发送。
文件系统与 Bash 工具
packages/extension-tools/src/plugins/fs.ts
用 child_process.spawn 替代 shelljs,新增 Windows Shell 解析(_resolveShellCommand、_buildWindowsPowerShellCommand、_resolveWindowsGitBashPath)、Git Bash 寻址、流式 stdout/stderr 解码、信号与超时处理;放宽 ReadFileTool/BashTool 的数值 schema 校验(去掉 .int()/.positive())。

Sequence Diagram

sequenceDiagram
    participant Agent as AgentExecutor
    participant Chain as PluginChatChain
    participant Middleware as RequestMiddleware
    participant Events as ChatEvents
    Agent->>Agent: 查找工具(name.toLowerCase())
    activate Agent
    Agent->>Agent: 调用工具或捕获解析错误
    Agent-->>Events: 生成/返回 AgentAction (含 content)
    deactivate Agent

    Chain->>Events: emit llm-call-tool(tool,args,content,log)
    activate Chain
    deactivate Chain

    Events->>Middleware: 接收事件并处理发送
    activate Middleware
    alt content 存在
        Middleware->>Middleware: render(content) -> sendRenderedMessage
    else showThoughtMessage false
        Middleware->>Middleware: 不发送
    else log 非工具调用
        Middleware->>Middleware: render(log) -> sendRenderedMessage
    else
        Middleware->>Middleware: render(格式化工具调用) -> sendRenderedMessage
    end
    deactivate Middleware
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 我是跳跃的小兔子,

名称都变小写了,跑得更稳健;
错误有了统一的眼睛来看,
内容一路传到每个链节,
Windows 也能优雅地执行命令!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed PR标题准确概括了主要改进:工具执行改进(包括大小写不敏感匹配、原生shell生成)和多模态支持,与changeset的核心变更相符。
Description check ✅ Passed PR描述详细阐述了改动范围,涵盖新特性(大小写不敏感匹配、多模态内容、Windows shell检测、子进程生成)、bug修复和其他变更,与changeset高度相关。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/plugin-tools

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly upgrades the agent's tool execution capabilities and multimodal content handling. It refines how tools are identified and invoked, introduces more robust error management, and modernizes the underlying command execution mechanism for enhanced cross-platform compatibility and reliability. These changes aim to make tool usage more flexible, resilient, and capable of processing diverse content types.

Highlights

  • Case-Insensitive Tool Matching: Implemented case-insensitive matching for tool names within the agent executor, ensuring more consistent tool resolution regardless of casing.
  • Multimodal Content Support: Enhanced tool call events and their handlers to support multimodal content, allowing richer interactions and information exchange during tool execution.
  • Native Shell Execution: Replaced the shelljs dependency with native node:child_process.spawn() for command execution, providing better control, timeout handling, and signal detection.
  • Windows Shell Resolution: Added robust resolution for Windows shells, including detection of Git Bash and PowerShell, along with proper UTF-8 encoding for cross-platform compatibility.
  • Improved Error Handling: Refactored tool input parsing error handling into a dedicated function and introduced support for handling tool runtime errors more gracefully.
  • Relaxed Validation Constraints: Removed strict int and positive validation constraints for offset, limit, and timeout parameters in tool schemas, offering more flexibility.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • packages/core/src/llm-core/agent/executor.ts
    • Enabled case-insensitive matching for tool names.
    • Refactored tool input parsing error handling into a dedicated utility function.
    • Fixed callback propagation during tool invocation.
    • Added support for handling tool runtime errors.
  • packages/core/src/llm-core/chain/plugin_chat_chain.ts
    • Updated the llm-call-tool event to pass multimodal content from agent actions.
  • packages/core/src/middlewares/model/request_model.ts
    • Refactored the tool call handler to process and display multimodal content.
    • Introduced a new sendRenderedMessage function for consistent message rendering.
  • packages/core/src/services/types.ts
    • Modified the ChatEvents['llm-call-tool'] interface to include a content parameter for multimodal support.
  • packages/extension-tools/package.json
    • Replaced the shelljs dependency with which for shell path resolution.
  • packages/extension-tools/src/plugins/fs.ts
    • Migrated command execution from shelljs to node:child_process.spawn.
    • Implemented logic to resolve native shells on Windows, including Git Bash and PowerShell, with proper UTF-8 encoding.
    • Removed int and positive validation constraints for offset, limit, and timeout parameters in the command schema.
    • Enhanced command execution to detect and report termination signals.
Activity
  • No human activity has been recorded for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to tool execution, including case-insensitive tool matching, native shell command execution via child_process.spawn (replacing shelljs), and enhanced multimodal content support. It also improves robustness and cross-platform compatibility through refactored error handling and a Windows shell resolver for Git Bash and PowerShell. However, it introduces significant security vulnerabilities related to path traversal and scope bypass in the FileStore class and the BashTool, as paths are not properly normalized before scope checks, potentially allowing unauthorized file access. Additionally, the removal of some input validation in the tool schemas could lead to unexpected behavior. Addressing these path normalization issues and re-instating the necessary input validations are critical for both security and robustness.

Comment thread packages/extension-tools/src/plugins/fs.ts
Comment thread packages/extension-tools/src/plugins/fs.ts
Comment thread packages/extension-tools/src/plugins/fs.ts
- Use patchConfig to pass runManager callbacks in ExceptionTool.invoke() call
- Ensure consistent callback propagation across all tool invocations
- Fix callback chain for error handling in agent executor
@dingyi222666 dingyi222666 merged commit 268f9ea into v1-dev Mar 5, 2026
2 checks passed
@dingyi222666 dingyi222666 deleted the fix/plugin-tools branch March 5, 2026 22:06
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ae87c93e90

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +392 to +393
if (handleParsingErrors === true || handleParsingErrors === false) {
return 'Invalid or incomplete tool input. Please try again.'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Re-throw tool input parse errors when handling is disabled

This helper now treats handleParsingErrors === false the same as true, so ToolInputParsingException is converted into a generic observation instead of being thrown. Because handleParsingErrors defaults to false, the default runtime behavior changed from fail-fast to silently continuing in both _call and _takeNextStep, which can hide malformed tool inputs and trap the agent in retry loops rather than surfacing the actual error.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant