Skip to content

fix: improve agent prompts, extract path autofix, and Gemini v3 upgrade#492

Merged
buger merged 5 commits intomainfrom
fix/improve-search-tool-prompts
Mar 6, 2026
Merged

fix: improve agent prompts, extract path autofix, and Gemini v3 upgrade#492
buger merged 5 commits intomainfrom
fix/improve-search-tool-prompts

Conversation

@buger
Copy link
Collaborator

@buger buger commented Mar 6, 2026

Summary

  • Anti-bash guidance: Add explicit instructions to agent system prompt preventing use of bash (grep, cat, find) for code exploration — use search/extract tools instead
  • Extract path auto-fix: When search returns relative paths (e.g., gateway/file.go) and model constructs wrong absolute paths by prepending workspace root instead of project subdirectory, extract now auto-corrects by checking allowedFolders
  • Search delegate prompt improvements: Based on real Jaeger trace analysis showing delegates making 29 searches for one term with case/path variations. Added exact=true guidance for symbol lookups, BAD examples from real traces (case variations, path hopping, full signature searches)
  • @ai-sdk/google v3 upgrade: Fixes Gemini 3.x thought_signature round-tripping error (Function call is missing a thought_signature in functionCall parts)
  • Support promptType + customPrompt together: Allow both parameters to work in combination

Evidence from trace analysis

Before fixes, explore-code session #2 had 9 bash cat calls because extract failed with wrong paths. Delegate #8 made 29 redundant searches for "ThrottleRetryLimit" with case/path variations.

Problem Fix
Extract fails: /workspace/gateway/file.go instead of /workspace/tyk/gateway/file.go Path auto-fix tries each allowedFolder
Model falls back to bash cat after extract fails Anti-bash prompt guidance (0 bash in explore-code #1)
29 searches for one symbol with path/case variations BAD examples + exact=true guidance
limitDRLLimitDRL case variations Explicit BAD example in prompt
Full signature searches "func (k *Type) Method()" Prompt: use exact=true "Method" instead
Gemini 3.x thought_signature error Upgrade @ai-sdk/google to v3

Test plan

  • Extract path auto-fix unit tests (8 tests covering: missing subdirectory, line/symbol suffixes, nested paths, multiple targets, multiple allowedFolders)
  • Existing test suite passes

🤖 Generated with Claude Code

buger and others added 5 commits March 6, 2026 16:12
…pe+customPrompt together

- Update bash tool description to explicitly forbid code exploration (grep, cat, find, etc.)
- Add anti-bash instruction #7 to commonInstructions in getSystemMessage()
- Add anti-bash guidance to Claude and Codex native system prompts
- Support both promptType and customPrompt simultaneously: predefined prompt as base + custom wrapped in <custom-instructions> tag
- When only customPrompt is set (no promptType), don't append commonInstructions since custom prompts are self-contained

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gemini 3.x models require thought_signature to be round-tripped in
function call conversation history. @ai-sdk/google v2 stripped these
out, causing 400 errors. v3.0.0+ preserves them correctly.

Fixes: "Function call is missing a thought_signature in functionCall parts"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When search returns relative paths (e.g., "gateway/file.go") and the model
constructs wrong absolute paths by prepending the workspace root instead of
the project subdirectory (e.g., /workspace/gateway/file.go instead of
/workspace/tyk/gateway/file.go), the extract tool now auto-corrects by
trying each allowedFolder as an alternative base path.

This was causing extract to fail 3x in a row, making models fall back to
bash cat for file reading.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8 tests covering the workspace subdirectory path correction:
- Missing project subdirectory fix
- Line number and symbol suffix preservation
- Nested paths, multiple targets, multiple allowedFolders
- No-op for correct paths and unfixable paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Based on real trace analysis showing delegates making 29 searches for one
term with case/path variations. Key additions:
- "When to use exact=true" section for known symbol lookups
- BAD examples from real traces: case variations (limitDRL/LimitDRL),
  snake_case variations, same query on different paths, full function
  signature searches
- GOOD examples: exact=true for symbol names like ForwardMessage
- Clarify that changing path does NOT bypass dedup
- "Do NOT search full signatures, just use exact=true with method name"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@buger buger merged commit e0f5b23 into main Mar 6, 2026
13 of 14 checks passed
@probelabs
Copy link
Contributor

probelabs bot commented Mar 6, 2026

Summary

This PR addresses multiple issues discovered through Jaeger trace analysis of agent behavior, focusing on four main areas:

1. Anti-Bash Guidance for Code Exploration

  • Added explicit instructions in agent system prompts preventing use of bash commands (grep, cat, find, head, tail, awk, sed) for code exploration
  • Guides agents to use search and extract tools instead, which are faster and more accurate
  • Updated in ProbeAgent.js (3 locations) and common.js (bash tool description)

2. Extract Path Auto-Fix

  • When search returns relative paths (e.g., gateway/file.go) and the model constructs wrong absolute paths by prepending workspace root instead of project subdirectory, extract now auto-corrects by checking allowedFolders
  • Example: /workspace/gateway/file.go/workspace/tyk/gateway/file.go
  • Preserves line number and symbol suffixes during path correction

3. Search Delegate Prompt Improvements

  • Added exact=true guidance for known symbol lookups
  • Added concrete BAD examples from real trace analysis:
    • Case variations: limitDRLLimitDRL
    • Path hopping: same query on different paths
    • Full signature searches: func (k *Type) Method()
  • Evidence: Delegate Implement "Query" Tool for Tree-Sitter Queries #8 made 29 redundant searches for "ThrottleRetryLimit" before this fix

4. @ai-sdk/google v3 Upgrade

  • Upgraded from ^2.0.14 to ^3.0.37
  • Fixes Gemini 3.x thought_signature round-tripping error

5. promptType + customPrompt Combination

  • Previously mutually exclusive, now both can work together
  • Predefined prompt serves as base, custom prompt appended in <custom-instructions> tag

Files Changed Analysis

File Changes Purpose
npm/package.json +1/-1 Dependency upgrade
npm/src/agent/ProbeAgent.js +18/-9 Prompt improvements, promptType+customPrompt combo
npm/src/tools/common.js +1/-1 Bash tool description update
npm/src/tools/vercel.js +67/-8 Extract path autofix, delegate prompt improvements
npm/tests/unit/extract-path-autofix.test.js +175 (new) 8 unit tests for path autofix

Total: +262 additions, -19 deletions across 5 files

Architecture & Impact Assessment

flowchart TD
    subgraph "Agent System"
        PA[ProbeAgent.js]
        PA --> SP[System Prompts]
        PA --> DP[Delegate Prompts]
    end
    
    subgraph "Tools Layer"
        CJ[common.js]
        VJ[vercel.js]
        VJ --> ET[extractTool]
        VJ --> SD[Search Delegate]
    end
    
    subgraph "External"
        GEM[Gemini API]
        PKG[package.json]
    end
    
    SP --> |Anti-bash guidance| CJ
    DP --> |exact=true + BAD examples| SD
    ET --> |Path autofix| AF[allowedFolders check]
    PKG --> |v3 upgrade| GEM
Loading

Key Technical Changes

  1. Prompt Engineering: Multiple system prompt templates updated with anti-bash guidance
  2. Path Resolution: New autofix logic in extractTool that iterates through allowedFolders to find correct file locations
  3. Delegate Instructions: Enhanced search strategy guidance with concrete examples
  4. SDK Compatibility: Breaking change handling for Gemini 3.x thought signatures

Affected Components

  • Agent orchestration (ProbeAgent)
  • Tool descriptions (common.js)
  • Extract tool path resolution (vercel.js)
  • Search delegate task generation (vercel.js)
  • Gemini provider integration

Scope Discovery & Context Expansion

Related Files to Consider

  • npm/src/tools/vercel.js - Contains splitTargetSuffix function used by autofix
  • Other tool implementations that may need similar path handling
  • Integration tests for agent behavior
  • Configuration files for allowedFolders setup

Testing Coverage

  • 8 new unit tests covering:
    • Missing project subdirectory
    • Line number suffix preservation
    • Symbol suffix preservation
    • Nested paths
    • Correct paths (no modification)
    • Mixed correct/wrong paths
    • Non-existent files
    • Multiple allowedFolders

Labels

  • tags.review-effort: 2 (Focused changes with good test coverage, straightforward prompt and path logic updates)
  • tags.label: bug (Fixes specific issues: path resolution failures, redundant searches, Gemini compatibility)
Metadata
  • Review Effort: 2 / 5
  • Primary Label: bug

Powered by Visor from Probelabs

Last updated: 2026-03-06T19:18:47.683Z | Triggered by: pr_opened | Commit: c6c08bb

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link
Contributor

probelabs bot commented Mar 6, 2026

Security Issues (1)

Severity Location Issue
🟠 Error system:0
ProbeAgent execution failed: Error: Failed to get response from AI model. No output generated. Check the stream for errors.

Architecture Issues (1)

Severity Location Issue
🟠 Error system:0
ProbeAgent execution failed: Error: Failed to get response from AI model. No output generated. Check the stream for errors.

Performance Issues (1)

Severity Location Issue
🟠 Error system:0
ProbeAgent execution failed: Error: Failed to get response from AI model. No output generated. Check the stream for errors.

Powered by Visor from Probelabs

Last updated: 2026-03-06T18:54:14.104Z | Triggered by: pr_opened | Commit: c6c08bb

💡 TIP: You can chat with Visor using /visor ask <your question>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant