Optimize LLM tool call execution and multi-repo searches#114
Conversation
… asyncio.gather Replaced sequential loops over `ai_response.tool_calls` in `CodeRetrievalPipeline` and `RetrievalPipeline` with concurrent execution using `asyncio.gather`. Also optimized multi-repository searches in `_search_symbols` and `_search_files` by gathering tasks concurrently. Added an entry to the journal documenting the performance pattern.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
💡 What: Optimized the execution of LLM tool calls and multi-repository searches in the
CodeRetrievalPipelineandRetrievalPipeline. Replaced sequentialforloops with concurrent execution viaasyncio.gather, while carefully preserving the required sequential processing of the gathered results to avoid race conditions on shared lists (sources,tool_messages).🎯 Why: When the LLM decided to invoke multiple tools (e.g., retrieving different files simultaneously) or when a query targeted multiple repositories, the pipeline previously awaited each I/O-bound operation sequentially. This created a significant bottleneck where independent network/database requests were waiting on each other.
📊 Impact: Reduces latency significantly during multi-tool execution and global searches. For example, executing 5 tool calls that take 200ms each will now take ~200ms in total instead of ~1000ms.
🔬 Measurement: Measure the total execution time of the
runmethod inRetrievalPipelineandCodeRetrievalPipelinewhen the LLM makes multiple tool calls in a single turn, or when searching across multiple repositories without specifying a repo. Observe the timing logs to verify concurrent execution.PR created automatically by Jules for task 16276008160634690662 started by @ishaanxgupta