Skip to content

Concurrent batched vector embeddings#110

Merged
ishaanxgupta merged 2 commits intomainfrom
bolt/optimize-weaver-batched-vectors-9188823495244099766
Apr 13, 2026
Merged

Concurrent batched vector embeddings#110
ishaanxgupta merged 2 commits intomainfrom
bolt/optimize-weaver-batched-vectors-9188823495244099766

Conversation

@ishaanxgupta
Copy link
Copy Markdown
Member

💡 What:
Optimized Weaver._execute_batched_vector to compute batch embeddings concurrently and offloaded synchronous vector_store operations (add, delete) to the thread pool (run_in_executor) to avoid blocking the event loop.

🎯 Why:
Previously, flush_add_batch generated embeddings and executed additions/deletions completely synchronously inside the async def function. This created a performance bottleneck by forcing the single-threaded event loop to wait on blocking operations (often I/O bound if embed_fn makes HTTP requests, or CPU bound if done locally).

📊 Impact:

  • Significantly reduces the latency of executing a batch of updates. In simulated network latency tests (0.2s API wait per embed, 0.5s network I/O per vector operation), a sequence of 5 operations went from ~2.0s sequentially down to ~1.2s when run concurrently.
  • Protects the FastAPI event loop from being blocked by synchronous calls, reducing the system's P99 latency and increasing overall throughput across other endpoints.

🔬 Measurement:
A new script verify_weaver_batch.py was introduced at the root to test the execution latency programmatically with mocked delays. You can run it via:

PINECONE_API_KEY=mock NEO4J_PASSWORD=mock OPENAI_API_KEY=mock python3 verify_weaver_batch.py

PR created automatically by Jules for task 9188823495244099766 started by @ishaanxgupta

- Refactored `Weaver._execute_batched_vector` to offload synchronous embedding generation (`self.embed_fn`) and vector store I/O (`store.add`, `store.delete`) to an executor thread pool (`loop.run_in_executor`).
- Replaced sequential processing in `flush_add_batch` with `asyncio.gather(*tasks, return_exceptions=True)` to execute all vector operations for a given batch concurrently.
- Re-architected error handling around metadata extraction and embedding execution to ensure robust failures are localized to single operations instead of failing the entire batch or skipping subsequently enqueued `DELETE` operations.
@google-labs-jules
Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@ishaanxgupta ishaanxgupta marked this pull request as ready for review April 13, 2026 04:29
@ishaanxgupta ishaanxgupta changed the title ⚡ Bolt: [performance improvement] Concurrent batched vector embeddings Concurrent batched vector embeddings Apr 13, 2026
@ishaanxgupta ishaanxgupta merged commit 8ce26c9 into main Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant