fix: optimize stream/generate performance#287
Conversation
🦋 Changeset detectedLatest commit: ba53977 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Deploying voltagent with
|
| Latest commit: |
3cab1ea
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://13c13e28.voltagent.pages.dev |
| Branch Preview URL: | https://fix-optimize-memory-timeline.voltagent.pages.dev |
There was a problem hiding this comment.
This is awesome to see!!!!!
A major item that concerns me:
Is this safe? Moving to queueing pattern means we need to consume all the queue events before we exist/close out a request. That means we need to execute that in the streamText and other calls, I didn't see that added (it might already but I don't belive so since this is new).
I would not merge until we have that as it could cause inconsistent memory data.
A couple pieces of additional feedback:
- Could we add in benchmark tests with vitest to help verify we don't lost start up speed (https://vitest.dev/guide/features.html#benchmarking)
- Anything that I know / believe we should fix in v1 / next major/minor bump I'm adding a
TODO:comment so we can easily find them
Yep, definitely. I’m going to wrap it with a queue 🚀 |
…Emitter and MemoryManager
zrosenbauer
left a comment
There was a problem hiding this comment.
Just to confirm the only time we process the queue is when we drain it correct? Otherwise LGTM
… Agent with context handling
… and isolation verification
fix: optimize streamText/generateText/generateObject/streamObject performance
PR Checklist
Please check if your PR fulfills the following requirements:
Bugs / Features
What is the current behavior?
Agent response times are slow due to blocking operations during stream initialization. Timeline events are published synchronously, causing delays, and memory operations for context loading block the stream start process. This results in:
What is the new behavior?
Significantly improved agent response times by optimizing blocking operations during stream initialization:
fixes (issue)
Notes for reviewers
This optimization focuses on making blocking operations asynchronous without affecting the quality of conversation context or history tracking. The performance improvements are particularly noticeable in production environments where fast AI interactions are critical.