perf: parallelize suspension handler for high-concurrency#544
Conversation
🦋 Changeset detectedLatest commit: 886548c The changes in this PR will be included in the next version bump. This PR includes changesets to release 12 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🧪 E2E Test Results❌ Some tests failed Summary
❌ Failed Tests🌍 Community Worlds (11 failed)mongodb (1 failed):
redis (1 failed):
starter (8 failed):
turso (1 failed):
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
❌ 🌍 Community Worlds
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
61edd56 to
bdefce0
Compare
c4a92e0 to
4d36770
Compare
4210576 to
30e3e2c
Compare
30e3e2c to
a9f53cb
Compare
There was a problem hiding this comment.
Pull request overview
This PR refactors the workflow suspension handler to improve performance through parallelization and better code organization. The main runtime.ts file is split into modular components (suspension-handler.ts, step-handler.ts, helpers.ts) to improve maintainability, and the suspension processing is optimized to handle hooks, steps, and waits more efficiently.
Key Changes
- Parallelization: Hooks are now processed sequentially first to prevent race conditions, followed by parallel processing of steps and waits for better performance
- Code organization: Large runtime.ts file is refactored into focused, modular files that separate concerns
- Enhanced telemetry: New OpenTelemetry attributes track hooks created (
workflow.hooks.created) and waits created (workflow.waits.created) for better observability
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| packages/core/src/telemetry/semantic-conventions.ts | Adds new telemetry attributes for hooks and waits, updates suspension status naming from pending_steps to workflow_suspended |
| packages/core/src/runtime/suspension-handler.ts | New module handling workflow suspensions with parallel processing of hooks, steps, and waits |
| packages/core/src/runtime/step-handler.ts | Extracted step handler logic from runtime.ts with no functional changes |
| packages/core/src/runtime/helpers.ts | New helper module containing shared utility functions (event loading, health checks, queue operations) |
| packages/core/src/runtime.ts | Refactored to import and use modular components, removes large inline implementations |
| .changeset/fast-owls-flow.md | Documents the performance improvement and refactoring changes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } catch (err) { | ||
| if (WorkflowAPIError.is(err) && err.status === 409) { | ||
| // Step already exists, so we can skip it | ||
| console.warn( | ||
| `Step "${queueItem.stepName}" with correlation ID "${queueItem.correlationId}" already exists, skipping: ${err.message}` | ||
| ); | ||
| return; | ||
| } | ||
| throw err; |
There was a problem hiding this comment.
The error handling in processStep only checks for 409 (conflict) errors but not 410 (workflow completed) errors. This is inconsistent with processHook which handles both 409 and 410 status codes. When a workflow has already completed, attempting to create a step should be handled gracefully like hooks do, otherwise the error will bubble up and potentially cause issues.
- Process hooks first, then steps and waits in parallel to prevent race conditions - Refactor runtime.ts into modular files: suspension-handler.ts, step-handler.ts, helpers.ts - Add otel attributes for hooks created (workflow.hooks.created) and waits created (workflow.waits.created) - Update suspension status from pending_steps to workflow_suspended - Add retry logic to benchmark for unexpected content type errors - Refactor stress test benchmarks to use shared config (10, 25 steps enabled; 100+ skipped) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The current implementation of suspension handler processes the invocation queue sequentially which is really slow. Now we process is in parallel:
suspension-handler.ts,step-handler.ts,helpers.tsworkflow.hooks.created) and waits created (workflow.waits.created)pending_stepstoworkflow_suspended