[core] V2: skip inline step execution when suspension also has a wait#1924
Conversation
Inline `await executeStep(...)` blocks the V2 handler for the full step duration, but `wait_completed` events are only created on the *next* loop iteration's "complete elapsed waits" pass. This breaks `Promise.race(step, sleep)` semantics whenever the sleep is shorter than the step — replay still picks the step because the sleep's wait_completed event hasn't been written yet. Reproducer (`sleepWinsRaceWorkflow`): a 1s sleep raced against a 10s step. Expected `'sleep'` to win; the runtime returned `'step'`. On the failing run, `wait_completed` was created at t=11.86s — right after `step_completed` — instead of at t≈2.74s when its `resumeAt` elapsed. Fix: in `runtime.ts`, gate the inline-step pick on the absence of a wait timeout. When `suspensionResult.timeoutSeconds !== undefined`, queue every pending step instead and return with the wait timeout. This lets the wait timer drive a continuation in parallel, matching V1's behavior where each step ran in a separate function invocation. Pure step suspensions (without waits) still benefit from inline execution. Test plan: - New e2e tests `sleepWinsRaceWorkflow` and `stepWinsRaceWorkflow` exercising `Promise.race` between a step function and `sleep()`, in both directions. - Verified locally against `nextjs-turbopack` workbench: both pass. Event log confirms `wait_completed` is now created on time (t≈1s after `wait_created`) rather than after the inline step. Eager-processing changelog updated with a "Mixed Suspensions" section describing the carve-out and its rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: ee02092 The changes in this PR will be included in the next version bump. This PR includes changesets to release 18 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
📊 Benchmark Results
workflow with no steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 1 step💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 10 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 25 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 50 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) Promise.all with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) Promise.all with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) Promise.all with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) Promise.race with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) Promise.race with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) Promise.race with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 10 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 25 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 50 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 10 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 25 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) workflow with 50 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) stream pipeline with 5 transform steps (1MB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) 10 parallel streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) fan-out fan-in 10 streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
❌ Some benchmark jobs failed:
Check the workflow run for details. |
🧪 E2E Test Results✅ All tests passed Summary
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
✅ 📋 Other
|
Signed-off-by: Peter Wielander <mittgfu@gmail.com>
* origin/main: [core] Skip inline step execution when suspension also has a wait (#1924) [errors] Replace chalk import in @workfow/errors with inline ANSI shim (#1915) Fix compatibility with Zod 4.4.x (#1902) Serialize `run_failed`/`step_failed` errors through serialization pipeline (#1851) tarballs: redesign preview tarballs index page (#1911) Remove extra changeset (#1922) Add stable Next.js eager and lazy test coverage (#1747) Enforce per-(run, correlation) uniqueness for entity-creating events in world-postgres (#1878) fix(world-vercel): add default request timeout to workflow-server HTTP calls (#1807)
PR #1924 added the same sleep/step race workflows directly to main while this branch was open. The textual concat from `git merge` left both copies in 99_e2e.ts; this drops the duplicate set so the file matches origin/main verbatim and the e2e tests pick up the upstream definitions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ignal * origin/main: [core] Skip inline step execution when suspension also has a wait (#1924) [errors] Replace chalk import in @workfow/errors with inline ANSI shim (#1915) Fix compatibility with Zod 4.4.x (#1902) Serialize `run_failed`/`step_failed` errors through serialization pipeline (#1851) tarballs: redesign preview tarballs index page (#1911) Remove extra changeset (#1922) Add stable Next.js eager and lazy test coverage (#1747) Enforce per-(run, correlation) uniqueness for entity-creating events in world-postgres (#1878) fix(world-vercel): add default request timeout to workflow-server HTTP calls (#1807) Allow disabling step sourcemap with new `sourcemap` option in builders (#1842) [ci] Enable Vercel-prod e2e for tanstack-start (#1904) web: configure vercelPreset() for Vercel deployments (#1815) [core] Combine flow+step bundle and process steps eagerly (#1338) [world-vercel] Revert stream close control framing (#1891) [tarballs] Use turbo to build workspace deps before packing (#1908) # Conflicts: # packages/core/src/runtime/step-handler.test.ts # packages/core/src/runtime/step-handler.ts # packages/core/src/runtime/suspension-handler.ts # packages/core/src/step.test.ts # packages/world-local/src/storage/events-storage.ts # packages/world-postgres/src/drizzle/migrations/meta/_journal.json
* [e2e] Add step-vs-sleep race tests + dev-tmux skill Adds two race workflows (sleepWinsRaceWorkflow, stepWinsRaceWorkflow) that exercise Promise.race between a step function and a sleep call. The current `sleepWinsRaceWorkflow` test fails — surfacing how the replay engine resolves a previously-completed step instantly while sleep still has to elapse. Also adds a `dev-tmux` skill that documents the 3-pane tmux + portless setup for testing workflows interactively in a worktree alongside the observability UI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [workbench/nextjs-turbopack] Allow *.turbopack.localhost in dev Adds allowedDevOrigins entries so portless-style worktree-prefixed .localhost URLs (e.g. https://<branch>.turbopack.localhost) can hit HMR and dev-only endpoints without Next's cross-origin protection flooding the logs with warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Drop duplicate race workflows after merging main PR #1924 added the same sleep/step race workflows directly to main while this branch was open. The textual concat from `git merge` left both copies in 99_e2e.ts; this drops the duplicate set so the file matches origin/main verbatim and the e2e tests pick up the upstream definitions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [skills/dev-tmux] Narrow activation, robust pane IDs, statusline helper - Tighten activation phrases so the skill only fires for the specific portless+tmux setup it documents, not the generic "start the dev server" task. Addresses #1916 review. - Capture pane IDs at split time (-P -F '#{pane_id}') so the snippet works under both pane-base-index 0 and 1. Addresses Copilot review. - Add `statusline.sh` that filters `portless list` to the current worktree's routes and emits a one-line summary, plus instructions for wiring it into Claude Code's `statusLine.command`. - Bump version to 1.1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [skills/dev-tmux] Recommend primary-checkout path for statusline Worktrees get deleted, so wiring the statusline to a worktree path breaks the moment the worktree is removed. Update the skill and the script header to recommend pointing `statusLine.command` at the primary checkout (`$HOME/github/vercel/workflow/...`). The script itself is already worktree-aware via Claude's `workspace.current_dir` stdin JSON, so the same invocation surfaces routes for whichever worktree the session is in. Bump version to 1.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [skills/dev-tmux] OSC 8 link statusline + worktree-named tmux session - Statusline overlay now renders `[dev] · [obs] · tmux:<prefix>`, with the bracketed labels emitted as OSC 8 hyperlinks (clickable in any modern terminal) styled cyan + underline so they stand out. Replaces the old long-URL form that was hard to scan and click. - Add a tmux-session indicator: shown when a session named exactly the worktree prefix exists (uses `tmux has-session -t =<prefix>` for exact matching). - Change the skill's tmux session naming convention from the fixed `workflow-dev` to `<worktree-prefix>` (basename of the branch — same string portless uses as the subdomain prefix). This lets the statusline locate the session deterministically and lets multiple worktrees run dev sessions concurrently without manual disambiguation. - Bump skill to v1.3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [skills/dev-tmux] Statusline: print full `tmux attach -t <name>` command Replaces the abbreviated `tmux:<prefix>` indicator with the full copy-paste-ready `tmux attach -t <prefix>` invocation. Saves a step when grabbing the session from another shell. Bump skill to v1.4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [skills/dev-tmux] Brighter statusline + Nerd Font icons - Drop the dim styling that made the overlay hard to read; use bold bright cyan + underline for links and bold bright green for the tmux command. - Add Nerd Font glyphs: for dev, for obs, for the tmux copy-paste hint. Falls back to box-drawing if the font lacks Nerd Font ranges; layout is unaffected. - Visual differentiation: cyan + underline = clickable hyperlink; green = copy this command. Bump skill to v1.5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [skills/dev-tmux] Restore Nerd Font icons via Unicode escapes The copy glyph in `emit_tmux` was a literal Nerd Font byte embedded in the printf string and got stripped during a prior rewrite. Promote all three icons (rocket / graph / copy) to top-level shell variables that use \uHHHH-equivalent UTF-8 escapes, so the source survives editor round-trips that don't preserve Private Use Area code points. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [skills/internal-dev-workbench] Rename from dev-tmux, set author + reset version - Rename `skills/dev-tmux/` → `skills/internal-dev-workbench/` to make the name self-explanatory about the skill's scope (an internal contributor's local dev workbench, not a generic tmux helper). - Author: Pranay Prakash. Version: 0.1 (first release of the skill). - Update internal references in SKILL.md and statusline.sh accordingly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Fix
Promise.race(step, sleep)semantics in V2 mixed suspensions: when a workflow suspension contains both pending steps and at least one wait (sleep), the runtime now queues every step instead of executing one inline.Inline `await executeStep(...)` blocks the V2 handler for the full step duration, but `wait_completed` events are only created on the next loop iteration's "complete elapsed waits" pass. So if the sleep is shorter than the step, the wait timer never fires on time — replay just sees `step_completed` first and `Promise.race` resolves with the step. A 1s sleep racing a 10s step would silently resolve to the step.
This restores V1's behavior, where each step ran in a separate function invocation and the wait timer drove a parallel queue continuation.
Reproducer (failing test from #1916)
`sleepWinsRaceWorkflow`: a 1s sleep raced against a 10s step. Expected `'sleep'` to win.
Event log on the failing run, before the fix:
After the fix, `wait_completed` lands at t≈1s after `wait_created` and the run completes with `'sleep'` in ~1s.
Change
One-liner in `packages/core/src/runtime.ts`:
```ts
const inlineStep =
suspensionResult.timeoutSeconds === undefined
? ownedPendingSteps[0]
: undefined;
```
The existing "queue every pending step except the inline one" loop and the `if (!inlineStep)` early-return already handle the `undefined` case correctly.
What about hooks?
Audited every flavor of suspension. Step + wait is the only fully-broken combination.
Step + hook works on all first-party worlds today. A consistent follow-up would extend the same carve-out to `hasPendingHooks`, but it would require a new field in `SuspensionHandlerResult` and isn't strictly necessary — happy to do that as a follow-up if desired.
Test plan
Docs
`docs/content/docs/changelog/eager-processing.mdx` "Mixed Suspensions" section rewritten to describe the carve-out and explicitly call out the `Promise.race(step, sleep)` race-semantics motivation.
🤖 Generated with Claude Code