e2e: replace fixed-sleep hook waits with event-driven waitForHook helper#1879
Conversation
🦋 Changeset detectedLatest commit: 72f0617 The changes in this PR will be included in the next version bump. This PR includes changesets to release 0 packagesWhen changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
🧪 E2E Test Results✅ All tests passed Summary
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
✅ 📋 Other
|
📊 Benchmark Results
workflow with no steps💻 Local Development
workflow with 1 step💻 Local Development
workflow with 10 sequential steps💻 Local Development
workflow with 25 sequential steps💻 Local Development
workflow with 50 sequential steps💻 Local Development
Promise.all with 10 concurrent steps💻 Local Development
Promise.all with 25 concurrent steps💻 Local Development
Promise.all with 50 concurrent steps💻 Local Development
Promise.race with 10 concurrent steps💻 Local Development
Promise.race with 25 concurrent steps💻 Local Development
Promise.race with 50 concurrent steps💻 Local Development
workflow with 10 sequential data payload steps (10KB)💻 Local Development
workflow with 25 sequential data payload steps (10KB)💻 Local Development
workflow with 50 sequential data payload steps (10KB)💻 Local Development
workflow with 10 concurrent data payload steps (10KB)💻 Local Development
workflow with 25 concurrent data payload steps (10KB)💻 Local Development
workflow with 50 concurrent data payload steps (10KB)💻 Local Development
Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
stream pipeline with 5 transform steps (1MB)💻 Local Development
10 parallel streams (1MB each)💻 Local Development
fan-out fan-in 10 streams (1MB each)💻 Local Development
SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
|
There was a problem hiding this comment.
Pull request overview
Updates @workflow/core hook-related e2e tests to avoid fixed sleeps by introducing a shared waitForHook() polling helper, improving reliability on slow backends/runtimes and reducing unnecessary waiting on fast ones.
Changes:
- Added
waitForHook(token, { timeoutMs, intervalMs, runId })helper that pollsgetHookByToken()until the hook is available (optionally filtering byrunId). - Replaced multiple
setTimeout(5_000)-based waits withwaitForHook()across hook registration/resume e2e tests. - Standardized one remaining direct
setTimeoutusage tosleep(3_000).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
3d9d5e3 to
8c293c9
Compare
Hook-related e2e tests (hookWorkflow, hookCleanupTestWorkflow,
hookDisposeTestWorkflow, hookWithSleepWorkflow, distributedAbortController)
previously slept a fixed 5 seconds before calling getHookByToken to wait
for the hook to be registered. On slower runtimes — notably the snapshot
runtime on Vercel where each workflow round-trip is several seconds longer
than replay — that fixed budget is too tight and the test fails with
HookNotFoundError. On faster runtimes it's unnecessarily slow.
Adds a waitForHook(token, { timeoutMs, intervalMs, runId }) helper that
polls until the hook resolves or the timeout (default 30s) expires, with
an optional runId filter for token-reuse tests where eventually-consistent
backends may briefly still report a stale hook. Each hook-wait site now
uses this helper. Non-hook fixed sleeps (workflow-progress polling for
sleepingWorkflow cancel tests, payload-processing waits in
hookWithSleepWorkflow) are unchanged.
8c293c9 to
72f0617
Compare
…ignal * origin/main: [workbench] Add TanStack Start workbench and tests (#1875) Atomically dedupe duplicate step_created/wait_created events in world-local (#1877) Split tarball hosting out of docs into its own project (#1893) Replace fixed-sleep hook waits with event-driven waitForHook helper (#1879)
…lier-errors-followups * origin-https/main: [workbench] Add TanStack Start workbench and tests (#1875) Atomically dedupe duplicate step_created/wait_created events in world-local (#1877) Split tarball hosting out of docs into its own project (#1893) Replace fixed-sleep hook waits with event-driven waitForHook helper (#1879) # Conflicts: # pnpm-lock.yaml
Conflicts resolved: * `packages/core/e2e/e2e.test.ts` (PR #1879 vs. V2 hookDispose helper) Drop the locally-defined `waitForHook(expectedRunId)` shadow in the hookDisposeTestWorkflow test in favor of the new top-level `waitForHook(token, { runId })` API merged from main. Keep V2's event-driven `waitForHookDisposal()` helper, which is stricter than main's `await sleep(3_000)` (it polls `getHookByToken` for the HookNotFoundError signal rather than waiting a fixed interval). * `packages/world-local/src/storage/events-storage.ts` (PR #1877 vs. V2 per-step async mutex) Both branches were closing race conditions in the local world's step lifecycle. PR #1877 adds filesystem-level O_CREAT|O_EXCL locks for `step_created` and `wait_created`; the V2 branch already wraps step lifecycle events in an in-process `withStepLock` mutex. They compose: the mutex serializes within a single Node process; the filesystem lock additionally protects against cross-process races (multiple pnpm workers, redelivered queue messages). Resolution takes V2's mutex wrap as the base and patches in main's two `writeExclusive` claims at the start of the `step_created` and `wait_created` handlers, with comments noting the dual-layer guarantee. `packages/core/e2e/utils.ts` auto-merged: V2's nextjs-webpack + all-Vercel source-map carve-outs preserved alongside main's `tanstack-start` addition to `hasWorkflowSourceMaps`. Verified locally: `@workflow/core` and `@workflow/world-local` typecheck clean; all 343 world-local unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ercel#1879) Hook-related e2e tests (hookWorkflow, hookCleanupTestWorkflow, hookDisposeTestWorkflow, hookWithSleepWorkflow, distributedAbortController) previously slept a fixed 5 seconds before calling getHookByToken to wait for the hook to be registered. On slower runtimes — notably the snapshot runtime on Vercel where each workflow round-trip is several seconds longer than replay — that fixed budget is too tight and the test fails with HookNotFoundError. On faster runtimes it's unnecessarily slow. Adds a waitForHook(token, { timeoutMs, intervalMs, runId }) helper that polls until the hook resolves or the timeout (default 30s) expires, with an optional runId filter for token-reuse tests where eventually-consistent backends may briefly still report a stale hook. Each hook-wait site now uses this helper. Non-hook fixed sleeps (workflow-progress polling for sleepingWorkflow cancel tests, payload-processing waits in hookWithSleepWorkflow) are unchanged.
Resolve conflicts: - packages/core/src/serialization/* (workflow.ts, step.ts, client.ts, codec-devalue.ts, errors.ts, common.ts): take main's version (post-#1849 SerializationError + post-#1851 first-class Error subclass reducers). - packages/core/src/serialization/types.ts: take main's per-Error-subclass payload shapes; re-add GZIP/ZSTD format prefixes from snapshot-runtime. - packages/core/src/serialization.ts: take main's V2 helpers (dehydrateStepError, hydrateStepError, dehydrateRunError, hydrateRunError, getWorldLazy import). - packages/core/src/runtime.ts: take main's V2 inline-replay loop + step-executor + memoizeEncryptionKey + dehydrateRunError patterns; layer back snapshot dispatch (useSnapshotRuntime + runWorkflowWithSnapshots) before the V2 main replay loop, after run_started setup. - packages/core/src/runtime/start.ts: take main's getWorldLazy; keep snapshot's getWorkflowRuntimeFromEnv usage. - packages/world-local/src/storage/index.ts: take main's local-var refactor + LocalStorage type; layer back snapshots storage entry. - packages/world-local/src/storage/events-storage.ts: take main's version (already includes #1877 dedup atomicity and #1851 Uint8Array passthrough). - packages/world-postgres: take main's tightened EntityConflictError gate (constraint name match) and waitCreated test assertion. Renumber snapshot's 0010_add_snapshots_table.sql to 0012; drop branch's duplicate 0011_add_events_entity_creation_unique_index.sql in favor of main's 0010 with dedup CTE. - packages/core/e2e/e2e.test.ts: take main's #1879 waitForHookDisposal. - .github/workflows/tests.yml: take main's runLabel/artifactSuffix naming scheme; keep snapshot's WORKFLOW_RUNTIME env var; take main's Windows job structure. - scripts/create-test-matrix.mjs: extend the runtime cross-product to fold runtime into runLabel and artifactSuffix so artifacts/job names remain unique. Snapshot dispatch is now layered on top of V2: when a workflow message arrives and the run's runtime mode is 'snapshot', runtime.ts delegates to runWorkflowWithSnapshots and returns. The V2 inline-replay loop and inline executeStep path remain in place for replay-mode runs and for inline step execution from background-step deliveries (snapshot mode will also re-route step queueing to the unified workflow queue in a follow-up commit so steps hit V2's executeStep instead of stepEntrypoint).
Summary
Replaces 8 fixed
setTimeout(5_000)waits in hook-related e2e tests with awaitForHook(token, { runId, timeoutMs, intervalMs })helper that pollsgetHookByTokenuntil it resolves or the timeout fires.Background
Hook-related e2e tests (
hookWorkflow,hookCleanupTestWorkflow,hookDisposeTestWorkflow,hookWithSleepWorkflow,distributedAbortController×3) all sleep a fixed 5 seconds before callinggetHookByToken, then assume the hook is registered. That budget is:createHookcall within 5s, and the test fails withHookNotFoundError.Fix
Adds a
waitForHookhelper at the top ofpackages/core/e2e/e2e.test.tsthat polls (default 250ms interval, 30s timeout) and exits early on success. The optionalrunIdfilter handles eventually-consistent backends where a stale lookup may still resolve to a previous run's hook for the same token (used byhookCleanupTestWorkflow/hookDisposeTestWorkflowtoken-reuse cases).Each affected call site replaces the
setTimeout(5_000) → getHookByToken(token)pair withwaitForHook(token, { runId: run.runId }). Non-hook fixed sleeps (thesleepingWorkflowcancel tests, payload-processing waits inhookWithSleepWorkflow) are untouched.Verification
The e2e file is exercised by the dedicated e2e CI pipeline against deployed worlds. Single-file change, 72 insertions / 49 deletions.
Extracted from PR #1300 (snapshot-runtime). The slowness issue was most visible on Vercel under the snapshot runtime where each round-trip is several seconds longer, but the fix makes the tests faster and more reliable on every runtime.