feat(runtime): invocation dry-run/simulation with Cloud-compatible run records (#186 P1) by kjgbot · Pull Request #187 · AgentWorkforce/workforce

kjgbot · 2026-06-03T17:40:42Z

User description

Summary

P1 of #186 — a TRUE invocation dry-run/simulation, distinct from the deploy-preflight --dry-run (which validates persona/config and exits without ever invoking the handler).

createSimulationSubsystems() (packages/runtime/src/simulate/subsystems.ts): recording, no-side-effect implementations of every ctx channel a handler can reach — harness.run, llm.complete, sandbox.exec/readFile/writeFile, files.read/write, memory.save/recall, workflow.run/status, schedule.at/cancel, log. Nothing leaves the process; every call is recorded ({kind, at, args, simulatedResult}) with the inert result the handler received instead. File writes land in an in-memory VFS (seedable via files, persists across envelopes); reads of unseeded paths fail with a message naming the seeding option.
simulateInvocation() (simulate.ts): replays fixture RawGatewayEnvelopes through the same shimEnvelope the runner uses, with runner-equivalent dispatch semantics (unsupported envelopes skipped + reported, per-envelope error isolation — one throwing handler never aborts the replay).
Cloud-compatible run records: one per dispatched envelope, mirroring Cloud's hosted compact run shape field-for-field (cloud#1788 compactBase + detail logs) with origin: "local_dry_run" — the origin value Cloud reserved for exactly this ingestion. summary is populated from the handler's returned value (string or {summary}), falling back to the error message on failure. failureClass uses Cloud's exact 7-value union (locally only success/runner_error occur, but the type stays identical). Simulation-only data (recorded side effects, captured logs) nests under an additive simulation key so the core shape stays byte-compatible.
Exported from the @agentworkforce/runtime root and a new ./simulate subpath.

exitCode: 0 when every dispatched envelope succeeded, 1 when any failed; unsupported envelopes are reported but are not failures (mirrors the runner's warn-and-continue).

Why

Cloud#1783 shipped the hosted run-observability side; per its division of labor the simulation half belongs here. This gives developers "if this agent received this event, what would happen?" with captured ctx.log(...), summary, and predicted (recorded-not-executed) side effects — and a record Cloud can later ingest alongside hosted runs without redesign.

Tests

11 new node:test cases: success-path golden record shape (field-for-field), handler-throw isolation + failure-class invariant (an error can never read as success), all 13 side-effect channel recordings, unseeded-read failure hint, VFS seeding + cross-envelope persistence, unsupported-envelope reporting, workspaceId fallback, explicit triggerKind override, raw-handler acceptance, per-run sink attribution.
Full workspace gate green: pnpm run check (lint + typecheck incl. examples + tests across all packages; runtime suite 49/49, CLI 211/211).

Out of scope (P2, after this review)

workforce invoke --persona <path> --fixture <event.json|.ndjsonl> CLI surface + docs/README distinguishing the two dry-run kinds.

Refs #186, AgentWorkforce/cloud#1783, AgentWorkforce/cloud#1788

🤖 Generated with Claude Code

CodeAnt-AI Description

Add a true invocation simulation that replays fixture events and returns Cloud-style run records

What Changed

Added a new simulation mode that runs a persona handler against fixture envelopes without executing external side effects
Each dispatched envelope now produces a run record with status, timing, trigger, logs, and a simulation summary
Unsupported fixture envelopes are reported and skipped instead of stopping the replay; handler failures are isolated to the envelope that caused them
File reads now require seeded content or a prior write, and simulation keeps written files available for later envelopes
Exported the simulation API from the runtime package root and a new ./simulate entry point

Impact

✅ Safer invocation dry-runs
✅ Clearer replay results for fixture runs
✅ Fewer surprises from unsupported events

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

…n records (#186) P1 of workforce#186: a TRUE invocation dry-run, distinct from the deploy-preflight --dry-run (which validates persona/config and exits without invoking the handler). - simulate/subsystems.ts: createSimulationSubsystems() — recording, no-side-effect implementations of every ctx channel a handler can reach (harness.run, llm.complete, sandbox exec/read/write, files read/write, memory save/recall, workflow run/status, schedule at/cancel, log). Nothing leaves the process; every call is recorded with the inert simulated result the handler received. File writes go to an in-memory VFS (seedable, persists across envelopes); reads of unseeded paths fail with a seeding hint. - simulate/simulate.ts: simulateInvocation() — replays fixture RawGatewayEnvelopes through the same shimEnvelope the runner uses, dispatches to the handler with runner-equivalent semantics (unsupported envelopes skipped + reported, per-envelope error isolation), and emits one run record per dispatched envelope. - Run records mirror Cloud's hosted compact run shape field-for-field (cloud#1788 compactBase + detail logs) with origin:"local_dry_run" — the origin value Cloud reserved for this ingestion. summary is populated from the handler's returned value (or the error message). failureClass uses Cloud's exact union; simulation-only data nests under an additive `simulation` key. - Exported via @agentworkforce/runtime root and a ./simulate subpath. 11 new node:test cases; full workspace check (lint+typecheck+test) green. P2 (CLI `workforce invoke --fixture` + docs) follows after P1 review. Refs #186, AgentWorkforce/cloud#1783, AgentWorkforce/cloud#1788 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

codeant-ai · 2026-06-03T17:40:47Z

CodeAnt AI is reviewing your PR.

gemini-code-assist · 2026-06-03T17:40:47Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-06-03T17:41:10Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d53079dc-fd0f-4f70-8b26-a07b03581046

📥 Commits

Reviewing files that changed from the base of the PR and between 1e41e47 and 6eef4d1.

📒 Files selected for processing (8)

packages/runtime/package.json
packages/runtime/src/index.ts
packages/runtime/src/simulate/failure-class.ts
packages/runtime/src/simulate/index.ts
packages/runtime/src/simulate/simulate.test.ts
packages/runtime/src/simulate/simulate.ts
packages/runtime/src/simulate/subsystems.ts
packages/runtime/src/simulate/types.ts

📝 Walkthrough

Walkthrough

This PR adds a comprehensive local simulation/dry-run API to the runtime package. It enables deterministic envelope replay with side-effect recording, logging capture, and Cloud-compatible run records, including type contracts, recording subsystems, core invocation logic, and public exports.

Changes

Simulation and Dry-Run API

Layer / File(s)	Summary
Type Contracts and Failure Classification `packages/runtime/src/simulate/types.ts`, `packages/runtime/src/simulate/failure-class.ts`, `simulate.test.ts` (failure class unit test)	Core type contracts define `RecordedSideEffect`, `CapturedLogLine`, `SimulatedRunRecord`, `UnsupportedEnvelope`, `SimulationResult`, and `SimulateInvocationOptions`. Failure classification types and `deriveSimulatedRunFailureClass` function handle run success/failure inference from handler status and error presence.
Simulation Subsystems Factory `packages/runtime/src/simulate/subsystems.ts`, `simulate.test.ts` (side effects, VFS, sink tests)	Recording-only implementations of sandbox, file, LLM, memory, workflow, schedule, logging, and harness contexts. Each records side effects with timestamps into an active sink; in-memory VFS is seeded and persisted; `useSink` swaps sink attribution per envelope; contexts return inert simulated results without executing dangerous operations.
Simulation Invocation and Run Recording `packages/runtime/src/simulate/simulate.ts`, `simulate.test.ts` (invocation tests)	`simulateInvocation` accepts handler and replay envelopes, invokes handler per envelope with simulation subsystems, derives run summaries and trigger kinds, renders logs split by level, and aggregates Cloud-compatible run records. Unsupported envelopes are recorded separately without failing the overall result. Handler failures continue replay to subsequent envelopes.
Public API and Package Exports `packages/runtime/src/simulate/index.ts`, `packages/runtime/src/index.ts`, `packages/runtime/package.json`	Simulate module entrypoint consolidates all public APIs; runtime index re-exports to top level; package.json declares `./simulate` export path with TypeScript and JavaScript entry points.

Sequence Diagram

sequenceDiagram
  participant Handler as User Handler
  participant Invocation as simulateInvocation
  participant Subsystems as SimulationSubsystems
  participant Sink as SimulationSink
  participant Record as SimulatedRunRecord
  Invocation->>Invocation: Validate handler & materialize envelopes
  loop Per envelope
    Invocation->>Subsystems: Create/swap sink
    Invocation->>Subsystems: Shim envelope to WorkforceEvent
    Invocation->>Handler: Invoke with subsystems context
    Subsystems->>Sink: Record side effects & logs
    Handler-->>Invocation: Return summary or error
    Invocation->>Record: Aggregate run record with logs & side effects
  end
  Invocation->>Record: Include unsupported envelopes
  Invocation-->>Invocation: Return SimulationResult with exitCode

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The PR introduces a new, substantial simulation API with multiple interdependent modules, comprehensive type contracts, and detailed subsystem implementations. The logic is dense in places (especially simulateInvocation and subsystems factories) but well-structured across separate concerns. The test suite is thorough and covers both happy and failure paths, aiding review understanding.

Possibly related issues

#186: Implements the requested local dry-run/simulation invocation API with deterministic envelope replay, ctx.log capture, Cloud-compatible run records, and comprehensive side-effect recording across all subsystem channels.

Poem

🐰 A rabbit hops through simulations bright,
No side effects harmed in testing's light.
Envelopes shimmed, handlers invoked with care,
Cloud-compatible records float through the air!
Dry-runs complete—logs captured with delight. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: adding invocation dry-run/simulation with Cloud-compatible run records. It is specific, on-topic, and reflects the core objective of the PR.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset. It explains the simulation mode, key components (subsystems, simulation invocation), Cloud-compatible records, and testing approach, all of which align with the actual changes in the PR.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/invocation-simulation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codeant-ai · 2026-06-03T17:44:44Z

+  const deployment: WorkforceDeploymentContext = {
+    id: options.deployment?.id ?? 'sim-deployment',
+    triggerKind: options.deployment?.triggerKind ?? 'inbox',
+    parentDeploymentId: options.deployment?.parentDeploymentId ?? null
+  };


🟠 Architect Review — HIGH

When deployment metadata is omitted, ctx.deployment.triggerKind is always synthesized as 'inbox' even for cron envelopes, while the emitted run records derive trigger.kind as 'clock', so simulation produces an inconsistent trigger kind between handler context and run records compared to production.

Suggestion: Derive the synthesized deployment triggerKind from the replayed envelopes' sources (e.g. 'clock' for cron events, 'inbox' for provider events), or require an explicit deployment.triggerKind, so ctx.deployment.triggerKind and the run-record trigger.kind remain aligned with each other and with production semantics.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is an **Architect / Logical Review** comment left during a code review. These reviews are first-class, important findings — not optional suggestions. Do NOT dismiss this as a 'big architectural change' just because the title says architect review; most of these can be resolved with a small, localized fix once the intent is understood. **Path:** packages/runtime/src/simulate/simulate.ts **Line:** 69:73 **Comment:** *HIGH: When deployment metadata is omitted, `ctx.deployment.triggerKind` is always synthesized as `'inbox'` even for cron envelopes, while the emitted run records derive `trigger.kind` as `'clock'`, so simulation produces an inconsistent trigger kind between handler context and run records compared to production. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. If a suggested approach is provided above, use it as the authoritative instruction. If no explicit code suggestion is given, you MUST still draft and apply your own minimal, localized fix — do not punt back with 'no suggestion provided, review manually'. Keep the change as small as possible: add a guard clause, gate on a loading state, reorder an await, wrap in a conditional, etc. Do not refactor surrounding code or expand scope beyond the finding. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

codeant-ai · 2026-06-03T17:45:05Z

+      record('sandbox.exec', { cmd, ...(opts?.cwd ? { cwd: opts.cwd } : {}) }, {
+        ...result,
+        note: SIMULATED
+      });


Suggestion: The simulated sandbox.exec recording ignores env and timeoutMs, so execution traces omit key execution context that callers passed. Record these options as part of the side-effect args to avoid incorrect or incomplete simulation observability. [incomplete implementation]

Severity Level: Major ⚠️

- ⚠️ Sandbox exec traces omit env and timeout configuration. - ⚠️ Reproducing simulated shell behavior from logs becomes harder.

Steps of Reproduction ✅

1. The sandbox execution options are defined as `SandboxExecArgs` in `packages/runtime/src/types.ts:202-206` with optional `cwd`, `env`, and `timeoutMs` fields. 2. In simulation, `createSimulationSubsystems` defines `sandbox.exec` at `subsystems.ts:87-96`, which runs `record('sandbox.exec', { cmd, ...(opts?.cwd ? { cwd: opts.cwd } : {}) }, { ...result, note: SIMULATED })` at `subsystems.ts:91-94` to store call metadata in `RecordedSideEffect.args` (see `record` at `subsystems.ts:61-71` and `RecordedSideEffect` at `simulate/types.ts:16-37`). 3. When a simulated handler (following the pattern in `simulate.test.ts:122-138`, where `await ctx.sandbox.exec('rm -rf / --no-preserve-root');` is used) is updated to call `ctx.sandbox.exec('cmd', { cwd: '/x', env: { FOO: 'bar' }, timeoutMs: 5000 })` using the `SandboxExecArgs` type, the `opts.env` and `opts.timeoutMs` values are visible to the handler but never included in the `record` call at `subsystems.ts:91-94`. 4. Running `simulateInvocation` with such a handler and inspecting `result.runs[0].simulation.sideEffects.find(e => e.kind === 'sandbox.exec')?.args` shows only `cmd` (and optionally `cwd`), with no `env` or `timeoutMs`, so the dry-run side-effect trace lacks key execution context the caller actually supplied.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** packages/runtime/src/simulate/subsystems.ts **Line:** 91:94 **Comment:** *Incomplete Implementation: The simulated `sandbox.exec` recording ignores `env` and `timeoutMs`, so execution traces omit key execution context that callers passed. Record these options as part of the side-effect args to avoid incorrect or incomplete simulation observability. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

codeant-ai · 2026-06-03T17:45:05Z

+    record('harness.run', {
+      promptChars: args.prompt.length,
+      ...(args.cwd ? { cwd: args.cwd } : {})
+    }, { exitCode: 0, note: SIMULATED });


Suggestion: The simulated harness.run recording drops inputs and env, so the recorded args no longer reflect the actual invocation contract and can produce misleading run records. Include all provided harness arguments in the recorded payload so dry-run traces stay faithful to real handler calls. [incomplete implementation]

Severity Level: Major ⚠️

- ⚠️ Harness dry-run traces omit `inputs` and `env` configuration. - ⚠️ Harder to debug harness behavior from simulation records.

Steps of Reproduction ✅

1. The harness invocation contract is defined in `packages/runtime/src/types.ts:191-199` as `HarnessRunArgs` with fields `prompt`, optional `cwd`, `inputs`, and `env`. 2. In simulation, `createSimulationSubsystems` at `subsystems.ts:50-56` defines `harnessRunner` at `subsystems.ts:181-188`, which the ctx builder wires into `ctx.harness.run` via `buildCtx` usage at `simulate.ts:84-97`. 3. When `ctx.harness.run` is called in a simulated handler (pattern shown in `simulate.test.ts:122-138`, where `await ctx.harness.run({ prompt: 'do the thing' });` is used), the simulation's `harnessRunner` receives the full `HarnessRunArgs`, but the recording call at `subsystems.ts:183-186` only persists `{ promptChars: args.prompt.length, ...(args.cwd ? { cwd: args.cwd } : {}) }` into `RecordedSideEffect.args` (see `record` implementation at `subsystems.ts:61-71` and `RecordedSideEffect.args` definition at `simulate/types.ts:16-37`). 4. To observe the loss of fidelity, extend the existing side-effect test handler at `simulate.test.ts:124-138` to pass `inputs` and `env` into `ctx.harness.run` (using the contract from `types.ts:191-199`), run `simulateInvocation`, and inspect `result.runs[0].simulation.sideEffects.find(e => e.kind === 'harness.run')?.args`: it will only contain `promptChars` (and maybe `cwd`), with no representation of the `inputs` or `env` actually used by the handler.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** packages/runtime/src/simulate/subsystems.ts **Line:** 183:186 **Comment:** *Incomplete Implementation: The simulated `harness.run` recording drops `inputs` and `env`, so the recorded `args` no longer reflect the actual invocation contract and can produce misleading run records. Include all provided harness arguments in the recorded payload so dry-run traces stay faithful to real handler calls. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

codeant-ai · 2026-06-03T17:45:05Z

+    useSink(next) {
+      sink = next;
+    },


Suggestion: Using a single mutable sink that is swapped between envelopes can misattribute side effects/logs when a handler leaves background async work running after it returns. Late async emissions from the previous invocation will be written into the next envelope's sink. Use per-invocation bound record/log functions (or an immutable sink captured at dispatch time) instead of a globally swapped sink reference. [race condition]

Severity Level: Major ⚠️

- ❌ Late async logs attached to the wrong simulated run. - ⚠️ Multi-envelope dry-run debugging sees misleading per-run logs.

Steps of Reproduction ✅

1. Entry point `simulateInvocation` at `packages/runtime/src/simulate/simulate.ts:36-38` accepts a `handler` and fixtures, then creates shared subsystems via `createSimulationSubsystems` at `simulate.ts:76-79`. 2. A single `ctx` is built at `simulate.ts:84-97` with `log` and other channels wired to the simulation subsystems from `subsystems.ts:87-198`; the log implementation closes over the mutable `sink` variable defined at `subsystems.ts:56` and mutated in `useSink` at `subsystems.ts:199-201`. 3. For each envelope, `simulateInvocation` iterates at `simulate.ts:103-156`, creating a fresh `sink: SimulationSink` at `simulate.ts:110`, then calling `subsystems.useSink(sink)` at `simulate.ts:111` before awaiting `handlerFn(ctx, event)` at `simulate.ts:117`. 4. If a user handler (following the pattern in `simulate.test.ts:52-62` where `ctx.log('info', 'saw event', ...)` is used) starts background async work that logs after it returns (e.g. `setTimeout(() => ctx.log('info','late'), 100)`), then: after the first envelope's `await handlerFn` completes, the loop advances, `useSink` is called for the second envelope (`simulate.ts:110-112`), mutating `sink`. When the background task from the first envelope finally calls `ctx.log`, the log implementation at `subsystems.ts:172-178` pushes into the *current* `sink.logs` (second envelope), misattributing the side effect to the wrong run.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** packages/runtime/src/simulate/subsystems.ts **Line:** 199:201 **Comment:** *Race Condition: Using a single mutable sink that is swapped between envelopes can misattribute side effects/logs when a handler leaves background async work running after it returns. Late async emissions from the previous invocation will be written into the next envelope's sink. Use per-invocation bound record/log functions (or an immutable sink captured at dispatch time) instead of a globally swapped sink reference. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

codeant-ai · 2026-06-03T17:46:48Z

+  const ctx = buildCtx({
+    persona: options.persona,
+    agent,
+    deployment,
+    workspaceId,
+    sandbox: subsystems.sandbox,
+    files: subsystems.files,
+    llm: subsystems.llm,
+    memory: subsystems.memory,
+    workflow: subsystems.workflow,
+    schedule: subsystems.schedule,
+    log: subsystems.log,
+    harnessRunner: subsystems.harnessRunner
+  });


Suggestion: Simulation still builds ctx through buildCtx without replacing the credentials subsystem, so ctx.credentials remains environment-backed and can throw missing-credential errors during dry-run. That breaks simulation determinism and violates the "recording/no-side-effect local run" behavior when handlers read credentials. Inject a simulated credentials context (or extend buildCtx to accept one) so simulation does not depend on host env secrets. [incomplete implementation]

Severity Level: Major ⚠️

- ❌ Simulation fails when handlers require credentials without env set. - ⚠️ Local dry-run unusable on machines lacking Cloud tokens. - ⚠️ Breaks documented no-side-effect local simulation expectations.

Steps of Reproduction ✅

1. In `packages/runtime/src/simulate/simulate.ts`, `simulateInvocation` constructs the handler context once via `const ctx = buildCtx({ ... })` at lines 84–97, passing `persona`, `agent`, `deployment`, `workspaceId`, `sandbox`, `files`, `llm`, `memory`, `workflow`, `schedule`, `log`, and `harnessRunner`, but not providing any override for credentials. 2. Inspect `buildCtx` in `packages/runtime/src/ctx.ts`: at lines 118–148 it builds the `WorkforceCtx` and unconditionally sets `credentials: credentialsFromEnv()` at line 143, meaning every context (including simulation contexts) uses process-environment-backed credentials. 3. `credentialsFromEnv` in `ctx.ts` lines 203–219 returns an object whose getters delegate to `requireRuntimeCredentials(processEnv)`, and `requireRuntimeCredentials` at lines 221–225 calls `readRuntimeCredentialSnapshot(processEnv)` and throws `new Error("Runtime credentials are required: missing ...")` when any of the expected RELAYFILE_* or CLOUD_API_* environment variables are absent (as exercised by the `ctx.credentials` tests in `packages/runtime/src/ctx.test.ts:24–45`). 4. Run `simulateInvocation` (exported for consumers via `packages/runtime/src/index.ts:5–22`) in a local environment that lacks full runtime credentials (e.g. no `RELAYFILE_TOKEN` or `CLOUD_API_ACCESS_TOKEN`) with a handler that reads credentials, such as `handler(async (ctx) => { ctx.credentials.require(); })`; when `simulateInvocation` calls `buildCtx` (simulate.ts:84–97), `ctx.credentials` is bound to `credentialsFromEnv`, and the first call to `ctx.credentials.require()` inside the simulated handler immediately throws the same "Runtime credentials are required: missing …" error as production, causing the simulation run to fail even though the rest of the simulation subsystems (`llm`, `memory`, `workflow`, `schedule`, `sandbox`, `harnessRunner`) are designed to be no-side-effect and independent of cloud configuration.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** packages/runtime/src/simulate/simulate.ts **Line:** 84:97 **Comment:** *Incomplete Implementation: Simulation still builds `ctx` through `buildCtx` without replacing the credentials subsystem, so `ctx.credentials` remains environment-backed and can throw missing-credential errors during dry-run. That breaks simulation determinism and violates the "recording/no-side-effect local run" behavior when handlers read credentials. Inject a simulated credentials context (or extend `buildCtx` to accept one) so simulation does not depend on host env secrets. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

codeant-ai · 2026-06-03T17:46:48Z

+    const sink: SimulationSink = { sideEffects: [], logs: [] };
+    subsystems.useSink(sink);
+
+    const runStarted = now();
+    let summary: string | null = null;
+    let error: string | null = null;
+    try {
+      const returned: unknown = await handlerFn(ctx, event);
+      summary = summaryFromHandlerReturn(returned);
+    } catch (err) {
+      error = err instanceof Error ? err.message : String(err);


Suggestion: Side-effect/log attribution is race-prone because a single mutable sink is swapped between envelopes and reused by the same ctx. If a handler starts background async work (e.g. timer/unawaited promise) that calls ctx.log or other ctx.* methods after the handler returns, those records will be written into the next envelope's sink, corrupting run records. Use per-envelope subsystem instances (or bind immutable per-run sink closures) so late async emissions cannot cross-contaminate runs. [race condition]

Severity Level: Major ⚠️

- ❌ Simulated run logs mis-attributed between envelopes in replay. - ⚠️ Side-effect records from one run appear under another. - ⚠️ Cloud run records become unreliable for debugging simulations.

Steps of Reproduction ✅

1. Observe the simulation subsystems implementation in `packages/runtime/src/simulate/subsystems.ts`: `createSimulationSubsystems` declares a single mutable variable `let sink: SimulationSink = { sideEffects: [], logs: [] };` at line 56 and exposes `useSink(next)` at lines 199–201, which simply reassigns this shared `sink` variable. 2. Note that all simulated side effects and logs are written through closures that capture this mutable `sink`: `record(...)` at lines 61–71 pushes into `sink.sideEffects`, and the `log` implementation at lines 172–179 pushes into `sink.logs`, both referencing whatever `sink` object `useSink` last assigned. 3. In `packages/runtime/src/simulate/simulate.ts`, `simulateInvocation` builds a single `ctx` once at lines 84–97 using `buildCtx(...)`, then in the per-envelope loop at lines 103–112 it creates a fresh `sink` object (`const sink: SimulationSink = { sideEffects: [], logs: [] };`) and calls `subsystems.useSink(sink);` before awaiting `handlerFn(ctx, event)` at lines 116–119, so the same `ctx` instance is reused for every envelope while only the shared `sink` pointer is swapped. 4. Define a handler using the public API (patterned on `simulateInvocation` tests in `packages/runtime/src/simulate/simulate.test.ts:52–63`) such as `handler(async (ctx, event) => { setTimeout(() => ctx.log('info', 'late log', { id: event.id }), 0); })`, then call `simulateInvocation` with two envelopes; after the first `handlerFn` returns, `simulateInvocation` advances to the next envelope and calls `subsystems.useSink(nextSink)` (simulate.ts:110–112), so when the `setTimeout` callback eventually runs it uses the same `ctx.log` closure but now writes into the second run's `sink.logs`, causing the first envelope's late log to appear under the second run's `simulation.capturedLogs` and leaving the first run's logs incomplete.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** packages/runtime/src/simulate/simulate.ts **Line:** 110:120 **Comment:** *Race Condition: Side-effect/log attribution is race-prone because a single mutable sink is swapped between envelopes and reused by the same `ctx`. If a handler starts background async work (e.g. timer/unawaited promise) that calls `ctx.log` or other `ctx.*` methods after the handler returns, those records will be written into the next envelope's sink, corrupting run records. Use per-envelope subsystem instances (or bind immutable per-run sink closures) so late async emissions cannot cross-contaminate runs. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

codeant-ai · 2026-06-03T17:46:53Z

CodeAnt AI finished reviewing your PR.

cubic-dev-ai

5 issues found across 8 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/runtime/src/simulate/subsystems.ts">

<violation number="1" location="packages/runtime/src/simulate/subsystems.ts:173">
P2: Captured `ctx.log` attrs are stored as-is, so non-JSON values can crash log rendering and abort simulation replay.</violation>

<violation number="2" location="packages/runtime/src/simulate/subsystems.ts:200">
P2: Per-envelope attribution is fragile because a global mutable sink can misattribute late async writes to the wrong run.</violation>
</file>

<file name="packages/runtime/src/simulate/simulate.ts">

<violation number="1" location="packages/runtime/src/simulate/simulate.ts:71">
P2: Defaulting `ctx.deployment.triggerKind` to `"inbox"` makes cron simulations report the wrong trigger kind inside handler context.</violation>

<violation number="2" location="packages/runtime/src/simulate/simulate.ts:84">
P1: Inject a simulated credentials implementation when building the simulation context; otherwise handlers that call `ctx.credentials` can fail based on host environment secrets, which breaks deterministic dry-run behavior.</violation>
</file>

<file name="packages/runtime/src/simulate/types.ts">

<violation number="1" location="packages/runtime/src/simulate/types.ts:77">
P2: `trigger.kind` is typed too loosely (`string`) instead of the documented trigger-kind union, which weakens Cloud-shape compatibility guarantees.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

cubic-dev-ai · 2026-06-03T17:52:59Z

+  // buildCtx may throw for unresolved required persona inputs — that is a
+  // simulation setup error (seed inputs via `agent.inputValues`), surfaced
+  // to the caller before any run records exist.
+  const ctx = buildCtx({


P1: Inject a simulated credentials implementation when building the simulation context; otherwise handlers that call ctx.credentials can fail based on host environment secrets, which breaks deterministic dry-run behavior.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/simulate.ts, line 84: <comment>Inject a simulated credentials implementation when building the simulation context; otherwise handlers that call `ctx.credentials` can fail based on host environment secrets, which breaks deterministic dry-run behavior.</comment> <file context> @@ -0,0 +1,236 @@ + // buildCtx may throw for unresolved required persona inputs — that is a + // simulation setup error (seed inputs via `agent.inputValues`), surfaced + // to the caller before any run records exist. + const ctx = buildCtx({ + persona: options.persona, + agent, </file context>

cubic-dev-ai · 2026-06-03T17:52:59Z

+    log,
+    harnessRunner,
+    useSink(next) {
+      sink = next;


P2: Per-envelope attribution is fragile because a global mutable sink can misattribute late async writes to the wrong run.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/subsystems.ts, line 200: <comment>Per-envelope attribution is fragile because a global mutable sink can misattribute late async writes to the wrong run.</comment> <file context> @@ -0,0 +1,206 @@ + log, + harnessRunner, + useSink(next) { + sink = next; + }, + vfsSnapshot() { </file context>

cubic-dev-ai · 2026-06-03T17:52:59Z

+  };
+
+  const log: WorkforceCtx['log'] = (level, message, attrs) => {
+    sink.logs.push({


P2: Captured ctx.log attrs are stored as-is, so non-JSON values can crash log rendering and abort simulation replay.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/subsystems.ts, line 173: <comment>Captured `ctx.log` attrs are stored as-is, so non-JSON values can crash log rendering and abort simulation replay.</comment> <file context> @@ -0,0 +1,206 @@ + }; + + const log: WorkforceCtx['log'] = (level, message, attrs) => { + sink.logs.push({ + t: now().toISOString(), + level, </file context>

cubic-dev-ai · 2026-06-03T17:52:59Z

+
+  const deployment: WorkforceDeploymentContext = {
+    id: options.deployment?.id ?? 'sim-deployment',
+    triggerKind: options.deployment?.triggerKind ?? 'inbox',


P2: Defaulting ctx.deployment.triggerKind to "inbox" makes cron simulations report the wrong trigger kind inside handler context.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/simulate.ts, line 71: <comment>Defaulting `ctx.deployment.triggerKind` to `"inbox"` makes cron simulations report the wrong trigger kind inside handler context.</comment> <file context> @@ -0,0 +1,236 @@ + + const deployment: WorkforceDeploymentContext = { + id: options.deployment?.id ?? 'sim-deployment', + triggerKind: options.deployment?.triggerKind ?? 'inbox', + parentDeploymentId: options.deployment?.parentDeploymentId ?? null + }; </file context>

cubic-dev-ai · 2026-06-03T17:52:59Z

+  durationMs: number;
+  trigger: {
+    /** Workforce trigger vocabulary (`inbox` | `clock` | `radio`). */
+    kind: string;


P2: trigger.kind is typed too loosely (string) instead of the documented trigger-kind union, which weakens Cloud-shape compatibility guarantees.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/types.ts, line 77: <comment>`trigger.kind` is typed too loosely (`string`) instead of the documented trigger-kind union, which weakens Cloud-shape compatibility guarantees.</comment> <file context> @@ -0,0 +1,155 @@ + durationMs: number; + trigger: { + /** Workforce trigger vocabulary (`inbox` | `clock` | `radio`). */ + kind: string; + eventSource: string; + }; </file context>

codeant-ai Bot added the size:XL This PR changes 500-999 lines, ignoring generated files label Jun 3, 2026

codeant-ai Bot reviewed Jun 3, 2026

View reviewed changes

cubic-dev-ai Bot reviewed Jun 3, 2026

View reviewed changes

kjgbot mentioned this pull request Jun 3, 2026

feat(cli): workforce invoke — fixture-replay simulation CLI (#186 P2) #188

Merged

4 tasks

kjgbot merged commit e66ffe3 into main Jun 3, 2026
3 checks passed

kjgbot deleted the feat/invocation-simulation branch June 3, 2026 17:54

kjgbot mentioned this pull request Jun 3, 2026

Invocation dry-run / simulation: fixture replay, ctx.log capture, Cloud-compatible run shape (#1783 follow-up) #186

Closed

4 tasks

khaliqgant mentioned this pull request Jun 4, 2026

Fixture discoverability for invoke: runs export, --scaffold, and an envelope contract test vs cloud #189

Closed

Conversation

kjgbot commented Jun 3, 2026 • edited by codeant-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Summary

Why

Tests

Out of scope (P2, after this review)

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Uh oh!

codeant-ai Bot commented Jun 3, 2026

Uh oh!

gemini-code-assist Bot commented Jun 3, 2026

Uh oh!

coderabbitai Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related issues

Poem

Uh oh!

codeant-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai Bot commented Jun 3, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kjgbot commented Jun 3, 2026 •

edited by codeant-ai Bot

Loading

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading