feat(runtime): invocation dry-run/simulation with Cloud-compatible run records (#186 P1)#187
Conversation
…n records (#186) P1 of workforce#186: a TRUE invocation dry-run, distinct from the deploy-preflight --dry-run (which validates persona/config and exits without invoking the handler). - simulate/subsystems.ts: createSimulationSubsystems() — recording, no-side-effect implementations of every ctx channel a handler can reach (harness.run, llm.complete, sandbox exec/read/write, files read/write, memory save/recall, workflow run/status, schedule at/cancel, log). Nothing leaves the process; every call is recorded with the inert simulated result the handler received. File writes go to an in-memory VFS (seedable, persists across envelopes); reads of unseeded paths fail with a seeding hint. - simulate/simulate.ts: simulateInvocation() — replays fixture RawGatewayEnvelopes through the same shimEnvelope the runner uses, dispatches to the handler with runner-equivalent semantics (unsupported envelopes skipped + reported, per-envelope error isolation), and emits one run record per dispatched envelope. - Run records mirror Cloud's hosted compact run shape field-for-field (cloud#1788 compactBase + detail logs) with origin:"local_dry_run" — the origin value Cloud reserved for this ingestion. summary is populated from the handler's returned value (or the error message). failureClass uses Cloud's exact union; simulation-only data nests under an additive `simulation` key. - Exported via @agentworkforce/runtime root and a ./simulate subpath. 11 new node:test cases; full workspace check (lint+typecheck+test) green. P2 (CLI `workforce invoke --fixture` + docs) follows after P1 review. Refs #186, AgentWorkforce/cloud#1783, AgentWorkforce/cloud#1788 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
CodeAnt AI is reviewing your PR. |
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThis PR adds a comprehensive local simulation/dry-run API to the runtime package. It enables deterministic envelope replay with side-effect recording, logging capture, and Cloud-compatible run records, including type contracts, recording subsystems, core invocation logic, and public exports. ChangesSimulation and Dry-Run API
Sequence DiagramsequenceDiagram
participant Handler as User Handler
participant Invocation as simulateInvocation
participant Subsystems as SimulationSubsystems
participant Sink as SimulationSink
participant Record as SimulatedRunRecord
Invocation->>Invocation: Validate handler & materialize envelopes
loop Per envelope
Invocation->>Subsystems: Create/swap sink
Invocation->>Subsystems: Shim envelope to WorkforceEvent
Invocation->>Handler: Invoke with subsystems context
Subsystems->>Sink: Record side effects & logs
Handler-->>Invocation: Return summary or error
Invocation->>Record: Aggregate run record with logs & side effects
end
Invocation->>Record: Include unsupported envelopes
Invocation-->>Invocation: Return SimulationResult with exitCode
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes The PR introduces a new, substantial simulation API with multiple interdependent modules, comprehensive type contracts, and detailed subsystem implementations. The logic is dense in places (especially Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| const deployment: WorkforceDeploymentContext = { | ||
| id: options.deployment?.id ?? 'sim-deployment', | ||
| triggerKind: options.deployment?.triggerKind ?? 'inbox', | ||
| parentDeploymentId: options.deployment?.parentDeploymentId ?? null | ||
| }; |
There was a problem hiding this comment.
🟠 Architect Review — HIGH
When deployment metadata is omitted, ctx.deployment.triggerKind is always synthesized as 'inbox' even for cron envelopes, while the emitted run records derive trigger.kind as 'clock', so simulation produces an inconsistent trigger kind between handler context and run records compared to production.
Suggestion: Derive the synthesized deployment triggerKind from the replayed envelopes' sources (e.g. 'clock' for cron events, 'inbox' for provider events), or require an explicit deployment.triggerKind, so ctx.deployment.triggerKind and the run-record trigger.kind remain aligned with each other and with production semantics.
Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is an **Architect / Logical Review** comment left during a code review. These reviews are first-class, important findings — not optional suggestions. Do NOT dismiss this as a 'big architectural change' just because the title says architect review; most of these can be resolved with a small, localized fix once the intent is understood.
**Path:** packages/runtime/src/simulate/simulate.ts
**Line:** 69:73
**Comment:**
*HIGH: When deployment metadata is omitted, `ctx.deployment.triggerKind` is always synthesized as `'inbox'` even for cron envelopes, while the emitted run records derive `trigger.kind` as `'clock'`, so simulation produces an inconsistent trigger kind between handler context and run records compared to production.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
If a suggested approach is provided above, use it as the authoritative instruction. If no explicit code suggestion is given, you MUST still draft and apply your own minimal, localized fix — do not punt back with 'no suggestion provided, review manually'. Keep the change as small as possible: add a guard clause, gate on a loading state, reorder an await, wrap in a conditional, etc. Do not refactor surrounding code or expand scope beyond the finding.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| record('sandbox.exec', { cmd, ...(opts?.cwd ? { cwd: opts.cwd } : {}) }, { | ||
| ...result, | ||
| note: SIMULATED | ||
| }); |
There was a problem hiding this comment.
Suggestion: The simulated sandbox.exec recording ignores env and timeoutMs, so execution traces omit key execution context that callers passed. Record these options as part of the side-effect args to avoid incorrect or incomplete simulation observability. [incomplete implementation]
Severity Level: Major ⚠️
- ⚠️ Sandbox exec traces omit env and timeout configuration.
- ⚠️ Reproducing simulated shell behavior from logs becomes harder.Steps of Reproduction ✅
1. The sandbox execution options are defined as `SandboxExecArgs` in
`packages/runtime/src/types.ts:202-206` with optional `cwd`, `env`, and `timeoutMs`
fields.
2. In simulation, `createSimulationSubsystems` defines `sandbox.exec` at
`subsystems.ts:87-96`, which runs `record('sandbox.exec', { cmd, ...(opts?.cwd ? { cwd:
opts.cwd } : {}) }, { ...result, note: SIMULATED })` at `subsystems.ts:91-94` to store
call metadata in `RecordedSideEffect.args` (see `record` at `subsystems.ts:61-71` and
`RecordedSideEffect` at `simulate/types.ts:16-37`).
3. When a simulated handler (following the pattern in `simulate.test.ts:122-138`, where
`await ctx.sandbox.exec('rm -rf / --no-preserve-root');` is used) is updated to call
`ctx.sandbox.exec('cmd', { cwd: '/x', env: { FOO: 'bar' }, timeoutMs: 5000 })` using the
`SandboxExecArgs` type, the `opts.env` and `opts.timeoutMs` values are visible to the
handler but never included in the `record` call at `subsystems.ts:91-94`.
4. Running `simulateInvocation` with such a handler and inspecting
`result.runs[0].simulation.sideEffects.find(e => e.kind === 'sandbox.exec')?.args` shows
only `cmd` (and optionally `cwd`), with no `env` or `timeoutMs`, so the dry-run
side-effect trace lacks key execution context the caller actually supplied.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** packages/runtime/src/simulate/subsystems.ts
**Line:** 91:94
**Comment:**
*Incomplete Implementation: The simulated `sandbox.exec` recording ignores `env` and `timeoutMs`, so execution traces omit key execution context that callers passed. Record these options as part of the side-effect args to avoid incorrect or incomplete simulation observability.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| record('harness.run', { | ||
| promptChars: args.prompt.length, | ||
| ...(args.cwd ? { cwd: args.cwd } : {}) | ||
| }, { exitCode: 0, note: SIMULATED }); |
There was a problem hiding this comment.
Suggestion: The simulated harness.run recording drops inputs and env, so the recorded args no longer reflect the actual invocation contract and can produce misleading run records. Include all provided harness arguments in the recorded payload so dry-run traces stay faithful to real handler calls. [incomplete implementation]
Severity Level: Major ⚠️
- ⚠️ Harness dry-run traces omit `inputs` and `env` configuration.
- ⚠️ Harder to debug harness behavior from simulation records.Steps of Reproduction ✅
1. The harness invocation contract is defined in `packages/runtime/src/types.ts:191-199`
as `HarnessRunArgs` with fields `prompt`, optional `cwd`, `inputs`, and `env`.
2. In simulation, `createSimulationSubsystems` at `subsystems.ts:50-56` defines
`harnessRunner` at `subsystems.ts:181-188`, which the ctx builder wires into
`ctx.harness.run` via `buildCtx` usage at `simulate.ts:84-97`.
3. When `ctx.harness.run` is called in a simulated handler (pattern shown in
`simulate.test.ts:122-138`, where `await ctx.harness.run({ prompt: 'do the thing' });` is
used), the simulation's `harnessRunner` receives the full `HarnessRunArgs`, but the
recording call at `subsystems.ts:183-186` only persists `{ promptChars:
args.prompt.length, ...(args.cwd ? { cwd: args.cwd } : {}) }` into
`RecordedSideEffect.args` (see `record` implementation at `subsystems.ts:61-71` and
`RecordedSideEffect.args` definition at `simulate/types.ts:16-37`).
4. To observe the loss of fidelity, extend the existing side-effect test handler at
`simulate.test.ts:124-138` to pass `inputs` and `env` into `ctx.harness.run` (using the
contract from `types.ts:191-199`), run `simulateInvocation`, and inspect
`result.runs[0].simulation.sideEffects.find(e => e.kind === 'harness.run')?.args`: it will
only contain `promptChars` (and maybe `cwd`), with no representation of the `inputs` or
`env` actually used by the handler.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** packages/runtime/src/simulate/subsystems.ts
**Line:** 183:186
**Comment:**
*Incomplete Implementation: The simulated `harness.run` recording drops `inputs` and `env`, so the recorded `args` no longer reflect the actual invocation contract and can produce misleading run records. Include all provided harness arguments in the recorded payload so dry-run traces stay faithful to real handler calls.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| useSink(next) { | ||
| sink = next; | ||
| }, |
There was a problem hiding this comment.
Suggestion: Using a single mutable sink that is swapped between envelopes can misattribute side effects/logs when a handler leaves background async work running after it returns. Late async emissions from the previous invocation will be written into the next envelope's sink. Use per-invocation bound record/log functions (or an immutable sink captured at dispatch time) instead of a globally swapped sink reference. [race condition]
Severity Level: Major ⚠️
- ❌ Late async logs attached to the wrong simulated run.
- ⚠️ Multi-envelope dry-run debugging sees misleading per-run logs.Steps of Reproduction ✅
1. Entry point `simulateInvocation` at `packages/runtime/src/simulate/simulate.ts:36-38`
accepts a `handler` and fixtures, then creates shared subsystems via
`createSimulationSubsystems` at `simulate.ts:76-79`.
2. A single `ctx` is built at `simulate.ts:84-97` with `log` and other channels wired to
the simulation subsystems from `subsystems.ts:87-198`; the log implementation closes over
the mutable `sink` variable defined at `subsystems.ts:56` and mutated in `useSink` at
`subsystems.ts:199-201`.
3. For each envelope, `simulateInvocation` iterates at `simulate.ts:103-156`, creating a
fresh `sink: SimulationSink` at `simulate.ts:110`, then calling `subsystems.useSink(sink)`
at `simulate.ts:111` before awaiting `handlerFn(ctx, event)` at `simulate.ts:117`.
4. If a user handler (following the pattern in `simulate.test.ts:52-62` where
`ctx.log('info', 'saw event', ...)` is used) starts background async work that logs after
it returns (e.g. `setTimeout(() => ctx.log('info','late'), 100)`), then: after the first
envelope's `await handlerFn` completes, the loop advances, `useSink` is called for the
second envelope (`simulate.ts:110-112`), mutating `sink`. When the background task from
the first envelope finally calls `ctx.log`, the log implementation at
`subsystems.ts:172-178` pushes into the *current* `sink.logs` (second envelope),
misattributing the side effect to the wrong run.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** packages/runtime/src/simulate/subsystems.ts
**Line:** 199:201
**Comment:**
*Race Condition: Using a single mutable sink that is swapped between envelopes can misattribute side effects/logs when a handler leaves background async work running after it returns. Late async emissions from the previous invocation will be written into the next envelope's sink. Use per-invocation bound record/log functions (or an immutable sink captured at dispatch time) instead of a globally swapped sink reference.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| const ctx = buildCtx({ | ||
| persona: options.persona, | ||
| agent, | ||
| deployment, | ||
| workspaceId, | ||
| sandbox: subsystems.sandbox, | ||
| files: subsystems.files, | ||
| llm: subsystems.llm, | ||
| memory: subsystems.memory, | ||
| workflow: subsystems.workflow, | ||
| schedule: subsystems.schedule, | ||
| log: subsystems.log, | ||
| harnessRunner: subsystems.harnessRunner | ||
| }); |
There was a problem hiding this comment.
Suggestion: Simulation still builds ctx through buildCtx without replacing the credentials subsystem, so ctx.credentials remains environment-backed and can throw missing-credential errors during dry-run. That breaks simulation determinism and violates the "recording/no-side-effect local run" behavior when handlers read credentials. Inject a simulated credentials context (or extend buildCtx to accept one) so simulation does not depend on host env secrets. [incomplete implementation]
Severity Level: Major ⚠️
- ❌ Simulation fails when handlers require credentials without env set.
- ⚠️ Local dry-run unusable on machines lacking Cloud tokens.
- ⚠️ Breaks documented no-side-effect local simulation expectations.Steps of Reproduction ✅
1. In `packages/runtime/src/simulate/simulate.ts`, `simulateInvocation` constructs the
handler context once via `const ctx = buildCtx({ ... })` at lines 84–97, passing
`persona`, `agent`, `deployment`, `workspaceId`, `sandbox`, `files`, `llm`, `memory`,
`workflow`, `schedule`, `log`, and `harnessRunner`, but not providing any override for
credentials.
2. Inspect `buildCtx` in `packages/runtime/src/ctx.ts`: at lines 118–148 it builds the
`WorkforceCtx` and unconditionally sets `credentials: credentialsFromEnv()` at line 143,
meaning every context (including simulation contexts) uses process-environment-backed
credentials.
3. `credentialsFromEnv` in `ctx.ts` lines 203–219 returns an object whose getters delegate
to `requireRuntimeCredentials(processEnv)`, and `requireRuntimeCredentials` at lines
221–225 calls `readRuntimeCredentialSnapshot(processEnv)` and throws `new Error("Runtime
credentials are required: missing ...")` when any of the expected RELAYFILE_* or
CLOUD_API_* environment variables are absent (as exercised by the `ctx.credentials` tests
in `packages/runtime/src/ctx.test.ts:24–45`).
4. Run `simulateInvocation` (exported for consumers via
`packages/runtime/src/index.ts:5–22`) in a local environment that lacks full runtime
credentials (e.g. no `RELAYFILE_TOKEN` or `CLOUD_API_ACCESS_TOKEN`) with a handler that
reads credentials, such as `handler(async (ctx) => { ctx.credentials.require(); })`; when
`simulateInvocation` calls `buildCtx` (simulate.ts:84–97), `ctx.credentials` is bound to
`credentialsFromEnv`, and the first call to `ctx.credentials.require()` inside the
simulated handler immediately throws the same "Runtime credentials are required: missing
…" error as production, causing the simulation run to fail even though the rest of the
simulation subsystems (`llm`, `memory`, `workflow`, `schedule`, `sandbox`,
`harnessRunner`) are designed to be no-side-effect and independent of cloud configuration.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** packages/runtime/src/simulate/simulate.ts
**Line:** 84:97
**Comment:**
*Incomplete Implementation: Simulation still builds `ctx` through `buildCtx` without replacing the credentials subsystem, so `ctx.credentials` remains environment-backed and can throw missing-credential errors during dry-run. That breaks simulation determinism and violates the "recording/no-side-effect local run" behavior when handlers read credentials. Inject a simulated credentials context (or extend `buildCtx` to accept one) so simulation does not depend on host env secrets.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| const sink: SimulationSink = { sideEffects: [], logs: [] }; | ||
| subsystems.useSink(sink); | ||
|
|
||
| const runStarted = now(); | ||
| let summary: string | null = null; | ||
| let error: string | null = null; | ||
| try { | ||
| const returned: unknown = await handlerFn(ctx, event); | ||
| summary = summaryFromHandlerReturn(returned); | ||
| } catch (err) { | ||
| error = err instanceof Error ? err.message : String(err); |
There was a problem hiding this comment.
Suggestion: Side-effect/log attribution is race-prone because a single mutable sink is swapped between envelopes and reused by the same ctx. If a handler starts background async work (e.g. timer/unawaited promise) that calls ctx.log or other ctx.* methods after the handler returns, those records will be written into the next envelope's sink, corrupting run records. Use per-envelope subsystem instances (or bind immutable per-run sink closures) so late async emissions cannot cross-contaminate runs. [race condition]
Severity Level: Major ⚠️
- ❌ Simulated run logs mis-attributed between envelopes in replay.
- ⚠️ Side-effect records from one run appear under another.
- ⚠️ Cloud run records become unreliable for debugging simulations.Steps of Reproduction ✅
1. Observe the simulation subsystems implementation in
`packages/runtime/src/simulate/subsystems.ts`: `createSimulationSubsystems` declares a
single mutable variable `let sink: SimulationSink = { sideEffects: [], logs: [] };` at
line 56 and exposes `useSink(next)` at lines 199–201, which simply reassigns this shared
`sink` variable.
2. Note that all simulated side effects and logs are written through closures that capture
this mutable `sink`: `record(...)` at lines 61–71 pushes into `sink.sideEffects`, and the
`log` implementation at lines 172–179 pushes into `sink.logs`, both referencing whatever
`sink` object `useSink` last assigned.
3. In `packages/runtime/src/simulate/simulate.ts`, `simulateInvocation` builds a single
`ctx` once at lines 84–97 using `buildCtx(...)`, then in the per-envelope loop at lines
103–112 it creates a fresh `sink` object (`const sink: SimulationSink = { sideEffects: [],
logs: [] };`) and calls `subsystems.useSink(sink);` before awaiting `handlerFn(ctx,
event)` at lines 116–119, so the same `ctx` instance is reused for every envelope while
only the shared `sink` pointer is swapped.
4. Define a handler using the public API (patterned on `simulateInvocation` tests in
`packages/runtime/src/simulate/simulate.test.ts:52–63`) such as `handler(async (ctx,
event) => { setTimeout(() => ctx.log('info', 'late log', { id: event.id }), 0); })`, then
call `simulateInvocation` with two envelopes; after the first `handlerFn` returns,
`simulateInvocation` advances to the next envelope and calls
`subsystems.useSink(nextSink)` (simulate.ts:110–112), so when the `setTimeout` callback
eventually runs it uses the same `ctx.log` closure but now writes into the second run's
`sink.logs`, causing the first envelope's late log to appear under the second run's
`simulation.capturedLogs` and leaving the first run's logs incomplete.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** packages/runtime/src/simulate/simulate.ts
**Line:** 110:120
**Comment:**
*Race Condition: Side-effect/log attribution is race-prone because a single mutable sink is swapped between envelopes and reused by the same `ctx`. If a handler starts background async work (e.g. timer/unawaited promise) that calls `ctx.log` or other `ctx.*` methods after the handler returns, those records will be written into the next envelope's sink, corrupting run records. Use per-envelope subsystem instances (or bind immutable per-run sink closures) so late async emissions cannot cross-contaminate runs.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
5 issues found across 8 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/runtime/src/simulate/subsystems.ts">
<violation number="1" location="packages/runtime/src/simulate/subsystems.ts:173">
P2: Captured `ctx.log` attrs are stored as-is, so non-JSON values can crash log rendering and abort simulation replay.</violation>
<violation number="2" location="packages/runtime/src/simulate/subsystems.ts:200">
P2: Per-envelope attribution is fragile because a global mutable sink can misattribute late async writes to the wrong run.</violation>
</file>
<file name="packages/runtime/src/simulate/simulate.ts">
<violation number="1" location="packages/runtime/src/simulate/simulate.ts:71">
P2: Defaulting `ctx.deployment.triggerKind` to `"inbox"` makes cron simulations report the wrong trigger kind inside handler context.</violation>
<violation number="2" location="packages/runtime/src/simulate/simulate.ts:84">
P1: Inject a simulated credentials implementation when building the simulation context; otherwise handlers that call `ctx.credentials` can fail based on host environment secrets, which breaks deterministic dry-run behavior.</violation>
</file>
<file name="packages/runtime/src/simulate/types.ts">
<violation number="1" location="packages/runtime/src/simulate/types.ts:77">
P2: `trigger.kind` is typed too loosely (`string`) instead of the documented trigger-kind union, which weakens Cloud-shape compatibility guarantees.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| // buildCtx may throw for unresolved required persona inputs — that is a | ||
| // simulation setup error (seed inputs via `agent.inputValues`), surfaced | ||
| // to the caller before any run records exist. | ||
| const ctx = buildCtx({ |
There was a problem hiding this comment.
P1: Inject a simulated credentials implementation when building the simulation context; otherwise handlers that call ctx.credentials can fail based on host environment secrets, which breaks deterministic dry-run behavior.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/simulate.ts, line 84:
<comment>Inject a simulated credentials implementation when building the simulation context; otherwise handlers that call `ctx.credentials` can fail based on host environment secrets, which breaks deterministic dry-run behavior.</comment>
<file context>
@@ -0,0 +1,236 @@
+ // buildCtx may throw for unresolved required persona inputs — that is a
+ // simulation setup error (seed inputs via `agent.inputValues`), surfaced
+ // to the caller before any run records exist.
+ const ctx = buildCtx({
+ persona: options.persona,
+ agent,
</file context>
| log, | ||
| harnessRunner, | ||
| useSink(next) { | ||
| sink = next; |
There was a problem hiding this comment.
P2: Per-envelope attribution is fragile because a global mutable sink can misattribute late async writes to the wrong run.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/subsystems.ts, line 200:
<comment>Per-envelope attribution is fragile because a global mutable sink can misattribute late async writes to the wrong run.</comment>
<file context>
@@ -0,0 +1,206 @@
+ log,
+ harnessRunner,
+ useSink(next) {
+ sink = next;
+ },
+ vfsSnapshot() {
</file context>
| }; | ||
|
|
||
| const log: WorkforceCtx['log'] = (level, message, attrs) => { | ||
| sink.logs.push({ |
There was a problem hiding this comment.
P2: Captured ctx.log attrs are stored as-is, so non-JSON values can crash log rendering and abort simulation replay.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/subsystems.ts, line 173:
<comment>Captured `ctx.log` attrs are stored as-is, so non-JSON values can crash log rendering and abort simulation replay.</comment>
<file context>
@@ -0,0 +1,206 @@
+ };
+
+ const log: WorkforceCtx['log'] = (level, message, attrs) => {
+ sink.logs.push({
+ t: now().toISOString(),
+ level,
</file context>
|
|
||
| const deployment: WorkforceDeploymentContext = { | ||
| id: options.deployment?.id ?? 'sim-deployment', | ||
| triggerKind: options.deployment?.triggerKind ?? 'inbox', |
There was a problem hiding this comment.
P2: Defaulting ctx.deployment.triggerKind to "inbox" makes cron simulations report the wrong trigger kind inside handler context.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/simulate.ts, line 71:
<comment>Defaulting `ctx.deployment.triggerKind` to `"inbox"` makes cron simulations report the wrong trigger kind inside handler context.</comment>
<file context>
@@ -0,0 +1,236 @@
+
+ const deployment: WorkforceDeploymentContext = {
+ id: options.deployment?.id ?? 'sim-deployment',
+ triggerKind: options.deployment?.triggerKind ?? 'inbox',
+ parentDeploymentId: options.deployment?.parentDeploymentId ?? null
+ };
</file context>
| durationMs: number; | ||
| trigger: { | ||
| /** Workforce trigger vocabulary (`inbox` | `clock` | `radio`). */ | ||
| kind: string; |
There was a problem hiding this comment.
P2: trigger.kind is typed too loosely (string) instead of the documented trigger-kind union, which weakens Cloud-shape compatibility guarantees.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/runtime/src/simulate/types.ts, line 77:
<comment>`trigger.kind` is typed too loosely (`string`) instead of the documented trigger-kind union, which weakens Cloud-shape compatibility guarantees.</comment>
<file context>
@@ -0,0 +1,155 @@
+ durationMs: number;
+ trigger: {
+ /** Workforce trigger vocabulary (`inbox` | `clock` | `radio`). */
+ kind: string;
+ eventSource: string;
+ };
</file context>
User description
Summary
P1 of #186 — a TRUE invocation dry-run/simulation, distinct from the deploy-preflight
--dry-run(which validates persona/config and exits without ever invoking the handler).createSimulationSubsystems()(packages/runtime/src/simulate/subsystems.ts): recording, no-side-effect implementations of every ctx channel a handler can reach —harness.run,llm.complete,sandbox.exec/readFile/writeFile,files.read/write,memory.save/recall,workflow.run/status,schedule.at/cancel,log. Nothing leaves the process; every call is recorded ({kind, at, args, simulatedResult}) with the inert result the handler received instead. File writes land in an in-memory VFS (seedable viafiles, persists across envelopes); reads of unseeded paths fail with a message naming the seeding option.simulateInvocation()(simulate.ts): replays fixtureRawGatewayEnvelopes through the sameshimEnvelopethe runner uses, with runner-equivalent dispatch semantics (unsupported envelopes skipped + reported, per-envelope error isolation — one throwing handler never aborts the replay).compactBase+ detaillogs) withorigin: "local_dry_run"— the origin value Cloud reserved for exactly this ingestion.summaryis populated from the handler's returned value (string or{summary}), falling back to the error message on failure.failureClassuses Cloud's exact 7-value union (locally onlysuccess/runner_erroroccur, but the type stays identical). Simulation-only data (recorded side effects, captured logs) nests under an additivesimulationkey so the core shape stays byte-compatible.@agentworkforce/runtimeroot and a new./simulatesubpath.exitCode: 0 when every dispatched envelope succeeded, 1 when any failed; unsupported envelopes are reported but are not failures (mirrors the runner's warn-and-continue).Why
Cloud#1783 shipped the hosted run-observability side; per its division of labor the simulation half belongs here. This gives developers "if this agent received this event, what would happen?" with captured
ctx.log(...), summary, and predicted (recorded-not-executed) side effects — and a record Cloud can later ingest alongside hosted runs without redesign.Tests
node:testcases: success-path golden record shape (field-for-field), handler-throw isolation + failure-class invariant (an error can never read assuccess), all 13 side-effect channel recordings, unseeded-read failure hint, VFS seeding + cross-envelope persistence, unsupported-envelope reporting, workspaceId fallback, explicit triggerKind override, raw-handler acceptance, per-run sink attribution.pnpm run check(lint + typecheck incl. examples + tests across all packages; runtime suite 49/49, CLI 211/211).Out of scope (P2, after this review)
workforce invoke --persona <path> --fixture <event.json|.ndjsonl>CLI surface + docs/README distinguishing the two dry-run kinds.Refs #186, AgentWorkforce/cloud#1783, AgentWorkforce/cloud#1788
🤖 Generated with Claude Code
CodeAnt-AI Description
Add a true invocation simulation that replays fixture events and returns Cloud-style run records
What Changed
./simulateentry pointImpact
✅ Safer invocation dry-runs✅ Clearer replay results for fixture runs✅ Fewer surprises from unsupported events💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.