Skip to content

feat(cli): workforce invoke — fixture-replay simulation CLI (#186 P2)#188

Merged
kjgbot merged 1 commit into
mainfrom
feat/invoke-cli
Jun 3, 2026
Merged

feat(cli): workforce invoke — fixture-replay simulation CLI (#186 P2)#188
kjgbot merged 1 commit into
mainfrom
feat/invoke-cli

Conversation

@kjgbot

@kjgbot kjgbot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

User description

Summary

P2 of #186, stacked on #187 (base = feat/invocation-simulation; GitHub retargets to main when #187 merges — I'll rebase if the squash requires it).

Adds the developer-facing surface for the P1 simulation runtime:

agentworkforce invoke <persona-path> --fixture <file> [--output run.json] [--input KEY=v] [--seed PATH=file] [--workspace id]
  • Same validation as deploy: reuses preflightPersona. Stages the agent bundle with the real bundleStager under .workforce/invoke-build/ inside the persona's tree — the bundle externalizes @agentworkforce/runtime, so Node must walk up to a node_modules, exactly deploy's build-dir constraint. Build dir cleaned up after every run; bundle import is cache-busted.
  • Handler extraction parity: identical logic to the generated runner.mjs (defineAgent default export → .handler; {handler} object; bare-function fallback; same error hint).
  • Fixture formats: single JSON envelope, JSON array, or NDJSON of RawGatewayEnvelopes (type now exported from the runtime root with shimEnvelope). Unknown envelope types surface via the simulation's unsupported list (runner's warn-and-continue mirrored).
  • Output contract: human summary → stderr; machine-readable Cloud-compatible run record (origin:"local_dry_run") → stdout or --output. Exit 0 = all dispatched envelopes succeeded; 1 = any handler failure or usage/setup error. Uses process.exitCode, never process.exit.
  • README: new "Simulate an invocation" section — the command, fixture format, run-record contract, and the deploy --dry-run (validate-only) vs invoke (execute-with-recorded-side-effects) distinction. CLI USAGE updated likewise.

Issue acceptance mapping

  • Developer can simulate locally with a fixture event and see captured ctx.log output, summary, and predicted side effects (logs.stdout + summary + simulation.sideEffects)
  • Simulation never executes real side effects (P1 recording subsystems; no opt-in shipped yet — safe path only)
  • Emitted record matches Cloud's run shape with origin: local_dry_run (P1, exercised end-to-end here)
  • Docs/README cover the new command and distinguish it from deploy-preflight --dry-run

Tests

18 new node:test cases: arg parsing (repeatable flags, inline forms, error messages), fixture parsing (object/array/NDJSON + malformed-line labeling), handler-extraction parity (all four shapes), human-summary rendering, and 4 end-to-end tests running the real preflight → real esbuild staging → dynamic import → simulateInvocation (success record, failure isolation + exit 1, --output file, usage-error path). E2E temp personas live inside packages/cli so the externalized runtime resolves — same reason deploy stages in-tree.

Full workspace pnpm run check green; CLI suite 229/229 (211 pre-existing + 18 new).

Refs #186, AgentWorkforce/cloud#1783

🤖 Generated with Claude Code


CodeAnt-AI Description

Add a local invocation command for replaying fixture events

What Changed

  • Added agentworkforce invoke to run a persona against fixture event envelope(s) and record side effects instead of executing them.
  • The command now accepts a single event, a list of events, or NDJSON, and can take extra input values, filesystem seeds, and a workspace id.
  • It prints a short human summary to stderr and a Cloud-style run record to stdout or a file, with failures reported through the command exit code.
  • Updated the CLI help text and README with the new workflow and the difference from deploy --dry-run.
  • Added test coverage for argument parsing, fixture formats, handler extraction, summaries, and end-to-end invocation runs.

Impact

✅ Local event replay without cloud side effects
✅ Clearer run summaries and error output
✅ Easier validation of persona behavior before deployment

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@codeant-ai

codeant-ai Bot commented Jun 3, 2026

Copy link
Copy Markdown

CodeAnt AI is reviewing your PR.

@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@kjgbot, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 43 minutes and 24 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 953e1212-a76e-4dc0-9761-e666543dbe4f

📥 Commits

Reviewing files that changed from the base of the PR and between e66ffe3 and fa5a470.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (6)
  • README.md
  • packages/cli/package.json
  • packages/cli/src/cli.ts
  • packages/cli/src/invoke-command.test.ts
  • packages/cli/src/invoke-command.ts
  • packages/runtime/src/index.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/invoke-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Base automatically changed from feat/invocation-simulation to main June 3, 2026 17:54
@codeant-ai codeant-ai Bot added the size:XL This PR changes 500-999 lines, ignoring generated files label Jun 3, 2026
P2 of workforce#186, on top of the simulateInvocation runtime (P1):

- packages/cli/src/invoke-command.ts: `agentworkforce invoke
  <persona-path> --fixture <file>` — preflights the persona (same
  validation as deploy), stages the agent bundle under
  `.workforce/invoke-build/` inside the persona's tree (the bundle
  leaves @agentworkforce/runtime external, so resolution must walk up
  to a node_modules — same constraint as deploy's build dir), imports
  the handler with the same extraction the generated runner.mjs
  performs, and replays the fixture through simulateInvocation.
- Fixture formats: single JSON envelope, JSON array, or NDJSON; entries
  are RawGatewayEnvelope (now exported from the runtime root alongside
  shimEnvelope). Unknown envelope types surface via the simulation's
  unsupported list, mirroring the runner's warn-and-continue.
- Output: human summary → stderr; machine-readable Cloud-compatible run
  record (origin "local_dry_run") → stdout or --output <file>. Exit 0
  when every dispatched envelope succeeded, 1 on any handler failure or
  usage/setup error. Sets process.exitCode (never process.exit) so
  streams flush and tests drive it directly.
- Flags: --input KEY=value (persona inputs), --seed PATH=file (seed the
  simulated VFS with provider data), --workspace <id>.
- cli.ts: `invoke` dispatch entry + USAGE block distinguishing it from
  deploy --dry-run. README: "Simulate an invocation" section covering
  the command, fixture format, run-record contract, and the
  dry-run-vs-simulate distinction.
- Build dir is cleaned up after every run; bundle import is
  cache-busted so repeated invokes load the freshly staged bundle.

18 new node:test cases (parse, fixture parsing incl. malformed-line
labeling, handler extraction parity, human summary, and 4 end-to-end
tests through real preflight + esbuild staging + dynamic import).
Full workspace `pnpm run check` green; CLI suite 229/229.

Refs #186, AgentWorkforce/cloud#1783

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kjgbot kjgbot force-pushed the feat/invoke-cli branch from b9850a4 to fa5a470 Compare June 3, 2026 17:56
@kjgbot kjgbot merged commit d2285f9 into main Jun 3, 2026
2 checks passed
@kjgbot kjgbot deleted the feat/invoke-cli branch June 3, 2026 17:59
Comment on lines +71 to +72
} else if (a.startsWith('--fixture=')) {
fixturePath = expectInline('--fixture', a.slice('--fixture='.length));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Appending a random query string to every dynamic import creates a unique ESM module cache entry per invocation, which accumulates indefinitely in long-lived processes (tests/watch flows). This causes unbounded memory growth; avoid per-run unique import URLs or isolate execution in a short-lived worker/process. [memory leak]

Severity Level: Major ⚠️
- ⚠️ Long-running `runInvoke` loops leak memory via ESM cache.
- ⚠️ Dev/watch flows using `invoke` may slow or crash over time.
Steps of Reproduction ✅
1. In `runInvokeWithOptions` (`packages/cli/src/invoke-command.ts:45-92`), the persona is
preflighted and an agent bundle is staged to a temporary `buildDir` via
`bundleStager.stage` (`packages/cli/src/invoke-command.ts:63-67`), producing
`bundle.bundlePath`.

2. The code then constructs an import URL as `const bundleUrl =
\`${pathToFileURL(bundle.bundlePath).href}?invoke=${randomUUID()}\`;` and dynamically
imports it with `await import(bundleUrl)` (`packages/cli/src/invoke-command.ts:69-72`).
Because `randomUUID()` produces a fresh value each time, every call to
`runInvokeWithOptions` uses a unique module URL even when `bundle.bundlePath` is the same.

3. Node's ESM loader caches modules by their full URL (including query string). In a
long-lived process (for example, a watch-mode dev tool or test harness) that repeatedly
calls `runInvoke` with the same persona and fixture, each invocation stages a bundle and
imports it via a new `bundleUrl`. The ESM cache retains each imported module object
indefinitely, because each URL is unique.

4. Although `buildDir` is deleted in the `finally` block
(`packages/cli/src/invoke-command.ts:104-107`), the already-imported module instances
remain strongly referenced by the ESM module cache. Over hundreds or thousands of
invocations in a single process, this pattern causes monotonically increasing memory usage
from accumulated bundle modules, degrading performance and potentially leading to
out-of-memory conditions in long-running dev/watch flows.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** packages/cli/src/invoke-command.ts
**Line:** 71:72
**Comment:**
	*Memory Leak: Appending a random query string to every dynamic import creates a unique ESM module cache entry per invocation, which accumulates indefinitely in long-lived processes (tests/watch flows). This causes unbounded memory growth; avoid per-run unique import URLs or isolate execution in a short-lived worker/process.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +85 to +92
} else if (a === '--seed') {
addKeyValue('--seed', expectValue('--seed', args[++i]), seeds);
} else if (a.startsWith('--seed=')) {
addKeyValue('--seed', expectInline('--seed', a.slice('--seed='.length)), seeds);
} else if (a.startsWith('--')) {
throw new Error(`invoke: unknown flag "${a}"`);
} else if (!personaPath) {
personaPath = path.resolve(a);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: --input values are forwarded directly without validating that the keys exist in the persona's declared inputs, so typos are silently ignored during simulation. This breaks parity with deploy input handling and can produce misleading simulation results; validate override keys against preflight.persona.inputs and fail on unknown keys. [api mismatch]

Severity Level: Major ⚠️
- ⚠️ `agentworkforce invoke` ignores misspelled `--input` overrides.
- ⚠️ Simulated runs may diverge from deploy-time input semantics.
- ⚠️ Developers may misdiagnose behavior using stale default inputs.
Steps of Reproduction ✅
1. Create a persona JSON with a declared input key, e.g. `TOPICS`, and mark it optional or
give it a default so the persona is accepted without a deploy-time override. This persona
is parsed and validated by `preflightPersona` at `packages/deploy/src/preflight.ts:23-31`,
which populates `preflight.persona.inputs`.

2. Run the new CLI command through the binary defined in
`packages/cli/src/cli.ts:245-251`, e.g. `agentworkforce invoke ./persona.json --fixture
./event.json --input TOPIC=override`, intentionally misspelling the key as `TOPIC` instead
of `TOPICS`.

3. The CLI entry calls `runInvoke` (`packages/cli/src/invoke-command.ts:248-252`), which
parses arguments via `parseInvokeArgs` (`packages/cli/src/invoke-command.ts:57-85`). This
collects the flag into `opts.inputs = { TOPIC: 'override' }` even though `TOPIC` is not
declared on the persona.

4. `runInvokeWithOptions` (`packages/cli/src/invoke-command.ts:45-92`) then calls
`simulateInvocation` with `agent: { inputValues: opts.inputs }` at
`packages/cli/src/invoke-command.ts:85-91`. Inside `simulateInvocation`
(`packages/runtime/src/simulate/simulate.ts:60-88`), `buildCtx` is called, which in turn
calls `buildPersonaContext` (`packages/runtime/src/ctx.ts:447-47x`). `buildPersonaContext`
iterates only `persona.inputs` keys and ignores extra keys on `agentInputValues`, so the
`TOPIC` override is silently dropped. The simulation runs using the default/env value for
`TOPICS` with no error about the unknown override key, even though the CLI help
(`INVOKE_USAGE` at `packages/cli/src/invoke-command.ts:13-41`) and deploy design docs
(`docs/plans/deploy-v1.md:159-164`) state that only declared persona inputs should be
accepted.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** packages/cli/src/invoke-command.ts
**Line:** 85:92
**Comment:**
	*Api Mismatch: `--input` values are forwarded directly without validating that the keys exist in the persona's declared inputs, so typos are silently ignored during simulation. This breaks parity with deploy input handling and can produce misleading simulation results; validate override keys against `preflight.persona.inputs` and fail on unknown keys.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@codeant-ai

codeant-ai Bot commented Jun 3, 2026

Copy link
Copy Markdown

CodeAnt AI finished reviewing your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant