feat(cli): honor session config overrides + SSH remote when spawning agents by chr1syy · Pull Request #888 · RunMaestro/Maestro

chr1syy · 2026-04-23T19:10:47Z

Summary

The CLI (maestro send, maestro run-playbook) previously spawned agents locally and silently ignored per-agent/per-session configuration set in the desktop app — customArgs, customEnvVars, customModel, customEffort, and sessionSshRemoteConfig were all dropped at the CLI boundary.

This was called out as a CLAUDE.md violation for SSH, and a silent capability gap for everything else. This PR brings the CLI to parity with the desktop process:spawn handler.

Config overrides: every CLI spawn now runs through applyAgentConfigOverrides() (the same merger the desktop uses), so customArgs, customEnvVars, customModel, and customEffort are all honored with session → agent-config → defaults precedence.
SSH remote execution: loaded lazily via await import() so the local spawn path never pulls in execFile/which helpers. Users can now run playbooks against remote-configured agents.
Hard-fail on unresolvable remote: when sshRemoteConfig.enabled is true but the remote can't be resolved, the CLI returns a clear error instead of silently falling back to local. A user who opts into SSH doesn't want their prompt leaking onto the controller's machine if the config is wrong.
Env-var precedence fix (follow-up 01938078c): agent defaultEnvVars were being returned as part of effectiveCustomEnvVars and then Object.assign'd unconditionally over process.env, clobbering shell-exported values. resolveAgentOverrides() now returns only the truly user-configured overrides; defaults are layered separately with "shell wins" semantics.

Changes

SessionInfo extended with customArgs, customEnvVars, sessionSshRemoteConfig so the CLI can read what the desktop persists.
spawnAgent() refactored: builds pre-override args, then calls applyAgentConfigOverrides() to merge configs; removes the now-redundant explicit def.modelArgs() call.
applyAgentConfigOverrides signature loosened to accept AgentDefinition (narrower type the CLI has on hand — the function only reads configOptions/defaultEnvVars).
New resolveAgentOverrides(), applyEnvLayers(), buildSshEnvForRemote(), sshUnresolvedFailure() helpers in agent-spawner.ts.
send.ts and batch-processor.ts (task run + synopsis) thread all overrides through to spawnAgent.

Tests

28 new tests covering session/agent-level customArgs and customEnvVars precedence, shell-quote parsing, customModel flow via configOptions (Claude + Codex), SSH wrapper invocation, stdin passthrough for large prompts, hard-fail on unresolvable remote, baseline stream-json flags, read-only mode, resume vs fresh session, Codex working-dir/resume args, plus three regression tests locking in the env-var precedence fix (shell > agent defaults, defaults fill blanks, defaults still reach SSH remote).
Full CLI suite: 721/721 tests passing.

Docs

CLAUDE.md SSH section now documents CLI parity and the hard-fail expectation.
docs/agent-guides/CLI-PLAYBOOKS.md "CLI vs Desktop Spawning" section corrected to reflect what is and isn't honored.

Test plan

Automated

tsc -p tsconfig.cli.json --noEmit clean
tsc -p tsconfig.main.json --noEmit clean (pre-existing unrelated js-yaml/picomatch warnings only)
eslint src/ exit 0
prettier --check clean on all edited files
vitest run src/__tests__/cli — 721/721 pass

Manual verification — 2026-04-24, Linux

End-to-end run against this branch's freshly-built CLI in an isolated sandbox (XDG_CONFIG_HOME redirect to a scratch dir — production Maestro config untouched), using a shim "Claude" that logged actual argv/cwd/env per spawn, and a real SSH remote for the remote-execution scenarios. All ten scenarios pass.

#	Scenario	Pass signal	Result
A	Session `customArgs` threaded through	argv contains `--foo`, `has spaces`, `--bar` (shell-quote parsed)	✅
B1	Session `customEnvVars` win over shell env	`TEST_USER_VAR=from-session` in spawn env, shell had `from-shell`	✅
B2	Shell env wins over agent `defaultEnvVars` (regression fix)	`OPENCODE_CONFIG_CONTENT=shell-set-value` survives spawn	✅
B3	Agent `defaultEnvVars` fill in when shell absent	opencode's JSON default reaches the spawn env	✅
C	Session `customModel` applied	`--model opus` in argv	✅
F1	Session-level beats agent-level `customArgs`	`--from-session` in argv, `--from-agent` absent	✅
F2	Agent-level `customArgs` applied when session absent	`--from-agent` in argv	✅
D	SSH enabled + valid remote	`Wrapping spawn with SSH remote execution` log, full remote `ssh ... claude ...` dispatched, real response + usage stats returned; local spawn log empty	✅
D+	Session `customEnvVars` reach SSH remote	`PR888_REMOTE_CANARY='reached-the-remote' claude ...` visible in dispatched `remoteCommand`	✅
E	SSH enabled + unresolvable remote → hard-fail, no prompt leak	CLI returns `success:false` with exact expected error; local spawn log empty — prompt "sensitive prompt that must not leak locally" never reached the local machine	✅

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Session-level overrides for custom CLI args, env vars, model and effort now apply consistently to all agent runs (including follow-ups).
- CLI playbooks gain SSH remote execution support with dynamic SSH wrapper loading.
Bug Fixes
- CLI now fails explicitly when a configured SSH remote cannot be resolved instead of silently falling back to local execution.
Documentation
- Updated guidance for CLI playbooks and SSH/agent configuration parity.
Tests
- Expanded tests for spawn options, SSH behavior, stdin passthrough, and regression cases.

…agents The CLI (`maestro send`, `maestro run-playbook`) previously spawned agents locally and silently ignored per-agent/per-session configuration. This was called out as a CLAUDE.md violation for SSH, and a silent capability gap for everything else: customArgs, customEnvVars, customModel, and customEffort set in the desktop app were simply dropped at the CLI boundary. This change brings the CLI to parity with the desktop process handler. Changes: - Extend `SessionInfo` with `customArgs`, `customEnvVars`, and `sessionSshRemoteConfig` so the CLI can read everything the desktop persists. - `spawnAgent()` now runs every spawn through `applyAgentConfigOverrides()` (the same merger the desktop uses), layering session → agent-config → defaults just like `process:spawn` does. Covers customArgs (shell-quote aware), customEnvVars (merged into local env and forwarded to SSH), customModel and customEffort (applied via agent `configOptions` argBuilders, so `def.modelArgs()` is no longer called twice). - Loosen `applyAgentConfigOverrides` signature to accept `AgentDefinition` in addition to `AgentConfig`. The function only reads `configOptions`/`defaultEnvVars`; accepting the narrower type lets the CLI pass the static agent definition directly. - SSH remote execution is wrapped via `wrapSpawnWithSsh()`, loaded lazily through `await import()` so the local spawn path never pulls in `execFile`/`which` helpers. - Hard-fail when `sshRemoteConfig.enabled` is true but the remote can't be resolved. Previously the desktop logged a warning and silently fell back to local execution — the CLI now returns a clear error instead, because a user who opted into SSH doesn't want their prompt running on the controller's machine if the remote is misconfigured. - Thread all overrides from `send.ts` and `batch-processor.ts` (both the task run and the synopsis call) so playbook automation respects the same config the desktop uses. Tests: - 25 new tests in `src/__tests__/cli/services/agent-spawner.test.ts` covering: session/agent-level customArgs and customEnvVars precedence, shell-quote parsing, customModel flow via configOptions, SSH wrapper invocation, stdin passthrough for large prompts, hard-fail on unresolvable remote, baseline flags preserved, read-only mode, resume vs fresh session, Codex working-dir/resume args. - Full CLI suite: 718 tests passing. Docs: - `CLAUDE.md` SSH section now documents the CLI-parity requirement and the hard-fail expectation. - `docs/agent-guides/CLI-PLAYBOOKS.md` corrects the "CLI vs Desktop Spawning" section to reflect what is and isn't honored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-23T19:11:38Z

📝 Walkthrough

Walkthrough

This PR aligns CLI agent spawning with desktop per-session overrides (model, effort, args, env), adds optional SSH remote execution via a dynamically imported SSH wrapper, and changes semantics so an enabled-but-unresolvable SSH remote produces an explicit failure instead of falling back to local execution.

Changes

Cohort / File(s)	Summary
Docs `CLAUDE.md`, `docs/agent-guides/CLI-PLAYBOOKS.md`	Documented SSH failure semantics, CLI/desktop parity for per-session overrides, SSH wrapping via dynamic import, and required parameters threaded to `spawnAgent(...)`.
Types & Utilities `src/shared/types.ts`, `src/main/utils/agent-args.ts`	Added per-session fields (`customArgs`, `customEnvVars`, `sessionSshRemoteConfig`) and widened `applyAgentConfigOverrides` input types to accept `AgentConfigOverridable`/`AgentDefinition`/`AgentConfig`/null.
CLI Agent Spawner `src/cli/services/agent-spawner.ts`	CLI spawners now apply `applyAgentConfigOverrides`, use layered env merging that preserves shell values, accept `customEffort/customArgs/customEnvVars/sshRemoteConfig` in `SpawnAgentOptions`, dynamically import and conditionally wrap spawns with SSH, fail on unresolvable remotes, and write `sshStdinScript` to stdin before ending.
CLI Command & Batch `src/cli/commands/send.ts`, `src/cli/services/batch-processor.ts`	Forwarded `customEffort`, `customArgs`, `customEnvVars`, and `sshRemoteConfig` from session to `spawnAgent` for main and synopsis agent spawns.
Tests `src/__tests__/cli/services/agent-spawner.test.ts`	Expanded tests: mocked agent/SSH behavior, stdin write spy, async spawn helpers; asserted override precedence, model flag translation, SSH wrapping and stdin-script passthrough, and explicit failure for unresolved remotes; added regression checks for flags and resume behavior.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI as CLI Command
    participant Spawner as Agent Spawner
    participant SSHW as SSH Wrapper (dynamic)
    participant Remote as Remote Host
    participant Local as Local Agent

    User->>CLI: Invoke agent with session overrides + sessionSshRemoteConfig
    CLI->>Spawner: spawnAgent(options with customArgs/customEnvVars/sshRemoteConfig)
    Spawner->>Spawner: applyAgentConfigOverrides(session > agent > defaults)
    Spawner->>Spawner: Merge env (shell <- agent defaults <- customEnvVars)

    alt SSH enabled
        Spawner->>SSHW: dynamic import & wrapSpawnWithSsh(command, args, remoteConfig)
        SSHW->>SSHW: resolve remote
        alt remote resolved
            SSHW->>Remote: SSH exec wrapped command (write sshStdinScript then end stdin)
            Remote-->>SSHW: stdout/close
            SSHW-->>Spawner: proxy results
        else remote not resolved
            SSHW-->>Spawner: unresolved error (no local fallback)
            Spawner-->>CLI: sshUnresolvedFailure
        end
    else SSH disabled
        Spawner->>Local: spawn(command + customArgs, env)
        Local-->>Spawner: stdout/close
    end
    Spawner-->>CLI: Agent output or failure

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: token-efficient system prompt delivery for CLI agents #596 — Updates SSH spawn/stdin wiring and stdin-script passthrough; directly related to SSH wrapper integration.
fix(cli): capture response from claude-code in maestro-cli send #679 — Modifies spawnClaudeAgent/spawnJsonLineAgent stdout/close buffering and related tests; overlapping test and spawn behavior changes.
fix(cli): disable background tasks for Claude Code batch spawns #862 — Changes env construction/merging for CLI batch spawns; overlaps with refactored env layering and precedence here.

Poem

🐰 I hop through args and env with care,
I wrap the spawn in SSH when told to dare,
If remotes hide, I don't pretend—
I fail cleanly, then wait to mend. 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 70.59% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main changes: CLI agent spawning now honors session configuration overrides and supports SSH remote execution, bringing parity with desktop behavior.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-04-23T19:15:15Z

Greptile Summary

This PR brings the CLI spawner (agent-spawner.ts) to parity with the desktop process:spawn handler by threading customArgs, customEnvVars, customModel, customEffort, and sessionSshRemoteConfig through spawnAgent() for both send and batch-processor call sites, and adds a hard-fail when a configured SSH remote can't be resolved.

P1: applyAgentConfigOverrides bundles agent.defaultEnvVars into effectiveCustomEnvVars, which is then applied unconditionally via Object.assign in applyEnvLayers — overriding any matching shell env keys and breaking the documented "shell wins for agent defaults" semantics on local spawns. Only user-configured vars (agent-level + session customEnvVars) should be passed as the userEnvVars arg; agent defaults should remain in the if (!env[k]) path only.

Confidence Score: 4/5

Safe to merge after fixing the env-precedence regression; SSH and config-override logic is otherwise well-structured.

One P1 defect: agent defaultEnvVars unconditionally override shell env values for local spawns, contradicting the documented precedence. All other findings are P2 or informational. The hard-fail SSH behavior, config-override precedence, and test coverage are solid.

src/cli/services/agent-spawner.ts — the applyEnvLayers call sites at lines ~293-299 and ~969-975 need to pass only user-configured env vars, not the full effectiveCustomEnvVars that bundles agent defaults.

Important Files Changed

Filename	Overview
src/cli/services/agent-spawner.ts	Core refactor adding SSH and config-override support; contains a P1 precedence bug where agent defaultEnvVars unconditionally override shell env on local spawns due to effectiveCustomEnvVars bundling defaults in before Object.assign
src/main/utils/agent-args.ts	Signature of applyAgentConfigOverrides widened to accept AgentDefinition alongside AgentConfig; effectiveCustomEnvVars bundles defaultEnvVars in — appropriate for SSH but the root cause of the local-spawn precedence bug
src/cli/commands/send.ts	Correctly threads customEffort, customArgs, customEnvVars, and sessionSshRemoteConfig through to spawnAgent
src/cli/services/batch-processor.ts	Both the main task spawn and the synopsis spawn now forward all session overrides including SSH config; synopsis resuming via agentSessionId over SSH looks correct
src/shared/types.ts	SessionInfo extended with customArgs, customEnvVars, and sessionSshRemoteConfig to give the CLI access to desktop-persisted per-session configuration
src/tests/cli/services/agent-spawner.test.ts	25 new tests covering config override precedence, SSH wrapper invocation, stdin passthrough, hard-fail on unresolvable remote, and baseline regression cases; good coverage of the new code paths

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[spawnAgent] --> B{toolType}
    B -->|claude-code| C[spawnClaudeAgent]
    B -->|usesJsonLineOutput| D[spawnJsonLineAgent]
    B -->|other| E[return error]

    C --> F[resolveAgentOverrides\napplyAgentConfigOverrides]
    D --> F

    F --> G[effectiveCustomEnvVars\n= agentDefaults + userVars]

    G --> H[applyEnvLayers\nlocal env build]
    G --> I{sshRemoteConfig\n.enabled?}

    H -->|agentDefaults shell wins| H1[if not env_k : set default]
    H -->|batchModeDefaults shell wins| H2[if not env_k : set default]
    H -->|effectiveCustomEnvVars UNCONDITIONAL| H3[Object.assign overrides shell]

    I -->|yes| J[maybeWrapSpawnWithSsh\n+ buildSshEnvForRemote]
    I -->|no| K[local spawn\nenv = applyEnvLayers result]

    J --> L{sshRemoteUsed?}
    L -->|null| M[sshUnresolvedFailure\nhard error]
    L -->|resolved| N[applySshWrapResult\nenv = process.env for ssh client]

    N --> O[spawn ssh command\nfinalizeAgentStdin]
    K --> O

_{Reviews (1): Last reviewed commit: "feat(cli): honor session config override..." | Re-trigger Greptile}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cli/services/agent-spawner.ts`:
- Around line 57-72: resolveAgentOverrides is returning the merged
effectiveCustomEnvVars produced by applyAgentConfigOverrides, but that value
already includes agent.defaultEnvVars so passing it forward causes
applyEnvLayers/Object.assign to treat defaults as explicit user overrides;
change resolveAgentOverrides to return only the explicit sessionCustomEnvVars
(i.e., the true custom overrides provided by the caller) rather than the merged
effectiveCustomEnvVars from applyAgentConfigOverrides so that applyEnvLayers
preserves shell-provided values over agent defaults; locate
resolveAgentOverrides and ensure the returned effectiveCustomEnvVars contains
only the explicit overrides (or return undefined when none) and leave merging of
defaults to the later applyEnvLayers logic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f0c24845-06af-4508-adcb-4662371c8089

📥 Commits

Reviewing files that changed from the base of the PR and between 8a4b361 and c55730a.

📒 Files selected for processing (8)

CLAUDE.md
docs/agent-guides/CLI-PLAYBOOKS.md
src/__tests__/cli/services/agent-spawner.test.ts
src/cli/commands/send.ts
src/cli/services/agent-spawner.ts
src/cli/services/batch-processor.ts
src/main/utils/agent-args.ts
src/shared/types.ts

pedramamini · 2026-04-23T19:21:48Z

Thanks for the thorough work on this, @chr1syy — the CLI/desktop parity gap has been a sore spot and the hard-fail on unresolvable SSH remote is exactly the right call. The test coverage is also great.

One issue before merging, flagged by Greptile as P1 and I've verified it against the code:

Agent defaultEnvVars now override shell env on local CLI spawns (regression vs. pre-PR behavior).

applyAgentConfigOverrides() in src/main/utils/agent-args.ts:180-187 builds effectiveCustomEnvVars as { ...agentDefaultEnvVars, ...userEnvVars } — it bundles the agent's defaultEnvVars in alongside the user-configured overrides.

resolveAgentOverrides() forwards that merged record, and applyEnvLayers() then does:

if (agentDefaults) {
  for (const [k, v] of Object.entries(agentDefaults)) {
    if (!env[k]) env[k] = v;   // shell wins ✅
  }
}
// ...
if (userEnvVars) Object.assign(env, userEnvVars);  // ← agent defaults ride in here and clobber shell ❌

So for a local spawn with no user-configured env vars, any def.defaultEnvVars key will unconditionally overwrite the shell's value, even though the if (!env[k]) layer above deliberately tried to preserve shell precedence. The comment on applyEnvLayers ("Agent defaults are preserved only when the shell didn't already set them — matches today's CLI precedence") describes the pre-PR behavior, not the actual new behavior.

Before this PR, spawnClaudeAgent/spawnJsonLineAgent applied defaultEnvVars only via the if (!env[k]) guard, so shell values won. That contract is now silently broken.

Suggested fix: have resolveAgentOverrides return only the explicit user overrides (session > agent-config customEnvVars) rather than the merged effectiveCustomEnvVars. The applyEnvLayers default-layer already handles def.defaultEnvVars correctly on its own. The buildSshEnvForRemote() path can keep using the bundled record since the remote host has no shell of its own to respect.

Everything else looks solid — nice surgical refactor around the existing code paths.

applyAgentConfigOverrides() folds agent.defaultEnvVars into its effectiveCustomEnvVars return value. The CLI spawner was forwarding that merged set through applyEnvLayers() as userEnvVars, so the final Object.assign(env, userEnvVars) was unconditionally overwriting shell env with agent defaults — the opposite of the documented "shell wins for defaults" precedence. resolveAgentOverrides() now returns only the truly user-configured env vars (session customEnvVars → agent-level customEnvVars). Defaults continue to be layered separately by applyEnvLayers() (local) and buildSshEnvForRemote() (SSH), where they correctly yield to shell / user values. Also removes the redundant defaultEnvVars double-spread in buildSshEnvForRemote(). Adds three regression tests: - shell env wins over agent defaultEnvVars when no user customEnvVars - agent defaultEnvVars still applied when shell has not set the key - SSH wrapper still receives agent defaultEnvVars when no user customEnvVars (remote has no shell env to fall back on) Resolves PR RunMaestro#888 review comments from greptile-apps (P1 + P2) and coderabbitai (major). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/__tests__/cli/services/agent-spawner.test.ts (1)
1854-1872: Nit: sshWrapResult parameter type omits prompt.

The Partial<{...}> type on Lines 1855–1862 doesn't declare a prompt field, but the returned object spreads prompt: undefined on Line 1868. If a future test wants to override prompt via the overrides arg, TS will reject it. Either drop the prompt: undefined line (since it's always undefined and callers can't change it) or add prompt?: string to the override type to match the real wrapSpawnWithSsh return shape.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/__tests__/cli/services/agent-spawner.test.ts` around lines 1854 - 1872,
The helper sshWrapResult defines an overrides Partial that omits the prompt
field but the returned object includes prompt: undefined, preventing tests from
overriding prompt; update the overrides type to include prompt?: string (or
prompt?: string | undefined) so it matches the actual wrapSpawnWithSsh return
shape and callers can override prompt, or alternatively remove the explicit
prompt: undefined line from the returned object—apply the change to the
sshWrapResult helper where it constructs the mocked wrapSpawnWithSsh result.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/__tests__/cli/services/agent-spawner.test.ts`:
- Around line 1753-1767: The test mutates process.env.MAESTRO_TEST_ENV but
doesn't guarantee restoration on failure; wrap the mutation and assertions in a
try/finally so the original value is restored even if an assertion fails:
capture the previous value, set process.env.MAESTRO_TEST_ENV = 'from-shell', run
spawnAgent('claude-code', ...), driveSpawnToCompletion(p, 0, CLAUDE_OK()),
assert using spawnCall(), and then restore the original value in a finally block
(delete if prev was undefined or set back to prev). Ensure this mirrors the
sibling test's try/finally pattern and references the same helpers (spawnAgent,
driveSpawnToCompletion, spawnCall).

---

Nitpick comments:
In `@src/__tests__/cli/services/agent-spawner.test.ts`:
- Around line 1854-1872: The helper sshWrapResult defines an overrides Partial
that omits the prompt field but the returned object includes prompt: undefined,
preventing tests from overriding prompt; update the overrides type to include
prompt?: string (or prompt?: string | undefined) so it matches the actual
wrapSpawnWithSsh return shape and callers can override prompt, or alternatively
remove the explicit prompt: undefined line from the returned object—apply the
change to the sshWrapResult helper where it constructs the mocked
wrapSpawnWithSsh result.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2be98080-7d0e-4fa0-b315-95e3f62304aa

📥 Commits

Reviewing files that changed from the base of the PR and between c55730a and 0193807.

📒 Files selected for processing (2)

src/__tests__/cli/services/agent-spawner.test.ts
src/cli/services/agent-spawner.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/cli/services/agent-spawner.ts

If the assertion inside the "applies session customEnvVars to local spawn env (wins over shell env)" test ever throws, the restore block would never run and MAESTRO_TEST_ENV='from-shell' would leak to subsequent tests. Mirrors the try/finally pattern used by the sibling regression tests added in 0193807. Resolves CodeRabbit minor review on PR RunMaestro#888. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chr1syy · 2026-04-23T21:02:25Z

Manual Testing Strategy — PR #888

All automated coverage is CLI-side with mocked spawn. These manual tests exercise the real spawn path end-to-end, including dynamic SSH wrapper loading. Do them on the feat/cli-ssh-remote-agents branch after a local npm run build (CLI output lands in dist/cli/index.js; the binary is maestro-cli).

Preflight

git checkout feat/cli-ssh-remote-agents
npm run build
which maestro-cli   # or: alias maestro-cli="node $(pwd)/dist/cli/index.js"

Config files used below (Linux paths — substitute ~/Library/Application Support/Maestro on macOS, %APPDATA%/Maestro on Windows):

~/.config/Maestro/maestro-sessions.json — session list; hand-edit per scenario
~/.config/Maestro/maestro-agent-configs.json — agent-level defaults
~/.config/Maestro/maestro-settings.json — contains sshRemotes[]

Back up your configs first: cp -r ~/.config/Maestro ~/.config/Maestro.bak

Shim agent (makes every scenario observable)

Drop a scriptable stand-in for any agent so we can inspect the exact argv and env the CLI produced. Claude Code is a good target because it's the default:

cat > /tmp/claude-shim.sh <<'SHIM'
#!/usr/bin/env bash
{
  echo "=== SHIM ARGS ==="
  printf '  %q\n' "$@"
  echo "=== SHIM CWD ==="
  pwd
  echo "=== SHIM ENV (filtered) ==="
  env | grep -E '^(TEST_|MAESTRO_|CLAUDE_|ANTHROPIC_|OPENCODE_|CODEX_|NODE_)' | sort
} >&2
# Emit a well-formed stream-json result so the CLI treats it as success
printf '{"type":"result","result":"shim-ok","session_id":"shim-session"}\n'
exit 0
SHIM
chmod +x /tmp/claude-shim.sh

Point Claude at the shim in maestro-agent-configs.json:

{
  "configs": {
    "claude-code": { "customPath": "/tmp/claude-shim.sh" }
  }
}

Every CLI invocation below should print the shim's stderr block — that's our ground-truth for what spawn() actually received.

Scenario A — session `customArgs` threaded through

Edit maestro-sessions.json:

{
  "sessions": [{
    "id": "test-args",
    "name": "test-args",
    "toolType": "claude-code",
    "cwd": "/tmp",
    "customArgs": "--foo \"has spaces\" --bar"
  }]
}

Run:

maestro-cli send test-args "hello"

Expected (in shim stderr): --foo, has spaces, --bar all present as separate args; base stream-json flags (--print --verbose --output-format stream-json) still there.

Scenario B — session `customEnvVars` + env precedence regression (new fix)

Add to the same session: "customEnvVars": { "TEST_USER_VAR": "from-session" }. Then:

TEST_USER_VAR=from-shell maestro-cli send test-args "hello"

Expected: TEST_USER_VAR=from-session in shim env block — user override wins over shell.

Now remove customEnvVars from the session and switch the agent to opencode (it ships with OPENCODE_CONFIG_CONTENT in defaultEnvVars). Re-point the shim at opencode in agent-configs:

OPENCODE_CONFIG_CONTENT='shell-set-value' maestro-cli send test-args "hello"

Expected (regression for the fix just landed): shim prints OPENCODE_CONFIG_CONTENT=shell-set-value. Before the fix in 01938078c this would have been silently overwritten by the agent default.

Finally, unset the shell var and rerun:

unset OPENCODE_CONFIG_CONTENT; maestro-cli send test-args "hello"

Expected: shim prints the agent's JSON default (contains "permission") — defaults still reach the process when shell hasn't claimed the key.

Scenario C — session `customModel` / `customEffort`

{ "customModel": "opus", "customEffort": "high" }

Run maestro-cli send test-args "hello" and confirm the shim sees --model opus (Claude) — for Codex/OpenCode, the customEffort flag varies per agent's configOptions.argBuilder.

Scenario D — SSH remote enabled + valid

Prereq: a reachable host with the real agent binary in $PATH and key-based auth (ssh-agent or configured key). Add it to maestro-settings.json:

{ "sshRemotes": [{ "id": "test-remote", "name": "test", "host": "user@yourhost" }] }

Update the session:

{ "sessionSshRemoteConfig": { "enabled": true, "remoteId": "test-remote" } }

Run:

maestro-cli send test-args "list files in /etc with ls"

Expected:

Locally: ps -ef | grep ssh shows an ssh user@yourhost … child while the command runs (or capture via MAESTRO_DEBUG=1 logs if available).
On the remote: agent CLI process actually ran (check shell history / reverse-proxy logs / ~/.claude/logs if available).
Output references remote filesystem reality, not local.

Scenario E — SSH remote enabled + unresolvable (hard-fail)

Point the session at a remoteId that doesn't exist in sshRemotes, or mark the remote disabled: true:

{ "sessionSshRemoteConfig": { "enabled": true, "remoteId": "does-not-exist" } }

Run:

maestro-cli send test-args "sensitive prompt that must not leak"

Expected:

CLI exits non-zero with the exact message shape: SSH remote execution is enabled for this session but the configured remote "does-not-exist" could not be resolved. …
Confirm the local agent was never invoked: pgrep -af claude / the shim's stderr block must be absent from output. This is the security-critical assertion — no prompt leak to the local machine.

Scenario F — agent-level vs session-level precedence

Set customArgs on both the agent config and the session:

// maestro-agent-configs.json
{ "configs": { "claude-code": { "customArgs": "--from-agent", "customPath": "/tmp/claude-shim.sh" } } }
// maestro-sessions.json
{ "customArgs": "--from-session" }

Run and confirm shim args contain --from-session and not --from-agent. Then drop the session customArgs and rerun — shim should now show --from-agent.

Cleanup

rm -rf ~/.config/Maestro && mv ~/.config/Maestro.bak ~/.config/Maestro
rm /tmp/claude-shim.sh

Signal summary

Scenario	Pass signal	Fail signal
A customArgs	shim argv has the flags	flags missing → override dropped
B user env wins	shim env = session value	shim env = shell value → not threaded
B shell beats default	shim env = shell value	shim env = agent default → regression
B default fills blank	shim env = agent default	key missing → defaults dropped
C customModel	`--model` arg present	arg missing
D SSH remote	ssh child visible, remote output	local spawn / no ssh process
E SSH hard-fail	clear error, no local spawn	falls back to local → security bug
F precedence	session beats agent	agent value leaks through

coderabbitai

🧹 Nitpick comments (3)

src/__tests__/cli/services/agent-spawner.test.ts (3)
1856-1874: prompt: undefined isn't in the overrides Partial type.

Minor: the factory signature declares Partial<{ command; args; cwd; customEnvVars; sshStdinScript; sshRemoteUsed }> but the returned object carries prompt: undefined (Line 1870) and spreads overrides over it. Callers can't override prompt via this helper, and no test needs to — so either drop the field or add it to the Partial type so the shape is self-consistent. Purely cosmetic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/__tests__/cli/services/agent-spawner.test.ts` around lines 1856 - 1874,
The helper sshWrapResult returns an object that includes prompt: undefined but
its Partial<> signature does not allow overriding prompt; update the factory so
its shape is consistent by either adding prompt?: string | undefined to the
Partial type in the sshWrapResult signature or remove the prompt property from
the returned object; modify the sshWrapResult function accordingly so callers
can override prompt or the field is omitted.
1670-1685: driveSpawnToCompletion silently swallows a rejecting resultPromise.

Promise.race([waitForSpawnCall(), resultPromise.then(() => {})]) attaches a .then with no .catch. spawnAgent is designed to always resolve with an AgentResult today, so this is fine in practice — but if a future refactor ever lets it reject (e.g., a programmer error thrown synchronously from the new SSH path), this helper would produce an unhandled promise rejection warning and the test would hang on attempts * yieldTick before falling through. Cheap guard:
♻️ Proposed tweak
-		await Promise.race([waitForSpawnCall(), resultPromise.then(() => {})]);
+		await Promise.race([
+			waitForSpawnCall(),
+			resultPromise.then(
+				() => {},
+				() => {}
+			),
+		]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/__tests__/cli/services/agent-spawner.test.ts` around lines 1670 - 1685,
The helper driveSpawnToCompletion attaches resultPromise.then(() => {}) inside
Promise.race which can cause unhandled rejections if resultPromise rejects;
modify the race to swallow rejections safely by replacing resultPromise.then(()
=> {}) with a rejection-safe handler (e.g., resultPromise.then(() => {}, () =>
{} ) or resultPromise.catch(() => {})) so waitForSpawnCall() races against a
promise that will not produce an unhandled rejection; update references in
driveSpawnToCompletion and keep existing behavior with waitForSpawnCall and
subsequent mock spawn handling.
2074-2082: Weaker UUID regex than the sibling test at Lines 1409–1414.

/^[0-9a-f-]{36}$/ will happily accept e.g. 36 dashes or a v1 UUID. The earlier uniqueness test uses the full v4 pattern (^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$). Worth tightening here for consistency so both tests fail the same way if session-id generation ever regresses off v4.
♻️ Proposed tweak
-		expect(args[idx + 1]).toMatch(/^[0-9a-f-]{36}$/);
+		expect(args[idx + 1]).toMatch(
+			/^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/
+		);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/__tests__/cli/services/agent-spawner.test.ts` around lines 2074 - 2082,
The test "Claude generates fresh --session-id when no agentSessionId is
provided" uses a too-permissive regex; update the assertion that checks args[idx
+ 1] to use the stricter v4 UUID pattern used by the sibling test (the regex
matching 8-4-4-4-12 hex with the '4' and variant nibble constraints) so
session-id generation is validated as a v4 UUID; locate this in the test that
calls spawnAgent('claude-code', '/p', 'hi') and uses spawnCall(),
driveSpawnToCompletion(), and replace the current `/^[0-9a-f-]{36}$/` check with
the v4 UUID regex.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/__tests__/cli/services/agent-spawner.test.ts`:
- Around line 1856-1874: The helper sshWrapResult returns an object that
includes prompt: undefined but its Partial<> signature does not allow overriding
prompt; update the factory so its shape is consistent by either adding prompt?:
string | undefined to the Partial type in the sshWrapResult signature or remove
the prompt property from the returned object; modify the sshWrapResult function
accordingly so callers can override prompt or the field is omitted.
- Around line 1670-1685: The helper driveSpawnToCompletion attaches
resultPromise.then(() => {}) inside Promise.race which can cause unhandled
rejections if resultPromise rejects; modify the race to swallow rejections
safely by replacing resultPromise.then(() => {}) with a rejection-safe handler
(e.g., resultPromise.then(() => {}, () => {} ) or resultPromise.catch(() => {}))
so waitForSpawnCall() races against a promise that will not produce an unhandled
rejection; update references in driveSpawnToCompletion and keep existing
behavior with waitForSpawnCall and subsequent mock spawn handling.
- Around line 2074-2082: The test "Claude generates fresh --session-id when no
agentSessionId is provided" uses a too-permissive regex; update the assertion
that checks args[idx + 1] to use the stricter v4 UUID pattern used by the
sibling test (the regex matching 8-4-4-4-12 hex with the '4' and variant nibble
constraints) so session-id generation is validated as a v4 UUID; locate this in
the test that calls spawnAgent('claude-code', '/p', 'hi') and uses spawnCall(),
driveSpawnToCompletion(), and replace the current `/^[0-9a-f-]{36}$/` check with
the v4 UUID regex.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b3e0e64a-b238-4bee-9b2b-ba83099c1ee5

📥 Commits

Reviewing files that changed from the base of the PR and between 0193807 and 2c7244b.

📒 Files selected for processing (1)

src/__tests__/cli/services/agent-spawner.test.ts

Resolves conflict in src/cli/services/agent-spawner.ts. Both branches fixed the same PR RunMaestro#888 review issue (don't fold agent defaultEnvVars into the "user override" slot of the env layer merge) but with slightly different shapes. Adopts upstream's cleaner single-field return (`userCustomEnvVars`) since our branch's parallel `effectiveCustomEnvVars` field had no remaining callers after 8a36a97. Keeps our `applyEnvLayers` fix (merged defaults+ batchMode iteration so batch-mode wins on shared keys) and the precedence-bug docstring on `buildSshEnvForRemote`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

greptile-apps Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/cli/services/agent-spawner.ts

Comment thread src/cli/services/agent-spawner.ts

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/cli/services/agent-spawner.ts

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/__tests__/cli/services/agent-spawner.test.ts

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

chr1syy mentioned this pull request Apr 24, 2026

feat(cli): add gist create to publish session transcripts to GitHub #898

Merged

8 tasks

chr1syy added the ready to merge This PR is ready to merge label Apr 26, 2026

pedramamini merged commit 59910e5 into RunMaestro:rc Apr 27, 2026
3 checks passed

coderabbitai Bot mentioned this pull request May 5, 2026

fix(codex): place -C working directory flag before exec subcommand #960

Open

4 tasks

Conversation

chr1syy commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Tests

Docs

Test plan

Automated

Manual verification — 2026-04-24, Linux

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pedramamini commented Apr 23, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chr1syy commented Apr 23, 2026

Manual Testing Strategy — PR #888

Preflight

Shim agent (makes every scenario observable)

Scenario A — session customArgs threaded through

Scenario B — session customEnvVars + env precedence regression (new fix)

Scenario C — session customModel / customEffort

Scenario D — SSH remote enabled + valid

Scenario E — SSH remote enabled + unresolvable (hard-fail)

Scenario F — agent-level vs session-level precedence

Cleanup

Signal summary

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chr1syy commented Apr 23, 2026 •

edited

Loading

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading

Scenario A — session `customArgs` threaded through

Scenario B — session `customEnvVars` + env precedence regression (new fix)

Scenario C — session `customModel` / `customEffort`