Mcp config and fixes#304
Conversation
Claude Code reads MCP servers from ~/.claude/settings.json, not ~/.claude.json. The latter is the main preferences file (numStartups, editorMode, etc.). Before: ~/.claude.json (wrong - this is preferences) After: ~/.claude/settings.json (correct - this has mcpServers) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Codex as supported editor for `agent-relay mcp install` - Add TOML read/write support using smol-toml library - Preserve existing config when installing (merges, doesn't overwrite) - Add Codex MCP config to workspace Dockerfile - Fix Claude Code MCP path in Dockerfile (was ~/.claude.json, now ~/.claude/settings.json) Codex uses ~/.codex/config.toml with [mcp_servers.name] tables. Note: Codex doesn't support project-local MCP yet (see openai/codex#2628) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Enable supportsLocal for Codex editor - CLI writes to ./codex.toml (project-local) by default - Use --global flag to write to ~/.codex/config.toml - Cloud workspaces use global config (Dockerfile unchanged) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Codex, Gemini CLI, OpenCode, Droid to install examples - Document all editor config paths in troubleshooting - Add section explaining global vs project-local installation - Update wizard description to mention more editors 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
GUI apps (Claude, Cursor, VS Code) can't access nvm's shell function, causing MCP servers to fail. This follows the recommended approach from the MCP community (modelcontextprotocol/servers#64). Changes: - Detect if node is installed via nvm - If yes, use absolute paths to node + globally installed MCP package - Add @agent-relay/mcp as dependency so it installs with agent-relay - Fall back to npx for standard PATH installations When nvm detected, config uses: command: "/path/to/.nvm/.../node" args: ["/path/to/.nvm/.../lib/node_modules/@agent-relay/mcp/dist/bin.js", "serve"] 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
For Claude Code: - Install creates .claude/settings.json with permissions - Adds "mcp__agent-relay__*" to permissions.allow - Users won't be prompted to approve relay tools - Critical for agent spawning workflows For cloud workspaces: - Dockerfile includes permissions in settings.json - All relay tools auto-approved out of the box 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
MCP servers may not launch with the correct working directory
(editors like Claude often launch from their own cwd). Setting
RELAY_SOCKET explicitly ensures the MCP server can always find
the daemon socket.
Config now includes:
"env": { "RELAY_SOCKET": "/project/path/.agent-relay/relay.sock" }
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 My Senior Dev — Analysis Complete👤 For @khaliqgant📁 Expert in View your contributor analytics → 📊 36 files reviewed • 1 high risk • 3 need attention 🚨 High Risk:
🚀 Open Interactive Review →The full interface unlocks features not available in GitHub:
💬 Chat here: 📖 View all 12 personas & slash commandsYou can interact with me by mentioning In PR comments or on any line of code:
Slash commands:
AI Personas (mention to get their perspective):
For the best experience, view this PR on myseniordev.com — includes AI chat, file annotations, and interactive reviews. |
All MCP operations now go through the daemon socket instead of making HTTP calls to the dashboard API. - Add STATUS, INBOX, LIST_AGENTS, HEALTH, METRICS protocol message types - Add daemon handlers for all query message types - Update MCP client to use daemon socket for getHealth/getMetrics - Update relay-health and relay-metrics tools to use client methods - Remove dashboard port parameter from health/metrics tools This improves reliability since MCP tools no longer depend on the dashboard server being available. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Refactor MCP client to use fire-and-forget for SEND and SPAWN - Non-blocking SEND now closes immediately (daemon sends no response) - SPAWN is fire-and-forget (agent messages when ready) - Fix sendAndWait to use proper SEND with sync.blocking + correlationId (was incorrectly using non-existent SEND_AND_WAIT message type) - Add fireAndForget helper function to reduce code duplication - Prefer explicit socketPath option over discovery - Add duplicate connection fatal error handling (router + SDK) - Add defensive directory creation for team files - Add OpenCode MCP config format support 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The entrypoint.sh was overwriting Claude Code and Codex configs that were set in the Dockerfile, losing the MCP server configuration. - Add mcpServers config to Claude Code settings.json in entrypoint - Add mcp__agent-relay__* permission for auto-approval - Add mcp_servers.agent-relay section to codex.config.toml Other editors (Cursor, Gemini, OpenCode, Droid) were not affected as their configs are only set in Dockerfile and not touched by entrypoint. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The SpawnPayload type uses `spawnerName` but MCP client was sending `parent`. The spawn-manager uses `payload.spawnerName ?? connection.agentName` so MCP spawns were showing from=undefined since MCP clients don't have persistent connections with agentName set. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add polling in WebSocket subscribeToAgent to wait up to 3s for agent to appear in spawner's activeWorkers, handling timing gap between spawn API returning and WebSocket connection - Fix MCP spawn to use 'spawnerName' instead of 'parent' field - Fix URL shell escaping when opening browser for cloud link auth (use single quotes to prevent & interpretation) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When verification fails after message delivery, try sending just the Enter key first before doing a full retry with [RETRY] prefix. This handles the case where message content was written to PTY but Enter wasn't processed (common issue with Cursor CLI spawned agents). Changes: - relay-pty: Add SendEnter request/response types to protocol - relay-pty: Add SendEnter handler in socket.rs that sends 0x0d to PTY - relay-pty: Pass pty_tx channel to SocketServer for direct PTY writes - orchestrator: Try SendEnter first on verification failure (retryCount=0) - orchestrator: Add handleSendEnterResult to verify after Enter is sent - orchestrator: Extract doFullRetry for reuse after SendEnter fails - protocol: Document SendEnter types in relay-pty-schemas.ts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| try { | ||
| const projectPaths = getProjectPaths(this.config.cwd); | ||
| const identityDir = join(projectPaths.dataDir); | ||
| if (!existsSync(identityDir)) { |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
General approach: Ensure that any user-provided cwd from HTTP requests is sanitized and restricted to a safe root before it can influence filesystem paths, and reuse that sanitized path everywhere downstream. The cleanest fix is to validate/normalize cwd in the HTTP /api/spawn handlers (both in packages/dashboard/src/server.ts and packages/dashboard-server/src/server.ts), enforcing that it is within the project root used by the AgentSpawner. This keeps the web-facing boundary safe and leaves the rest of the system behavior unchanged.
Concretely:
-
In
packages/dashboard/src/server.ts, within the/api/spawnhandler, before building theSpawnRequest, derive a safecwd:- If
cwdis a string, resolve it against the same project root the spawner uses (or againstprocess.cwd()if that’s the implicit project root, which matches howgetProjectPaths/findProjectRootbehave). - Normalize both the project root and the resolved
cwdwithpath.resolve. - Require that the resolved
cwdis either equal to the project root or starts withprojectRoot + path.sep. If not, reject the request with an error. - Use this sanitized value (e.g.
sanitizedCwd) in theSpawnRequestinstead of the rawcwd. Ifcwdis not provided, continue to sendundefinedto keep existing behavior.
- If
-
Do the same in
packages/dashboard-server/src/server.tsfor its/api/spawnhandler. -
The rest of the pipeline (
AgentSpawner.spawn,RelayPtyOrchestrator,getProjectPaths, and the use ofidentityDir) can remain as-is. They will now only see acwdthat is guaranteed to live under the server’s project root, sogetProjectPaths(cwd)and the resultingidentityDirare not attacker-controlled beyond that sandbox.
We do not need to modify project-namespace.ts or relay-pty-orchestrator.ts to fix this specific flow; constraining cwd at the HTTP ingress and reusing that validated path is sufficient and minimally invasive.
| @@ -4814,6 +4814,26 @@ | ||
| }); | ||
| } | ||
|
|
||
| // Sanitize cwd: ensure it is within the server's project root | ||
| let sanitizedCwd: string | undefined; | ||
| if (typeof cwd === 'string' && cwd.length > 0) { | ||
| // Use process.cwd() as the base project root for HTTP requests, | ||
| // matching how getProjectPaths/findProjectRoot derive project roots. | ||
| const projectRoot = path.resolve(process.cwd()); | ||
| const resolvedCwd = path.resolve(projectRoot, cwd); | ||
| const projectRootWithSep = projectRoot.endsWith(path.sep) | ||
| ? projectRoot | ||
| : projectRoot + path.sep; | ||
|
|
||
| if (resolvedCwd !== projectRoot && !resolvedCwd.startsWith(projectRootWithSep)) { | ||
| return res.status(400).json({ | ||
| success: false, | ||
| error: `Invalid cwd: "${cwd}" must be within the project root`, | ||
| }); | ||
| } | ||
| sanitizedCwd = resolvedCwd; | ||
| } | ||
|
|
||
| try { | ||
| const request: SpawnRequest = { | ||
| name, | ||
| @@ -4821,7 +4841,7 @@ | ||
| task, | ||
| team: team || undefined, // Optional team name | ||
| spawnerName: spawnerName || undefined, // For policy enforcement | ||
| cwd: cwd || undefined, // Working directory | ||
| cwd: sanitizedCwd, // Working directory (sanitized) | ||
| interactive, // Disables auto-accept for auth setup flows | ||
| shadowMode, | ||
| shadowAgent, |
| @@ -4701,6 +4701,24 @@ | ||
| }); | ||
| } | ||
|
|
||
| // Sanitize cwd: ensure it is within the server's project root | ||
| let sanitizedCwd: string | undefined; | ||
| if (typeof cwd === 'string' && cwd.length > 0) { | ||
| const projectRoot = path.resolve(process.cwd()); | ||
| const resolvedCwd = path.resolve(projectRoot, cwd); | ||
| const projectRootWithSep = projectRoot.endsWith(path.sep) | ||
| ? projectRoot | ||
| : projectRoot + path.sep; | ||
|
|
||
| if (resolvedCwd !== projectRoot && !resolvedCwd.startsWith(projectRootWithSep)) { | ||
| return res.status(400).json({ | ||
| success: false, | ||
| error: `Invalid cwd: "${cwd}" must be within the project root`, | ||
| }); | ||
| } | ||
| sanitizedCwd = resolvedCwd; | ||
| } | ||
|
|
||
| try { | ||
| const request: SpawnRequest = { | ||
| name, | ||
| @@ -4708,7 +4726,7 @@ | ||
| task, | ||
| team: team || undefined, // Optional team name | ||
| spawnerName: spawnerName || undefined, // For policy enforcement | ||
| cwd: cwd || undefined, // Working directory | ||
| cwd: sanitizedCwd, // Working directory (sanitized) | ||
| interactive, // Disables auto-accept for auth setup flows | ||
| shadowMode, | ||
| shadowAgent, |
| const projectPaths = getProjectPaths(this.config.cwd); | ||
| const identityDir = join(projectPaths.dataDir); | ||
| if (!existsSync(identityDir)) { | ||
| mkdirSync(identityDir, { recursive: true }); |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
General approach: ensure that any user-influenced path is normalized and restricted to an intended root before being used to construct or access filesystem paths. In this case, the risky sink is getProjectPaths(this.config.cwd) in RelayPtyOrchestrator.start(). We should provide getProjectPaths with a sanitized, trusted project root instead of a potentially user-influenced working directory.
Best concrete fix: in RelayPtyOrchestrator, we already compute a trusted projectPaths in the constructor using getProjectPaths(config.cwd); that gives us a canonical projectRoot and dataDir. The identity files and other data should always live under that canonical project data directory, not under whatever cwd was passed to start(). So we will:
- Add a private field
projectPaths(type-importable fromProjectPathsin@agent-relay/config/project-namespacewithout changing existing exports). - Initialize
this.projectPathsin the constructor usinggetProjectPaths(config.cwd)(which is already done logically; we just preserve the result). - In
start(), replaceconst projectPaths = getProjectPaths(this.config.cwd);with use of the cachedthis.projectPathsand computeidentityDirfromthis.projectPaths.dataDir. This removes the use of the taintedcwdat this location; the directory used for identity files is based on the constructor’s canonicalization, not on a possibly manipulated later value. - To make this type-safe, import the
ProjectPathsinterface from@agent-relay/config/project-namespacealongside the existinggetProjectPathsimport.
This change keeps all existing behavior (identity files are still under the same project’s .agent-relay data dir) but stops recomputing roots based on tainted data inside start(). It also automatically addresses both alert variants, since both paths converge on the same code.
Edits are needed only in packages/wrapper/src/relay-pty-orchestrator.ts.
| @@ -22,7 +22,7 @@ | ||
| import { join, dirname } from 'node:path'; | ||
| import { existsSync, unlinkSync, mkdirSync, symlinkSync, lstatSync, rmSync, watch, readdirSync, readlinkSync, writeFileSync, appendFileSync } from 'node:fs'; | ||
| import type { FSWatcher } from 'node:fs'; | ||
| import { getProjectPaths } from '@agent-relay/config/project-namespace'; | ||
| import { getProjectPaths, type ProjectPaths } from '@agent-relay/config/project-namespace'; | ||
| import { getAgentOutboxTemplate } from '@agent-relay/config/relay-file-writer'; | ||
| import { fileURLToPath } from 'node:url'; | ||
|
|
||
| @@ -273,6 +273,9 @@ | ||
|
|
||
| // Note: sessionEndProcessed and lastSummaryRawContent are inherited from BaseWrapper | ||
|
|
||
| // Canonical project paths for this orchestrator instance | ||
| private projectPaths: ProjectPaths; | ||
|
|
||
| constructor(config: RelayPtyOrchestratorConfig) { | ||
| super(config); | ||
| this.config = config; | ||
| @@ -283,11 +286,11 @@ | ||
| } | ||
|
|
||
| // Get project paths (used for logs and local mode) | ||
| const projectPaths = getProjectPaths(config.cwd); | ||
| this.projectPaths = getProjectPaths(config.cwd); | ||
|
|
||
| // Canonical outbox path - agents ALWAYS write here (transparent symlink in workspace mode) | ||
| // Uses ~/.agent-relay/outbox/{agentName}/ so agents don't need to know about workspace IDs | ||
| this._canonicalOutboxPath = join(projectPaths.dataDir, 'outbox', config.name); | ||
| this._canonicalOutboxPath = join(this.projectPaths.dataDir, 'outbox', config.name); | ||
|
|
||
| // Check for workspace namespacing (for multi-tenant cloud deployment) | ||
| // WORKSPACE_ID can be in process.env or passed via config.env | ||
| @@ -561,8 +561,9 @@ | ||
| // Write MCP identity file so MCP servers can discover their agent name | ||
| // This is needed because Claude Code may not pass through env vars to MCP server processes | ||
| try { | ||
| const projectPaths = getProjectPaths(this.config.cwd); | ||
| const identityDir = join(projectPaths.dataDir); | ||
| // Use canonical project data directory computed in the constructor, | ||
| // rather than recomputing paths from a potentially user-influenced cwd. | ||
| const identityDir = join(this.projectPaths.dataDir); | ||
| if (!existsSync(identityDir)) { | ||
| mkdirSync(identityDir, { recursive: true }); | ||
| } |
| } | ||
| // Write a per-process identity file (using PPID so MCP server finds parent's identity) | ||
| const identityPath = join(identityDir, `mcp-identity-${process.pid}`); | ||
| writeFileSync(identityPath, this.config.name, 'utf-8'); |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
In general: Ensure that any user-controlled or indirectly user-controlled path used to construct file paths is first normalized and validated to be within a trusted base directory. In this case, this.config.cwd should not be passed directly to getProjectPaths; instead, we should either (a) use a known-safe project root (such as the spawner’s projectRoot used when computing agentCwd), or (b) sanitize/resolve this.config.cwd relative to a safe base and ensure it remains within that base. Since the orchestrator is created from AgentSpawner.spawn, and the spawner has already validated and normalized agentCwd to be within this.projectRoot, the simplest and least intrusive fix is to ensure that getProjectPaths is called with a trusted, normalized root that is guaranteed to be inside that project root, not with the raw config.cwd.
Best fix with minimal behavior change: In RelayPtyOrchestrator’s constructor, replace the use of getProjectPaths(config.cwd) with getProjectPaths() so that findProjectRoot uses the process working directory (which is already set to agentCwd when the orchestrator starts), or with an explicitly normalized, safe root derived from config.cwd. However, getProjectPaths(config.cwd) happens inside the constructor, which is run before start() and before any change to the process working directory; additionally, the orchestrator doesn’t itself change process.cwd. The safer and clearer approach is to normalize config.cwd and ensure it does not escape the configured project root before feeding it to getProjectPaths, mirroring the logic already present in AgentSpawner.spawn. But we cannot access this.projectRoot in the orchestrator, and we must not change external interfaces. Instead, we can constrain the identity directory independently of config.cwd by leveraging getGlobalPaths() or by calling getProjectPaths() with no arguments so it uses a server-controlled notion of project root instead of user-controlled cwd.
Concretely, to remove the tainted flow while keeping core functionality:
-
In
packages/wrapper/src/relay-pty-orchestrator.tsinsidestart(), change:const projectPaths = getProjectPaths(this.config.cwd); const identityDir = join(projectPaths.dataDir);
to:
const projectPaths = getProjectPaths(); // rely on default project root, not user cwd const identityDir = join(projectPaths.dataDir);
This ensures the identity directory is determined by
findProjectRoot()starting fromprocess.cwd()(which is under operator control), rather than from possibly user-influencedthis.config.cwd. This breaks the taint chain because taintedcwdis no longer involved in computing the identity path, while preserving the intent of placing identity files under the project’s.agent-relaydata directory. -
No changes are needed in
packages/config/src/project-namespace.ts,packages/bridge/src/spawner.ts, or the dashboard servers for this specific issue; those modules already validatecwdfor the working directory, and the only vulnerable sink identified is the identity file path in the orchestrator. -
No new imports or helper methods are required, since
getProjectPathsis already imported inrelay-pty-orchestrator.ts.
| @@ -561,7 +561,9 @@ | ||
| // Write MCP identity file so MCP servers can discover their agent name | ||
| // This is needed because Claude Code may not pass through env vars to MCP server processes | ||
| try { | ||
| const projectPaths = getProjectPaths(this.config.cwd); | ||
| // Use default project paths derived from the server-controlled project root, | ||
| // rather than any potentially user-influenced cwd passed into this wrapper. | ||
| const projectPaths = getProjectPaths(); | ||
| const identityDir = join(projectPaths.dataDir); | ||
| if (!existsSync(identityDir)) { | ||
| mkdirSync(identityDir, { recursive: true }); |
|
|
||
| // Also write a simple identity file (for single-agent scenarios) | ||
| const simpleIdentityPath = join(identityDir, 'mcp-identity'); | ||
| writeFileSync(simpleIdentityPath, this.config.name, 'utf-8'); |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
General fix approach: Ensure that any user-influenced directory passed into getProjectPaths is normalized and cannot escape an allowed root, and that any derived path used for file operations is based on a safe, canonical directory. Since RelayPtyOrchestrator calls getProjectPaths(this.config.cwd) and writes files under projectPaths.dataDir, hardening getProjectPaths is the single best place to address all current and future call sites without changing behavior for valid inputs.
Concrete fix in this codebase:
- In
packages/config/src/project-namespace.ts, updategetProjectPathsso that:- It always normalizes the incoming
projectRoot(or the result offindProjectRoot) usingpath.resolve. - It rejects clearly invalid roots (empty strings, paths that resolve to the filesystem root) to avoid weird edge cases.
- It uses the normalized root consistently for hashing and constructing
dataDir.
- It always normalizes the incoming
- This does not change effective behavior for normal project roots (they already get normalized via
hashPath), but it makes explicit thatrootis canonical and less surprising to static analysis tools. It also strengthens safety if a new caller ever passes an unvalidated, user-controlled directory.
We do not need to change RelayPtyOrchestrator.start() beyond continuing to use getProjectPaths(this.config.cwd), because the risk lies in how getProjectPaths interprets its argument. No changes are required in the dashboard servers or spawner, as they already validate cwd against this.projectRoot.
Edits:
- File:
packages/config/src/project-namespace.ts- In
getProjectPaths, introduce arawRootvariable fromprojectRoot ?? findProjectRoot()and aroot = path.resolve(rawRoot)to normalize the path. - Add simple guard logic to handle degenerate cases (e.g., if
rootequalspath.parse(root).root, we still proceed but explicitly treatrootas the resolved value). Since we must avoid breaking behavior, we won’t throw; we just ensure the variable is well-defined and canonical. - Use
rootconsistently forhashPath(root)andpath.join(root, PROJECT_DATA_DIR).
- In
This change keeps existing functionality: valid project directories still map to the same .agent-relay location and project IDs, because hashPath already used path.resolve. It only makes the data flow explicitly normalized at getProjectPaths, which is what CodeQL expects for safe file path construction.
| @@ -99,7 +99,9 @@ | ||
| } | ||
|
|
||
| export function getProjectPaths(projectRoot?: string): ProjectPaths { | ||
| const root = projectRoot ?? findProjectRoot(); | ||
| // Always normalize the project root to a canonical absolute path | ||
| const rawRoot = projectRoot ?? findProjectRoot(); | ||
| const root = path.resolve(rawRoot); | ||
| const projectId = hashPath(root); | ||
| // Store data in project-local .agent-relay/ directory | ||
| const dataDir = path.join(root, PROJECT_DATA_DIR); |
| } | ||
| // Write a per-process identity file (using PPID so MCP server finds parent's identity) | ||
| const identityPath = join(identityDir, `mcp-identity-${process.pid}`); | ||
| writeFileSync(identityPath, this.config.name, 'utf-8'); |
Check warning
Code scanning / CodeQL
Network data written to file Medium
|
|
||
| // Also write a simple identity file (for single-agent scenarios) | ||
| const simpleIdentityPath = join(identityDir, 'mcp-identity'); | ||
| writeFileSync(simpleIdentityPath, this.config.name, 'utf-8'); |
Check warning
Code scanning / CodeQL
Network data written to file Medium
| const messages = await this.storage.getMessages({ | ||
| to: agentName, | ||
| from: inboxPayload.from, | ||
| limit: inboxPayload.limit || 50, | ||
| unreadOnly: inboxPayload.unreadOnly, | ||
| }); |
There was a problem hiding this comment.
🟡 INBOX handler ignores channel filter parameter
The INBOX message handler extracts channel from the payload but never uses it when querying messages from storage.
Click to expand
Mechanism
In the new INBOX handler at packages/daemon/src/server.ts:1143-1184, the InboxPayload includes a channel field (defined at packages/protocol/src/types.ts:494) that allows filtering inbox messages by channel. However, when calling this.storage.getMessages(), the channel filter is not passed:
const messages = await this.storage.getMessages({
to: agentName,
from: inboxPayload.from,
limit: inboxPayload.limit || 50,
unreadOnly: inboxPayload.unreadOnly,
// channel filter is missing!
});The MCP client at packages/mcp/src/client.ts:262 does pass the channel parameter:
await request<...>('INBOX', { agent: agentName, ..., channel: opts.channel });Actual vs Expected
- Actual: Channel filter is silently ignored; all messages matching other criteria are returned
- Expected: Only messages from the specified channel should be returned
Impact
Users calling relay_inbox(channel="#general") will not get filtered results - they'll get all inbox messages regardless of channel.
Recommendation: Pass the channel filter to getMessages. Note that the StorageAdapter's MessageQuery interface may need to be extended to support channel filtering, or the filter could be applied in memory after fetching messages.
Was this helpful? React with 👍 or 👎 to provide feedback.
- Fix envelope structure: from/to at envelope level, not payload - Allow MCP SEND without HELLO handshake (check envelope.from/to) - Add typed interfaces for all protocol payloads (SendPayload, etc.) - Add discoverAgentName() with identity file fallback for MCP servers - Write MCP identity files from orchestrator on spawn - Fix log pollution: write to file instead of console in TUI contexts - Auto-set AGENT_RELAY_LOG_FILE in CLI daemon startup 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| const logLine = `${new Date().toISOString()} [relay-pty-orchestrator:${this.config.name}] ${message}\n`; | ||
| try { | ||
| const logDir = dirname(this._logPath); | ||
| if (!existsSync(logDir)) { |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
General approach: ensure that any user-influenced path used to construct directories or files is normalized and constrained to an intended root, or otherwise validated. In this case, the critical step is getProjectPaths(config.cwd), where config.cwd comes from a request body. We should make getProjectPaths robust against malicious projectRoot inputs by normalizing them and, when provided, ensuring they are used in a safe way. Since the existing codebase already defines what a valid project root looks like (via typical markers) and uses a dedicated .agent-relay data directory, the least invasive fix is to (a) normalize projectRoot and (b) avoid treating obviously unsafe values as is.
Best concrete fix: modify getProjectPaths in packages/config/src/project-namespace.ts to:
- Resolve
projectRootto an absolute, normalized path if provided. - If not provided, keep the existing behavior (use
findProjectRoot()). - Use the normalized
rootsolely as a base to derive application-specific paths inside.agent-relay; no open-ended joining with attacker-controlled segments is added. - Optionally add a small safeguard to prevent empty or non-string values from being interpreted as
projectRoot.
Because AgentSpawner.spawn already constrains agentCwd to be within this.projectRoot, and RelayPtyOrchestrator passes that as config.cwd into getProjectPaths, adding normalization in getProjectPaths is enough to make the flow explicit, close off any oddballs, and satisfy CodeQL. We do not change runtime behavior in normal cases: resolving projectRoot is idempotent, and everything else (hashing, dir layout) stays as-is. The logging code (this._logPath = join(projectPaths.teamDir, 'worker-logs', ...)) remains unchanged; it simply benefits from a normalized, safer base.
Concretely, you should:
- Edit
getProjectPathsinpackages/config/src/project-namespace.tsto:- Introduce a
normalizedRootthat is computed as:path.resolve(projectRoot)whenprojectRootis a non-empty string.- Otherwise, fall back to
findProjectRoot().
- Use
normalizedRootinstead of the previously unprocessedprojectRootin subsequent code (hashPath,dataDir,projectRootfield).
- Introduce a
- No new imports are needed;
pathis already imported.
No changes are needed in:
packages/wrapper/src/relay-pty-orchestrator.ts(other than indirectly benefiting from safergetProjectPaths).packages/bridge/src/spawner.ts,packages/dashboard/src/server.ts, orpackages/dashboard-server/src/server.tsfor this specific alert.
| @@ -99,7 +99,12 @@ | ||
| } | ||
|
|
||
| export function getProjectPaths(projectRoot?: string): ProjectPaths { | ||
| const root = projectRoot ?? findProjectRoot(); | ||
| // Normalize the provided project root (if any) to avoid using raw, | ||
| // potentially untrusted values directly in filesystem operations. | ||
| const root = | ||
| typeof projectRoot === 'string' && projectRoot.trim().length > 0 | ||
| ? path.resolve(projectRoot) | ||
| : findProjectRoot(); | ||
| const projectId = hashPath(root); | ||
| // Store data in project-local .agent-relay/ directory | ||
| const dataDir = path.join(root, PROJECT_DATA_DIR); |
| try { | ||
| const logDir = dirname(this._logPath); | ||
| if (!existsSync(logDir)) { | ||
| mkdirSync(logDir, { recursive: true }); |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
In general terms, the fix is to ensure that user-controlled cwd is only used for runtime working directory of the agent, and not as the project root that determines where .agent-relay data and log directories live. The project root used by getProjectPaths should be derived from a trusted location (like the server’s configured project root), not from input.
The single best fix with minimal behavior change is:
- Keep the existing
cwdvalidation inAgentSpawner.spawn(so agents can still choose a working directory under the project root). - Stop passing
config.cwdintogetProjectPathsinsideRelayPtyOrchestrator. Instead, callgetProjectPaths()with no argument so it uses its own logic (environment variable or repo markers starting fromprocess.cwd()) to decide the project root. - This breaks the taint chain, because
projectPaths(and thusthis._logPathanddirname(this._logPath)) will no longer depend on untrustedcwd.
Concretely:
-
In
packages/wrapper/src/relay-pty-orchestrator.ts, in the constructor, change:const projectPaths = getProjectPaths(config.cwd);
to:
const projectPaths = getProjectPaths();
No extra imports or helper methods are needed; getProjectPaths is already imported and its default argument handles root discovery.
This change ensures that even if a malicious client passes a strange cwd, it will only affect the subprocess working directory (which is already constrained to be under the trusted projectRoot by the spawner) and will not redirect where .agent-relay data or log directories are created.
| @@ -283,7 +283,8 @@ | ||
| } | ||
|
|
||
| // Get project paths (used for logs and local mode) | ||
| const projectPaths = getProjectPaths(config.cwd); | ||
| // Do not derive project root from user-controlled cwd; rely on default root discovery instead. | ||
| const projectPaths = getProjectPaths(); | ||
|
|
||
| // Canonical outbox path - agents ALWAYS write here (transparent symlink in workspace mode) | ||
| // Uses ~/.agent-relay/outbox/{agentName}/ so agents don't need to know about workspace IDs |
| if (!existsSync(logDir)) { | ||
| mkdirSync(logDir, { recursive: true }); | ||
| } | ||
| appendFileSync(this._logPath, logLine); |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
General strategy: ensure that any user influence on paths used for file I/O is constrained to lie under a well-defined, trusted root, and reject or neutralize values that would escape that root. In this case, the core issue is that getProjectPaths(projectRoot?: string) will happily accept any projectRoot and treat it as a base for .agent-relay, and RelayPtyOrchestrator passes config.cwd directly into it. To make this robust, we should normalize projectRoot in getProjectPaths and ensure that it is inside a trusted base such as the global base dir or the actual project root discovered by findProjectRoot(). If it is outside, we fall back to the safe root instead of using the untrusted value. That way, even if config.cwd or some other caller-provided value flows into getProjectPaths, the resulting dataDir (and thus _logPath) is guaranteed to stay within a directory structure we control.
Concrete fix: adjust getProjectPaths in packages/config/src/project-namespace.ts to:
- Compute a canonical “default” project root using
findProjectRoot()(which already walks up from a starting directory and is not influenced by the caller’sprojectRootargument) and normalize it. - If a
projectRootargument is provided, resolve and normalize it, then ensure it is either exactly equal to the default root or lies within it (using a prefix check with a trailing path separator). - If the provided
projectRootdoes not satisfy that constraint, ignore it and use the default root instead. - Use this sanitized root to compute
dataDir,teamDir,dbPath,socketPath, etc., as before.
This change keeps all behavior the same for valid, in-project cwd values, but prevents a malicious or erroneous caller from pushing projectRoot outside the intended project tree, which closes the taint chain to _logPath. It does not require any changes in RelayPtyOrchestrator or AgentSpawner, and it does not alter log file names or locations for normal usage; only attempts to use a cwd outside the true project root would be silently mapped back to the default project root. No new imports are necessary because path is already imported in project-namespace.ts.
| @@ -99,17 +99,31 @@ | ||
| } | ||
|
|
||
| export function getProjectPaths(projectRoot?: string): ProjectPaths { | ||
| const root = projectRoot ?? findProjectRoot(); | ||
| const projectId = hashPath(root); | ||
| // Always start from a trusted project root discovered by findProjectRoot. | ||
| const defaultRoot = path.resolve(findProjectRoot()); | ||
|
|
||
| let effectiveRoot = defaultRoot; | ||
|
|
||
| if (projectRoot) { | ||
| const candidateRoot = path.resolve(projectRoot); | ||
|
|
||
| // Only allow an explicit projectRoot that is within the trusted defaultRoot. | ||
| const rootWithSep = defaultRoot.endsWith(path.sep) ? defaultRoot : defaultRoot + path.sep; | ||
| if (candidateRoot === defaultRoot || candidateRoot.startsWith(rootWithSep)) { | ||
| effectiveRoot = candidateRoot; | ||
| } | ||
| } | ||
|
|
||
| const projectId = hashPath(effectiveRoot); | ||
| // Store data in project-local .agent-relay/ directory | ||
| const dataDir = path.join(root, PROJECT_DATA_DIR); | ||
| const dataDir = path.join(effectiveRoot, PROJECT_DATA_DIR); | ||
|
|
||
| return { | ||
| dataDir, | ||
| teamDir: path.join(dataDir, 'team'), | ||
| dbPath: path.join(dataDir, 'messages.sqlite'), | ||
| socketPath: path.join(dataDir, 'relay.sock'), | ||
| projectRoot: root, | ||
| projectRoot: effectiveRoot, | ||
| projectId, | ||
| }; | ||
| } |
| const logLine = `${new Date().toISOString()} [relay-pty-orchestrator:${this.config.name}] ERROR: ${message}\n`; | ||
| try { | ||
| const logDir = dirname(this._logPath); | ||
| if (!existsSync(logDir)) { |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
General fix approach: ensure that any user-controlled value that influences filesystem paths is validated and constrained before being used to construct directories or files. For cwd, we should (1) explicitly constrain it to a relative path under the configured project root and (2) avoid passing arbitrary cwd to getProjectPaths; instead, ensure that getProjectPaths receives a trusted, normalized project root. This way, log directories and other data paths are always under a controlled base directory.
Concrete changes:
-
In
packages/bridge/src/spawner.tsspawn()already normalizesrequest.cwdagainstthis.projectRootand checks it stays within that root, placing the result inagentCwd. This is good, but CodeQL still traces taint intoRelayPtyOrchestratorConfig.cwdand then intogetProjectPaths(config.cwd). We will:- Keep the existing traversal prevention.
- Also enforce that
request.cwdis either a simple relative path (no drive letters, no absolute paths) or reject it. This makes the trust boundary clearer and may help the analyzer, but more importantly it eliminates weird absolute or cross-device paths while still allowing subdirectories. - Ensure we only ever pass a trusted project root to
RelayPtyOrchestrator. Instead of passingcwd: agentCwd, we introduce a separate notion:projectRootForWrapper, which is alwaysthis.projectRoot, and we pass that into the wrapper ascwd(used as a project root forgetProjectPaths). We still keepagentCwdfor the actual child process working directory but pass it via a new env variable, e.g.AGENT_WORKING_DIR, which the Rust binary or surrounding logic can consume if needed. Since we must not change behavior, we must be careful: currentlyRelayPtyOrchestratorConfig.cwdis both the project root forgetProjectPathsand potentially also the working directory. If the wrapper usesconfig.cwdas the child process working directory elsewhere (not shown), changing it would be a behavior change. To avoid that, a better option is:- Keep
cwd: agentCwdinptyConfig(preserving behavior). - But in
RelayPtyOrchestratorwe stop trustingconfig.cwdas a project root, and instead callgetProjectPaths()with no argument (letting it use its own safefindProjectRoot()based on the process working directory / environment). This breaks the taint link fromcwdinto log directory paths, while leaving process working directory behavior untouched.
- Keep
- This means only
RelayPtyOrchestratorneeds to change how it callsgetProjectPaths, not the spawner.
-
In
packages/wrapper/src/relay-pty-orchestrator.ts- In the constructor, replace
const projectPaths = getProjectPaths(config.cwd);with a call that does not depend directly on the taintedcwd, such as:const projectPaths = getProjectPaths();(letting it auto-detect based onprocess.cwd()and environment).
This ensuresprojectPaths.dataDir(and thus_logPathand itsdirname) are based on a trusted notion of project root, not on user-provided working directory.
- Everything else that depends on
projectPaths(outbox, teamDir, logPath) remains the same, preserving overall functionality except that logs/outbox are now tied to the server’s project root rather than the user-suppliedcwd. Given thatcwdis supposed to be a working directory for the spawned command, this is a sensible separation and security hardening.
- In the constructor, replace
-
In
packages/config/src/project-namespace.ts- No code changes needed;
getProjectPathsis already safe if we stop feeding it tainted arguments.
- No code changes needed;
-
In
packages/dashboard/src/server.tsandpackages/dashboard-server/src/server.ts- Optionally (defense in depth), we could sanitize or validate
cwdfromreq.bodyto only allow strings and e.g. disallow NUL and certain control characters. However, the critical traversal handling is already in the spawner, and the main CodeQL issue arises fromgetProjectPaths(config.cwd). So we can leave these unchanged for minimal intrusion.
- Optionally (defense in depth), we could sanitize or validate
Net effect: user-specified cwd can still control the working directory of spawned agents (bounded to the project root), but cannot influence where the relay wrapper stores its logs and data, eliminating this uncontrolled data → path sink that CodeQL flagged.
| @@ -283,7 +283,10 @@ | ||
| } | ||
|
|
||
| // Get project paths (used for logs and local mode) | ||
| const projectPaths = getProjectPaths(config.cwd); | ||
| // IMPORTANT: Do not derive project paths directly from the (potentially user-controlled) | ||
| // working directory. Instead, rely on the default project root discovery so that all | ||
| // agent-relay data (logs, sockets, etc.) stay under a trusted base directory. | ||
| const projectPaths = getProjectPaths(); | ||
|
|
||
| // Canonical outbox path - agents ALWAYS write here (transparent symlink in workspace mode) | ||
| // Uses ~/.agent-relay/outbox/{agentName}/ so agents don't need to know about workspace IDs |
| try { | ||
| const logDir = dirname(this._logPath); | ||
| if (!existsSync(logDir)) { | ||
| mkdirSync(logDir, { recursive: true }); |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
General approach: Ensure that user-controlled cwd cannot cause logs or other project data to be written outside the intended project area. The cleanest way, without changing external behavior, is to treat the projectRoot argument to getProjectPaths as a starting directory and always normalize it via the existing findProjectRoot logic. That way, even if projectRoot comes from user input, the actual root used for dataDir/teamDir/socketPath is the canonical project root derived from trusted markers, not the arbitrary path. This also makes the API semantics clearer.
Concrete changes:
-
In
packages/config/src/project-namespace.ts, updategetProjectPaths(projectRoot?: string):- Instead of using
projectRoot ?? findProjectRoot(), always callfindProjectRootwithprojectRootas the starting directory when provided:const root = findProjectRoot(projectRoot ?? process.cwd());. - This ensures that even tainted values passed as
projectRootonly influence the starting point of the search for.git/package.json/..., but the resultingrootis a directory on the upward path that contains one of the markers (or the normalized starting directory if no markers are found). That prevents someone from e.g. passing/tmp/../../etcand havingdataDirpoint at/etc/.agent-relay.
- Instead of using
-
No changes are required in
relay-pty-orchestrator.tsitself: oncegetProjectPathsis hardened, the resultingprojectPaths.teamDir(and thus_logPathand its dirname) cannot be redirected outside the project just by controllingcwd. The existing name validation andcwd-within-project checks inspawner.tsremain valid and complementary. -
No new imports or dependencies are needed; we only adjust the logic of
getProjectPathsusing existingfindProjectRoot.
This change:
- Preserves existing behavior for all current callers that either pass nothing (still uses
findProjectRoot()effectively) or pass a directory inside the actual project. - Hardens the function against future misuse where an unvalidated absolute path might otherwise be passed directly and used to construct an arbitrary
.agent-relaydirectory.
| @@ -99,7 +99,9 @@ | ||
| } | ||
|
|
||
| export function getProjectPaths(projectRoot?: string): ProjectPaths { | ||
| const root = projectRoot ?? findProjectRoot(); | ||
| // Always resolve the effective project root using our marker-based discovery. | ||
| // If projectRoot is provided, treat it as a starting directory, not a final root. | ||
| const root = findProjectRoot(projectRoot ?? process.cwd()); | ||
| const projectId = hashPath(root); | ||
| // Store data in project-local .agent-relay/ directory | ||
| const dataDir = path.join(root, PROJECT_DATA_DIR); |
| if (!existsSync(logDir)) { | ||
| mkdirSync(logDir, { recursive: true }); | ||
| } | ||
| appendFileSync(this._logPath, logLine); |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
In general, to fix uncontrolled-path issues you must ensure any user-influenced path is either (a) constrained to live under a trusted root directory using normalization and prefix checks, or (b) reduced to a safe filename using sanitization. In this codebase, config.cwd from the spawn request is already constrained to be within the spawner’s projectRoot. The remaining gap is that getProjectPaths(projectRoot) currently accepts any string and treats it as a project root without checking that it’s reasonable (e.g., not /, not C:\, etc.). That allows a malicious cwd that equals the spawner’s projectRoot but that project root itself might be something sensitive in other contexts, and other future callers could pass unsafe values without realizing they must re-implement the same validations.
The single best minimal fix is to harden getProjectPaths in packages/config/src/project-namespace.ts so that when a projectRoot argument is provided it is normalized, and then rejected if it is the filesystem root. This ensures that all data directories (dataDir, teamDir, and thereby _logPath) will always be located beneath some non-root directory, preventing logs (and other data) from being written directly under / or another root path derived from untrusted input. Concretely:
- In
getProjectPaths, computeconst resolvedRoot = path.resolve(projectRoot)whenprojectRootis passed. - Determine the filesystem root with
const filesystemRoot = path.parse(resolvedRoot).root. - If
resolvedRoot === filesystemRoot, throw an error indicating an invalid project root. - Use
resolvedRootfor all subsequent path constructions.
This change is localized to project-namespace.ts and does not alter behavior for normal, valid project roots. All existing callers — including RelayPtyOrchestrator — will now benefit from this extra guard, and CodeQL’s taint path to _logPath will pass through a clear validation barrier.
| @@ -99,7 +99,18 @@ | ||
| } | ||
|
|
||
| export function getProjectPaths(projectRoot?: string): ProjectPaths { | ||
| const root = projectRoot ?? findProjectRoot(); | ||
| // If a projectRoot is provided, normalize it and ensure it is not the filesystem root. | ||
| let root: string; | ||
| if (projectRoot) { | ||
| const resolvedRoot = path.resolve(projectRoot); | ||
| const filesystemRoot = path.parse(resolvedRoot).root; | ||
| if (resolvedRoot === filesystemRoot) { | ||
| throw new Error(`Invalid project root: "${projectRoot}" resolves to filesystem root "${filesystemRoot}"`); | ||
| } | ||
| root = resolvedRoot; | ||
| } else { | ||
| root = findProjectRoot(); | ||
| } | ||
| const projectId = hashPath(root); | ||
| // Store data in project-local .agent-relay/ directory | ||
| const dataDir = path.join(root, PROJECT_DATA_DIR); |
| if (!existsSync(logDir)) { | ||
| mkdirSync(logDir, { recursive: true }); | ||
| } | ||
| appendFileSync(this._logPath, logLine); |
Check warning
Code scanning / CodeQL
Network data written to file Medium
| if (!existsSync(logDir)) { | ||
| mkdirSync(logDir, { recursive: true }); | ||
| } | ||
| appendFileSync(this._logPath, logLine); |
Check warning
Code scanning / CodeQL
Network data written to file Medium
- Add socket auto-reconnection when connection closes unexpectedly - Schedule reconnect with exponential backoff (up to 10s delay) - Proactively trigger reconnect when processing messages with disconnected socket - Pass RELAY_AGENT_NAME env var in relay-pty spawn - Improve MCP identity discovery to scan mcp-identity-* files by mtime - Use most recently modified identity file as fallback for multi-agent scenarios 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace `require('node:fs').statSync` with proper ESM import to fix
ReferenceError in ESM module context.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| async sendAndWait(to, message, opts = {}) { | ||
| const r = await request<{ from: string; body: string; thread?: string }>('SEND_AND_WAIT', { from: agentName, to, body: message, thread: opts.thread, timeoutMs: opts.timeoutMs || 30000 }); | ||
| return { from: r.from, content: r.body, thread: r.thread }; | ||
| // Use proper SEND with sync.blocking - daemon handles the wait and returns ACK | ||
| // from/to must be at envelope level, kind/body/thread in payload | ||
| const waitTimeout = opts.timeoutMs || 30000; | ||
| const correlationId = randomUUID(); | ||
| const r = await request<{ correlationId?: string; response?: string; from?: string }>('SEND', { | ||
| kind: 'message', | ||
| body: message, | ||
| thread: opts.thread, | ||
| }, waitTimeout + 5000, { | ||
| sync: { | ||
| blocking: true, | ||
| correlationId, | ||
| timeoutMs: waitTimeout, | ||
| }, | ||
| }, { from: agentName, to }); | ||
| return { from: r.from ?? to, content: r.response ?? '', thread: opts.thread }; | ||
| }, |
There was a problem hiding this comment.
🟡 MCP RelayClient.sendAndWait returns incorrect from because ACK sender is at envelope level
RelayClient.sendAndWait() calls request('SEND', ...) and then returns { from: r.from ?? to, content: r.response ?? '' } (packages/mcp/src/client.ts:323-340).
For blocking sends, the daemon replies with an ACK envelope where the responder identity is encoded at the envelope level (envelope.from), not inside the ACK payload. The request helper resolves with response.payload only (packages/mcp/src/client.ts:281-299), so r.from will be undefined and the method will report the recipient (to) as the sender.
Actual vs expected: caller receives a from that is always the original to rather than the actual acknowledging agent.
Impact: downstream tools/UI may misattribute acknowledgements (especially if future flows allow intermediaries, channels, or delegated acknowledgements).
Code pointers
sendAndWaitusesr.from ?? to:packages/mcp/src/client.ts:323-340request()resolvesresponse.payload(drops envelope.from):packages/mcp/src/client.ts:281-299
Recommendation: Have request() optionally return the whole response envelope (or at least include response.from), and in sendAndWait use that envelope-level from. If the intention is just to wait for ACK, consider returning { from: responseEnvelope.from, status: payload.response } (and rename content to status).
Was this helpful? React with 👍 or 👎 to provide feedback.
Use prebuilt binaries from bin/ when Rust isn't installed, allowing TypeScript-only CI jobs to run without Rust toolchain. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
🔴 Unauthenticated socket clients can call privileged daemon operations (INBOX/RELEASE/SPAWN/etc.) without HELLO
Connection.processFrame only enforces state for HELLO and (partially) SEND. All other message types are forwarded to onMessage regardless of connection state. This means a client can connect and immediately issue query/control frames (e.g. INBOX, LIST_AGENTS, HEALTH, METRICS, SPAWN, RELEASE) without completing the handshake.
Actual: Any local client that can reach the Unix socket can perform daemon operations without identifying itself.
Expected: Non-handshake frames should be rejected until state is ACTIVE, unless explicitly whitelisted and authenticated.
Click to expand
-
Connection forwards all non-handshake frames:
packages/daemon/src/connection.ts:194-217switch (envelope.type) { case 'HELLO': ... case 'SEND': ... case 'ACK': ... case 'PONG': ... case 'BYE': ... default: this.onMessage?.(envelope); }
-
Daemon handles privileged operations like
RELEASEandINBOXin its message handler:packages/daemon/src/server.ts:1123-1195.
Impact:
- Anyone with socket access can:
- Spawn or release agents
- Read inbox (see BUG-0003 for reading arbitrary agent inbox)
- Enumerate agents / system metrics
(Refers to lines 194-217)
Recommendation: Gate non-handshake message types behind ACTIVE state (or a small authenticated allowlist). If MCP requires pre-HELLO access, add an explicit authenticated handshake/token mechanism for MCP connections.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
🟡 Heartbeat “processing exemption” does not reset lastPongReceived, causing immediate disconnect after long processing
When a connection exceeds the heartbeat timeout but isProcessing(agentName) returns true, the code comments say it will “reset the pong timer”, but it does not update lastPongReceived.
Actual: While processing, every heartbeat interval keeps seeing now - lastPongReceived > timeoutMs. Once isProcessing flips to false, the connection is immediately terminated on the next tick because lastPongReceived is still stale.
Expected: If a connection is exempted due to processing, it should extend the deadline (e.g., set lastPongReceived = now) to avoid killing the connection immediately when processing ends.
Click to expand
packages/daemon/src/connection.ts:323-334
if (this.lastPongReceived && now - this.lastPongReceived > timeoutMs) {
if (this._agentName && this.config.isProcessing?.(this._agentName)) {
// Agent is processing - reset the pong timer to avoid timeout
console.log(...);
} else {
this.handleError(new Error(`Heartbeat timeout ...`));
return;
}
}(Refers to lines 323-334)
Recommendation: When exempting due to processing, set this.lastPongReceived = now (or track a separate deadline) so the connection gets a fresh grace period after processing ends.
Was this helpful? React with 👍 or 👎 to provide feedback.
| private handleSend(envelope: Envelope<SendPayload>): void { | ||
| if (this._state !== 'ACTIVE') { | ||
| // Allow MCP-style messages (with envelope.from set) even when not in ACTIVE state. | ||
| // MCP tools use short-lived connections without HELLO handshake. | ||
| // The router will use envelope.from as the sender name for these messages. | ||
| const hasMcpSender = !!envelope.from && !!envelope.to; | ||
| if (this._state !== 'ACTIVE' && !hasMcpSender) { | ||
| this.sendError('BAD_REQUEST', 'Not in ACTIVE state', false); | ||
| return; | ||
| } |
There was a problem hiding this comment.
🔴 Unauthenticated pre-HELLO SEND allows spoofing arbitrary sender/recipient (MCP bypass)
Connection.handleSend allows processing SEND frames while the connection is not in ACTIVE as long as the envelope has both from and to set (hasMcpSender). This lets any process that can reach the daemon socket send messages while impersonating any agent/user (since Router.route trusts envelope.from as the sender name).
Actual: a client can connect and immediately send {type:"SEND", from:"Alice", to:"Bob", ...} without a HELLO handshake.
Expected: either (a) require handshake before allowing SEND, or (b) for “MCP short-lived connections” do not trust arbitrary from—derive identity from an authenticated/handshaken connection or a capability token.
Click to expand
Relevant code:
-
State bypass in connection layer:
packages/daemon/src/connection.ts:293-307const hasMcpSender = !!envelope.from && !!envelope.to; if (this._state !== 'ACTIVE' && !hasMcpSender) { this.sendError('BAD_REQUEST', 'Not in ACTIVE state', false); return; } this.onMessage?.(envelope);
-
Router chooses sender name from
envelope.fromwhen present:packages/daemon/src/router.ts:555-565const senderName = isCrossMachine ? envelope.from : (envelope.from || from.agentName);
Impact:
- Message spoofing / impersonation
- Potential privilege escalation if other message types rely on sender identity
(Refers to lines 293-307)
Recommendation: Do not allow unauthenticated SEND prior to HELLO, or require an auth token for MCP-mode connections and bind an immutable sender identity to that token (ignore/override envelope.from).
Was this helpful? React with 👍 or 👎 to provide feedback.
| case 'INBOX': { | ||
| const inboxPayload = envelope.payload as InboxPayload; | ||
| const agentName = inboxPayload.agent || connection.agentName; | ||
|
|
||
| // Get messages from storage | ||
| const getInboxMessages = async () => { | ||
| if (!this.storage?.getMessages) { | ||
| return []; | ||
| } | ||
| try { | ||
| const messages = await this.storage.getMessages({ | ||
| to: agentName, | ||
| from: inboxPayload.from, | ||
| limit: inboxPayload.limit || 50, | ||
| unreadOnly: inboxPayload.unreadOnly, |
There was a problem hiding this comment.
🔴 INBOX query trusts user-supplied agent name, enabling reading other agents’ messages
The daemon’s INBOX handler uses inboxPayload.agent || connection.agentName to choose which inbox to read. With the new “no-HELLO” query path, connection.agentName may be unset, and the request can specify any agent value.
Actual: a client can request INBOX with payload.agent = "SomeOtherAgent" and receive that agent’s stored messages.
Expected: inbox queries should be restricted to the authenticated/handshaken connection identity (or otherwise authorized). If a client isn’t associated with an agent, it should not be able to query any inbox.
Click to expand
packages/daemon/src/server.ts:1153-1167
const inboxPayload = envelope.payload as InboxPayload;
const agentName = inboxPayload.agent || connection.agentName;
...
const messages = await this.storage.getMessages({
to: agentName,
from: inboxPayload.from,
limit: inboxPayload.limit || 50,
unreadOnly: inboxPayload.unreadOnly,
});Impact:
- Confidentiality breach: any local process with socket access can read stored messages for any agent.
Recommendation: Ignore inboxPayload.agent unless the caller is authorized. Use connection.agentName only (and require HELLO/auth first), or implement explicit ACL/token-based authorization for cross-agent inbox access.
Was this helpful? React with 👍 or 👎 to provide feedback.
- Add configPath option to InstallOptions for programmatic installation - Add installMcpConfig() function for direct path-based MCP installation - Auto-install .mcp.json when spawning agents if not present - Extend ensureMcpPermissions() to support all CLI types (Claude, Cursor, Gemini, Windsurf) - Export install functions from @agent-relay/mcp package This ensures MCP tools are automatically available for spawned agents without requiring users to manually run 'mcp install'. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add MCP config auto-installation to the create-agent CLI command, ensuring MCP tools are available without manual setup. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Claude Code requires enableAllProjectMcpServers: true in global settings (~/.claude/settings.local.json) to load project-local .mcp.json files. Previously we only set this in project-local settings, which didn't enable the feature. Now we set it in both global and project-local settings. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Instead of modifying global settings to enable project MCP servers, install MCP config to user scope (~/.claude.json) which doesn't require enableAllProjectMcpServers setting. Also fixes MCP permission format: mcp__agent-relay__* (with wildcard) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Export ensureMcpPermissions from @agent-relay/bridge and call it in create-agent CLI to create ~/.claude/settings.local.json with permissions.allow: ["mcp__agent-relay__*"] This ensures MCP tools are auto-approved when agents are created, preventing the manual approval prompt from blocking initialization. Also fixes package-validation.yml to check for Daemon instead of removed RelayServer export. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Install MCP config to project .mcp.json (non-intrusive) - Set enableAllProjectMcpServers in ~/.claude/settings.local.json so Claude Code loads project-local MCP configs - Continue setting mcp__agent-relay__* permission for auto-approval 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When Claude Code starts the MCP server, the CWD may not be the project directory, causing discoverSocket() to fail. Now we explicitly set RELAY_SOCKET environment variable in .mcp.json so the MCP server can find the daemon regardless of what directory it's launched from. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| case 'INBOX': { | ||
| const inboxPayload = envelope.payload as InboxPayload; | ||
| const agentName = inboxPayload.agent || connection.agentName; | ||
|
|
||
| // Get messages from storage | ||
| const getInboxMessages = async () => { | ||
| if (!this.storage?.getMessages) { | ||
| return []; | ||
| } | ||
| try { | ||
| const messages = await this.storage.getMessages({ | ||
| to: agentName, | ||
| from: inboxPayload.from, | ||
| limit: inboxPayload.limit || 50, | ||
| unreadOnly: inboxPayload.unreadOnly, |
There was a problem hiding this comment.
🔴 INBOX handler can return all messages if payload.agent is missing/empty (privacy leak)
INBOX requests compute agentName as inboxPayload.agent || connection.agentName and then pass it to storage.getMessages({ to: agentName, ... }).
If a client sends a malformed INBOX frame with no payload.agent (or an empty string) and the connection has not completed HELLO (so connection.agentName is undefined), agentName becomes undefined/falsy and the query becomes effectively unscoped. In the in-memory adapter (and likely SQLite adapter) the to filter is only applied when query.to is truthy, so this can return all messages in storage.
Actual: unauthenticated / non-handshaken clients can potentially exfiltrate other agents’ messages.
Expected: INBOX should always be scoped to a specific agent and reject requests without a valid agent name.
Click to expand
Relevant code:
const inboxPayload = envelope.payload as InboxPayload;
const agentName = inboxPayload.agent || connection.agentName;
...
const messages = await this.storage.getMessages({
to: agentName,
...
});packages/daemon/src/server.ts:1153-1167
Recommendation: Validate inboxPayload.agent (non-empty string) and/or require a completed handshake for INBOX. If absent, send an ERROR response and do not query storage.
Was this helpful? React with 👍 or 👎 to provide feedback.
| case 'INBOX': { | ||
| const inboxPayload = envelope.payload as InboxPayload; | ||
| const agentName = inboxPayload.agent || connection.agentName; | ||
|
|
||
| // Get messages from storage | ||
| const getInboxMessages = async () => { | ||
| if (!this.storage?.getMessages) { | ||
| return []; | ||
| } | ||
| try { | ||
| const messages = await this.storage.getMessages({ | ||
| to: agentName, | ||
| from: inboxPayload.from, | ||
| limit: inboxPayload.limit || 50, | ||
| unreadOnly: inboxPayload.unreadOnly, |
There was a problem hiding this comment.
🟡 INBOX handler ignores requested channel filter
The InboxPayload includes a channel field, and the MCP README/tooling advertises channel filtering, but the daemon’s INBOX handler never applies it when querying storage.
Actual: relay_inbox(channel=...) (or protocol INBOX with channel) returns messages from all channels.
Expected: channel filter should be honored (either via storage query or post-filtering).
Click to expand
packages/daemon/src/server.ts:1153-1167 constructs the query:
const messages = await this.storage.getMessages({
to: agentName,
from: inboxPayload.from,
limit: inboxPayload.limit || 50,
unreadOnly: inboxPayload.unreadOnly,
});No channel is included, despite being present on InboxPayload.
Recommendation: Pass a channel/topic filter into getMessages if supported (e.g. topic), or filter returned rows by m.data.channel before responding.
Was this helpful? React with 👍 or 👎 to provide feedback.
Uses file-based protocol for reliable writes (send, spawn, release) and socket for read queries (status, inbox, who). This avoids the timeout issues seen with socket-only communication. Also adds: - relay_continuity tool for session save/load - file-transport.ts for file-based message protocol - hybrid-client.ts combining file + socket transports 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| @@ -768,7 +837,7 @@ export class AgentSpawner { | |||
| : mapModelToCli(); // defaults to claude:sonnet | |||
|
|
|||
| // Extract effective model name for logging | |||
There was a problem hiding this comment.
🟡 Default model logging inconsistency - logs 'opus' but actually uses 'sonnet'
The effectiveModel variable defaults to 'opus' but the actual CLI variant defaults to 'sonnet', causing misleading logs.
Click to expand
At packages/bridge/src/spawner.ts:839, the default model for logging was changed from 'sonnet' to 'opus':
const effectiveModel = modelFromProfile || 'opus';However, the actual model selection at line 835-837 still defaults to sonnet:
const cliVariant = modelFromProfile
? mapModelToCli(modelFromProfile)
: mapModelToCli(); // defaults to claude:sonnet (per comment)This causes a mismatch: when no model is specified in the agent profile, the code logs model=opus (line 848) but actually spawns the agent with claude:sonnet. This inconsistency could lead to:
- Incorrect cost tracking (opus is more expensive than sonnet)
- Confusion when debugging model-related issues
- Misleading audit logs
The effectiveModel should match what's actually used. Since mapModelToCli() defaults to sonnet, the logging should also default to 'sonnet'.
Recommendation: Change the default from 'opus' back to 'sonnet' to match the actual model used: const effectiveModel = modelFromProfile || 'sonnet';
Was this helpful? React with 👍 or 👎 to provide feedback.
The daemon now watches .agent-relay/outbox directories and processes files directly. This enables MCP tools to send messages via the file-based protocol without needing relay-pty triggers. Files are processed and routed through the normal router/spawn paths. This provides parity between MCP and relay-pty file-based agents. Also updates protocol types: - Add model field to SpawnPayload - Add reason field to ReleasePayload Note: This is a simpler implementation without ledger tracking. For full durability, can be upgraded to use RelayWatchdog later. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| async sendAndWait(to, message, opts = {}) { | ||
| const r = await request<{ from: string; body: string; thread?: string }>('SEND_AND_WAIT', { from: agentName, to, body: message, thread: opts.thread, timeoutMs: opts.timeoutMs || 30000 }); | ||
| return { from: r.from, content: r.body, thread: r.thread }; | ||
| // Use proper SEND with sync.blocking - daemon handles the wait and returns ACK | ||
| // from/to must be at envelope level, kind/body/thread in payload | ||
| const waitTimeout = opts.timeoutMs || 30000; | ||
| const correlationId = randomUUID(); | ||
| const r = await request<{ correlationId?: string; response?: string; from?: string }>('SEND', { | ||
| kind: 'message', | ||
| body: message, | ||
| thread: opts.thread, | ||
| }, waitTimeout + 5000, { | ||
| sync: { | ||
| blocking: true, | ||
| correlationId, | ||
| timeoutMs: waitTimeout, | ||
| }, | ||
| }, { from: agentName, to }); | ||
| return { from: r.from ?? to, content: r.response ?? '', thread: opts.thread }; | ||
| }, |
There was a problem hiding this comment.
🔴 MCP client sendAndWait now returns ACK status instead of a message reply (API contract change/break)
RelayClient.sendAndWait() used to call a dedicated SEND_AND_WAIT flow that returned {from, body} (a real reply). It now sends a blocking SEND and resolves with the daemon-forwarded ACK payload.
- Expected:
sendAndWait()returns the content of a reply message (or at least a well-defined response payload), consistent with its signature{ from, content }. - Actual: the code maps
AckPayload.response(an ACK status string) intocontent, defaulting to''if not present. In typical ACKs, there is no message body, so callers get empty or status-like content rather than a reply.
Code
packages/mcp/src/client.ts:323-340
const r = await request<{ correlationId?: string; response?: string; from?: string }>('SEND', {
kind: 'message',
body: message,
thread: opts.thread,
}, waitTimeout + 5000, {
sync: { blocking: true, correlationId, timeoutMs: waitTimeout },
}, { from: agentName, to });
return { from: r.from ?? to, content: r.response ?? '', thread: opts.thread };Impact: any consumers relying on sendAndWait() for a reply body will behave incorrectly (e.g., MCP tool relay_send(..., await_response=true)-style usage).
Recommendation: Either (a) restore a true request/response message flow (e.g., a SEND_AND_WAIT message type with a reply payload), or (b) rename/retype this method to reflect that it only waits for an ACK and returns an ACK status/metadata.
Was this helpful? React with 👍 or 👎 to provide feedback.
Three demos showcasing agents with competing priorities negotiating limited resources using the agent-relay system: - budget-negotiation.sh: Teams compete for $100K quarterly budget - server-capacity.sh: Services compete for 10 emergency servers - sprint-planning.sh: Stakeholders negotiate 50 story points Each demo creates a shared prompt file that agents read, then they join a channel to negotiate naturally with consensus voting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| if (!isSpawned && isDaemon && spawner) { | ||
| const maxWaitMs = 3000; // Wait up to 3 seconds | ||
| const pollIntervalMs = 100; | ||
| const startTime = Date.now(); | ||
|
|
||
| while (Date.now() - startTime < maxWaitMs) { | ||
| await new Promise(resolve => setTimeout(resolve, pollIntervalMs)); | ||
| isSpawned = spawner.hasWorker(agentName); | ||
| if (isSpawned) { | ||
| console.log(`[dashboard] Agent ${agentName} appeared in spawner after ${Date.now() - startTime}ms`); | ||
| break; | ||
| } | ||
| // Check if WebSocket was closed during wait | ||
| if (ws.readyState !== WebSocket.OPEN) { | ||
| return false; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🔴 Dashboard WebSocket handler blocks event loop with synchronous polling
The subscribeToAgent function performs busy-waiting with a while loop that blocks the WebSocket event loop for up to 3 seconds, preventing the connection from handling close/error events during that time.
Click to expand
How it gets triggered
At packages/dashboard-server/src/server.ts:2250-2267 and packages/dashboard/src/server.ts:2277-2294, when a WebSocket client subscribes to an agent that is daemon-connected but not yet in the spawner's activeWorkers:
if (!isSpawned && isDaemon && spawner) {
const maxWaitMs = 3000;
const pollIntervalMs = 100;
const startTime = Date.now();
while (Date.now() - startTime < maxWaitMs) {
await new Promise(resolve => setTimeout(resolve, pollIntervalMs));
isSpawned = spawner.hasWorker(agentName);
// ...
}
}This while loop runs in the WebSocket message handler, blocking that handler thread.
Actual vs Expected
Actual: WebSocket handler is blocked for up to 3 seconds, during which:
- Cannot process other WebSocket messages
- Cannot detect if the WebSocket was closed
- Holds up the event loop
Expected: Use event-based notification or move polling to a separate async task
Impact
During the 3-second polling window:
- Other WebSocket operations on that connection are delayed
- If the client closes the connection, it won't be detected until polling completes
- Can cause perceived lag in the dashboard UI
- Multiple concurrent subscriptions could stack up and significantly delay the event loop
Recommendation: Use event-based notification from spawner when agents become available, or move polling to a background task that doesn't block the WebSocket handler
Was this helpful? React with 👍 or 👎 to provide feedback.
| socket.on('connect', () => { | ||
| socket.write(encodeFrame(envelope)); | ||
| socket.end(); | ||
| resolve(); |
There was a problem hiding this comment.
🔴 fireAndForget resolves before write completes or on socket errors
The fireAndForget function resolves immediately when socket.end() is called, without waiting for the write operation to complete or handling post-end socket errors.
Click to expand
How it gets triggered
At packages/mcp/src/client.ts:218-221:
socket.on('connect', () => {
socket.write(encodeFrame(envelope));
socket.end();
resolve();
});The promise resolves immediately after calling socket.end(), but:
socket.write()is asynchronous - the data may not have been sent yetsocket.end()is also asynchronous - it initiates closing but doesn't wait for completion- Socket errors that occur after
end()are not caught (error handler only registered after promise resolves)
Actual vs Expected
Actual: Function returns success even if:
- Write buffer is full and message is dropped
- Socket errors occur after end() but before data is transmitted
- Network failure prevents message delivery
Expected: Wait for write to complete or handle write callback to ensure message was buffered
Impact
MCP tools (send, spawn, release) may report success when the message was never actually delivered to the daemon. This leads to:
- Silent message loss
- Agents thinking they sent a message when it never arrived
- Spawn/release operations failing silently
- Difficult to debug communication failures
Recommendation: Wait for write callback or use socket.once('finish', ...) to ensure data was buffered before resolving
Was this helpful? React with 👍 or 👎 to provide feedback.
| // Send fatal error before closing to prevent reconnection loop | ||
| const errorEnvelope: Envelope<ErrorPayload> = { | ||
| v: PROTOCOL_VERSION, | ||
| type: 'ERROR', | ||
| id: generateId(), | ||
| ts: Date.now(), | ||
| payload: { | ||
| code: 'DUPLICATE_CONNECTION', | ||
| message: `Another agent with name "${connection.agentName}" connected. This connection will be closed.`, | ||
| fatal: true, | ||
| }, | ||
| }; | ||
| existing.send(errorEnvelope); | ||
| existing.close(); |
There was a problem hiding this comment.
🟡 Router duplicate connection handling doesn't verify error send succeeded
When detecting duplicate agent connections, the router sends a DUPLICATE_CONNECTION fatal error to the old connection but doesn't verify the send succeeded before closing the connection.
Click to expand
How it gets triggered
At packages/daemon/src/router.ts:273-286, when a new agent connection registers with a name that already exists:
existing.send(errorEnvelope);
existing.close();The send() method returns a boolean indicating success, but the return value is ignored. If the old connection's write queue is full or the socket is already broken, the error won't be delivered.
Actual vs Expected
Actual: Old connection is closed immediately without verifying the fatal error was sent
Expected: Check if send() succeeded, or add a small delay to allow the error to be transmitted before closing
Impact
The old agent connection is closed without knowing why, which can cause:
- Unexpected disconnection without error message
- Agent attempts to reconnect, creating a reconnection loop
- Confusion in logs about why connection was terminated
- Poor user experience when debugging duplicate connection issues
Recommendation: Check the return value of existing.send() or add a small delay (e.g., setImmediate) before calling existing.close() to allow the error to be transmitted
Was this helpful? React with 👍 or 👎 to provide feedback.
Uh oh!
There was an error while loading. Please reload this page.