From 5921f58f19d9d562bc2dc46182e80c9d296230d6 Mon Sep 17 00:00:00 2001 From: Kenneth Sinder Date: Tue, 19 May 2026 16:53:01 -0700 Subject: [PATCH 1/9] docs(rfd): mid-turn input via session/inject (queue and steer) Lift discussion #1220 into the v2 RFD bucket per @benbrandt's offer. One method, two modes (queue and steer), one capability. Rides on the v2 prompt lifecycle (user_message echo, state_change, agent-owned messageId). Supersedes PR #484 with credit to @SteffenDE. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/docs.json | 1 + docs/rfds/v2/session-inject.mdx | 238 ++++++++++++++++++++++++++++++++ 2 files changed, 239 insertions(+) create mode 100644 docs/rfds/v2/session-inject.mdx diff --git a/docs/docs.json b/docs/docs.json index dc0fc8d2..c097177f 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -148,6 +148,7 @@ "root": "rfds/v2/overview", "pages": [ "rfds/v2/prompt", + "rfds/v2/session-inject", "rfds/message-id", "rfds/streamable-http-websocket-transport" ] diff --git a/docs/rfds/v2/session-inject.mdx b/docs/rfds/v2/session-inject.mdx new file mode 100644 index 00000000..532ed3b9 --- /dev/null +++ b/docs/rfds/v2/session-inject.mdx @@ -0,0 +1,238 @@ +--- +title: "Mid-turn input: queue and steer" +--- + +Author(s): [@kennethsinder](https://github.com/kennethsinder) + +## Elevator pitch + +> What are you proposing to change? + +Add a single `session/inject` method, with a `mode` of `queue` or `steer`, so clients can hand the agent another user-role message without ending the current turn. The agent advertises which modes it supports via a capability. Delivery is echoed as a `user_message` notification so multi-client observers and `session/load` replay see one consistent history. + +This is a v2 follow-on. It rides on the [v2 prompt lifecycle](./prompt) (short-lived `session/prompt`, agent-owned `messageId`, `state_change` notifications) and replaces the standalone [prompt queueing RFD (#484)](https://github.com/agentclientprotocol/agent-client-protocol/pull/484), which predates that substrate and stops at the capability bit. + +## Status quo + +> How do things work today and what problems does this cause? Why would we change things? + +Coding-agent UIs have converged on two adjacent input affordances on top of "send a prompt and wait": + +- **Queue.** The user types while the agent is mid-turn; the text is held and delivered once the agent goes idle. Shipped in Cursor since 1.2 as a FIFO ([forum thread](https://forum.cursor.com/t/queue-agent-messages/110883)). Shipped in Windsurf Cascade — type while Cascade is working, press Enter, it lands after the current task ([docs](https://docs.windsurf.com/plugins/cascade/cascade-overview)). Shipped in Gemini CLI with explicit `wait_for_idle` vs `wait_for_response` settings ([PR #7867](https://github.com/google-gemini/gemini-cli/pull/7867)), and with documented ordering bugs when tool calls are involved ([#17282](https://github.com/google-gemini/gemini-cli/issues/17282), [#17719](https://github.com/google-gemini/gemini-cli/issues/17719)). Shipped in Claude Code, with [known boundary bugs](https://github.com/anthropics/claude-code/issues/49373) where it flushes at every LLM pause instead of true end-of-turn ([related](https://github.com/anthropics/claude-code/issues/36326)). +- **Steer.** The user types while the agent is mid-turn; the text is delivered at the next safe break-point, typically after the in-flight tool call. Codex CLI has the most fleshed-out implementation, opt-in via `/experimental`. The Codex code base has had to evolve its own pending-steers buffer, requeue-on-turn-complete, pause-on-usage-limits, and a replay edge-case surface. See PRs [#12569](https://github.com/openai/codex/pull/12569), [#15235](https://github.com/openai/codex/pull/15235), [#22226](https://github.com/openai/codex/pull/22226), [#22879](https://github.com/openai/codex/pull/22879) and issues [#19975](https://github.com/openai/codex/issues/19975), [#22815](https://github.com/openai/codex/issues/22815). The original Codex request issues ([#4312](https://github.com/openai/codex/issues/4312), [#9096](https://github.com/openai/codex/issues/9096)) frame this as "queued corrections while the current turn is still running" — the same problem. Gemini CLI has an open proposal to add the same shape under a different name ([`/inject` for asynchronous mid-stream steering, #17197](https://github.com/google-gemini/gemini-cli/issues/17197)). + +The "queue vs. steer is two distinct things" framing has surfaced explicitly across ecosystems: see `langchain-ai/deepagents` [#1390 "Disambiguate steer/queue"](https://github.com/langchain-ai/deepagents/issues/1390) and `anomalyco/opencode` [#21388 "Allow messages to be sent mid-turn — interrupt, queue, or inject"](https://github.com/anomalyco/opencode/issues/21388). The convergence is visible, but each tool is reinventing the semantics from scratch. + +ACP today has one mid-turn lever: `session/cancel` followed by a fresh `session/prompt`. That discards in-flight work and forces tool calls to surface as cancelled even when they were milliseconds from completing. There is no way to express "let this tool call finish, then add this." There is no way to express "buffer this for after the turn." The `stopReason` enum has no value for a turn interrupted because a new message landed. + +The v2 prompt lifecycle RFD already flags this gap in its motivation: + +> Queueing messages (still an ongoing discussion) would also fit much nicer in this pattern... potentially the agent could decide whether it cancels the current turn and inserts it immediately, or inserts it at the next convenient break point. + +The substrate that's missing isn't queueing alone. It's the wire shape that distinguishes "deliver soon" from "deliver later," that gives each delivery a stable handle for ack and revocation, and that echoes the delivery into the session history so multi-client and replay stay coherent. That substrate is the v2 prompt lifecycle. This RFD adds the call sites on top. + +PR [#484](https://github.com/agentclientprotocol/agent-client-protocol/pull/484) (open since February 2026) takes a thinner cut: a `promptQueueing` capability plus an `end_turn` early-finish signal, with prompts queued via a parallel `session/prompt` call. It doesn't address steer, doesn't define delivery semantics, and predates the v2 prompt lifecycle's `state_change` and `user_message` notifications. The author flags in a follow-up comment that the Claude Agent SDK had no public hook for "when was my queued message inserted," which is precisely what `user_message` echoing solves at the protocol level. The proposal below is intended to supersede #484, with credit to [@SteffenDE](https://github.com/SteffenDE) for raising the question first. + +## What we propose to do about it + +> What are you proposing to improve the situation? + +One method, two modes, one capability. + +### The method + +```jsonc +{ + "jsonrpc": "2.0", + "id": 42, + "method": "session/inject", + "params": { + "sessionId": "sess_abc", + "mode": "steer", + "content": [ + { + "type": "text", + "text": "stop using the legacy auth path, see auth_v2.go", + }, + ], + }, +} +``` + +The agent responds when the inject has been **accepted into the session**, not when the model has processed it. The response carries the agent-assigned `messageId`, matching the v2 prompt lifecycle pattern: + +```jsonc +{ + "jsonrpc": "2.0", + "id": 42, + "result": { "messageId": "msg_inj_001" }, +} +``` + +At delivery (which happens later for both modes), the agent emits a `user_message` session update with the same `messageId`. This is the same notification the v2 prompt lifecycle defines for `session/prompt`, so clients, observers attached via [multi-client attach (#533)](https://github.com/agentclientprotocol/agent-client-protocol/pull/533), and `session/load` replay all see the message land in the same shape regardless of whether it arrived as a prompt or an inject. + +### The capability + +```jsonc +{ + "agentCapabilities": { + "session": { + "inject": { "modes": ["queue", "steer"] }, + }, + }, +} +``` + +An agent that only supports buffering at end-of-turn declares `["queue"]`. A tool-loop agent that can break safely between tool calls declares `["queue", "steer"]`. A streaming-only agent that can't do either declares the absence of the capability and clients fall back to `session/cancel`. + +### Semantics + +**`queue`** is the simple one. The agent buffers the content. The agent delivers it as a `user_message` notification once `state_change: idle` fires for the current turn. FIFO across multiple queued injects. If a queue lands on an already-idle session, the agent treats it as a normal user message and starts a turn, though clients should prefer `session/prompt` in that case. + +**`steer`** is the interesting one. The agent delivers the content at the next safe break-point during the running turn: + +- **Mid tool call**: complete the tool call, then deliver before the next LLM call. +- **Mid LLM stream with no tool call pending**: agent-defined. The agent may either interrupt the stream and re-prompt with the steer text included, or let the stream finish and deliver before the next iteration. The choice is a quality-of-implementation concern; both behaviors are spec-conformant. Clients should not assume one or the other. +- **Mid model-process pause** (provider rate limit, usage-limit cap): hold pending steers until the pause clears. This mirrors the shape Codex landed on in [PR #22226](https://github.com/openai/codex/pull/22226). + +Both modes are fire-and-forget in the sense that the response confirms acceptance, not model-process. Both deliveries are observable via the same `user_message` notification. + +## Shiny future + +> How will things will play out once this feature exists? + +A user typing into a Zed prompt box during a long agent turn no longer has to choose between "wait for the turn to finish" and "blow it up with cancel." If they want to add context for the next iteration, they queue. If they want to redirect the agent now without losing the tool call it just kicked off, they steer. + +A dashboard client attached to three concurrent sessions can route a steer into whichever session needs course-correcting without disturbing the other two. The `user_message` echo means every observer sees the steer land at the same point in history, so the side-by-side panels stay consistent. + +A `session/load` replay reconstructs the same transcript regardless of how messages originally arrived. The replay doesn't need to know that message N was a steer that interrupted a tool call vs. a prompt that started a turn. It sees `user_message`, `agent_message`, `tool_call`, `state_change` in delivery order, and renders. + +For agents, this also means the existing `cancelled` stop reason starts carrying a clearer signal. Today, "cancelled" can mean either "user gave up" or "user wanted to redirect." Once steer exists, "cancelled" goes back to meaning "user gave up." Steer-interrupted turns are not cancelled. They end normally and the steer becomes the next user message. + +## Implementation details and plan + +> Tell me more about your implementation. What is your detailed implementation plan? + +### Schema additions + +In `schema/schema.json`: + +1. Add `SessionInjectMode` enum: `"queue" | "steer"`. +2. Add `SessionInjectRequest`: + - `sessionId: SessionId` + - `mode: SessionInjectMode` + - `content: ContentBlock[]` (same shape as `PromptRequest.prompt`) +3. Add `SessionInjectResponse`: + - `messageId: string` (UUID, agent-owned) +4. Add `agentCapabilities.session.inject`: + - `modes: SessionInjectMode[]` +5. Register `session/inject` as a request method. + +No changes to `session/update` or to existing notification types. The `user_message` notification introduced by the v2 prompt lifecycle is what carries the delivered content, with the same `messageId` that was returned from the inject response. + +### Revocation + +A queued message that hasn't been delivered yet should be cancellable. The natural primitive is the one the [request cancellation RFD](./request-cancellation) just added: `$/cancel_request` on the original `session/inject` request id. + +```jsonc +{ + "jsonrpc": "2.0", + "method": "$/cancel_request", + "params": { "requestId": 42 }, +} +``` + +If the inject hasn't been delivered, the agent drops it and responds to the original request with `-32800 Request cancelled`. If delivery already happened (the `user_message` notification has fired or is about to fire), the cancel is a no-op; the message is in the session. There is no separate "revoke a delivered message" surface; that's [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory. + +Steered messages that have already been handed to the LLM call are not revocable from the protocol's point of view. They're already in the model's input. + +### Persistence and replay + +Injected messages are part of session history. They appear in `session/load` replay as `user_message` notifications, ordered by acceptance time (which equals delivery time for queue+steer because acceptance and delivery are the same moment from the session's perspective, even though the response went out earlier). + +This means a session loaded from disk doesn't carry "this was a steer" provenance. That's deliberate. Once delivered, an injected message is just a user message in the transcript. The modality is observable in real time via the gap between response and `user_message`, not as a persistent attribute. + +### Multi-client interaction + +With [multi-client attach (#533)](https://github.com/agentclientprotocol/agent-client-protocol/pull/533), multiple controllers can inject into the same session. Ordering across controllers is by arrival time at the agent (FIFO from each controller; interleaved across controllers). Observers cannot inject. + +Every controller and observer sees the same `user_message` notification at delivery time, so transcripts converge. + +### Interaction with other in-flight requests + +- **`session/cancel` during a pending steer.** Codex [#22815](https://github.com/openai/codex/issues/22815) is a real bug surface here, and we should resolve it in the spec rather than leave it implicit. Recommended behavior: cancel applies to the in-flight turn only. Pending injects (both queue and steer) survive `session/cancel` and deliver as normal once the agent reaches idle. Clients that want to clear everything can revoke the injects via `$/cancel_request` first, then `session/cancel`. Agents should document if they deviate. +- **`session/request_permission` blocking the turn.** While the agent is awaiting a permission decision, the turn is paused but not idle. Queue accumulates as normal. Steer is held until the permission resolves, then delivered at the next break-point (which may be immediately if the permission decision is the break-point). Agents that prefer to drop steers during permission waits may do so, capability-advertised. +- **Elicitation in flight.** [Elicitation](./elicitation) is a separate request from the agent to the client. An inject is not an answer to an elicitation; they're orthogonal channels. Agents must not satisfy a pending `elicitation/create` from inject content. + +### Sub-agents + +Inject lands on the root session (the one the client knows about). If the agent has spawned sub-agents or delegated tool calls to nested workers, propagation is the parent agent's choice. The protocol stays out of sub-agent topology. + +### `stopReason` + +No new variant. A steer-interrupted LLM call that stops cleanly is `end_turn`. A steer that arrived while the user had also pressed cancel is `cancelled`. Existing semantics cover both. + +### Versioning + +This is a v2-only addition. It depends on the v2 prompt lifecycle's `user_message` notification and `state_change` semantics, neither of which exists in v1. v1 clients that want a forward-compatible story keep using `session/cancel` + new `session/prompt`. There is no proposed v1 backport. + +### Phased rollout + +1. Land schema and Rust SDK behind the `unstable` feature flag on v2. +2. Reference implementation in the v2 example agent: queue at minimum, steer if the example agent can demonstrate tool-call break-points cleanly. +3. Document in `docs/protocol/`. +4. Stabilize once at least two agent implementations and one client implementation have exercised both modes in real workflows. + +## Frequently asked questions + +> What questions have arisen over the course of authoring this document or during subsequent discussions? + +### What alternative approaches did you consider, and why did you settle on this one? + +**Two separate methods (`session/queue` and `session/steer`).** Cleaner type signatures, but every implementer has to wire two methods that share most of their plumbing. The `mode` parameter is more honest about the actual factoring: same delivery channel, same echo notification, different break-point policy. + +**Roll queue into the v2 prompt lifecycle RFD itself.** Tempting, because the prompt lifecycle already mentions queueing as future work. Rejected because steer is the harder design problem and deserves its own discussion, and pinning both to the v2 prompt lifecycle RFD would either bloat that RFD or commit to a design before the steer semantics are settled. + +**Parallel `session/prompt` calls (the PR #484 approach).** Reuses the existing method but conflates "start a turn" with "add to a turn." It also can't express steer cleanly, because a parallel `session/prompt` already implies its own turn boundary. Splitting inject out keeps `session/prompt` meaning "start a turn." + +**`session/prompt` with a `priority` field.** Same problem as parallel `session/prompt`: overloads a method whose semantics are about turn ownership, not insertion policy. + +### Why one method with a mode parameter instead of one method that always queues, where the agent decides whether to steer? + +Because the agent doesn't know the user's intent. A user typing "actually skip the migration, do it later" expects steer. A user typing "also, when you're done, run the tests" expects queue. The mode is the user's choice expressed by the client. The agent only chooses whether it supports each mode. + +### What if a client requests `steer` from an agent that only supports `queue`? + +The agent returns an error. Clients should check the capability before offering steer in the UI. Agents may, as a quality-of-implementation choice, downgrade to `queue` and return success, but the spec recommends erroring so clients don't silently lose the user's stated intent. + +### How does this interact with the [turn-complete signal RFD (#644)](https://github.com/agentclientprotocol/agent-client-protocol/pull/644)? + +The turn-complete signal gives clients a deterministic barrier for "all updates for this turn have been delivered." `queue` mode delivers after that barrier. Either the `state_change: idle` notification or the explicit `turn_complete` signal works as the trigger; they're the same point in time. The two RFDs are complementary, not competing. + +### How does this differ from [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214)? + +Rewind is destructive: it truncates history and changes what the agent sees as context. Inject is additive: it adds a new user message without touching what came before. They're siblings on the "mid-conversation user intervention" axis: inject for "add this," rewind for "undo that." + +### What about non-text content? Images, file references, resource links? + +`content` is a full `ContentBlock[]`, identical to `PromptRequest.prompt`. Anything you can send in a prompt, you can send in an inject. Whether the agent does something useful with an image steered mid-tool-call is up to the agent. + +### Is this related to discussion [#1224 `session/remind`](https://github.com/orgs/agentclientprotocol/discussions/1224)? + +Sibling but separate. `session/inject` carries user-role content; the agent treats it as the user speaking. `session/remind` (proposed separately) carries system-role context that the host wants the agent to see without faking a user turn. They could share the underlying break-point machinery, but the role contract differs at call sites and conflating them muddies what the agent sees in its message list. + +### Why fire-and-forget on acceptance instead of waiting for delivery? + +Because delivery for `queue` can be arbitrarily delayed (a long-running turn might mean the queue waits minutes), and forcing the JSON-RPC request to stay open for that duration is wasteful and brittle on stateful transports. The `messageId` returned on acceptance is the handle clients need; the `user_message` notification at delivery time is the signal. + +### Should the agent be allowed to drop an injected message? + +Only via revocation. Once accepted (the response has gone out with a `messageId`), the agent commits to delivering. If the agent crashes mid-turn and loses the queue, that's a durability bug, not a protocol-allowed behavior. + +### What about ordering when a queue inject lands after a steer inject during the same turn? + +Steer delivers at the next break-point; queue delivers at idle. The steer arrives first in the model's eyes even though both injects might have been accepted in the opposite order. This is intentional: mode determines delivery time, arrival order only breaks ties within the same mode. + +## Revision history + +- 2026-05-19: Initial draft, lifted from [discussion #1220](https://github.com/orgs/agentclientprotocol/discussions/1220) with two changes: explicit acknowledgment of and supersession over PR #484, and tightened `messageId` ownership to match the v2 prompt lifecycle (agent-owned, returned in the inject response). From 1adea7379e62c8ee3c0f427b99027dee6fb8fccc Mon Sep 17 00:00:00 2001 From: Kenneth Sinder Date: Wed, 20 May 2026 07:12:14 -0700 Subject: [PATCH 2/9] docs(rfd): clarify pending inject edits --- docs/rfds/v2/session-inject.mdx | 112 +++++++++++++++++++++++++++----- 1 file changed, 96 insertions(+), 16 deletions(-) diff --git a/docs/rfds/v2/session-inject.mdx b/docs/rfds/v2/session-inject.mdx index 532ed3b9..478591bf 100644 --- a/docs/rfds/v2/session-inject.mdx +++ b/docs/rfds/v2/session-inject.mdx @@ -8,7 +8,7 @@ Author(s): [@kennethsinder](https://github.com/kennethsinder) > What are you proposing to change? -Add a single `session/inject` method, with a `mode` of `queue` or `steer`, so clients can hand the agent another user-role message without ending the current turn. The agent advertises which modes it supports via a capability. Delivery is echoed as a `user_message` notification so multi-client observers and `session/load` replay see one consistent history. +Add a single `session/inject` method, with a `mode` of `queue` or `steer`, so clients can hand the agent another user-role message without ending the current turn. The agent advertises which modes it supports via a capability. Delivery is echoed as a `user_message` notification so multi-client observers and `session/load` replay see one consistent history. Before delivery, the returned `messageId` is also the handle clients use to revoke the pending message, and optionally replace its content. This is a v2 follow-on. It rides on the [v2 prompt lifecycle](./prompt) (short-lived `session/prompt`, agent-owned `messageId`, `state_change` notifications) and replaces the standalone [prompt queueing RFD (#484)](https://github.com/agentclientprotocol/agent-client-protocol/pull/484), which predates that substrate and stops at the capability bit. @@ -31,7 +31,7 @@ The v2 prompt lifecycle RFD already flags this gap in its motivation: The substrate that's missing isn't queueing alone. It's the wire shape that distinguishes "deliver soon" from "deliver later," that gives each delivery a stable handle for ack and revocation, and that echoes the delivery into the session history so multi-client and replay stay coherent. That substrate is the v2 prompt lifecycle. This RFD adds the call sites on top. -PR [#484](https://github.com/agentclientprotocol/agent-client-protocol/pull/484) (open since February 2026) takes a thinner cut: a `promptQueueing` capability plus an `end_turn` early-finish signal, with prompts queued via a parallel `session/prompt` call. It doesn't address steer, doesn't define delivery semantics, and predates the v2 prompt lifecycle's `state_change` and `user_message` notifications. The author flags in a follow-up comment that the Claude Agent SDK had no public hook for "when was my queued message inserted," which is precisely what `user_message` echoing solves at the protocol level. The proposal below is intended to supersede #484, with credit to [@SteffenDE](https://github.com/SteffenDE) for raising the question first. +PR [#484](https://github.com/agentclientprotocol/agent-client-protocol/pull/484) (open since February 2026) takes a thinner cut: a `promptQueueing` capability plus an `end_turn` early-finish signal, with prompts queued via a parallel `session/prompt` call. It doesn't address steer, doesn't define delivery semantics, and predates the v2 prompt lifecycle's `state_change` and `user_message` notifications. The follow-up thread also asks how clients edit a queued message. This RFD answers that directly: editing is a pending-inject operation before `user_message`, not a transcript rewrite after delivery. The author also flags that the Claude Agent SDK had no public hook for "when was my queued message inserted," which is precisely what `user_message` echoing solves at the protocol level. The proposal below is intended to supersede #484, with credit to [@SteffenDE](https://github.com/SteffenDE) for raising the question first. ## What we propose to do about it @@ -59,7 +59,7 @@ One method, two modes, one capability. } ``` -The agent responds when the inject has been **accepted into the session**, not when the model has processed it. The response carries the agent-assigned `messageId`, matching the v2 prompt lifecycle pattern: +The agent responds when the inject has been accepted for pending delivery, not when the model has processed it. The response carries the agent-assigned `messageId`, matching the v2 prompt lifecycle pattern: ```jsonc { @@ -71,13 +71,18 @@ The agent responds when the inject has been **accepted into the session**, not w At delivery (which happens later for both modes), the agent emits a `user_message` session update with the same `messageId`. This is the same notification the v2 prompt lifecycle defines for `session/prompt`, so clients, observers attached via [multi-client attach (#533)](https://github.com/agentclientprotocol/agent-client-protocol/pull/533), and `session/load` replay all see the message land in the same shape regardless of whether it arrived as a prompt or an inject. +Until that `user_message` notification is emitted, the message is pending. Pending messages are not transcript history yet. They can be revoked by `messageId`, and agents may also support replacing their content while preserving the original mode and queue position. + ### The capability ```jsonc { "agentCapabilities": { "session": { - "inject": { "modes": ["queue", "steer"] }, + "inject": { + "modes": ["queue", "steer"], + "pending": { "replace": true }, + }, }, }, } @@ -85,6 +90,8 @@ At delivery (which happens later for both modes), the agent emits a `user_messag An agent that only supports buffering at end-of-turn declares `["queue"]`. A tool-loop agent that can break safely between tool calls declares `["queue", "steer"]`. A streaming-only agent that can't do either declares the absence of the capability and clients fall back to `session/cancel`. +Revoke is part of the pending-inject contract. `pending.replace` is optional. Clients should only offer edit-in-place for already-sent pending messages when it is advertised; otherwise they can keep drafts client-side longer or revoke and send a new inject when losing queue position is acceptable. + ### Semantics **`queue`** is the simple one. The agent buffers the content. The agent delivers it as a `user_message` notification once `state_change: idle` fires for the current turn. FIFO across multiple queued injects. If a queue lands on an already-idle session, the agent treats it as a normal user message and starts a turn, though clients should prefer `session/prompt` in that case. @@ -95,7 +102,7 @@ An agent that only supports buffering at end-of-turn declares `["queue"]`. A too - **Mid LLM stream with no tool call pending**: agent-defined. The agent may either interrupt the stream and re-prompt with the steer text included, or let the stream finish and deliver before the next iteration. The choice is a quality-of-implementation concern; both behaviors are spec-conformant. Clients should not assume one or the other. - **Mid model-process pause** (provider rate limit, usage-limit cap): hold pending steers until the pause clears. This mirrors the shape Codex landed on in [PR #22226](https://github.com/openai/codex/pull/22226). -Both modes are fire-and-forget in the sense that the response confirms acceptance, not model-process. Both deliveries are observable via the same `user_message` notification. +Both modes are fire-and-forget in the sense that the response confirms acceptance, not model-process. Both deliveries are observable via the same `user_message` notification. While an inject is pending, the `messageId` returned from `session/inject` is the only stable handle clients should use for pending operations. ## Shiny future @@ -126,41 +133,107 @@ In `schema/schema.json`: - `messageId: string` (UUID, agent-owned) 4. Add `agentCapabilities.session.inject`: - `modes: SessionInjectMode[]` -5. Register `session/inject` as a request method. + - `pending.replace: boolean` (optional; default `false`) +5. Add `SessionRevokeInjectRequest`: + - `sessionId: SessionId` + - `messageId: string` +6. Add `SessionRevokeInjectResponse`: empty object. +7. Add `SessionReplaceInjectRequest`: + - `sessionId: SessionId` + - `messageId: string` + - `content: ContentBlock[]` +8. Add `SessionReplaceInjectResponse`: + - `messageId: string` (same value, returned for symmetry with `session/inject`) +9. Register `session/inject`, `session/revoke_inject`, and `session/replace_inject` as request methods. No changes to `session/update` or to existing notification types. The `user_message` notification introduced by the v2 prompt lifecycle is what carries the delivered content, with the same `messageId` that was returned from the inject response. -### Revocation +### Pending injects: revoke and replace -A queued message that hasn't been delivered yet should be cancellable. The natural primitive is the one the [request cancellation RFD](./request-cancellation) just added: `$/cancel_request` on the original `session/inject` request id. +A pending inject can be revoked by `messageId`: ```jsonc { "jsonrpc": "2.0", - "method": "$/cancel_request", - "params": { "requestId": 42 }, + "id": 43, + "method": "session/revoke_inject", + "params": { + "sessionId": "sess_abc", + "messageId": "msg_inj_001", + }, } ``` -If the inject hasn't been delivered, the agent drops it and responds to the original request with `-32800 Request cancelled`. If delivery already happened (the `user_message` notification has fired or is about to fire), the cancel is a no-op; the message is in the session. There is no separate "revoke a delivered message" surface; that's [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory. +A successful response means the agent will not emit a future `user_message` for that `messageId`: -Steered messages that have already been handed to the LLM call are not revocable from the protocol's point of view. They're already in the model's input. +```jsonc +{ + "jsonrpc": "2.0", + "id": 43, + "result": {}, +} +``` + +Revoke is ordered against delivery at the agent: + +- If revoke wins, the pending inject is dropped. No later `user_message` is emitted for that `messageId`. +- If delivery wins, the agent returns an error. The recommended shape is `-32602 Invalid params` with `error.data.reason: "already_delivered"` and the `messageId`. The client should keep the delivered message in the transcript. +- If the `messageId` is unknown for that session, return `-32602 Invalid params` with `error.data.reason: "unknown_message_id"`. +- If the message was already revoked and the agent still remembers the tombstone, return success. This makes revoke safe to retry after transport loss. + +Normal session errors still apply. If the session is unknown, closed, or no longer accepts input, return the same error the agent uses for other session requests. + +A steer that has been emitted as `user_message`, or otherwise committed to the next model input, is already delivered from the protocol's point of view. It is not revocable through `session/revoke_inject`. There is no separate "revoke a delivered message" surface; that is [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory. + +Agents may also support replacing pending content. Replacement keeps the same `messageId`, mode, and queue position; only `content` changes. The eventual `user_message` carries the final content, not the edit history. + +```jsonc +{ + "jsonrpc": "2.0", + "id": 44, + "method": "session/replace_inject", + "params": { + "sessionId": "sess_abc", + "messageId": "msg_inj_001", + "content": [ + { + "type": "text", + "text": "actually use auth_v3.go", + }, + ], + }, +} +``` + +```jsonc +{ + "jsonrpc": "2.0", + "id": 44, + "result": { "messageId": "msg_inj_001" }, +} +``` + +The same failure cases apply to replace: `already_delivered`, `unknown_message_id`, and unsupported capability. If `pending.replace` was not advertised, clients should not call the method; agents may still return `Method not found` or `-32602 Invalid params` with `error.data.reason: "replace_not_supported"`. If replace races with delivery, the agent either delivers the old content and returns `already_delivered`, or accepts the replacement and later emits `user_message` with the new content. It must not report success and then deliver the old content. + +`$/cancel_request` still applies to an in-flight `session/inject` request before the accept response is sent. Once `session/inject` has returned a `messageId`, request cancellation is no longer the right tool; the request is complete, and the message is managed by `session/revoke_inject` or `session/replace_inject`. ### Persistence and replay -Injected messages are part of session history. They appear in `session/load` replay as `user_message` notifications, ordered by acceptance time (which equals delivery time for queue+steer because acceptance and delivery are the same moment from the session's perspective, even though the response went out earlier). +Delivered injected messages are part of session history. They appear in `session/load` replay as `user_message` notifications, ordered by delivery time. + +Pending messages are not replayed as history. Revoked messages are not replayed. Replaced messages replay once, as the final `user_message` content that was actually delivered. This means a session loaded from disk doesn't carry "this was a steer" provenance. That's deliberate. Once delivered, an injected message is just a user message in the transcript. The modality is observable in real time via the gap between response and `user_message`, not as a persistent attribute. ### Multi-client interaction -With [multi-client attach (#533)](https://github.com/agentclientprotocol/agent-client-protocol/pull/533), multiple controllers can inject into the same session. Ordering across controllers is by arrival time at the agent (FIFO from each controller; interleaved across controllers). Observers cannot inject. +With [multi-client attach (#533)](https://github.com/agentclientprotocol/agent-client-protocol/pull/533), multiple controllers can inject into the same session. Ordering across controllers is by arrival time at the agent (FIFO from each controller; interleaved across controllers). Observers cannot inject, revoke, or replace. -Every controller and observer sees the same `user_message` notification at delivery time, so transcripts converge. +Every controller and observer sees the same `user_message` notification at delivery time, so transcripts converge. A revoked inject has no transcript update. If an agent restricts revocation or replacement to the controller that created the pending inject, it should reject other controllers with a permission error instead of silently doing nothing. ### Interaction with other in-flight requests -- **`session/cancel` during a pending steer.** Codex [#22815](https://github.com/openai/codex/issues/22815) is a real bug surface here, and we should resolve it in the spec rather than leave it implicit. Recommended behavior: cancel applies to the in-flight turn only. Pending injects (both queue and steer) survive `session/cancel` and deliver as normal once the agent reaches idle. Clients that want to clear everything can revoke the injects via `$/cancel_request` first, then `session/cancel`. Agents should document if they deviate. +- **`session/cancel` during a pending steer.** Codex [#22815](https://github.com/openai/codex/issues/22815) is a real bug surface here, and we should resolve it in the spec rather than leave it implicit. Recommended behavior: cancel applies to the in-flight turn only. Pending injects (both queue and steer) survive `session/cancel` and deliver as normal once the agent reaches idle. Clients that want to clear everything can call `session/revoke_inject` for pending injects first, then `session/cancel`. Agents should document if they deviate. - **`session/request_permission` blocking the turn.** While the agent is awaiting a permission decision, the turn is paused but not idle. Queue accumulates as normal. Steer is held until the permission resolves, then delivered at the next break-point (which may be immediately if the permission decision is the break-point). Agents that prefer to drop steers during permission waits may do so, capability-advertised. - **Elicitation in flight.** [Elicitation](./elicitation) is a separate request from the agent to the client. An inject is not an answer to an elicitation; they're orthogonal channels. Agents must not satisfy a pending `elicitation/create` from inject content. @@ -225,6 +298,12 @@ Sibling but separate. `session/inject` carries user-role content; the agent trea Because delivery for `queue` can be arbitrarily delayed (a long-running turn might mean the queue waits minutes), and forcing the JSON-RPC request to stay open for that duration is wasteful and brittle on stateful transports. The `messageId` returned on acceptance is the handle clients need; the `user_message` notification at delivery time is the signal. +### Can a queued message be edited? + +Yes, while it is pending. If the agent advertises `pending.replace`, clients can call `session/replace_inject` and keep the same queue position. If replacement is not advertised, clients can still keep unsent drafts editable on their side, or revoke and send a new inject when appending at the end is acceptable. + +After `user_message` fires, the message is transcript history. Editing or removing it is a rewind or fork operation, not an inject operation. + ### Should the agent be allowed to drop an injected message? Only via revocation. Once accepted (the response has gone out with a `messageId`), the agent commits to delivering. If the agent crashes mid-turn and loses the queue, that's a durability bug, not a protocol-allowed behavior. @@ -235,4 +314,5 @@ Steer delivers at the next break-point; queue delivers at idle. The steer arrive ## Revision history +- 2026-05-20: Replaced request-cancellation-based revocation with `messageId`-based pending revoke/replace semantics, including race outcomes for already-delivered messages. - 2026-05-19: Initial draft, lifted from [discussion #1220](https://github.com/orgs/agentclientprotocol/discussions/1220) with two changes: explicit acknowledgment of and supersession over PR #484, and tightened `messageId` ownership to match the v2 prompt lifecycle (agent-owned, returned in the inject response). From ae3226e1a0be8299463ab2069c8fec8e06b6213e Mon Sep 17 00:00:00 2001 From: Kenneth Sinder Date: Wed, 20 May 2026 07:15:08 -0700 Subject: [PATCH 3/9] docs(rfd): polish session inject wording --- docs/rfds/v2/session-inject.mdx | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/rfds/v2/session-inject.mdx b/docs/rfds/v2/session-inject.mdx index 478591bf..6fd5c9de 100644 --- a/docs/rfds/v2/session-inject.mdx +++ b/docs/rfds/v2/session-inject.mdx @@ -18,7 +18,7 @@ This is a v2 follow-on. It rides on the [v2 prompt lifecycle](./prompt) (short-l Coding-agent UIs have converged on two adjacent input affordances on top of "send a prompt and wait": -- **Queue.** The user types while the agent is mid-turn; the text is held and delivered once the agent goes idle. Shipped in Cursor since 1.2 as a FIFO ([forum thread](https://forum.cursor.com/t/queue-agent-messages/110883)). Shipped in Windsurf Cascade — type while Cascade is working, press Enter, it lands after the current task ([docs](https://docs.windsurf.com/plugins/cascade/cascade-overview)). Shipped in Gemini CLI with explicit `wait_for_idle` vs `wait_for_response` settings ([PR #7867](https://github.com/google-gemini/gemini-cli/pull/7867)), and with documented ordering bugs when tool calls are involved ([#17282](https://github.com/google-gemini/gemini-cli/issues/17282), [#17719](https://github.com/google-gemini/gemini-cli/issues/17719)). Shipped in Claude Code, with [known boundary bugs](https://github.com/anthropics/claude-code/issues/49373) where it flushes at every LLM pause instead of true end-of-turn ([related](https://github.com/anthropics/claude-code/issues/36326)). +- **Queue.** The user types while the agent is mid-turn; the text is held and delivered once the agent goes idle. Shipped in Cursor since 1.2 as a FIFO ([forum thread](https://forum.cursor.com/t/queue-agent-messages/110883)). Shipped in Windsurf Cascade: type while Cascade is working, press Enter, and it lands after the current task ([docs](https://docs.windsurf.com/plugins/cascade/cascade-overview)). Shipped in Gemini CLI with explicit `wait_for_idle` vs `wait_for_response` settings ([PR #7867](https://github.com/google-gemini/gemini-cli/pull/7867)), and with documented ordering bugs when tool calls are involved ([#17282](https://github.com/google-gemini/gemini-cli/issues/17282), [#17719](https://github.com/google-gemini/gemini-cli/issues/17719)). Shipped in Claude Code, with [known boundary bugs](https://github.com/anthropics/claude-code/issues/49373) where it flushes at every LLM pause instead of true end-of-turn ([related](https://github.com/anthropics/claude-code/issues/36326)). - **Steer.** The user types while the agent is mid-turn; the text is delivered at the next safe break-point, typically after the in-flight tool call. Codex CLI has the most fleshed-out implementation, opt-in via `/experimental`. The Codex code base has had to evolve its own pending-steers buffer, requeue-on-turn-complete, pause-on-usage-limits, and a replay edge-case surface. See PRs [#12569](https://github.com/openai/codex/pull/12569), [#15235](https://github.com/openai/codex/pull/15235), [#22226](https://github.com/openai/codex/pull/22226), [#22879](https://github.com/openai/codex/pull/22879) and issues [#19975](https://github.com/openai/codex/issues/19975), [#22815](https://github.com/openai/codex/issues/22815). The original Codex request issues ([#4312](https://github.com/openai/codex/issues/4312), [#9096](https://github.com/openai/codex/issues/9096)) frame this as "queued corrections while the current turn is still running" — the same problem. Gemini CLI has an open proposal to add the same shape under a different name ([`/inject` for asynchronous mid-stream steering, #17197](https://github.com/google-gemini/gemini-cli/issues/17197)). The "queue vs. steer is two distinct things" framing has surfaced explicitly across ecosystems: see `langchain-ai/deepagents` [#1390 "Disambiguate steer/queue"](https://github.com/langchain-ai/deepagents/issues/1390) and `anomalyco/opencode` [#21388 "Allow messages to be sent mid-turn — interrupt, queue, or inject"](https://github.com/anomalyco/opencode/issues/21388). The convergence is visible, but each tool is reinventing the semantics from scratch. @@ -29,7 +29,7 @@ The v2 prompt lifecycle RFD already flags this gap in its motivation: > Queueing messages (still an ongoing discussion) would also fit much nicer in this pattern... potentially the agent could decide whether it cancels the current turn and inserts it immediately, or inserts it at the next convenient break point. -The substrate that's missing isn't queueing alone. It's the wire shape that distinguishes "deliver soon" from "deliver later," that gives each delivery a stable handle for ack and revocation, and that echoes the delivery into the session history so multi-client and replay stay coherent. That substrate is the v2 prompt lifecycle. This RFD adds the call sites on top. +The missing piece is not queueing alone. It is the wire shape that distinguishes "deliver soon" from "deliver later," gives each delivery a stable handle for ack and revocation, and echoes the delivery into the session history so multi-client and replay stay coherent. The v2 prompt lifecycle gives us that base. This RFD adds the call sites on top. PR [#484](https://github.com/agentclientprotocol/agent-client-protocol/pull/484) (open since February 2026) takes a thinner cut: a `promptQueueing` capability plus an `end_turn` early-finish signal, with prompts queued via a parallel `session/prompt` call. It doesn't address steer, doesn't define delivery semantics, and predates the v2 prompt lifecycle's `state_change` and `user_message` notifications. The follow-up thread also asks how clients edit a queued message. This RFD answers that directly: editing is a pending-inject operation before `user_message`, not a transcript rewrite after delivery. The author also flags that the Claude Agent SDK had no public hook for "when was my queued message inserted," which is precisely what `user_message` echoing solves at the protocol level. The proposal below is intended to supersede #484, with credit to [@SteffenDE](https://github.com/SteffenDE) for raising the question first. @@ -130,7 +130,7 @@ In `schema/schema.json`: - `mode: SessionInjectMode` - `content: ContentBlock[]` (same shape as `PromptRequest.prompt`) 3. Add `SessionInjectResponse`: - - `messageId: string` (UUID, agent-owned) + - `messageId: string` (opaque, agent-owned) 4. Add `agentCapabilities.session.inject`: - `modes: SessionInjectMode[]` - `pending.replace: boolean` (optional; default `false`) @@ -183,7 +183,7 @@ Revoke is ordered against delivery at the agent: Normal session errors still apply. If the session is unknown, closed, or no longer accepts input, return the same error the agent uses for other session requests. -A steer that has been emitted as `user_message`, or otherwise committed to the next model input, is already delivered from the protocol's point of view. It is not revocable through `session/revoke_inject`. There is no separate "revoke a delivered message" surface; that is [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory. +Delivery is the point where the agent commits the inject to session history and emits, or has irrevocably queued, the matching `user_message`. From that point, the content may already be in the next model input. Revoke must return `already_delivered`, and the client should expect the `user_message` if it has not seen it yet. There is no separate "revoke a delivered message" surface; that is [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory. Agents may also support replacing pending content. Replacement keeps the same `messageId`, mode, and queue position; only `content` changes. The eventual `user_message` carries the final content, not the edit history. @@ -229,7 +229,7 @@ This means a session loaded from disk doesn't carry "this was a steer" provenanc With [multi-client attach (#533)](https://github.com/agentclientprotocol/agent-client-protocol/pull/533), multiple controllers can inject into the same session. Ordering across controllers is by arrival time at the agent (FIFO from each controller; interleaved across controllers). Observers cannot inject, revoke, or replace. -Every controller and observer sees the same `user_message` notification at delivery time, so transcripts converge. A revoked inject has no transcript update. If an agent restricts revocation or replacement to the controller that created the pending inject, it should reject other controllers with a permission error instead of silently doing nothing. +Every controller and observer sees the same `user_message` notification at delivery time, so transcripts converge. A revoked inject has no transcript update. This RFD does not add a shared pending-queue list or pending-message broadcast; clients can manage the pending handles returned to them, and delivery remains the shared signal. If an agent restricts revocation or replacement to the controller that created the pending inject, it should reject other controllers with a permission error instead of silently doing nothing. ### Interaction with other in-flight requests From 48a629f77cb81edfccb794ab45f0d488cec0109e Mon Sep 17 00:00:00 2001 From: Kenneth Sinder Date: Sun, 24 May 2026 12:34:29 -0700 Subject: [PATCH 4/9] docs(rfd): address review feedback on session-inject Resolves PR #1261 inline comments from @SteffenDE: - Reframe PR #484 comparison: acknowledge that #484's early end_turn is a legitimate steer-via-yield design. Spell out the trade-off (forced turn boundary at every steer, client drives the re-prompt) rather than dismissing it as not addressing steer. - Make revoke mandatory if session/inject is supported; keep replace opt-in via pending.replace. Explain the asymmetry (drop is cheap, content edit is harder once partially serialized). - Drop the "irrevocably queued" hedge. Delivery is defined strictly as user_message emission; no phantom committed-but-not-delivered state. - Drop the agent-may-drop-steers-during-permission-wait carve-out. Adapter buffers when the underlying runtime can't. - Add FAQ on adapter burden for messageId: addresses Claude Agent SDK not having a stable ID until delivery, explains why the alternatives (client-provided ID, deferred ID) are worse. - Add FAQ on why queue lives in the protocol rather than client-side: multi-client FIFO ordering authority, replay/reconnect, headless agents. Also a prose pass: tighten em-dashes to colons where natural, drop rule-of-three flourishes, remove "not X. It is Y" constructions, match the matter-of-fact tone of the v2 prompt lifecycle RFD. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/rfds/v2/session-inject.mdx | 71 ++++++++++++++++++++++----------- 1 file changed, 48 insertions(+), 23 deletions(-) diff --git a/docs/rfds/v2/session-inject.mdx b/docs/rfds/v2/session-inject.mdx index 6fd5c9de..76088f67 100644 --- a/docs/rfds/v2/session-inject.mdx +++ b/docs/rfds/v2/session-inject.mdx @@ -8,30 +8,32 @@ Author(s): [@kennethsinder](https://github.com/kennethsinder) > What are you proposing to change? -Add a single `session/inject` method, with a `mode` of `queue` or `steer`, so clients can hand the agent another user-role message without ending the current turn. The agent advertises which modes it supports via a capability. Delivery is echoed as a `user_message` notification so multi-client observers and `session/load` replay see one consistent history. Before delivery, the returned `messageId` is also the handle clients use to revoke the pending message, and optionally replace its content. +Add a single `session/inject` method with a `mode` of `queue` or `steer`, so clients can hand the agent another user-role message without ending the current turn. The agent advertises which modes it supports via a capability. Delivery is echoed as a `user_message` notification, so multi-client observers and `session/load` replay see one consistent history. Before delivery, the returned `messageId` is also the handle clients use to revoke the pending message, and optionally replace its content. -This is a v2 follow-on. It rides on the [v2 prompt lifecycle](./prompt) (short-lived `session/prompt`, agent-owned `messageId`, `state_change` notifications) and replaces the standalone [prompt queueing RFD (#484)](https://github.com/agentclientprotocol/agent-client-protocol/pull/484), which predates that substrate and stops at the capability bit. +This is a v2 follow-on. It rides on the [v2 prompt lifecycle](./prompt): short-lived `session/prompt`, agent-owned `messageId`, `state_change` notifications. It also aims to supersede the standalone [prompt queueing RFD (#484)](https://github.com/agentclientprotocol/agent-client-protocol/pull/484), which predates that substrate. ## Status quo > How do things work today and what problems does this cause? Why would we change things? -Coding-agent UIs have converged on two adjacent input affordances on top of "send a prompt and wait": +Coding-agent UIs keep landing on two ways to add input on top of "send a prompt and wait": - **Queue.** The user types while the agent is mid-turn; the text is held and delivered once the agent goes idle. Shipped in Cursor since 1.2 as a FIFO ([forum thread](https://forum.cursor.com/t/queue-agent-messages/110883)). Shipped in Windsurf Cascade: type while Cascade is working, press Enter, and it lands after the current task ([docs](https://docs.windsurf.com/plugins/cascade/cascade-overview)). Shipped in Gemini CLI with explicit `wait_for_idle` vs `wait_for_response` settings ([PR #7867](https://github.com/google-gemini/gemini-cli/pull/7867)), and with documented ordering bugs when tool calls are involved ([#17282](https://github.com/google-gemini/gemini-cli/issues/17282), [#17719](https://github.com/google-gemini/gemini-cli/issues/17719)). Shipped in Claude Code, with [known boundary bugs](https://github.com/anthropics/claude-code/issues/49373) where it flushes at every LLM pause instead of true end-of-turn ([related](https://github.com/anthropics/claude-code/issues/36326)). -- **Steer.** The user types while the agent is mid-turn; the text is delivered at the next safe break-point, typically after the in-flight tool call. Codex CLI has the most fleshed-out implementation, opt-in via `/experimental`. The Codex code base has had to evolve its own pending-steers buffer, requeue-on-turn-complete, pause-on-usage-limits, and a replay edge-case surface. See PRs [#12569](https://github.com/openai/codex/pull/12569), [#15235](https://github.com/openai/codex/pull/15235), [#22226](https://github.com/openai/codex/pull/22226), [#22879](https://github.com/openai/codex/pull/22879) and issues [#19975](https://github.com/openai/codex/issues/19975), [#22815](https://github.com/openai/codex/issues/22815). The original Codex request issues ([#4312](https://github.com/openai/codex/issues/4312), [#9096](https://github.com/openai/codex/issues/9096)) frame this as "queued corrections while the current turn is still running" — the same problem. Gemini CLI has an open proposal to add the same shape under a different name ([`/inject` for asynchronous mid-stream steering, #17197](https://github.com/google-gemini/gemini-cli/issues/17197)). +- **Steer.** The user types while the agent is mid-turn; the text is delivered at the next safe break-point, typically after the in-flight tool call. Codex CLI has the most fleshed-out implementation, opt-in via `/experimental`. The Codex code base has evolved its own pending-steers buffer, a requeue-on-turn-complete path, a pause-on-usage-limits behavior, and a replay edge-case surface (see PRs [#12569](https://github.com/openai/codex/pull/12569), [#15235](https://github.com/openai/codex/pull/15235), [#22226](https://github.com/openai/codex/pull/22226), [#22879](https://github.com/openai/codex/pull/22879) and issues [#19975](https://github.com/openai/codex/issues/19975), [#22815](https://github.com/openai/codex/issues/22815)). The original request issues ([#4312](https://github.com/openai/codex/issues/4312), [#9096](https://github.com/openai/codex/issues/9096)) frame the feature as "queued corrections while the current turn is still running," which is the same problem stated from the user's seat. Gemini CLI has an open proposal for the same shape under a different name ([`/inject` for asynchronous mid-stream steering, #17197](https://github.com/google-gemini/gemini-cli/issues/17197)). -The "queue vs. steer is two distinct things" framing has surfaced explicitly across ecosystems: see `langchain-ai/deepagents` [#1390 "Disambiguate steer/queue"](https://github.com/langchain-ai/deepagents/issues/1390) and `anomalyco/opencode` [#21388 "Allow messages to be sent mid-turn — interrupt, queue, or inject"](https://github.com/anomalyco/opencode/issues/21388). The convergence is visible, but each tool is reinventing the semantics from scratch. +The "queue and steer are distinct" framing has surfaced explicitly across ecosystems: see `langchain-ai/deepagents` [#1390 "Disambiguate steer/queue"](https://github.com/langchain-ai/deepagents/issues/1390) and `anomalyco/opencode` [#21388 "Allow messages to be sent mid-turn — interrupt, queue, or inject"](https://github.com/anomalyco/opencode/issues/21388). Every tool is reinventing the semantics from scratch. -ACP today has one mid-turn lever: `session/cancel` followed by a fresh `session/prompt`. That discards in-flight work and forces tool calls to surface as cancelled even when they were milliseconds from completing. There is no way to express "let this tool call finish, then add this." There is no way to express "buffer this for after the turn." The `stopReason` enum has no value for a turn interrupted because a new message landed. +ACP today has one mid-turn lever: `session/cancel` followed by a fresh `session/prompt`. That discards in-flight work and forces tool calls to surface as cancelled even when they were milliseconds from completing. There is no way to express "let this tool call finish, then add this," and no way to express "buffer this for after the turn." The `stopReason` enum has no value for a turn interrupted because a new message landed. -The v2 prompt lifecycle RFD already flags this gap in its motivation: +The v2 prompt lifecycle RFD already flags the gap in its motivation: > Queueing messages (still an ongoing discussion) would also fit much nicer in this pattern... potentially the agent could decide whether it cancels the current turn and inserts it immediately, or inserts it at the next convenient break point. -The missing piece is not queueing alone. It is the wire shape that distinguishes "deliver soon" from "deliver later," gives each delivery a stable handle for ack and revocation, and echoes the delivery into the session history so multi-client and replay stay coherent. The v2 prompt lifecycle gives us that base. This RFD adds the call sites on top. +Queueing alone isn't the only gap. The wire shape needs to distinguish "deliver soon" from "deliver later," give each delivery a stable handle for ack and revocation, and echo the delivery into session history so multi-client and replay stay aligned. The v2 prompt lifecycle gives us most of that already; this RFD adds the call sites on top. -PR [#484](https://github.com/agentclientprotocol/agent-client-protocol/pull/484) (open since February 2026) takes a thinner cut: a `promptQueueing` capability plus an `end_turn` early-finish signal, with prompts queued via a parallel `session/prompt` call. It doesn't address steer, doesn't define delivery semantics, and predates the v2 prompt lifecycle's `state_change` and `user_message` notifications. The follow-up thread also asks how clients edit a queued message. This RFD answers that directly: editing is a pending-inject operation before `user_message`, not a transcript rewrite after delivery. The author also flags that the Claude Agent SDK had no public hook for "when was my queued message inserted," which is precisely what `user_message` echoing solves at the protocol level. The proposal below is intended to supersede #484, with credit to [@SteffenDE](https://github.com/SteffenDE) for raising the question first. +PR [#484](https://github.com/agentclientprotocol/agent-client-protocol/pull/484) (open since February 2026) sketches a different shape for the same problem: a `promptQueueing` capability paired with an early-finish `end_turn`, where the next message arrives as a parallel `session/prompt`. That's a legitimate steer design: the agent yields, and the client's parallel prompt becomes the next turn. The trade-off is that every steer forces a turn boundary, so the in-flight tool call surfaces as completing a turn instead of being absorbed into the running one, and the client has to drive the re-prompt rather than handing the agent a payload. + +The follow-up thread on #484 also asks how clients edit a queued message before it lands, and notes that the Claude Agent SDK doesn't expose a hook for "when did my queued message get inserted." Those are the two things `messageId` + `user_message` echo are built for. This RFD aims to supersede #484, with credit to [@SteffenDE](https://github.com/SteffenDE) for raising the question first; the timing of closing #484 is a coordination question for whoever ends up championing this. ## What we propose to do about it @@ -90,16 +92,18 @@ Until that `user_message` notification is emitted, the message is pending. Pendi An agent that only supports buffering at end-of-turn declares `["queue"]`. A tool-loop agent that can break safely between tool calls declares `["queue", "steer"]`. A streaming-only agent that can't do either declares the absence of the capability and clients fall back to `session/cancel`. -Revoke is part of the pending-inject contract. `pending.replace` is optional. Clients should only offer edit-in-place for already-sent pending messages when it is advertised; otherwise they can keep drafts client-side longer or revoke and send a new inject when losing queue position is acceptable. +Revoke is mandatory: any agent that supports `session/inject` must also support `session/revoke_inject`. If the agent can accept a pending inject, it can drop one before delivery; the fallback is a tombstone flag plus skipping the `user_message` emit. Agents that genuinely can't do that shouldn't advertise `session/inject` at all. + +Replace is opt-in via `pending.replace`. Content editing is genuinely harder than dropping, because the content may already be partially serialized into a prompt envelope by the time the replace lands. Clients that want edit-in-place should check `pending.replace`. Otherwise they can hold drafts client-side longer, or revoke and re-inject when losing queue position is acceptable. ### Semantics -**`queue`** is the simple one. The agent buffers the content. The agent delivers it as a `user_message` notification once `state_change: idle` fires for the current turn. FIFO across multiple queued injects. If a queue lands on an already-idle session, the agent treats it as a normal user message and starts a turn, though clients should prefer `session/prompt` in that case. +**`queue`** is the simple one. The agent buffers the content and delivers it as a `user_message` notification once `state_change: idle` fires for the current turn. FIFO across multiple queued injects. If a queue lands on an already-idle session, the agent treats it as a normal user message and starts a turn, though clients should prefer `session/prompt` in that case. **`steer`** is the interesting one. The agent delivers the content at the next safe break-point during the running turn: - **Mid tool call**: complete the tool call, then deliver before the next LLM call. -- **Mid LLM stream with no tool call pending**: agent-defined. The agent may either interrupt the stream and re-prompt with the steer text included, or let the stream finish and deliver before the next iteration. The choice is a quality-of-implementation concern; both behaviors are spec-conformant. Clients should not assume one or the other. +- **Mid LLM stream with no tool call pending**: agent-defined. The agent may either interrupt the stream and re-prompt with the steer text included, or let the stream finish and deliver before the next iteration. Both behaviors are spec-conformant; clients should not assume one or the other. - **Mid model-process pause** (provider rate limit, usage-limit cap): hold pending steers until the pause clears. This mirrors the shape Codex landed on in [PR #22226](https://github.com/openai/codex/pull/22226). Both modes are fire-and-forget in the sense that the response confirms acceptance, not model-process. Both deliveries are observable via the same `user_message` notification. While an inject is pending, the `messageId` returned from `session/inject` is the only stable handle clients should use for pending operations. @@ -108,13 +112,13 @@ Both modes are fire-and-forget in the sense that the response confirms acceptanc > How will things will play out once this feature exists? -A user typing into a Zed prompt box during a long agent turn no longer has to choose between "wait for the turn to finish" and "blow it up with cancel." If they want to add context for the next iteration, they queue. If they want to redirect the agent now without losing the tool call it just kicked off, they steer. +A user typing into a Zed prompt box during a long agent turn no longer has to choose between "wait for the turn to finish" and "blow it up with cancel." Add-context-for-later is queue; redirect-now-without-losing-the-tool-call is steer. -A dashboard client attached to three concurrent sessions can route a steer into whichever session needs course-correcting without disturbing the other two. The `user_message` echo means every observer sees the steer land at the same point in history, so the side-by-side panels stay consistent. +A dashboard client attached to several concurrent sessions can route a steer into whichever session needs course-correcting without disturbing the others. Because the `user_message` echo lands at the same point in history for every observer, side-by-side panels stay consistent. -A `session/load` replay reconstructs the same transcript regardless of how messages originally arrived. The replay doesn't need to know that message N was a steer that interrupted a tool call vs. a prompt that started a turn. It sees `user_message`, `agent_message`, `tool_call`, `state_change` in delivery order, and renders. +A `session/load` replay reconstructs the same transcript regardless of how messages originally arrived. The replay doesn't need to know that message N was a steer that interrupted a tool call versus a prompt that started a turn; it sees `user_message`, `agent_message`, `tool_call`, `state_change` in delivery order and renders. -For agents, this also means the existing `cancelled` stop reason starts carrying a clearer signal. Today, "cancelled" can mean either "user gave up" or "user wanted to redirect." Once steer exists, "cancelled" goes back to meaning "user gave up." Steer-interrupted turns are not cancelled. They end normally and the steer becomes the next user message. +For agents, the existing `cancelled` stop reason starts carrying a clearer signal. Today it can mean either "user gave up" or "user wanted to redirect." Once steer exists, `cancelled` goes back to meaning "user gave up": steer-interrupted turns end normally, and the steer becomes the next user message. ## Implementation details and plan @@ -183,7 +187,7 @@ Revoke is ordered against delivery at the agent: Normal session errors still apply. If the session is unknown, closed, or no longer accepts input, return the same error the agent uses for other session requests. -Delivery is the point where the agent commits the inject to session history and emits, or has irrevocably queued, the matching `user_message`. From that point, the content may already be in the next model input. Revoke must return `already_delivered`, and the client should expect the `user_message` if it has not seen it yet. There is no separate "revoke a delivered message" surface; that is [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory. +Delivery is defined as the point where the agent emits the matching `user_message` notification. Before that point, revoke succeeds and no `user_message` is emitted for that `messageId`. At or after that point, revoke returns `already_delivered`, and the client should expect the `user_message` if it has not already seen it. There is no third "committed but not delivered" state in the protocol; any internal commit the agent does before emitting `user_message` is its own bookkeeping, not a visible state. Removing or rewriting a delivered message is [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory, not inject. Agents may also support replacing pending content. Replacement keeps the same `messageId`, mode, and queue position; only `content` changes. The eventual `user_message` carries the final content, not the edit history. @@ -233,8 +237,8 @@ Every controller and observer sees the same `user_message` notification at deliv ### Interaction with other in-flight requests -- **`session/cancel` during a pending steer.** Codex [#22815](https://github.com/openai/codex/issues/22815) is a real bug surface here, and we should resolve it in the spec rather than leave it implicit. Recommended behavior: cancel applies to the in-flight turn only. Pending injects (both queue and steer) survive `session/cancel` and deliver as normal once the agent reaches idle. Clients that want to clear everything can call `session/revoke_inject` for pending injects first, then `session/cancel`. Agents should document if they deviate. -- **`session/request_permission` blocking the turn.** While the agent is awaiting a permission decision, the turn is paused but not idle. Queue accumulates as normal. Steer is held until the permission resolves, then delivered at the next break-point (which may be immediately if the permission decision is the break-point). Agents that prefer to drop steers during permission waits may do so, capability-advertised. +- **`session/cancel` during a pending steer.** Codex [#22815](https://github.com/openai/codex/issues/22815) is a real bug surface that the spec should resolve rather than leave implicit. Recommended behavior: cancel applies to the in-flight turn only. Pending injects (both queue and steer) survive `session/cancel` and deliver normally once the agent reaches idle. Clients that want to clear everything can call `session/revoke_inject` for the pending injects first, then `session/cancel`. Agents should document any deviation from this. +- **`session/request_permission` blocking the turn.** While the agent is awaiting a permission decision, the turn is paused but not idle. Queue accumulates as normal. Steer is held until the permission resolves, then delivered at the next break-point (which may be immediately, if the permission decision is itself the break-point). If the underlying agent runtime can't hold a steer during a permission wait, the ACP adapter buffers; the protocol semantics are uniform, no per-agent carve-out. - **Elicitation in flight.** [Elicitation](./elicitation) is a separate request from the agent to the client. An inject is not an answer to an elicitation; they're orthogonal channels. Agents must not satisfy a pending `elicitation/create` from inject content. ### Sub-agents @@ -262,11 +266,11 @@ This is a v2-only addition. It depends on the v2 prompt lifecycle's `user_messag ### What alternative approaches did you consider, and why did you settle on this one? -**Two separate methods (`session/queue` and `session/steer`).** Cleaner type signatures, but every implementer has to wire two methods that share most of their plumbing. The `mode` parameter is more honest about the actual factoring: same delivery channel, same echo notification, different break-point policy. +**Two separate methods (`session/queue` and `session/steer`).** Cleaner type signatures, but every implementer has to wire two methods that share most of their plumbing. The `mode` parameter is more honest about the factoring: same delivery channel, same echo notification, different break-point policy. -**Roll queue into the v2 prompt lifecycle RFD itself.** Tempting, because the prompt lifecycle already mentions queueing as future work. Rejected because steer is the harder design problem and deserves its own discussion, and pinning both to the v2 prompt lifecycle RFD would either bloat that RFD or commit to a design before the steer semantics are settled. +**Roll queue into the v2 prompt lifecycle RFD itself.** Tempting, since the prompt lifecycle already mentions queueing as future work. Rejected because steer is the harder design problem and deserves its own discussion; pinning both to the v2 prompt lifecycle RFD would either bloat that RFD or commit to a design before steer semantics are settled. -**Parallel `session/prompt` calls (the PR #484 approach).** Reuses the existing method but conflates "start a turn" with "add to a turn." It also can't express steer cleanly, because a parallel `session/prompt` already implies its own turn boundary. Splitting inject out keeps `session/prompt` meaning "start a turn." +**Parallel `session/prompt` calls (the #484 approach).** Reuses the existing method but conflates "start a turn" with "add to a turn." Can't express absorb-into-running-turn steer cleanly, because a parallel `session/prompt` already implies its own turn boundary. Splitting inject out keeps `session/prompt` meaning "start a turn." **`session/prompt` with a `priority` field.** Same problem as parallel `session/prompt`: overloads a method whose semantics are about turn ownership, not insertion policy. @@ -284,7 +288,7 @@ The turn-complete signal gives clients a deterministic barrier for "all updates ### How does this differ from [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214)? -Rewind is destructive: it truncates history and changes what the agent sees as context. Inject is additive: it adds a new user message without touching what came before. They're siblings on the "mid-conversation user intervention" axis: inject for "add this," rewind for "undo that." +Rewind is destructive: it truncates history and changes what the agent sees as context. Inject is additive: it adds a new user message without touching what came before. They're siblings on the "mid-conversation user intervention" axis: inject for "add this", rewind for "undo that". ### What about non-text content? Images, file references, resource links? @@ -298,6 +302,26 @@ Sibling but separate. `session/inject` carries user-role content; the agent trea Because delivery for `queue` can be arbitrarily delayed (a long-running turn might mean the queue waits minutes), and forcing the JSON-RPC request to stay open for that duration is wasteful and brittle on stateful transports. The `messageId` returned on acceptance is the handle clients need; the `user_message` notification at delivery time is the signal. +### Why must the agent return `messageId` synchronously when some agent SDKs don't have one until delivery? + +The case in mind is the Claude Agent SDK, which doesn't assign a stable message ID until the moment it inserts the message into context, which can be well after the inject is accepted. For those SDKs, the ACP adapter has to mint its own wire ID at accept-time and remember the mapping until the underlying SDK delivers. That's a real cost, and it's the same shape adapters already pay for streaming chunks under the [message-id](../message-id) RFD. + +The alternatives are worse. A client-provided `messageId` splits collision-avoidance across both sides and breaks fan-out to observers, which is what `message-id` considered and explicitly rejected. A deferred-id flow (accept now, the ID arrives in the `user_message` notification) leaves revoke without a handle until the moment revoke stops being useful. + +So the adapter burden stays: the wire-protocol ID is the durable one, and what the underlying SDK uses internally is the adapter's mapping problem. + +### Does `queue` need to be in the protocol at all? Couldn't clients just buffer and call `session/prompt` on idle? + +For one client driving one agent, client-side buffering plus `session/prompt` on `state_change: idle` covers the user-visible behavior, and the protocol stays smaller. Fair point. The protocol-level queue earns its keep in the cases where client-side can't. + +**Multi-client fan-out.** With [multi-client attach (#533)](https://github.com/agentclientprotocol/agent-client-protocol/pull/533), several controllers can push into the same session while a turn runs. If each maintains its own queue and races to `session/prompt` on idle, there is no agreed FIFO across them, and "what order did the user actually want these in" becomes a distributed-systems question without a clean answer. Putting the queue at the agent makes the agent the single ordering authority: the same place that already owns the transcript. + +**Replay and reconnect.** A queue held at the agent survives client restart and is observable to clients that attach after the queue was populated. A client-side queue evaporates with the process that held it. + +**Headless and daemon agents.** Agents that serve multiple thin surfaces (an editor pane, a CLI, a dashboard) want one queue at the agent, not one per surface that has to be reconciled when something idles. + +`queue` is the smaller half of this RFD relative to `steer`, so standardizing it is a modest cost, and the cases where it matters are exactly the ones ACP intends to support first-class. + ### Can a queued message be edited? Yes, while it is pending. If the agent advertises `pending.replace`, clients can call `session/replace_inject` and keep the same queue position. If replacement is not advertised, clients can still keep unsent drafts editable on their side, or revoke and send a new inject when appending at the end is acceptable. @@ -314,5 +338,6 @@ Steer delivers at the next break-point; queue delivers at idle. The steer arrive ## Revision history +- 2026-05-24: Incorporated review feedback from PR [#1261](https://github.com/agentclientprotocol/agent-client-protocol/pull/1261). Reframed the comparison with PR #484 (acknowledging its early `end_turn` as a legitimate steer-via-yield design). Made revoke mandatory and replace opt-in, with the asymmetry explained. Removed the "irrevocably queued" hedge so delivery is defined strictly as `user_message` emission. Dropped the agent-may-drop-steers-during-permission carve-out in favor of adapter-buffers. Added FAQ entries on the adapter burden for `messageId` (Claude Agent SDK case) and on why `queue` is in the protocol rather than client-side only. - 2026-05-20: Replaced request-cancellation-based revocation with `messageId`-based pending revoke/replace semantics, including race outcomes for already-delivered messages. - 2026-05-19: Initial draft, lifted from [discussion #1220](https://github.com/orgs/agentclientprotocol/discussions/1220) with two changes: explicit acknowledgment of and supersession over PR #484, and tightened `messageId` ownership to match the v2 prompt lifecycle (agent-owned, returned in the inject response). From c3c99808eb31d231f53b2236623d54147397c9dc Mon Sep 17 00:00:00 2001 From: Kenneth Sinder Date: Sun, 24 May 2026 12:41:48 -0700 Subject: [PATCH 5/9] docs(rfd): smooth bookkeeping sentence Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/rfds/v2/session-inject.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/rfds/v2/session-inject.mdx b/docs/rfds/v2/session-inject.mdx index 76088f67..9a529b6b 100644 --- a/docs/rfds/v2/session-inject.mdx +++ b/docs/rfds/v2/session-inject.mdx @@ -187,7 +187,7 @@ Revoke is ordered against delivery at the agent: Normal session errors still apply. If the session is unknown, closed, or no longer accepts input, return the same error the agent uses for other session requests. -Delivery is defined as the point where the agent emits the matching `user_message` notification. Before that point, revoke succeeds and no `user_message` is emitted for that `messageId`. At or after that point, revoke returns `already_delivered`, and the client should expect the `user_message` if it has not already seen it. There is no third "committed but not delivered" state in the protocol; any internal commit the agent does before emitting `user_message` is its own bookkeeping, not a visible state. Removing or rewriting a delivered message is [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory, not inject. +Delivery is defined as the point where the agent emits the matching `user_message` notification. Before that point, revoke succeeds and no `user_message` is emitted for that `messageId`. At or after that point, revoke returns `already_delivered`, and the client should expect the `user_message` if it has not already seen it. There is no third "committed but not delivered" state in the protocol; any internal commit the agent does before emitting `user_message` is purely for its own bookkeeping. Removing or rewriting a delivered message is [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory, not inject. Agents may also support replacing pending content. Replacement keeps the same `messageId`, mode, and queue position; only `content` changes. The eventual `user_message` carries the final content, not the edit history. From a0903705ede6453a0d048a70dc5083654b1cb6d3 Mon Sep 17 00:00:00 2001 From: Kenneth Sinder Date: Sun, 24 May 2026 12:52:47 -0700 Subject: [PATCH 6/9] docs(rfd): expand prior art with shipped managed-runtime primitives Cover the convergence that landed in the past quarter: - Codex app-server `turn/steer` and `turn/interrupt` methods (the closest existing analogue to session/inject, by name). - Cursor 3 "Glass" keybind split (Alt+Enter queues, Cmd+Enter interrupts-and-sends). - Replit Agent 4 "Queue" with explicit queue/steer separation. - Claude Managed Agents user.message events + idle/running/etc. statuses. - Devin sessions API as the single-mode counterpoint. Replace the now-superseded Gemini CLI #17197 reference with #18782 (experimental steering hints, system-role variant), and call out the user-role vs system-role design split in the session/remind FAQ. Strengthen the adapter FAQ with the concrete bug symptom from claude-agent-sdk-typescript#67: messages yielded into the async generator get processed but never appear in the visible transcript, which is exactly the gap user_message echo + stable wire ID closes. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/rfds/v2/session-inject.mdx | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/rfds/v2/session-inject.mdx b/docs/rfds/v2/session-inject.mdx index 9a529b6b..2dd88cd1 100644 --- a/docs/rfds/v2/session-inject.mdx +++ b/docs/rfds/v2/session-inject.mdx @@ -18,10 +18,10 @@ This is a v2 follow-on. It rides on the [v2 prompt lifecycle](./prompt): short-l Coding-agent UIs keep landing on two ways to add input on top of "send a prompt and wait": -- **Queue.** The user types while the agent is mid-turn; the text is held and delivered once the agent goes idle. Shipped in Cursor since 1.2 as a FIFO ([forum thread](https://forum.cursor.com/t/queue-agent-messages/110883)). Shipped in Windsurf Cascade: type while Cascade is working, press Enter, and it lands after the current task ([docs](https://docs.windsurf.com/plugins/cascade/cascade-overview)). Shipped in Gemini CLI with explicit `wait_for_idle` vs `wait_for_response` settings ([PR #7867](https://github.com/google-gemini/gemini-cli/pull/7867)), and with documented ordering bugs when tool calls are involved ([#17282](https://github.com/google-gemini/gemini-cli/issues/17282), [#17719](https://github.com/google-gemini/gemini-cli/issues/17719)). Shipped in Claude Code, with [known boundary bugs](https://github.com/anthropics/claude-code/issues/49373) where it flushes at every LLM pause instead of true end-of-turn ([related](https://github.com/anthropics/claude-code/issues/36326)). -- **Steer.** The user types while the agent is mid-turn; the text is delivered at the next safe break-point, typically after the in-flight tool call. Codex CLI has the most fleshed-out implementation, opt-in via `/experimental`. The Codex code base has evolved its own pending-steers buffer, a requeue-on-turn-complete path, a pause-on-usage-limits behavior, and a replay edge-case surface (see PRs [#12569](https://github.com/openai/codex/pull/12569), [#15235](https://github.com/openai/codex/pull/15235), [#22226](https://github.com/openai/codex/pull/22226), [#22879](https://github.com/openai/codex/pull/22879) and issues [#19975](https://github.com/openai/codex/issues/19975), [#22815](https://github.com/openai/codex/issues/22815)). The original request issues ([#4312](https://github.com/openai/codex/issues/4312), [#9096](https://github.com/openai/codex/issues/9096)) frame the feature as "queued corrections while the current turn is still running," which is the same problem stated from the user's seat. Gemini CLI has an open proposal for the same shape under a different name ([`/inject` for asynchronous mid-stream steering, #17197](https://github.com/google-gemini/gemini-cli/issues/17197)). +- **Queue.** The user types while the agent is mid-turn; the text is held and delivered once the agent goes idle. Shipped in Cursor since 1.2 as a FIFO ([forum thread](https://forum.cursor.com/t/queue-agent-messages/110883)). Shipped in Windsurf Cascade: type while Cascade is working, press Enter, and it lands after the current task ([docs](https://docs.windsurf.com/plugins/cascade/cascade-overview)). Shipped in Gemini CLI with explicit `wait_for_idle` vs `wait_for_response` settings ([PR #7867](https://github.com/google-gemini/gemini-cli/pull/7867)), with documented ordering bugs when tool calls are involved ([#17282](https://github.com/google-gemini/gemini-cli/issues/17282), [#17719](https://github.com/google-gemini/gemini-cli/issues/17719)). Shipped in Claude Code, with [known boundary bugs](https://github.com/anthropics/claude-code/issues/49373) where it flushes at every LLM pause instead of true end-of-turn ([related](https://github.com/anthropics/claude-code/issues/36326)). Shipped in [Replit Agent 4 as "Queue"](https://blog.replit.com/introducing-queue-a-smarter-way-to-work-with-agent), with explicit framing of queue and steer as separate user intents. +- **Steer.** The user types while the agent is mid-turn; the text is delivered at the next safe break-point, typically after the in-flight tool call. Codex CLI shipped this opt-in via `/experimental`, and the code base has evolved its own pending-steers buffer, requeue-on-turn-complete path, pause-on-usage-limits behavior, and replay edge-case surface (see PRs [#12569](https://github.com/openai/codex/pull/12569), [#15235](https://github.com/openai/codex/pull/15235), [#22226](https://github.com/openai/codex/pull/22226), [#22879](https://github.com/openai/codex/pull/22879) and issues [#19975](https://github.com/openai/codex/issues/19975), [#22815](https://github.com/openai/codex/issues/22815)). Codex's [app-server protocol](https://developers.openai.com/codex/app-server) lifts the same behavior into named methods: `turn/steer` (append user input to an in-flight turn without starting a new one) and `turn/interrupt` (cancel the turn). Cursor 3 "Glass" surfaces the same distinction as a keyboard split: Alt+Enter queues, Cmd+Enter interrupts-and-sends ([1.4 changelog](https://cursor.com/changelog/1-4)). Gemini CLI has an experimental version under [#18782](https://github.com/google-gemini/gemini-cli/issues/18782) that takes a different shape: injects the steer as a hidden system-role hint rather than a user message. The original Codex request issues ([#4312](https://github.com/openai/codex/issues/4312), [#9096](https://github.com/openai/codex/issues/9096)) frame the feature as "queued corrections while the current turn is still running," which is the same problem stated from the user's seat. -The "queue and steer are distinct" framing has surfaced explicitly across ecosystems: see `langchain-ai/deepagents` [#1390 "Disambiguate steer/queue"](https://github.com/langchain-ai/deepagents/issues/1390) and `anomalyco/opencode` [#21388 "Allow messages to be sent mid-turn — interrupt, queue, or inject"](https://github.com/anomalyco/opencode/issues/21388). Every tool is reinventing the semantics from scratch. +Server-side managed runtimes have converged on similar primitives. Anthropic's [Claude Managed Agents](https://platform.claude.com/docs/en/managed-agents/overview) (`managed-agents-2026-04-01` beta) accepts mid-execution `user.message` events on `/v1/sessions/{id}/events` while the session is `running`, with `idle` / `running` / `rescheduling` / `terminated` statuses that line up with the v2 prompt lifecycle's `state_change`. [Devin's sessions API](https://docs.devin.ai/api-reference/v1/sessions/send-a-message-to-an-existing-devin-session) takes a single-mode cut: `POST /v1/sessions/{id}/message` regardless of whether the session is idle or running. The "queue and steer are distinct" framing has also surfaced explicitly across ecosystems (see `langchain-ai/deepagents` [#1390 "Disambiguate steer/queue"](https://github.com/langchain-ai/deepagents/issues/1390) and `anomalyco/opencode` [#21388 "Allow messages to be sent mid-turn — interrupt, queue, or inject"](https://github.com/anomalyco/opencode/issues/21388)). Most tools are landing on it independently. ACP today has one mid-turn lever: `session/cancel` followed by a fresh `session/prompt`. That discards in-flight work and forces tool calls to surface as cancelled even when they were milliseconds from completing. There is no way to express "let this tool call finish, then add this," and no way to express "buffer this for after the turn." The `stopReason` enum has no value for a turn interrupted because a new message landed. @@ -296,7 +296,7 @@ Rewind is destructive: it truncates history and changes what the agent sees as c ### Is this related to discussion [#1224 `session/remind`](https://github.com/orgs/agentclientprotocol/discussions/1224)? -Sibling but separate. `session/inject` carries user-role content; the agent treats it as the user speaking. `session/remind` (proposed separately) carries system-role context that the host wants the agent to see without faking a user turn. They could share the underlying break-point machinery, but the role contract differs at call sites and conflating them muddies what the agent sees in its message list. +Sibling but separate. `session/inject` carries user-role content; the agent treats it as the user speaking. `session/remind` (proposed separately) carries system-role context that the host wants the agent to see without faking a user turn. The split shows up in the wild: Codex's `turn/steer` and Cursor 3 "Glass" keep the steer in the user-role channel, while Gemini CLI's experimental hints ([#18782](https://github.com/google-gemini/gemini-cli/issues/18782)) take the system-role path. Both are coherent; `session/inject` and `session/remind` are the two ACP surfaces for that split. They could share the underlying break-point machinery, but the role contract differs at call sites, and conflating them muddies what the agent sees in its message list. ### Why fire-and-forget on acceptance instead of waiting for delivery? @@ -304,7 +304,7 @@ Because delivery for `queue` can be arbitrarily delayed (a long-running turn mig ### Why must the agent return `messageId` synchronously when some agent SDKs don't have one until delivery? -The case in mind is the Claude Agent SDK, which doesn't assign a stable message ID until the moment it inserts the message into context, which can be well after the inject is accepted. For those SDKs, the ACP adapter has to mint its own wire ID at accept-time and remember the mapping until the underlying SDK delivers. That's a real cost, and it's the same shape adapters already pay for streaming chunks under the [message-id](../message-id) RFD. +The case in mind is the Claude Agent SDK, which doesn't assign a stable message ID until the moment it inserts the message into context, which can be well after the inject is accepted. The downstream symptom is visible in [claude-agent-sdk-typescript#67](https://github.com/anthropics/claude-agent-sdk-typescript/issues/67): messages yielded into the async generator mid-turn get processed by the agent but never appear in the visible transcript, leaving clients with no way to know which messages actually landed. The `user_message` echo plus a stable wire ID is exactly the protocol-level hook that gap calls for. For those SDKs, the ACP adapter has to mint its own wire ID at accept-time and remember the mapping until the underlying SDK delivers. That's a real cost, and it's the same shape adapters already pay for streaming chunks under the [message-id](../message-id) RFD. The alternatives are worse. A client-provided `messageId` splits collision-avoidance across both sides and breaks fan-out to observers, which is what `message-id` considered and explicitly rejected. A deferred-id flow (accept now, the ID arrives in the `user_message` notification) leaves revoke without a handle until the moment revoke stops being useful. @@ -338,6 +338,6 @@ Steer delivers at the next break-point; queue delivers at idle. The steer arrive ## Revision history -- 2026-05-24: Incorporated review feedback from PR [#1261](https://github.com/agentclientprotocol/agent-client-protocol/pull/1261). Reframed the comparison with PR #484 (acknowledging its early `end_turn` as a legitimate steer-via-yield design). Made revoke mandatory and replace opt-in, with the asymmetry explained. Removed the "irrevocably queued" hedge so delivery is defined strictly as `user_message` emission. Dropped the agent-may-drop-steers-during-permission carve-out in favor of adapter-buffers. Added FAQ entries on the adapter burden for `messageId` (Claude Agent SDK case) and on why `queue` is in the protocol rather than client-side only. +- 2026-05-24: Incorporated review feedback from PR [#1261](https://github.com/agentclientprotocol/agent-client-protocol/pull/1261). Reframed the comparison with PR #484 (acknowledging its early `end_turn` as a legitimate steer-via-yield design). Made revoke mandatory and replace opt-in, with the asymmetry explained. Removed the "irrevocably queued" hedge so delivery is defined strictly as `user_message` emission. Dropped the agent-may-drop-steers-during-permission carve-out in favor of adapter-buffers. Added FAQ entries on the adapter burden for `messageId` (Claude Agent SDK case) and on why `queue` is in the protocol rather than client-side only. Refreshed prior-art coverage with Codex's app-server `turn/steer` / `turn/interrupt` methods, Cursor 3 "Glass" keybind split, Replit Agent 4 Queue, Claude Managed Agents, and Devin as a single-mode counterpoint. Strengthened the adapter FAQ with the concrete SDK-bug symptom from [claude-agent-sdk-typescript#67](https://github.com/anthropics/claude-agent-sdk-typescript/issues/67) and noted the user-role vs system-role design split. - 2026-05-20: Replaced request-cancellation-based revocation with `messageId`-based pending revoke/replace semantics, including race outcomes for already-delivered messages. - 2026-05-19: Initial draft, lifted from [discussion #1220](https://github.com/orgs/agentclientprotocol/discussions/1220) with two changes: explicit acknowledgment of and supersession over PR #484, and tightened `messageId` ownership to match the v2 prompt lifecycle (agent-owned, returned in the inject response). From ec3397d88843446b040d239785f6692d56f7c140 Mon Sep 17 00:00:00 2001 From: Kenneth Sinder Date: Sun, 24 May 2026 12:57:18 -0700 Subject: [PATCH 7/9] =?UTF-8?q?docs(rfd):=20final=20pass=20=E2=80=94=20fac?= =?UTF-8?q?t-check=20prior=20art=20+=20concise=20history?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Correct Cursor 1.4 vs Cursor 3 "Glass" conflation: the Alt/Cmd+Enter keybind split is from the 1.4 changelog, not the Glass IDE rebrand. - Fix Replit attribution: drop "Agent 4" and "explicit queue/steer framing" — neither is in the source. Update the redirected blog URL. - Add FAQ comparing to Codex's `turn/steer` / `turn/interrupt`: documents the threadId/expectedTurnId/input shape, calls out that Codex has no revoke/replace, and articulates why ACP's pending-edit surface matters for editor clients. - Collapse the 2026-05-24 revision history entry to one line. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/rfds/v2/session-inject.mdx | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/docs/rfds/v2/session-inject.mdx b/docs/rfds/v2/session-inject.mdx index 2dd88cd1..eaa5f6ec 100644 --- a/docs/rfds/v2/session-inject.mdx +++ b/docs/rfds/v2/session-inject.mdx @@ -18,8 +18,8 @@ This is a v2 follow-on. It rides on the [v2 prompt lifecycle](./prompt): short-l Coding-agent UIs keep landing on two ways to add input on top of "send a prompt and wait": -- **Queue.** The user types while the agent is mid-turn; the text is held and delivered once the agent goes idle. Shipped in Cursor since 1.2 as a FIFO ([forum thread](https://forum.cursor.com/t/queue-agent-messages/110883)). Shipped in Windsurf Cascade: type while Cascade is working, press Enter, and it lands after the current task ([docs](https://docs.windsurf.com/plugins/cascade/cascade-overview)). Shipped in Gemini CLI with explicit `wait_for_idle` vs `wait_for_response` settings ([PR #7867](https://github.com/google-gemini/gemini-cli/pull/7867)), with documented ordering bugs when tool calls are involved ([#17282](https://github.com/google-gemini/gemini-cli/issues/17282), [#17719](https://github.com/google-gemini/gemini-cli/issues/17719)). Shipped in Claude Code, with [known boundary bugs](https://github.com/anthropics/claude-code/issues/49373) where it flushes at every LLM pause instead of true end-of-turn ([related](https://github.com/anthropics/claude-code/issues/36326)). Shipped in [Replit Agent 4 as "Queue"](https://blog.replit.com/introducing-queue-a-smarter-way-to-work-with-agent), with explicit framing of queue and steer as separate user intents. -- **Steer.** The user types while the agent is mid-turn; the text is delivered at the next safe break-point, typically after the in-flight tool call. Codex CLI shipped this opt-in via `/experimental`, and the code base has evolved its own pending-steers buffer, requeue-on-turn-complete path, pause-on-usage-limits behavior, and replay edge-case surface (see PRs [#12569](https://github.com/openai/codex/pull/12569), [#15235](https://github.com/openai/codex/pull/15235), [#22226](https://github.com/openai/codex/pull/22226), [#22879](https://github.com/openai/codex/pull/22879) and issues [#19975](https://github.com/openai/codex/issues/19975), [#22815](https://github.com/openai/codex/issues/22815)). Codex's [app-server protocol](https://developers.openai.com/codex/app-server) lifts the same behavior into named methods: `turn/steer` (append user input to an in-flight turn without starting a new one) and `turn/interrupt` (cancel the turn). Cursor 3 "Glass" surfaces the same distinction as a keyboard split: Alt+Enter queues, Cmd+Enter interrupts-and-sends ([1.4 changelog](https://cursor.com/changelog/1-4)). Gemini CLI has an experimental version under [#18782](https://github.com/google-gemini/gemini-cli/issues/18782) that takes a different shape: injects the steer as a hidden system-role hint rather than a user message. The original Codex request issues ([#4312](https://github.com/openai/codex/issues/4312), [#9096](https://github.com/openai/codex/issues/9096)) frame the feature as "queued corrections while the current turn is still running," which is the same problem stated from the user's seat. +- **Queue.** The user types while the agent is mid-turn; the text is held and delivered once the agent goes idle. Shipped in Cursor since 1.2 as a FIFO ([forum thread](https://forum.cursor.com/t/queue-agent-messages/110883)). Shipped in Windsurf Cascade: type while Cascade is working, press Enter, and it lands after the current task ([docs](https://docs.windsurf.com/plugins/cascade/cascade-overview)). Shipped in Gemini CLI with explicit `wait_for_idle` vs `wait_for_response` settings ([PR #7867](https://github.com/google-gemini/gemini-cli/pull/7867)), with documented ordering bugs when tool calls are involved ([#17282](https://github.com/google-gemini/gemini-cli/issues/17282), [#17719](https://github.com/google-gemini/gemini-cli/issues/17719)). Shipped in Claude Code, with [known boundary bugs](https://github.com/anthropics/claude-code/issues/49373) where it flushes at every LLM pause instead of true end-of-turn ([related](https://github.com/anthropics/claude-code/issues/36326)). Shipped in [Replit Agent as "Queue"](https://replit.com/blog/introducing-queue-a-smarter-way-to-work-with-agent) with a dedicated message-queue drawer for submitting tasks while the agent is working. +- **Steer.** The user types while the agent is mid-turn; the text is delivered at the next safe break-point, typically after the in-flight tool call. Codex CLI shipped this opt-in via `/experimental`, and the code base has evolved its own pending-steers buffer, requeue-on-turn-complete path, pause-on-usage-limits behavior, and replay edge-case surface (see PRs [#12569](https://github.com/openai/codex/pull/12569), [#15235](https://github.com/openai/codex/pull/15235), [#22226](https://github.com/openai/codex/pull/22226), [#22879](https://github.com/openai/codex/pull/22879) and issues [#19975](https://github.com/openai/codex/issues/19975), [#22815](https://github.com/openai/codex/issues/22815)). Codex's [app-server protocol](https://developers.openai.com/codex/app-server) lifts the same behavior into named methods: `turn/steer` (append user input to an in-flight turn without starting a new one) and `turn/interrupt` (cancel the turn, status `"interrupted"`). Cursor surfaces the same distinction as a keyboard split, with Option/Alt+Enter to queue and Cmd/Ctrl+Enter to interrupt-and-send ([1.4 changelog](https://cursor.com/changelog/1-4)). Gemini CLI has an experimental version under [#18782](https://github.com/google-gemini/gemini-cli/issues/18782) that takes a different shape: injects the steer as a hidden system-role hint rather than a user message. The original Codex request issues ([#4312](https://github.com/openai/codex/issues/4312), [#9096](https://github.com/openai/codex/issues/9096)) frame the feature as "queued corrections while the current turn is still running," which is the same problem stated from the user's seat. Server-side managed runtimes have converged on similar primitives. Anthropic's [Claude Managed Agents](https://platform.claude.com/docs/en/managed-agents/overview) (`managed-agents-2026-04-01` beta) accepts mid-execution `user.message` events on `/v1/sessions/{id}/events` while the session is `running`, with `idle` / `running` / `rescheduling` / `terminated` statuses that line up with the v2 prompt lifecycle's `state_change`. [Devin's sessions API](https://docs.devin.ai/api-reference/v1/sessions/send-a-message-to-an-existing-devin-session) takes a single-mode cut: `POST /v1/sessions/{id}/message` regardless of whether the session is idle or running. The "queue and steer are distinct" framing has also surfaced explicitly across ecosystems (see `langchain-ai/deepagents` [#1390 "Disambiguate steer/queue"](https://github.com/langchain-ai/deepagents/issues/1390) and `anomalyco/opencode` [#21388 "Allow messages to be sent mid-turn — interrupt, queue, or inject"](https://github.com/anomalyco/opencode/issues/21388)). Most tools are landing on it independently. @@ -296,7 +296,13 @@ Rewind is destructive: it truncates history and changes what the agent sees as c ### Is this related to discussion [#1224 `session/remind`](https://github.com/orgs/agentclientprotocol/discussions/1224)? -Sibling but separate. `session/inject` carries user-role content; the agent treats it as the user speaking. `session/remind` (proposed separately) carries system-role context that the host wants the agent to see without faking a user turn. The split shows up in the wild: Codex's `turn/steer` and Cursor 3 "Glass" keep the steer in the user-role channel, while Gemini CLI's experimental hints ([#18782](https://github.com/google-gemini/gemini-cli/issues/18782)) take the system-role path. Both are coherent; `session/inject` and `session/remind` are the two ACP surfaces for that split. They could share the underlying break-point machinery, but the role contract differs at call sites, and conflating them muddies what the agent sees in its message list. +Sibling but separate. `session/inject` carries user-role content; the agent treats it as the user speaking. `session/remind` (proposed separately) carries system-role context that the host wants the agent to see without faking a user turn. The split shows up in the wild: Codex's `turn/steer` and Cursor's keybind-driven mid-turn input keep the steer in the user-role channel, while Gemini CLI's experimental hints ([#18782](https://github.com/google-gemini/gemini-cli/issues/18782)) take the system-role path. Both are coherent; `session/inject` and `session/remind` are the two ACP surfaces for that split. They could share the underlying break-point machinery, but the role contract differs at call sites, and conflating them muddies what the agent sees in its message list. + +### How does this compare to Codex's `turn/steer`? + +Codex's [app-server protocol](https://developers.openai.com/codex/app-server) is the closest existing analogue. Its `turn/steer` appends user input to an active in-flight turn — taking a `threadId`, an `input` array, and an `expectedTurnId` that must match the running turn — and rejecting turn-level overrides like `model`, `cwd`, `sandboxPolicy`, or `outputSchema`. `turn/interrupt` cancels with `status: "interrupted"`. There is no revoke or replace: once `turn/steer` is called, the input is committed. + +ACP's design is structurally similar but adds two surfaces that `turn/steer` doesn't have. First, an explicit `queue` mode for "deliver after idle" alongside steer-style "deliver at next break-point," for the multi-client and replay reasons the queue FAQ covers below. Second, the `messageId` handle for revoke and (optional) replace before delivery. Pending-message edits are a real UX surface in editor clients — adding "wait, also include the test file" to a queued message before the agent picks it up is the common case — and a CLI-shaped commit-on-submit contract is brittle there. ### Why fire-and-forget on acceptance instead of waiting for delivery? @@ -338,6 +344,6 @@ Steer delivers at the next break-point; queue delivers at idle. The steer arrive ## Revision history -- 2026-05-24: Incorporated review feedback from PR [#1261](https://github.com/agentclientprotocol/agent-client-protocol/pull/1261). Reframed the comparison with PR #484 (acknowledging its early `end_turn` as a legitimate steer-via-yield design). Made revoke mandatory and replace opt-in, with the asymmetry explained. Removed the "irrevocably queued" hedge so delivery is defined strictly as `user_message` emission. Dropped the agent-may-drop-steers-during-permission carve-out in favor of adapter-buffers. Added FAQ entries on the adapter burden for `messageId` (Claude Agent SDK case) and on why `queue` is in the protocol rather than client-side only. Refreshed prior-art coverage with Codex's app-server `turn/steer` / `turn/interrupt` methods, Cursor 3 "Glass" keybind split, Replit Agent 4 Queue, Claude Managed Agents, and Devin as a single-mode counterpoint. Strengthened the adapter FAQ with the concrete SDK-bug symptom from [claude-agent-sdk-typescript#67](https://github.com/anthropics/claude-agent-sdk-typescript/issues/67) and noted the user-role vs system-role design split. +- 2026-05-24: Review-feedback revisions from PR [#1261](https://github.com/agentclientprotocol/agent-client-protocol/pull/1261): tightened revoke/replace contract, refined delivery semantics, refreshed prior-art coverage (Codex `turn/steer`, Cursor 1.4 keybind split, Replit Queue, Claude Managed Agents, Devin), and added FAQ entries on adapter burden and queue rationale. - 2026-05-20: Replaced request-cancellation-based revocation with `messageId`-based pending revoke/replace semantics, including race outcomes for already-delivered messages. - 2026-05-19: Initial draft, lifted from [discussion #1220](https://github.com/orgs/agentclientprotocol/discussions/1220) with two changes: explicit acknowledgment of and supersession over PR #484, and tightened `messageId` ownership to match the v2 prompt lifecycle (agent-owned, returned in the inject response). From fe9c6146a7a3ecb68e4f41ea471dd151193422fa Mon Sep 17 00:00:00 2001 From: Kenneth Sinder Date: Sun, 24 May 2026 13:04:07 -0700 Subject: [PATCH 8/9] docs(rfd): address fresh-eyes review findings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Capability shape tightened: `modes` is required and non-empty when `inject` is present. Added the `steer_in_stream` capability (`["interrupt"]`, `["finish"]`, or both) so clients can know whether a mid-stream steer truncates the assistant turn or waits for it to finish — closes the previously "agent-defined" interop gap that produced visibly different transcripts for the same client action. - Multi-client ordering claim weakened to per-controller FIFO with agent-defined cross-controller order observable via `user_message` delivery — what the protocol can actually enforce. - Replace position semantics clarified to "within that mode's pending order" now that modes are segregated. - Error code recommendation moved from `-32602 Invalid params` (request wasn't malformed) into the JSON-RPC server-error range (`-32000`–`-32099`), with the exact code left to schema-definition time and `error.data.reason` as the discriminator. - Steer on an idle session is now explicitly an error; clients use `session/prompt` for input that starts a turn. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/rfds/v2/session-inject.mdx | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/docs/rfds/v2/session-inject.mdx b/docs/rfds/v2/session-inject.mdx index eaa5f6ec..9947605f 100644 --- a/docs/rfds/v2/session-inject.mdx +++ b/docs/rfds/v2/session-inject.mdx @@ -83,6 +83,7 @@ Until that `user_message` notification is emitted, the message is pending. Pendi "session": { "inject": { "modes": ["queue", "steer"], + "steer_in_stream": ["interrupt", "finish"], "pending": { "replace": true }, }, }, @@ -90,7 +91,9 @@ Until that `user_message` notification is emitted, the message is pending. Pendi } ``` -An agent that only supports buffering at end-of-turn declares `["queue"]`. A tool-loop agent that can break safely between tool calls declares `["queue", "steer"]`. A streaming-only agent that can't do either declares the absence of the capability and clients fall back to `session/cancel`. +`modes` is required and non-empty whenever the `inject` key is present. An agent that only supports buffering at end-of-turn declares `["queue"]`. A tool-loop agent that can break safely between tool calls declares `["queue", "steer"]`. A streaming-only agent that can't do either omits the `inject` key entirely; clients without the capability fall back to `session/cancel`. + +`steer_in_stream` declares how an agent handles a steer that arrives mid-LLM-stream with no tool call pending: `["interrupt"]` means the in-flight stream is always truncated and the agent re-prompts with the steer included; `["finish"]` means the stream always completes before delivery; both values together mean the agent picks per occurrence, and clients must handle either outcome. Required and non-empty when `modes` contains `"steer"`. Revoke is mandatory: any agent that supports `session/inject` must also support `session/revoke_inject`. If the agent can accept a pending inject, it can drop one before delivery; the fallback is a tombstone flag plus skipping the `user_message` emit. Agents that genuinely can't do that shouldn't advertise `session/inject` at all. @@ -103,9 +106,11 @@ Replace is opt-in via `pending.replace`. Content editing is genuinely harder tha **`steer`** is the interesting one. The agent delivers the content at the next safe break-point during the running turn: - **Mid tool call**: complete the tool call, then deliver before the next LLM call. -- **Mid LLM stream with no tool call pending**: agent-defined. The agent may either interrupt the stream and re-prompt with the steer text included, or let the stream finish and deliver before the next iteration. Both behaviors are spec-conformant; clients should not assume one or the other. +- **Mid LLM stream with no tool call pending**: behavior is declared via the `steer_in_stream` capability (see the capability section). The agent may truncate the stream and re-prompt with the steer text included, or let the stream finish and deliver before the next iteration. Clients consult the capability to know which to expect. - **Mid model-process pause** (provider rate limit, usage-limit cap): hold pending steers until the pause clears. This mirrors the shape Codex landed on in [PR #22226](https://github.com/openai/codex/pull/22226). +A steer on an already-idle session has no running turn to steer; the agent returns an error. Clients should use `session/prompt` for input that starts a new turn. + Both modes are fire-and-forget in the sense that the response confirms acceptance, not model-process. Both deliveries are observable via the same `user_message` notification. While an inject is pending, the `messageId` returned from `session/inject` is the only stable handle clients should use for pending operations. ## Shiny future @@ -136,7 +141,8 @@ In `schema/schema.json`: 3. Add `SessionInjectResponse`: - `messageId: string` (opaque, agent-owned) 4. Add `agentCapabilities.session.inject`: - - `modes: SessionInjectMode[]` + - `modes: SessionInjectMode[]` (required, non-empty) + - `steer_in_stream: ("interrupt" | "finish")[]` (required and non-empty when `modes` contains `"steer"`) - `pending.replace: boolean` (optional; default `false`) 5. Add `SessionRevokeInjectRequest`: - `sessionId: SessionId` @@ -181,15 +187,15 @@ A successful response means the agent will not emit a future `user_message` for Revoke is ordered against delivery at the agent: - If revoke wins, the pending inject is dropped. No later `user_message` is emitted for that `messageId`. -- If delivery wins, the agent returns an error. The recommended shape is `-32602 Invalid params` with `error.data.reason: "already_delivered"` and the `messageId`. The client should keep the delivered message in the transcript. -- If the `messageId` is unknown for that session, return `-32602 Invalid params` with `error.data.reason: "unknown_message_id"`. +- If delivery wins, the agent returns an error in the JSON-RPC server-error range (`-32000` to `-32099`; the exact code is left to schema definition) with `error.data.reason: "already_delivered"` and the `messageId`. The client should keep the delivered message in the transcript. +- If the `messageId` is unknown for that session, return an error in the same range with `error.data.reason: "unknown_message_id"`. - If the message was already revoked and the agent still remembers the tombstone, return success. This makes revoke safe to retry after transport loss. Normal session errors still apply. If the session is unknown, closed, or no longer accepts input, return the same error the agent uses for other session requests. Delivery is defined as the point where the agent emits the matching `user_message` notification. Before that point, revoke succeeds and no `user_message` is emitted for that `messageId`. At or after that point, revoke returns `already_delivered`, and the client should expect the `user_message` if it has not already seen it. There is no third "committed but not delivered" state in the protocol; any internal commit the agent does before emitting `user_message` is purely for its own bookkeeping. Removing or rewriting a delivered message is [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory, not inject. -Agents may also support replacing pending content. Replacement keeps the same `messageId`, mode, and queue position; only `content` changes. The eventual `user_message` carries the final content, not the edit history. +Agents may also support replacing pending content. Replacement keeps the same `messageId`, mode, and position within that mode's pending order; only `content` changes. The eventual `user_message` carries the final content, not the edit history. ```jsonc { @@ -217,7 +223,7 @@ Agents may also support replacing pending content. Replacement keeps the same `m } ``` -The same failure cases apply to replace: `already_delivered`, `unknown_message_id`, and unsupported capability. If `pending.replace` was not advertised, clients should not call the method; agents may still return `Method not found` or `-32602 Invalid params` with `error.data.reason: "replace_not_supported"`. If replace races with delivery, the agent either delivers the old content and returns `already_delivered`, or accepts the replacement and later emits `user_message` with the new content. It must not report success and then deliver the old content. +The same failure cases apply to replace: `already_delivered`, `unknown_message_id`, and unsupported capability. If `pending.replace` was not advertised, clients should not call the method; agents may still return `Method not found` or a server-range error with `error.data.reason: "replace_not_supported"`. If replace races with delivery, the agent either delivers the old content and returns `already_delivered`, or accepts the replacement and later emits `user_message` with the new content. It must not report success and then deliver the old content. `$/cancel_request` still applies to an in-flight `session/inject` request before the accept response is sent. Once `session/inject` has returned a `messageId`, request cancellation is no longer the right tool; the request is complete, and the message is managed by `session/revoke_inject` or `session/replace_inject`. @@ -231,7 +237,7 @@ This means a session loaded from disk doesn't carry "this was a steer" provenanc ### Multi-client interaction -With [multi-client attach (#533)](https://github.com/agentclientprotocol/agent-client-protocol/pull/533), multiple controllers can inject into the same session. Ordering across controllers is by arrival time at the agent (FIFO from each controller; interleaved across controllers). Observers cannot inject, revoke, or replace. +With [multi-client attach (#533)](https://github.com/agentclientprotocol/agent-client-protocol/pull/533), multiple controllers can inject into the same session. Per-controller delivery is FIFO. Cross-controller order is agent-defined and observable to every controller and observer via the `user_message` delivery order. Observers cannot inject, revoke, or replace. Every controller and observer sees the same `user_message` notification at delivery time, so transcripts converge. A revoked inject has no transcript update. This RFD does not add a shared pending-queue list or pending-message broadcast; clients can manage the pending handles returned to them, and delivery remains the shared signal. If an agent restricts revocation or replacement to the controller that created the pending inject, it should reject other controllers with a permission error instead of silently doing nothing. @@ -344,6 +350,6 @@ Steer delivers at the next break-point; queue delivers at idle. The steer arrive ## Revision history -- 2026-05-24: Review-feedback revisions from PR [#1261](https://github.com/agentclientprotocol/agent-client-protocol/pull/1261): tightened revoke/replace contract, refined delivery semantics, refreshed prior-art coverage (Codex `turn/steer`, Cursor 1.4 keybind split, Replit Queue, Claude Managed Agents, Devin), and added FAQ entries on adapter burden and queue rationale. +- 2026-05-24: Review-feedback revisions from PR [#1261](https://github.com/agentclientprotocol/agent-client-protocol/pull/1261): tightened revoke/replace and capability contracts, refined delivery semantics, refreshed prior-art coverage (Codex `turn/steer`, Cursor 1.4 keybind split, Replit Queue, Claude Managed Agents, Devin), and added FAQ entries on adapter burden, queue rationale, and Codex comparison. - 2026-05-20: Replaced request-cancellation-based revocation with `messageId`-based pending revoke/replace semantics, including race outcomes for already-delivered messages. - 2026-05-19: Initial draft, lifted from [discussion #1220](https://github.com/orgs/agentclientprotocol/discussions/1220) with two changes: explicit acknowledgment of and supersession over PR #484, and tightened `messageId` ownership to match the v2 prompt lifecycle (agent-owned, returned in the inject response). From 5f48a74f7569091431edd63156b78ecacc5a3599 Mon Sep 17 00:00:00 2001 From: Kenneth Sinder Date: Sun, 24 May 2026 13:07:55 -0700 Subject: [PATCH 9/9] docs(rfd): assign concrete error codes per ACP convention Vetted against ACP's existing ErrorCode union and the patterns in request-cancellation, next-edit-suggestions, and the multi-client-attach RFDs. The "exact code TBD" placeholder was duct tape; ACP RFDs assign specific codes paired with a `data.reason` discriminator string. Concrete assignments: - `-32002 "Resource not found"` (already defined): reused for `unknown_message_id`, since the semantics match. - `-32010 "Inject precondition failed"` (new): covers `already_delivered`, `no_running_turn`, `replace_not_supported`, with `error.data.reason` discriminating. Number chosen to leave -32001 through -32009 available for future generic-protocol codes. - `-32601 "Method not found"` (standard JSON-RPC): an acceptable response when `pending.replace` was not advertised, as an alternative to returning -32010 with `reason: "replace_not_supported"`. Updated all four error-response sites and added a schema-additions entry documenting the ErrorCode extension. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/rfds/v2/session-inject.mdx | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/rfds/v2/session-inject.mdx b/docs/rfds/v2/session-inject.mdx index 9947605f..3d5b7889 100644 --- a/docs/rfds/v2/session-inject.mdx +++ b/docs/rfds/v2/session-inject.mdx @@ -109,7 +109,7 @@ Replace is opt-in via `pending.replace`. Content editing is genuinely harder tha - **Mid LLM stream with no tool call pending**: behavior is declared via the `steer_in_stream` capability (see the capability section). The agent may truncate the stream and re-prompt with the steer text included, or let the stream finish and deliver before the next iteration. Clients consult the capability to know which to expect. - **Mid model-process pause** (provider rate limit, usage-limit cap): hold pending steers until the pause clears. This mirrors the shape Codex landed on in [PR #22226](https://github.com/openai/codex/pull/22226). -A steer on an already-idle session has no running turn to steer; the agent returns an error. Clients should use `session/prompt` for input that starts a new turn. +A steer on an already-idle session has no running turn to steer. The agent returns `-32010` with `error.data.reason: "no_running_turn"`. Clients should use `session/prompt` for input that starts a new turn. Both modes are fire-and-forget in the sense that the response confirms acceptance, not model-process. Both deliveries are observable via the same `user_message` notification. While an inject is pending, the `messageId` returned from `session/inject` is the only stable handle clients should use for pending operations. @@ -155,6 +155,7 @@ In `schema/schema.json`: 8. Add `SessionReplaceInjectResponse`: - `messageId: string` (same value, returned for symmetry with `session/inject`) 9. Register `session/inject`, `session/revoke_inject`, and `session/replace_inject` as request methods. +10. Extend the `ErrorCode` union with `-32010 "Inject precondition failed"`, documenting the `error.data.reason` discriminator values: `already_delivered`, `no_running_turn`, `replace_not_supported`. The existing `-32002 "Resource not found"` is reused for `unknown_message_id`. (Number chosen to leave `-32001` through `-32009` available for future generic-protocol codes; the actual value is subject to schema-PR review.) No changes to `session/update` or to existing notification types. The `user_message` notification introduced by the v2 prompt lifecycle is what carries the delivered content, with the same `messageId` that was returned from the inject response. @@ -187,8 +188,8 @@ A successful response means the agent will not emit a future `user_message` for Revoke is ordered against delivery at the agent: - If revoke wins, the pending inject is dropped. No later `user_message` is emitted for that `messageId`. -- If delivery wins, the agent returns an error in the JSON-RPC server-error range (`-32000` to `-32099`; the exact code is left to schema definition) with `error.data.reason: "already_delivered"` and the `messageId`. The client should keep the delivered message in the transcript. -- If the `messageId` is unknown for that session, return an error in the same range with `error.data.reason: "unknown_message_id"`. +- If delivery wins, the agent returns `-32010 "Inject precondition failed"` with `error.data.reason: "already_delivered"` and the `messageId`. The client should keep the delivered message in the transcript. +- If the `messageId` is unknown for that session, return the existing `-32002 "Resource not found"` with `error.data.reason: "unknown_message_id"`. - If the message was already revoked and the agent still remembers the tombstone, return success. This makes revoke safe to retry after transport loss. Normal session errors still apply. If the session is unknown, closed, or no longer accepts input, return the same error the agent uses for other session requests. @@ -223,7 +224,7 @@ Agents may also support replacing pending content. Replacement keeps the same `m } ``` -The same failure cases apply to replace: `already_delivered`, `unknown_message_id`, and unsupported capability. If `pending.replace` was not advertised, clients should not call the method; agents may still return `Method not found` or a server-range error with `error.data.reason: "replace_not_supported"`. If replace races with delivery, the agent either delivers the old content and returns `already_delivered`, or accepts the replacement and later emits `user_message` with the new content. It must not report success and then deliver the old content. +The same failure cases apply to replace, using the same codes: `-32010` for `already_delivered`, `-32002` for `unknown_message_id`. If `pending.replace` was not advertised, clients should not call `session/replace_inject`; agents that receive it anyway should return `-32601 "Method not found"` or `-32010` with `error.data.reason: "replace_not_supported"`. If replace races with delivery, the agent either delivers the old content and returns `already_delivered`, or accepts the replacement and later emits `user_message` with the new content. It must not report success and then deliver the old content. `$/cancel_request` still applies to an in-flight `session/inject` request before the accept response is sent. Once `session/inject` has returned a `messageId`, request cancellation is no longer the right tool; the request is complete, and the message is managed by `session/revoke_inject` or `session/replace_inject`.