Skip to content

[codex] Add request-scoped environment context#28936

Draft
sayan-oai wants to merge 8 commits into
mainfrom
codex/request-scoped-environments
Draft

[codex] Add request-scoped environment context#28936
sayan-oai wants to merge 8 commits into
mainfrom
codex/request-scoped-environments

Conversation

@sayan-oai

@sayan-oai sayan-oai commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Why

Environment availability will eventually change between sampling requests, while TurnContext is intentionally stable for an entire turn. Before enabling those updates, one model request needs a single frozen environment view so its model-visible context, advertised tools, and actual tool execution cannot disagree.

This PR is stacked on #28683.

What changed

  • Add StepContext as the request-scoped owner of TurnEnvironmentSnapshot, and remove environments from TurnContext.
  • Use the same step for environment context, tool-router construction, tool dispatch, approvals, and compaction. ToolRouter owns the step it was built with.
  • Keep yielded code-mode cells bound to their originating router/environment.
  • Pass only attached environment selections to child threads and reviews.

Most handler, runtime, and test changes mechanically replace turn.environments with the frozen step snapshot. The main review path is session/step_context.rs -> session/turn.rs -> tools/spec_plan.rs / tools/router.rs -> tools/context.rs. Code mode and delegated reviews are the non-mechanical pieces.

This does not yet recapture steps between sampling requests, persist a StepContextBaseline, or refresh AGENTS, skills, plugins, and MCP servers when an environment attaches. Those remain follow-up work.

Test plan

  • just test -p codex-core sampling_request_keeps_one_environment_view_for_context_and_tool_execution
  • just test -p codex-core code_mode_background_keeps_running_on_later_turn_without_wait
  • just test -p codex-core snapshot_keeps_starting_environment_until_it_can_be_attached
Reviewer notes

Overall goal

A turn can involve several model requests:

model request -> tool call -> model request -> tool call -> final answer

Environment availability may eventually change between those requests. TurnContext is intentionally stable for the whole turn, so it is the wrong place for environment state that may change more frequently.

This PR introduces StepContext, which holds the environment snapshot used for a model request. Its central invariant is:

One frozen environment snapshot
        |-- model-visible environment context
        |-- tools shown to the model
        `-- environment used when those tools execute

Without that invariant, the model could see environment A, receive tools for A, but have an actual tool call run in newly selected environment B.

This is a consistency foundation, not the reactive update feature. Today run_turn() captures one StepContext before its sampling loop and reuses it. A later PR will capture replacements between sampling requests, inject environment changes, and add StepContextBaseline. AGENTS, skills, plugins, and MCP reconciliation are deliberately not migrated here.

Change groups

Group Files What to look for
Core ownership session/step_context.rs, session/turn_context.rs TurnContext.environments is removed. StepContext contains the environment snapshot, computes tool availability, and chooses an effective cwd.
Request lifecycle session/turn.rs, session/mod.rs One step is captured and passed through context construction, compaction, router construction, and sampling.
Router ownership tools/spec_plan.rs, tools/router.rs, tools/context.rs, tools/parallel.rs The router owns the frozen step. Every ToolInvocation receives that exact step from the router, making router/execution mismatch difficult to express.
Model context context/environment_context.rs, context_manager/updates.rs, prompt_debug.rs, codex_thread.rs, session_startup_prewarm.rs Environment text and permissions cwd now come from the step. Persisted TurnContextItem behavior is intentionally unchanged.
Compaction compact.rs, compact_remote.rs, compact_remote_v2.rs, tasks/compact.rs Compaction can rebuild context and tools, so it must use the same step rather than taking another environment snapshot.
Tool consumers tools/handlers/{apply_patch,extension_tools,mcp,request_permissions,shell,view_image}.rs, tools/handlers/unified_exec/exec_command.rs, mcp_openai_file.rs Mostly mechanical: replace turn.environments with invocation.step.environments.
Tool runtimes tools/{orchestrator,sandboxing}.rs, tools/runtimes/{apply_patch,shell,unified_exec}.rs, unified_exec/{mod,process_manager}.rs ToolCtx, ApprovalCtx, and UnifiedExecContext carry the already-frozen snapshot through retries, sandboxing, and approvals. They do not take a new snapshot.
Children and reviews codex_delegate.rs, guardian/{review,review_session}.rs, session/review.rs, tasks/review.rs, multi-agent spawn files, agent-job files Child agents inherit attached selections only. Review and approval code retains the originating frozen snapshot where it needs consistent path or environment resolution.
Code mode tools/code_mode/{delegate,mod,execute_handler}.rs A yielded JavaScript cell remains paired with the router and environment from the request that created it, even after a later turn starts.
Temporary exceptions session/mcp.rs, tools/network_approval.rs, tasks/user_shell.rs MCP refresh and user-shell operations intentionally read live thread state. Async MCP/network reviews also use the latest snapshot until they can be tied to an originating tool call. Comments mark these limitations.
Tests Most *_tests.rs, session/tests.rs, spec_plan_tests.rs, router_tests.rs Mostly repetitive construction of StepContext::local_for_test() after removing environments from TurnContext.

Non-obvious pieces

StepContext::effective_cwd() chooses the attached primary environment's cwd, then the first starting environment's known cwd, then the legacy turn cwd. This lets context mention the intended workspace before its filesystem is usable, while tool availability still counts only attached environments.

ToolRouter owns Arc<StepContext> rather than receiving one snapshot for construction and another during dispatch. This structurally enforces the main invariant.

Child inheritance is deliberately asymmetric: the parent request may know about starting environments, but children receive only attached selections. This PR does not share pending startup operations across threads.

The skills/plugin path accepts StepContext only so extension input receives the correct attached environment handles. It does not rediscover skills or plugins.

Tests worth reading

sampling_request_keeps_one_environment_view_for_context_and_tool_execution starts with environment A, changes the live thread selection to B while the response is streaming, then proves the prompt, tools, and actual filesystem write all still use A.

code_mode_background_keeps_running_on_later_turn_without_wait proves a yielded cell created in workspace A does not start executing nested tools in workspace B after a later turn begins.

For an efficient review, read the core ownership, request lifecycle, router ownership, these two tests, and code mode closely. The tool-consumer and test-construction groups are largely safe to pattern-scan.

@sayan-oai sayan-oai force-pushed the codex/turn-environment-starting-snapshots branch 3 times, most recently from 253a6ba to bb8869d Compare June 19, 2026 04:52
Base automatically changed from codex/turn-environment-starting-snapshots to main June 19, 2026 05:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant