pre-merge by TonyRHouston · Pull Request #2 · TonyRHouston/codex

TonyRHouston · 2026-02-04T13:49:18Z

External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

### Summary - Parse all `web_search` tool actions (`search`, `find_in_page`, `open_page`). - Previously we only parsed + displayed `search`, which made the TUI appear to pause when the other actions were being used. - Show in progress `web_search` calls as `Searching the web` - Previously we only showed completed tool calls <img width="308" height="149" alt="image" src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845" /> ### Tests Added + updated tests, tested locally ### Follow ups Update VSCode extension to display these as well

# External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.

hopefully final this time (at least tonight) >_<

Fix resume --last prompt parsing by dropping the clap conflict on the codex resume subcommand so a positional prompt is accepted when --last is set. This aligns interactive resume behavior with exec-mode logic and avoids the “--last cannot be used with SESSION_ID” error. This addresses #6717

# External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.

[Experiment](https://console.statsig.com/50aWbk2p4R76rNX9lN5VUw/experiments/codex_web_search_rollout/summary) for default cached `web_search` completed; cached chosen as default. Update client to reflect that.

This was too much to ask for

It's overthinking so much on high and going over the context window.

## WHAT? - Bias aggregated output toward stderr under contention (2/3 stderr, 1/3 stdout) while keeping the 1 MiB cap. - Rebalance unused stderr share back to stdout when stderr is tiny to avoid underfilling. - Add tests for contention, small-stderr rebalance, and under-cap ordering (stdout then stderr). ## WHY? - Review feedback requested stderr priority under contention. - Avoid underfilled aggregated output when stderr is small while preserving a consistent cap across exec paths. ## HOW? - Update `aggregate_output` to compute stdout/stderr shares, then reassign unused capacity to the other stream. - Use the helper in both Windows and async exec paths. - Add regression tests for contention/rebalance and under-cap ordering. ## BEFORE ```rust // Best-effort aggregate: stdout then stderr (capped). let mut aggregated = Vec::with_capacity( stdout .text .len() .saturating_add(stderr.text.len()) .min(EXEC_OUTPUT_MAX_BYTES), ); append_capped(&mut aggregated, &stdout.text, EXEC_OUTPUT_MAX_BYTES); append_capped(&mut aggregated, &stderr.text, EXEC_OUTPUT_MAX_BYTES); let aggregated_output = StreamOutput { text: aggregated, truncated_after_lines: None, }; ``` ## AFTER ```rust fn aggregate_output( stdout: &StreamOutput<Vec<u8>>, stderr: &StreamOutput<Vec<u8>>, ) -> StreamOutput<Vec<u8>> { let total_len = stdout.text.len().saturating_add(stderr.text.len()); let max_bytes = EXEC_OUTPUT_MAX_BYTES; let mut aggregated = Vec::with_capacity(total_len.min(max_bytes)); if total_len <= max_bytes { aggregated.extend_from_slice(&stdout.text); aggregated.extend_from_slice(&stderr.text); return StreamOutput { text: aggregated, truncated_after_lines: None, }; } // Under contention, reserve 1/3 for stdout and 2/3 for stderr; rebalance unused stderr to stdout. let want_stdout = stdout.text.len().min(max_bytes / 3); let want_stderr = stderr.text.len(); let stderr_take = want_stderr.min(max_bytes.saturating_sub(want_stdout)); let remaining = max_bytes.saturating_sub(want_stdout + stderr_take); let stdout_take = want_stdout + remaining.min(stdout.text.len().saturating_sub(want_stdout)); aggregated.extend_from_slice(&stdout.text[..stdout_take]); aggregated.extend_from_slice(&stderr.text[..stderr_take]); StreamOutput { text: aggregated, truncated_after_lines: None, } } ``` ## TESTS - [x] `just fmt` - [x] `just fix -p codex-core` - [x] `cargo test -p codex-core aggregate_output_` - [x] `cargo test -p codex-core` - [x] `cargo test --all-features` ## FIXES Fixes #9758

Adds getting config requirement to backend-client. I made a slash command to test it (not included in this PR): <img width="726" height="330" alt="Screenshot 2026-01-27 at 15 20 41" src="https://github.com/user-attachments/assets/97222e7c-5078-485a-a5b2-a6630313901e" />

Updates pnpm to 10.28.2. to address security issues in prior versions of pnpm that can allow deps to execute lifecycle scripts against policy. I have read the CLA Document and I hereby sign the CLA

…leanly (#9944) ## Summary Refines the bottom footer layout to keep `% context left` right-aligned while making the left side degrade cleanly ## Behavior with empty textarea Full width: <img width="607" height="62" alt="Screenshot 2026-01-26 at 2 59 59 PM" src="https://github.com/user-attachments/assets/854f33b7-d714-40be-8840-a52eb3bda442" /> Less: <img width="412" height="66" alt="Screenshot 2026-01-26 at 2 59 48 PM" src="https://github.com/user-attachments/assets/9c501788-c3a2-4b34-8f0b-8ec4395b44fe" /> Min width: <img width="218" height="77" alt="Screenshot 2026-01-26 at 2 59 33 PM" src="https://github.com/user-attachments/assets/0bed2385-bdbf-4254-8ae4-ab3452243628" /> ## Behavior with message in textarea and agent running (steer enabled) Full width: <img width="753" height="63" alt="Screenshot 2026-01-26 at 4 33 54 PM" src="https://github.com/user-attachments/assets/1856b352-914a-44cf-813d-1cb50c7f183b" /> Less: <img width="353" height="61" alt="Screenshot 2026-01-26 at 4 30 12 PM" src="https://github.com/user-attachments/assets/d951c4d5-f3e7-4116-8fe1-6a6c712b3d48" /> Less: <img width="304" height="64" alt="Screenshot 2026-01-26 at 4 30 51 PM" src="https://github.com/user-attachments/assets/1433e994-5cbc-4e20-a98a-79eee13c8699" /> Less: <img width="235" height="61" alt="Screenshot 2026-01-26 at 4 30 56 PM" src="https://github.com/user-attachments/assets/e216c3c6-84cd-40fc-ae4d-83bf28947f0e" /> Less: <img width="165" height="59" alt="Screenshot 2026-01-26 at 4 31 08 PM" src="https://github.com/user-attachments/assets/027de5de-7185-47ce-b1cc-5363ea33d9b1" /> ## Notes / Edge Cases - In steer mode while typing, the queue hint no longer replaces the mode label; it renders as `tab to queue message · {Mode}`. - Collapse priorities differ by state: - With the queue hint active, `% context left` is hidden before shortening or dropping the queue hint. - In the empty + non-running state, `? for shortcuts` is dropped first, and `% context left` is only shown if `(shift+tab to cycle)` can also fit. - Transient instructional states (`?` overlay, Esc hint, Ctrl+C/D reminders, and flash/override hints) intentionally suppress the mode label (and context) to focus the next action. ## Implementation Notes - Renamed the base footer modes to make the state explicit: `ComposerEmpty` and `ComposerHasDraft`, and compute the base mode directly from emptiness. - Unified collapse behavior in `single_line_footer_layout` for both base modes, with: - Queue-hint behavior that prefers keeping the queue hint over context. - A cycle-hint guard that prevents context from reappearing after `(shift+tab to cycle)` is dropped. - Kept rendering responsibilities explicit: - `single_line_footer_layout` decides what fits. - `render_footer_line` renders a chosen line. - `render_footer_from_props` renders the canonical mode-to-text mapping. - Expanded snapshot coverage: - Added `footer_collapse_snapshots` in `chat_composer.rs` to lock the distinct collapse states across widths. - Consolidated the width-aware snapshot helper usage (e.g., `snapshot_composer_state_with_width`, `snapshot_footer_with_mode_indicator`).

### Summary - Adds an optional SOCKS5 listener via `rama-socks5` - SOCKS5 is disabled by default and gated by config - Reuses existing policy enforcement and blocked-request recording - Blocks SOCKS5 in limited mode to prevent method-policy bypass - Applies bind clamping to the SOCKS5 listener ### Config New/used fields under `network_proxy`: - `enable_socks5` - `socks_url` - `enable_socks5_udp` ### Scope - Changes limited to `codex-rs/network-proxy` (+ `codex-rs/Cargo.lock`) ### Testing ```bash cd codex-rs just fmt cargo test -p codex-network-proxy --offline

…til first turn (#9950) ### Overview Currently calling `thread/resume` will always bump the thread's `updated_at` timestamp. This PR makes it the `updated_at` timestamp changes only if a turn is triggered. ### Additonal context What we typically do on resuming a thread is **always** writing “initial context” to the rollout file immediately. This initial context includes: - Developer instructions derived from sandbox/approval policy + cwd - Optional developer instructions (if provided) - Optional collaboration-mode instructions - Optional user instructions (if provided) - Environment context (cwd, shell, etc.) This PR defers writing the “initial context” to the rollout file until the first `turn/start`, so we don't inadvertently bump the thread's `updated_at` timestamp until a turn is actually triggered. This works even though both `thread/resume` and `turn/start` accept overrides (such as `model`, `cwd`, etc.) because the initial context is seeded from the effective `TurnContext` in memory, computed at `turn/start` time, after both sets of overrides have been applied. **NOTE**: This is a very short-lived solution until we introduce sqlite. Then we can remove this.

Updates the `CodexOptions` passed to the `Codex()` constructor in the SDK to support a `config` property that is a map of configuration data that will be transformed into `--config` flags passed to the invocation of `codex`. Therefore, something like this: ```typescript const codex = new Codex({ config: { show_raw_agent_reasoning: true, sandbox_workspace_write: { network_access: true }, }, }); ``` would result in the following args being added to the invocation of `codex`: ```shell --config show_raw_agent_reasoning=true --config sandbox_workspace_write.network_access=true ```

Threads sandbox updates through OverrideTurnContext for active turn Passes computed sandbox type into safety/exec

Constructors with long param lists can be hard to reason about when a number of the args are `None`, in practice. Introducing a struct to use as the args type helps make things more self-documenting.

Co-authored-by: Michael Bolin <mbolin@openai.com>

Auto-enable live `web_search` tool when sandbox policy is `DangerFullAccess`. Explicitly setting `web_search` (canonical setting), or enabling `web_search_cached` or `web_search_request` still takes precedence over this sandbox-policy-driven enablement.

## What? - Render an MCP image output cell whenever a decodable image block exists in `CallToolResult.content` (including text-before-image or malformed image before valid image). ## Why? - Tool results that include caption text before the image currently drop the image output cell. - A malformed image block can also suppress later valid image output. ## How? - Iterate `content` and return the first successfully decoded image instead of only checking the first block. - Add unit tests that cover text-before-image ordering and invalid-image-before-valid. ## Before ```rust let image = match result { Ok(mcp_types::CallToolResult { content, .. }) => { if let Some(mcp_types::ContentBlock::ImageContent(image)) = content.first() { // decode image (fails -> None) } else { None } } _ => None, }?; ``` ## After ```rust let image = result .as_ref() .ok()? .content .iter() .find_map(decode_mcp_image)?; ``` ## Risk / Impact - Low: only affects image cell creation for MCP tool results; no change for non-image outputs. ## Tests - [x] `just fmt` - [x] `cargo test -p codex-tui` - [x] Rerun after branch update (2026-01-27): `just fmt`, `cargo test -p codex-tui` Manual testing # Manual testing: MCP image tool result rendering (Codex TUI) # Build the rmcp stdio test server binary: cd codex-rs cargo build -p codex-rmcp-client --bin test_stdio_server # Register the server as an MCP server (absolute path to the built binary): codex mcp add mcpimg -- /Users/joshka/code/codex-pr-review/codex-rs/target/debug/test_stdio_server # Then in Codex TUI, ask it to call: - mcpimg.image_scenario({"scenario":"image_only"}) - mcpimg.image_scenario({"scenario":"text_then_image","caption":"Here is the image:"}) - mcpimg.image_scenario({"scenario":"invalid_base64_then_image"}) - mcpimg.image_scenario({"scenario":"invalid_image_bytes_then_image"}) - mcpimg.image_scenario({"scenario":"multiple_valid_images"}) - mcpimg.image_scenario({"scenario":"image_then_text","caption":"Here is the image:"}) - mcpimg.image_scenario({"scenario":"text_only","caption":"Here is the image:"}) # Expected: # - You should see an extra history cell: "tool result (image output)" when the # tool result contains at least one decodable image block (even if earlier # blocks are text or invalid images). Fixes #9814 --------- Co-authored-by: Josh McKinney <joshka@openai.com>

### Motivation - The local OAuth callback server returned a generic "Invalid OAuth callback" on failures even when the query contained an `error_description`, making it hard to debug OAuth failures. ### Description - Update `codex-rs/rmcp-client/src/perform_oauth_login.rs` to surface `error_description` values from the callback query in the HTTP response. - Introduce a `CallbackOutcome` enum and change `parse_oauth_callback` to return it, parsing `code`, `state`, and `error_description` from the query string. - Change `spawn_callback_server` to match on `CallbackOutcome` and return `OAuth error: <description>` with a 400 status when `error_description` is present, while preserving the existing success and invalid flows. - Use inline formatting for the error response string. ### Testing - Ran `just fmt` in the `codex-rs` workspace to format changes successfully. - Ran `cargo test -p codex-rmcp-client` and all tests passed. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_6971aadc68d0832e93159efea8cd48a9)

### Motivation - Improve UX by making it explicit that `VISUAL`/`EDITOR` must be set before launching Codex, not during a running session. ### Description - Update the external editor error text in `codex-rs/tui/src/app.rs` to: `"Cannot open external editor: set $VISUAL or $EDITOR before starting Codex."` and run `just fmt` to apply formatting. ### Testing - Ran `just fmt` successfully; attempted `cargo test -p codex-tui` but it failed due to network errors when fetching git dependencies (tests did not complete). ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_6972c2c984948329b1a37d5c5839aff3)

…10412) This updates our generated TypeScript types to be more correct with how the server actually behaves, **specifically for JSON-RPC requests**. Before this PR, we'd generate `field: T | null`. After this PR, we will have `field?: T | null`. The latter matches how the server actually works, in that if an optional field is omitted, the server will treat it as null. This also makes it less annoying in theory for clients to upgrade to newer versions of Codex, since adding a new optional field to a JSON-RPC request should not require a client change. NOTE: This only applies to JSON-RPC requests. All other payloads (i.e. responses, notifications) will return `field: T | null` as usual.

One of our partners flagged that they were seeing the wrong order of events when running `review/start` with command exec approvals: ``` {"method":"item/commandExecution/requestApproval","id":0,"params":{"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0","itemId":"0","reason":"`/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'` requires approval: Xcode-required approval: Require explicit user confirmation for all commands.","proposedExecpolicyAmendment":null}} {"method":"item/started","params":{"item":{"type":"commandExecution","id":"call_AEjlbHqLYNM7kbU3N6uw1CNi","command":"/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'","cwd":"/Users/devingreen/Desktop/SampleProject","processId":null,"status":"inProgress","commandActions":[{"type":"unknown","command":"git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat"}],"aggregatedOutput":null,"exitCode":null,"durationMs":null},"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0"}} ``` **Key fix**: In the review sub‑agent delegate we were forwarding exec (and patch) approvals using the parent turn id (`parent_ctx.sub_id`) as the approval call_id. That made `item/commandExecution/requestApproval.itemId` differ from the actual `item/started` id. We now forward the sub‑agent’s `call_id` from the approval event instead, so the approval item id matches the commandExecution item id in review flows. Here’s the expected event order for an inline `review/start` that triggers an exec approval after this fix: 1. Response to review/start (JSON‑RPC response) - Includes `turn` (status inProgress) and `review_thread_id` (same as parent thread for inline). 2. `turn/started` notification - turnId is the review turn id (e.g., "0"). 3. `item/started` → EnteredReviewMode - item.id == turnId, marks entry into review mode. 4. `item/started` → commandExecution - item.id == <call_id> (e.g., "review-call-1"), status: inProgress. 5. `item/commandExecution/requestApproval` request - JSON‑RPC request (not a notification). - params.itemId == <call_id> and params.turnId == turnId. 6. Client replies to approval request (Approved / Declined / etc). 7. If approved: - Optional `item/commandExecution/outputDelta` notifications. - `item/completed` → commandExecution with status and exitCode. 8. Review finishes: - `item/started` → ExitedReviewMode - `item/completed` → ExitedReviewMode - (Agent message items may also appear, depending on review output.) 9. `turn/completed` notification The key being #4 and #5 are now in the proper order with the correct item id.

## Summary This PR updates `request_user_input` behavior and Default-mode guidance to match current collaboration-mode semantics and reduce model confusion. ## Why - `request_user_input` should be explicitly documented as **Plan-only**. - Tool description and runtime availability checks should be driven by the **same centralized mode policy**. - Default mode prompt needed stronger execution guidance and explicit instruction that `request_user_input` is unavailable. - Error messages should report the **actual mode name** (not aliases that can read as misleading). ## What changed - Centralized `request_user_input` mode policy in `core` handler logic: - Added a single allowed-modes config (`Plan` only). - Reused that policy for: - runtime rejection messaging - tool description text - Updated tool description to include availability constraint: - `"This tool is only available in Plan mode."` - Updated runtime rejection behavior: - `Default` -> `"request_user_input is unavailable in Default mode"` - `Execute` -> `"request_user_input is unavailable in Execute mode"` - `PairProgramming` -> `"request_user_input is unavailable in Pair Programming mode"` - Strengthened Default collaboration prompt: - Added explicit execution-first behavior - Added assumptions-first guidance - Added explicit `request_user_input` unavailability instruction - Added concise progress-reporting expectations - Simplified formatting implementation: - Inlined allowed-mode name collection into `format_allowed_modes()` - Kept `format_allowed_modes()` output for 3+ modes as CSV style (`modes: a,b,c`)

Make Apps Gateway MCP blocking since otherwise app mentions may not work when apps are not loaded. Messages sent before apps become available will be queued. This only affects when `apps` feature is enabled.

…10189) Today, there is a single capability SID that allows the sandbox to write to * workspace (cwd) * tmp directories if enabled * additional writable roots This change splits those up, so that each workspace has its own capability SID, while tmp and additional roots, which are installation-wide, are still governed by the "generic" capability SID This isolates workspaces from each other in terms of sandbox write access. Also allows us to protect <cwd>/.codex when codex runs in a specific <cwd>

If we don't set any explicit values for sandbox or approval policy, let's try to use a requirements-satisfying value.

## Description ### What changed - Switch the arg0 helper root from `~/.codex/tmp/path` to `~/.codex/tmp/path2` - Add `Arg0PathEntryGuard` to keep both the `TempDir` and an exclusive `.lock` file alive for the process lifetime - Add a startup janitor that scans `path2` and deletes only directories whose lock can be acquired ### Tests - `cargo clippy -p codex-arg0` - `cargo clippy -p codex-core` - `cargo test -p codex-arg0` - `cargo test -p codex-core`

Add API to list / download from remote public skills

…ebsockets) (#10519) ### Motivation - Ensure `codex exec` exits when a running turn is interrupted (e.g., Ctrl-C) so the CLI is not "immortal" when websockets/streaming are used. ### Description - Return `CodexStatus::InitiateShutdown` when handling `EventMsg::TurnAborted` in `exec/src/event_processor_with_human_output.rs` so human-output exec mode shuts down after an interrupt. - Treat `protocol::EventMsg::TurnAborted` as `CodexStatus::InitiateShutdown` in `exec/src/event_processor_with_jsonl_output.rs` so JSONL output mode behaves the same. - Applied formatting with `just fmt`. ### Testing - Ran `just fmt` successfully. - Ran `cargo test -p codex-exec`; many unit tests ran and the test command completed, but the full test run in this environment produced `35 passed, 11 failed` where the failures are due to Landlock sandbox panics and 403 responses in the test harness (environmental/integration issues) and are not caused by the interrupt/shutdown changes. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_698165cec4e083258d17702bd29014c1)

### Summary * Add model upgrade to listModel app server endpoint to support dynamically show model upgrade banner.

## Summary - preserve baseline streaming behavior (smooth mode still commits one line per 50ms tick) - extract adaptive chunking policy and commit-tick orchestration from ChatWidget into `streaming/chunking.rs` and `streaming/commit_tick.rs` - add hysteresis-based catch-up behavior with bounded batch draining to reduce queue lag without bursty single-frame jumps - document policy behavior, tuning guidance, and debug flow in rustdoc + docs ## Testing - just fmt - cargo test -p codex-tui

codex debug app-server <user message> forwards the message through codex-app-server-test-client’s send_message_v2 library entry point, using std::env::current_exe() to resolve the codex binary. for how it looks like, see: ``` celia@com-92114 codex-rs % cargo build -p codex-cli && target/debug/codex debug app-server --help Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.34s Tooling: helps debug the app server Usage: codex debug app-server [OPTIONS] <COMMAND> Commands: send-message-v2 help Print this message or the help of the given subcommand(s) ```` and ``` celia@com-92114 codex-rs % cargo build -p codex-cli && target/debug/codex debug app-server send-message-v2 "hello world" Compiling codex-cli v0.0.0 (/Users/celia/code/codex/codex-rs/cli) Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.38s > { > "method": "initialize", > "id": "f8ba9f60-3a49-4ea9-81d6-4ab6853e3954", > "params": { > "clientInfo": { > "name": "codex-toy-app-server", > "title": "Codex Toy App Server", > "version": "0.0.0" > }, > "capabilities": { > "experimentalApi": true > } > } > } < { < "id": "f8ba9f60-3a49-4ea9-81d6-4ab6853e3954", < "result": { < "userAgent": "codex-toy-app-server/0.0.0 (Mac OS 26.2.0; arm64) vscode/2.4.27 (codex-toy-app-server; 0.0.0)" < } < } < initialize response: InitializeResponse { user_agent: "codex-toy-app-server/0.0.0 (Mac OS 26.2.0; arm64) vscode/2.4.27 (codex-toy-app-server; 0.0.0)" } > { > "method": "thread/start", > "id": "203f1630-beee-4e60-b17b-9eff16b1638b", > "params": { > "model": null, > "modelProvider": null, > "cwd": null, > "approvalPolicy": null, > "sandbox": null, > "config": null, > "baseInstructions": null, > "developerInstructions": null, > "personality": null, > "ephemeral": null, > "dynamicTools": null, > "mockExperimentalField": null, > "experimentalRawEvents": false > } > } ... ```

…10569) ## Summary This PR updates the `request_user_input` TUI overlay so `Esc` is context-aware: - When notes are visible for an option question, `Esc` now clears notes and exits notes mode. - When notes are not visible (normal option selection UI), `Esc` still interrupts as before. It also updates footer guidance text to match behavior. ## Changes - Added a shared notes-clear path for option questions: - `Tab` and `Esc` now both clear notes and return focus to options when notes are visible. - Updated footer hint text in notes-visible state: - from: `tab to clear notes | ... | esc to interrupt` - to: `tab or esc to clear notes | ...` - Hid `esc to interrupt` hint while notes are visible for option questions. - Kept `esc to interrupt` visible and functional in normal option-selection mode. - Updated tests to assert the new `Esc` behavior in notes mode. - Updated snapshot output for the notes-visible footer row. - Updated docs in `docs/tui-request-user-input.md` to reflect mode-specific `Esc` behavior.

- add `thread/compact` as a trigger-only v2 RPC that submits `Op::Compact` and returns `{}` immediately. - add v2 compaction e2e coverage for success and invalid/unknown thread ids, and update protocol schemas/docs.

Model client shouldn't be responsible for this.

## Summary Update guidance for request_rule ## Testing - [x] Unit tests pass

## Summary Forgot to include this tweak. ## Testing - [x] Unit tests pass

Summary - add Cursor/ThreadsPage conversions so state DB listings can be mapped back into the rollout list model - make recorder list helpers query the state DB first (archived flag included) and only fall back to file traversal if needed, along with populating head bytes lazily - add extensive tests to ensure the DB path is honored for active and archived threads and that the fallback works Testing - Not run (not requested) <img width="1196" height="693" alt="Screenshot 2026-02-03 at 20 42 33" src="https://github.com/user-attachments/assets/826b3c7a-ef11-4b27-802a-3c343695794a" />

If we want to build `/debug-config`, we'll need to know the requirements sources that supplied the values. This PR adds those sources such that we can render them in the UI.

We don't check anymore the response item with `user` role as they may be instructions etc

…10618)

pre-merge

- Remove numeric prefixes for disabled rows in shared list rendering. These numbers are shortcuts, Ex: Pressing "2" selects option `#2`. Disabled items can not be selected, so keeping numbers on these items is misleading. - Apply the same behavior in both tui and tui_app_server. - Update affected snapshots for apps/plugins loading and plugin detail rows. _**This is a global change.**_ Before: <img width="1680" height="488" alt="image" src="https://github.com/user-attachments/assets/4bcf94ad-285f-48d3-a235-a85b58ee58e2" /> After: <img width="1706" height="484" alt="image" src="https://github.com/user-attachments/assets/76bb6107-a562-42fe-ae94-29440447ca77" />

…i#21190) ## Why We found this while reviewing openai#21091, but confirmed it is not introduced by that PR: the order-sensitive `current_text_with_pending()` replacement loop already existed, and `main` already allowed active same-size large pastes to use prefix-overlapping labels such as `[Pasted Content N chars]` and `[Pasted Content N chars] #2`. openai#21091 fixes placeholder numbering after a draft is cleared, so a fresh same-size paste can reuse the base label. This PR fixes a different path: when a draft already contains multiple active same-size large pastes, the placeholders can overlap by prefix, for example `[Pasted Content N chars]` and `[Pasted Content N chars] #2`. That overlap breaks `current_text_with_pending()` when the composer materializes the draft text for the external editor. Replacing the base placeholder first can partially rewrite the `#2` placeholder, leaving the external editor seeded with corrupted text instead of both paste payloads. | Before | After | |---|---| | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 18 09" src="https://github.com/user-attachments/assets/88a2936c-cf00-4adc-8567-8fd8f398b4a8" /> | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 20 31" src="https://github.com/user-attachments/assets/119cff52-43c8-432a-9367-418d82f4ed82" /> | | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 18 57" src="https://github.com/user-attachments/assets/026031bb-839b-4252-a0fd-9ba9616435fe" /> | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 21 31" src="https://github.com/user-attachments/assets/8cb6f2c8-3a5d-411b-8623-dca666ee3c08" /> | ## What Changed - Changed `current_text_with_pending()` to expand pending pastes through the existing element-range based `expand_pending_pastes()` helper instead of global string replacement. - Added a regression test with two different same-length large pastes to ensure both overlapping placeholders expand to their original payloads. ## How to Test 1. Start Codex TUI. 2. Paste a large string, for example 1004 `A` characters. ```shell perl -e 'print "A" x 1004' | pbcopy ``` 3. Paste a second large string with the same length, for example 1004 `B` characters. ```shell perl -e 'print "B" x 1004' | pbcopy ``` 4. Open the external editor from the composer. 5. Confirm the editor is seeded with the full `A...` payload followed by the full `B...` payload, with no literal `#2` left behind. Targeted tests: - `cargo test -p codex-tui current_text_with_pending_expands_overlapping_placeholders` - `just argument-comment-lint-from-source -p codex-tui` I also ran `cargo test -p codex-tui`; it reached the full crate suite but failed two unrelated local status tests because this machine's `/etc/codex/requirements.toml` rejects `DangerFullAccess`.

sayan-oai and others added 30 commits January 27, 2026 03:33

prompt final (#9969)

f45a873

hopefully final this time (at least tonight) >_<

make cached web_search client-side default (#9974)

0adcd8a

[Experiment](https://console.statsig.com/50aWbk2p4R76rNX9lN5VUw/experiments/codex_web_search_rollout/summary) for default cached `web_search` completed; cached chosen as default. Update client to reflect that.

tui: wrapping user input questions (#9971)

4db6da3

make plan prompt less detailed (#9977)

cabb208

This was too much to ask for

Fixing main and make plan mode reasoning effort medium (#9980)

509ff1c

It's overthinking so much on high and going over the context window.

nit: better tool description (#9988)

742f086

nit: better unused prompt (#9991)

74ffbbe

chore: clean orchestrator prompt (#9994)

3b726d9

update pnpm to 10.28.2 to address security issues (#9992)

dd24ac6

Updates pnpm to 10.28.2. to address security issues in prior versions of pnpm that can allow deps to execute lifecycle scripts against policy. I have read the CLA Document and I hereby sign the CLA

description in role type (#9993)

067922a

remove sandbox globals. (#9797)

c40ad65

Threads sandbox updates through OverrideTurnContext for active turn Passes computed sandbox type into safety/exec

chore: introduce *Args types for new() methods (#10009)

700a29e

Constructors with long param lists can be hard to reason about when a number of the args are `None`, in practice. Introducing a struct to use as the args type helps make things more self-documenting.

really fix pwd for windows codex zip (#10011)

30eb655

Co-authored-by: Michael Bolin <mbolin@openai.com>

Remove load from SKILL.toml fallback (#10007)

2f8a44b

owenlin0 and others added 26 commits February 3, 2026 11:51

[apps] Gateway MCP should be blocking. (#10289)

654fcb4

Make Apps Gateway MCP blocking since otherwise app mentions may not work when apps are not loaded. Messages sent before apps become available will be queued. This only affects when `apps` feature is enabled.

Updated bug templates and added a new one for app (#10548)

477379b

[codex] Default values from requirements if unset (#10531)

8406bd7

If we don't set any explicit values for sandbox or approval policy, let's try to use a requirements-satisfying value.

Fixed icon for CLI bug template (#10552)

c87c271

feat: add APIs to list and download public remote skills (#10448)

f38d181

Add API to list / download from remote public skills

Feat: add upgrade to app server modelList (#10556)

750ebe1

### Summary * Add model upgrade to listModel app server endpoint to support dynamically show model upgrade banner.

feat: log webscocket timing into runtime metrics (#10577)

fcaed4c

Add thread/compact v2 (#10445)

38a4770

- add `thread/compact` as a trigger-only v2 RPC that submits `Op::Compact` and returns `{}` immediately. - add v2 compaction e2e coverage for success and invalid/unknown thread ids, and update protocol schemas/docs.

Move metadata calculation out of client (#10589)

56ebfff

Model client shouldn't be responsible for this.

fix(core) updated request_rule guidance (#10379)

968c029

## Summary Update guidance for request_rule ## Testing - [x] Unit tests pass

fix(core) Request Rule guidance tweak (#10598)

8f17b37

## Summary Forgot to include this tweak. ## Testing - [x] Unit tests pass

fix: single transaction for dyn tools injection (#10614)

3d8deee

Requirements: add source to constrained requirement values (#10568)

1eb21e2

If we want to build `/debug-config`, we'll need to know the requirements sources that supplied the values. This PR adds those sources such that we can render them in the UI.

chore: simplify user message detection (#10611)

38f6c6b

We don't check anymore the response item with `user` role as they may be instructions etc

fix: make sure file exist in find_thread_path_by_id_str_in_subdir (#…

61aecdd

…10618)

nit: cleaning (#10619)

aab60a5

TonyRHouston merged commit f6da11b into TonyRHouston:merge Feb 4, 2026

TonyRHouston added a commit that referenced this pull request Feb 27, 2026

Merge pull request #2 from openai/main

2f4c755

pre-merge

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pre-merge#2

pre-merge#2
TonyRHouston merged 300 commits into
TonyRHouston:mergefrom
openai:main

TonyRHouston commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

TonyRHouston commented Feb 4, 2026

External (non-OpenAI) Pull Request Requirements

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants