[pull] main from openai:main#58
Open
pull[bot] wants to merge 2586 commits into
Open
Conversation
## Why `ToolExecutor` is the runtime contract that keeps a callable tool and its model-visible spec together. Leaving `spec()` optional lets a registered runtime silently omit that half of the contract, and it also overloads a missing spec as an exposure decision for tools that should stay dispatchable without being shown to the model. ## What - Make `ToolExecutor::spec()` required and update core, extension, and test tool executors to return a concrete `ToolSpec`. - Add `ToolExposure::Hidden` for dispatch-only tools. The legacy `shell_command` runtime in unified-exec sessions now uses that explicit exposure instead of hiding itself by omitting a spec. - Build MCP tool specs when `McpHandler` is constructed so invalid MCP specs are skipped before the handler is registered. - Keep tool planning aligned with the new contract for direct, deferred, hidden, code-mode, dynamic, and namespaced tool paths. ## Testing - Added tool-plan coverage that invalid MCP tool specs are not registered. - Updated shell-family coverage for the hidden legacy `shell_command` runtime and the affected tool executor test fixtures.
## What Remove the exact captured request-count assertion from the `SubagentStart` hook integration test while still waiting for the child request that matches the injected hook context. ## Why The test owns the start-hook behavior and already verifies that the child request reaches the context matcher plus that the start/session hook logs have the expected invocations. Counting every request captured by the response mock makes the test sensitive to lifecycle timing outside that contract and has been flaky in CI. ## Testing - `cargo test -p codex-core --test all suite::subagent_notifications::subagent_start_replaces_session_start_and_injects_context -- --exact`
## Why Tool exposure is a planning concern, but the deferred MCP path and dispatch-only legacy shell path were carrying those decisions in handler constructors and a shell-only tool-family builder. Keeping those decisions in `spec_plan` makes the core tool plan easier to follow and keeps handlers focused on runtime behavior. ## What changed - add `PlannedTools` helpers for ordinary runtimes, exposure overrides, dispatch-only runtimes, and hosted specs - inline shell tool assembly into `core/src/tools/spec_plan.rs` and remove the shell-only `tool_family` module - remove exposure state and special exposure constructors from `McpHandler` and `ShellCommandHandler` - keep hidden runtime behavior centralized in `ExposureOverride`, including disabling parallel tool calls for hidden handlers ## Testing - Not run (refactor only)
Just updating the error message with a link to the doc
## Why Profile v2 is taking over the user-facing profile selection path, so the CLI no longer needs to expose the transitional `--profile-v2` spelling. This switches the public args surface to `--profile` before the remaining legacy profile plumbing is removed separately. ## What - Rebind `--profile` and `-p` to the v2 profile name argument that selects `$CODEX_HOME/<name>.config.toml`. - Stop parsing the legacy shared CLI profile argument while keeping its implementation path in place for follow-up cleanup. - Update CLI validation, profile-name parse errors, and the legacy-profile collision message/tests to refer to `--profile`. ## Testing - `cargo test -p codex-cli -p codex-config -p codex-protocol -p codex-utils-cli`
Fix this eyesore where our lack of a `"description"` was causing our `README.md` to be used for previews on npm. <img width="1291" height="178" alt="image" src="https://github.com/user-attachments/assets/a9bc08c5-0def-4755-8bcc-0c90e096b9c2" />
## Summary - route each configured MCP server through an explicit per-server `environment_id` instead of a manager-wide remote toggle - default omitted `environment_id` to `local`, resolve named ids through `EnvironmentManager`, and fail only the affected MCP server when an explicit id is unknown - keep local stdio on the existing local launcher path for now, while named-environment stdio uses the selected environment backend and requires an absolute `cwd` - allow local HTTP MCP servers to keep using the ambient HTTP client when no local `Environment` is configured; named-environment HTTP MCPs use that environment's HTTP client ## Validation - devbox Bazel build: `bazel build --bes_backend= --bes_results_url= //codex-rs/cli:codex //codex-rs/rmcp-client:test_stdio_server //codex-rs/rmcp-client:test_streamable_http_server` - devbox app-server config matrix with real `config.toml` / `environments.toml` files covering omitted local, explicit local, omitted local under remote default, explicit remote stdio, local HTTP without local env, explicit remote HTTP, local stdio without local env, unknown explicit env, and remote stdio without `cwd`
## Why [#23883](#23883) moved the user-facing `--profile` flag onto profile v2. The shared CLI option layer still carried the old `config_profile` slot and several CLI entrypoints still copied that value into legacy config overrides. Leaving that path around makes the CLI surface look like it still selects legacy `[profiles.*]` state even though `--profile` now means `$CODEX_HOME/<name>.config.toml`. ## What - Remove the legacy `config_profile` field and merge/copy path from [`SharedCliOptions`](https://github.com/openai/codex/blob/95baaf72920c8db22097df8d15a0bb76c84528b6/codex-rs/utils/cli/src/shared_options.rs#L8-L177). - Stop forwarding profile-v1 overrides from CLI, exec, TUI, doctor, debug, feature, and exec-server paths; runtime profile selection remains on `config_profile_v2` through [`loader_overrides_for_profile`](https://github.com/openai/codex/blob/95baaf72920c8db22097df8d15a0bb76c84528b6/codex-rs/cli/src/main.rs#L1606-L1619). - Resolve local OSS provider selection from the base config in exec and TUI now that the legacy profile argument is gone. ## Testing - Not run (cleanup-only follow-up to #23883).
## Why The named-profile `/permissions` picker needs a small TUI action path that can select permission profiles without folding the menu UI and profile metadata into the same review. ## What changed - Carry permission-profile selections through the TUI app event flow. - Persist selected profiles while preserving the existing approval settings and guardrail prompts. - Keep the legacy `/permissions` picker behavior in this layer; the profile-mode menu stays in the follow-up PR. ## Stack 1. [#22931](#22931): runtime/session/network propagation for active permission profiles. 2. **This PR**: TUI selection plumbing and guardrail flow. 3. [#21559](#21559): profile-aware `/permissions` menu and custom profile display. <img width="1632" height="1186" alt="image" src="https://github.com/user-attachments/assets/69ddcd5e-b57c-468d-8c1d-246916323c15" /> ## Validation - `git diff --cached --check` before commit. - Full test run skipped at the user request while pushing the split stack.
## Why Installing `@openai/codex` currently places a Dotslash `rg` manifest at `node_modules/@openai/codex/bin/rg`, even though the native optional dependency already ships the actual helper under `vendor/<target>/codex-path/rg`. The launcher prepends that `codex-path` directory, so the top-level `bin/rg` file is redundant in the npm install. The remaining direct consumers of the manifest are package-building paths: `scripts/codex_package/ripgrep.py` and `codex-cli/scripts/install_native_deps.py`. Keeping the manifest under `codex-cli/bin` makes it look like a shipped npm binary, so this moves it next to the package-builder code that owns it. The checked-in `@openai/codex` package metadata should likewise describe only the meta package payload; generated platform packages continue to publish `vendor`. ## What Changed - Moved the Dotslash ripgrep manifest from `codex-cli/bin/rg` to `scripts/codex_package/rg`. - Updated the package builder, npm native-artifact hydrator, README, and CLI help text to reference the new manifest location. - Stopped `codex-cli/scripts/build_npm_package.py` from copying `rg` into the `@openai/codex` meta package. - Narrowed the checked-in meta package `files` whitelist to `bin/codex.js`. ## Verification - `python3 -m unittest discover -s scripts/codex_package -p "test_*.py"` - `python3 -m unittest discover -s codex-cli/scripts -p "test_*.py"` - `python3 -m py_compile codex-cli/scripts/build_npm_package.py codex-cli/scripts/install_native_deps.py scripts/codex_package/ripgrep.py scripts/codex_package/cli.py scripts/stage_npm_packages.py` - `codex-cli/scripts/build_npm_package.py --package codex --version 0.0.0-test --pack-output <tmp>/codex-meta-no-vendor.tgz` - `tar -tf <tmp>/codex-meta-no-vendor.tgz` showed only `package/bin/codex.js`, `package/package.json`, and `package/README.md`. - Direct staging check showed `codex` uses `files: ["bin/codex.js"]` while `codex-darwin-arm64` still uses `files: ["vendor"]`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23833). * #23836 * __->__ #23833
## Why When a user runs `/goal` in a temporary session, the TUI can currently surface an internal app-server failure such as `thread/goal/get failed in TUI`. That message is technically true, but it does not explain the actual constraint: goals require a saved session because goal state is persisted with the thread. This is especially confusing when `codex doctor` reports the background app-server as running in ephemeral mode, since that wording is easy to conflate with ephemeral thread/session behavior. ## What changed - Added a TUI-side formatter for thread-goal RPC failures in `codex-rs/tui/src/app/thread_goal_actions.rs`. - Detects app-server/core errors that indicate goals are unsupported for an ephemeral thread/session. - Replaces the internal RPC failure with a user-facing explanation: ```text Goals need a saved session. This session is temporary. Run `codex` to start a saved session, or `codex resume` / `/resume` to reopen one. ``` - Preserves the existing generic failure wording for non-ephemeral goal errors. ## Verification - `cargo test -p codex-tui thread_goal_error_message --lib` I also tried `cargo test -p codex-tui`; it built successfully but the test runner aborted in an unrelated side-thread stack overflow (`app::tests::discard_side_thread_removes_agent_navigation_entry`), which reproduced when run by itself.
…ons (#23867) ## Summary - replace the one-shot lazy remote exec-server cache with a lock-protected current client - when the cached websocket client is already disconnected, create one fresh websocket client/session on the next `get()` - keep existing disconnect failure behavior for old process sessions and HTTP body streams; do not add session resume or request retry ## Why The prior PR direction was trying to grow into session restore: resume the old `session_id`, preserve existing process handles, and add reconnect retry policy. That is more machinery than we want for this slice. For now, the useful minimum is simpler: later fresh remote operations should not be stuck behind a dead cached websocket client, but anything already attached to the dead connection should fail loudly through the existing disconnect path. The server already has detached-session cleanup via its existing TTL, so this PR does not need to add client-side session preservation. ## What Changed - `LazyRemoteExecServerClient::get()` now keeps the current concrete client in a small mutex-protected cache plus one async connect lock. - If that cached client is still connected, `get()` returns it. - If that cached websocket client has observed the transport close, `get()` creates a brand-new websocket client with a brand-new exec-server session and replaces the cache. - If that cached client is stdio-backed, behavior stays one-shot: the dead client is returned and later work surfaces the existing disconnect error. - No `resume_session_id`, backoff, request replay, or existing `RemoteExecProcess` rebinding is added here. - Added focused websocket coverage that proves two concurrent `get()` calls after disconnect share one fresh replacement client/session.
## Why Users reported that the replacement confirmation feels unnecessary when the current thread goal is already complete. In that state, `/goal <objective>` is starting fresh rather than interrupting active work. ## What changed `/goal <objective>` now skips the replace confirmation when the existing goal has `complete` status and uses the existing fresh replacement path. Goals that are active, paused, blocked, usage-limited, or budget-limited still require confirmation before being replaced.
## Summary - add experimental `thread/search` for local rollout-backed thread search using `rg` over JSONL rollouts - return search-specific result rows with optional previews instead of storing preview data on `StoredThread` or ordinary `Thread` responses - keep `thread/list` separate from full-content search and document the new app-server surface ## Testing - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server thread_search_returns_content_and_title_matches -- --nocapture`
# Why This is a follow-up stacked on top of the `plugin_hooks` default-on change. Once we are comfortable making plugin hooks part of the normal plugin behavior, the separate feature flag stops buying us much and leaves extra branching/cache state behind. # What - remove the `PluginHooks` feature and generated config-schema entries - make plugin hook loading/listing follow plugin enablement directly - drop plugin-manager cache/state that only existed to distinguish hook-flag toggles - remove tests and fixtures that modeled `plugin_hooks = true/false`
## Why `rust-release` now publishes `codex-package-<target>.tar.gz` as the canonical native package payload. npm staging should consume those archives directly instead of keeping legacy synthesis code that fetched `rg`, copied standalone binaries, and rebuilt an approximate package layout. That also means the package builder should not know the internal shape of `codex-package`. It should extract and copy the target payload wholesale so future layout changes stay localized to the archive producer. The release job stages `codex`, `codex-responses-api-proxy`, and `codex-sdk` together, so native artifact download should be filtered, observable, and shared across component installs. Since that native hydration is now only used by release staging, keeping a separate `install_native_deps.py` CLI adds an extra wrapper without a real caller. ## What Changed - Removed legacy `codex-package` synthesis and related compatibility flags from npm staging. - Folded the remaining native artifact hydration code into `scripts/stage_npm_packages.py` and deleted `codex-cli/scripts/install_native_deps.py`. - Made platform package staging copy the full extracted target directory instead of enumerating package entries. - Kept non-`codex-package` native components under their component directory name instead of using a legacy destination map. - Split native staging by component set while sharing one workflow-artifact cache across the invocation. - Changed workflow artifact download to select target artifacts by name, print sizes/progress, and reuse cached artifacts. - Removed the implicit `CI=true` default from `build_npm_package.py`; local CI-shaped runs should set that environment explicitly. - Kept `npm pack` cache/log output in its temporary directory so packing does not write to the user npm cache. ## Verification - `python3 -m py_compile scripts/stage_npm_packages.py codex-cli/scripts/build_npm_package.py` - `python3 -m unittest discover -s scripts/codex_package -p "test_*.py"` - `scripts/stage_npm_packages.py --help` - `codex-cli/scripts/build_npm_package.py --help` - Ran the release-shaped staging command from `rust-release.yml` against workflow run https://github.com/openai/codex/actions/runs/26240748758 with `CI=true` set locally to match GitHub Actions: ```sh CI=true python3 ./scripts/stage_npm_packages.py \ --release-version 0.133.0 \ --workflow-url https://github.com/openai/codex/actions/runs/26240748758 \ --package codex \ --package codex-responses-api-proxy \ --package codex-sdk ``` That completed successfully, downloaded only the six target artifacts once, reused the cache for `codex-responses-api-proxy`, and produced all nine npm tarballs. Generated tarballs and staging/artifact temp dirs were cleaned afterward.
## Summary - make rollout content search prefilter rollout files case-insensitively - keep the no-ripgrep fallback scan and visible snippet matcher aligned with that behavior - cover a lowercase `thread/search` query matching mixed-case conversation content ## Why The rollout-backed `thread/search` path used exact string matching in both its `rg` prefilter and semantic snippet generation. A content result could be missed solely because the query casing did not match the stored conversation text. ## Validation - `just fmt` - `cargo test -p codex-app-server thread_search_returns_content_matches` - `cargo test -p codex-rollout` - `just bazel-lock-update` - `just bazel-lock-check` - `cargo build -p codex-cli` - launched a local Electron dev instance with the rebuilt CLI binary
## Why When remote control hits an auth failure such as a revoked or reused refresh token, the websocket loop falls into reconnect backoff. If the user fixes auth while that loop is sleeping, remote control can stay offline until the old retry timer expires because nothing wakes the loop or resets its exhausted auth recovery state. ## What Changed Added an auth-change watch on `AuthManager` for refresh-relevant cached auth updates. The remote-control websocket loop now subscribes to that signal, resets `UnauthorizedRecovery` and reconnect backoff when auth changes, and retries immediately instead of waiting for the previous delay. Updated the remote-control transport test to verify that reloading auth with the now-available account id wakes enrollment before the prior retry delay. ## Verification `cargo test -p codex-app-server-transport remote_control_waits_for_account_id_before_enrolling`
# What When a normal hook fires inside a thread-spawned subagent, Codex now includes these optional top-level fields in the hook input: - `agent_id`: the child thread id - `agent_type`: the subagent role Root-agent hook inputs omit these fields. `SubagentStart` and `SubagentStop` keep their existing required `agent_id` and `agent_type` fields because those events are inherently subagent-scoped. This does not change matcher behavior. Tool hooks still match on tool name, compact hooks still match on trigger, and `UserPromptSubmit` still ignores matchers. Only `SubagentStart` and `SubagentStop` match on `agent_type`.
…2915) ## Why Experimental feature toggles and memory settings can update several related config values in one interaction. Keeping those writes local in a remote TUI session is especially dangerous because the UI can diverge from the app-server config while also leaving behind partially stale supporting keys. This is **[3 of 4]** in a stacked series that moves TUI-owned config mutations onto app-server APIs. ## What changed - Routed feature flag persistence through app-server batch writes, including the supporting reviewer and permission updates used by guardian approval. - Routed Windows sandbox mode persistence and legacy Windows feature cleanup through app-server writes. - Routed memory settings through app-server batch writes and updated the TUI tests to exercise the embedded app-server path. ## Config keys affected - `features.<feature_key>` - `profiles.<profile>.features.<feature_key>` - `approval_policy` - `sandbox_mode` - `approvals_reviewer` - `windows.sandbox` - `features.experimental_windows_sandbox` - `features.elevated_windows_sandbox` - `features.enable_experimental_windows_sandbox` - Profile-scoped Windows legacy feature variants under `profiles.<profile>.features.*` - `memories.use_memories` - `memories.generate_memories` - Profile-scoped memory variants under `profiles.<profile>.memories.*` ## Suggested manual validation - Connect the TUI to a remote app server, toggle guardian approval on and off, and confirm the remote config updates `features.guardian_approval`, reviewer state, approval policy, and sandbox mode coherently. - Toggle a default-false experimental feature at the root level, disable it again, and confirm the key clears instead of lingering as an unnecessary explicit `false`. - Change memory settings and confirm the remote config updates both memory keys while the running TUI reflects the new state. - On Windows, switch sandbox mode through the TUI and confirm `windows.sandbox` is updated while the legacy Windows feature keys are cleared. ## Stack 1. [#22913](#22913) `[1 of 4]` primary settings writes 2. [#22914](#22914) `[2 of 4]` app and skill enablement 3. [#22915](#22915) `[3 of 4]` feature and memory toggles 4. [#22916](#22916) `[4 of 4]` startup and onboarding bookkeeping
Thread the plugin root through plugin skill loading so skill interface icons can reference shared plugin assets, such as ../../assets/logo.svg.
## Summary - Add us-gov-west-1 to the Bedrock Mantle supported region list - Cover the GovCloud endpoint URL in the existing base_url unit test ## Test - cargo test -p codex-model-provider
## Summary The auto-review runtime sync path was assigning a raw `PermissionProfile` into `runtime_permission_profile_override`, whose field now expects `RuntimePermissionProfileOverride`. That broke the TUI Bazel build. This changes the assignment to store `RuntimePermissionProfileOverride::from_config(&self.config)`, matching the other runtime override paths and preserving the active profile and network metadata with the permission profile.
# Why
Some connector tool input schemas use local JSON Schema references and
definition tables to avoid duplicating large nested shapes. Codex
previously lowered these schemas into the supported subset in a way that
could discard `$ref`-only schema objects and lose the corresponding
definitions, which made non-strict tool registration less faithful than
the original connector schema.
This keeps the existing minimal-lowering policy: Codex still does not
raw-pass through arbitrary JSON Schema, but it now preserves local
reference structure that fits the Responses-compatible subset and prunes
definition entries that cannot be reached by following `$ref`s from the
root schema after sanitization, including refs found transitively inside
other reachable definitions. The pruning matters because Responses
parses definition tables even when entries are unused, so keeping dead
definitions wastes prompt tokens.
# What changed
- Added `$ref`, `$defs`, and legacy `definitions` fields to the tool
`JsonSchema` representation.
- Updated `parse_tool_input_schema` lowering so `$ref`-only schema
objects survive sanitization instead of becoming `{}`.
- Sanitized definition tables recursively and dropped malformed
definition tables so non-strict registration degrades gracefully.
- Added reachability pruning for root definition tables by starting from
refs outside definition tables, then following refs inside reachable
definitions.
- Added JSON Pointer decoding for local definition refs such as
`#/$defs/Foo~1Bar`.
# Verification
ran local golden-schema probes against representative connector schemas
to validate behavior on real generated schemas:
| Golden schema | Before bytes | After bytes | `$defs` before -> after |
`$ref` before -> after | Result |
|---|---:|---:|---:|---:|---|
| `google_calendar/create_space` | 7111 | 4526 | 7 -> 7 | 7 -> 7 | all
definitions preserved because all are reachable |
| `figma/apply_file_variable_changes` | 4609 | 999 | 8 -> 5 | 8 -> 5 |
unused defs pruned after unsupported `oneOf` shapes lower away |
| `snowflake/list_catalog_integrations` | 1380 | 404 | 3 -> 0 | 0 -> 0 |
all defs pruned because none are referenced |
| `dropbox/create_shared_link` | 8894 | 1836 | 14 -> 4 | 9 -> 4 | only
defs reachable from the root schema after sanitization are retained,
including transitively through other retained defs |
Token increase across golden schema due to this change:
<img width="817" height="366" alt="Screenshot 2026-05-19 at 1 47 04 PM"
src="https://github.com/user-attachments/assets/d5c80fe9-da85-41e6-8ac7-a01d1e0b0f71"
/>
## Why Extension tools that need conversation context should be able to read it from the live tool invocation instead of reaching into thread persistence themselves. ## What changed - Add a `ConversationHistory` snapshot to extension `ToolCall`s and populate it from the current raw in-memory response history. - Expose all history items at this boundary so each extension can filter and bound the subset it needs before consuming or forwarding it. - Cover the adapter and registry dispatch paths and update existing extension tests that construct `ToolCall` literals. ## Test plan - `cargo test -p codex-tools` - `cargo test -p codex-extension-api` - `cargo test -p codex-goal-extension` - `cargo test -p codex-memories-extension` - `cargo test -p codex-core passes_turn_fields_to_extension_call` - `cargo test -p codex-core extension_tool_executors_are_model_visible_and_dispatchable`
## Why
The `dev/cc/ref-def` branch preserves richer JSON Schema detail for
connector tools, including `$defs` and nested shapes. That improves
fidelity, but it pushes the largest connector schemas well past the
intended tool-schema budget. This PR adds a best-effort compaction pass
for unusually large tool input schemas so the p99 and max tails stay
small while ordinary schemas are left alone.
## What Changed
- Added best-effort large-schema compaction in
`codex-rs/tools/src/json_schema.rs` after schema sanitization and
definition pruning.
- Compaction runs as a waterfall only while the compact JSON budget
proxy is exceeded:
1. Strip schema `description` metadata.
2. Drop root `$defs` / `definitions`.
3. Collapse deep nested complex schema objects to `{}`.
- Kept top-level argument names and immediate schema shape where
possible.
## Corpus Results
Scope: 2,025 schemas under `golden_schemas`, all parsed successfully.
Token count is `o200k_base` over compact JSON from
`parse_tool_input_schema`.
| Percentile | Before `origin/main` `4dbca61e20` | After branch
`dev/cc/ref-def` `f9bf071758` | After this PR |
|---|---:|---:|---:|
| p0 | 9 | 9 | 9 |
| p10 | 59 | 63 | 63 |
| p25 | 81 | 86 | 86 |
| p50 | 114 | 127 | 125 |
| p75 | 174 | 205 | 202 |
| p90 | 295 | 335 | 322 |
| p95 | 391 | 526 | 422 |
| p99 | 794 | 1,303 | 689 |
| max | 2,836 | 3,337 | 887 |
After this PR, `0 / 2,025` schemas are over 1k tokens.
### Compaction Savings
These are cumulative waterfall stages over the same corpus. Later passes
only run for schemas that are still over the compact JSON budget proxy.
| Stage | Total tokens | Step savings | Schemas changed by step |
|---|---:|---:|---:|
| No compaction | 391,862 | - | - |
| Strip schema `description` metadata | 350,961 | 40,901 | 66 |
| Drop root `$defs` / `definitions` | 340,683 | 10,278 | 13 |
| Collapse deep complex schemas to `{}` | 335,875 | 4,808 | 6 |
## Summary - Treat MCP tools with `readOnlyHint: true` as parallel-safe even when `supports_parallel_tool_calls` is unset or `false`. - Keep server-level `supports_parallel_tool_calls` as an additive override for non-read-only tools. - Add focused unit coverage for the MCP handler eligibility decision. - Update RMCP integration coverage to keep the serial baseline on a mutable tool, verify read-only concurrency without server opt-in, and preserve the server opt-in concurrency path separately. ## Testing - `just fmt` - `cargo test -p codex-core --lib tools::handlers::mcp::tests::` - `cargo test -p codex-core --test all stdio_mcp_read_only_tool_calls_run_concurrently_without_server_opt_in` - `cargo test -p codex-core --test all stdio_mcp_parallel_tool_calls_opt_in_runs_concurrently` - `cargo test -p codex-rmcp-client`
# Why Fixes #24529. Completed hook output in the TUI rendered each `HookOutputEntry` as one ratatui line, so explicit newlines inside hook output were not shown as separate transcript rows. That made multiline `SessionStart.additionalContext` hard to inspect even though the model-facing context path preserved the original text. # What - Split completed hook output entries on explicit newlines before rendering them in `codex-rs/tui/src/history_cell/hook_cell.rs`. - Keep the hook output prefix, such as `hook context:` or `warning:`, on the first physical line only. - Preserve explicit blank lines and render continuation lines with the hook body indent. - Add unit coverage for multiline context and warning output, plus a chatwidget snapshot regression for `SessionStart` history output. # Testing - `cargo nextest run -p codex-tui completed_hook_multiline hook_completed_before_reveal_renders_completed_without_running_flash` - `just argument-comment-lint -p codex-tui -- --ignore-rust-version --lib --tests`
## Why Some models need to select their code-execution behavior through model catalog metadata. Models without that metadata must continue to follow the existing `CodeMode` and `CodeModeOnly` feature flags, including when a newer server sends an enum value this client does not recognize. ## What changed - add optional `ModelInfo.tool_mode` metadata with `direct`, `code_mode`, and `code_mode_only` - treat omitted and unknown wire values as `None` - resolve `None` from the existing feature flags - carry the resolved `ToolMode` directly on `TurnContext`, outside `Config` - use the resolved value for turn creation, model switches, review turns, tool planning, and code execution ## Coverage - add protocol coverage for omitted, known, and unknown enum values - add focused coverage for flag fallback and explicit metadata overriding feature flags - add core integration coverage that fetches remote model metadata through `/v1/models` and verifies the outbound `/responses` tools for explicit `direct` and `code_mode_only` selectors ## Stack - followed by #25032
## Why Standalone `web.run` calls run in the extension, so they need normal web-search progress activity while a request is in flight and durable completed activity after a thread is reloaded. Follow-up to #23823; uses the extension turn-item emission path added in #24813. ## What changed - Emit standalone `web.run` start/completion items through the host turn-item emitter, preserving standard client delivery and rollout persistence. - Include useful completion detail for queries, image queries, and literal-URL `open`/`find` commands. - Render completed searches as `Searched the web` or `Searched the web for <detail>`, with snapshot coverage for the detail-free case. - Extend the app-server round-trip test to verify completed search activity is reconstructed by `thread/read` after a fresh-process reload. ## Testing - `just test -p codex-web-search-extension` - `just test -p codex-app-server -E "test(standalone_web_search_round_trips_encrypted_output)"`
## Why `core/src/config/edit.rs` owns the config edit state machine, but it also carried the TOML document helper code inline as a nested module. Moving those helpers into their own file keeps the edit orchestration easier to scan without changing the config persistence behavior. ## What changed - Moved the existing `document_helpers` module from `core/src/config/edit.rs` into `core/src/config/edit/document_helpers.rs`. - Added `mod document_helpers;` so the existing `pub(super)` helper API remains available to the rest of `config::edit`. ## Testing Not run; this is a refactor-only module extraction with no intended behavior change.
Adds failure-only logging for MCP streamable HTTP post_message calls and the underlying reqwest send path, capturing the MCP method/request id, endpoint shape, auth-header presence, timeout/connect classification, and sanitized error source chain without logging headers, bodies, tokens, or full URLs.
Ensures MCP-backed `codex-core` integration tests exercise initialized servers instead of racing server startup. I've been idly investigating a few flakes and the failure modes are much more confusing when a tool call fails because of a failed server start than when the failed server start causes the test to fail directly.
…pipeline (#24972) ## Why The standalone `image_gen.imagegen` extension should behave like native image generation for artifact persistence and UI completion, while returning its save-location guidance as part of the tool result instead of injecting a developer message. ## What Changed - Added an image-generation completion hook for extension tools so core can persist generated images and emit the existing `ImageGeneration` lifecycle events. - Reused core image artifact persistence for extension output and removed extension-local save-path/file-writing logic. - Split shared image persistence from built-in finalization so native image generation keeps its existing developer-message instruction behavior. - Returned the generated image save-location instruction through the extension `FunctionCallOutput`, alongside the generated image input for model follow-up. - Preserved the existing image-generation event shape for current UI and replay compatibility. - Avoided cloning the full generated-image base64 payload when emitting the in-progress image item. - Removed dependencies no longer needed after moving persistence out of the extension crate. ## Fast Follow - Adjust the existing Extension API and add a general `TurnItem` finalization path for re-usability of code ## Validation - Ran `just fmt`. - Ran `just bazel-lock-update`. - Ran `just bazel-lock-check`. - Ran `just test -p codex-tools -p codex-extension-api -p codex-image-generation-extension`. - Ran `just test -p codex-core image_generation_publication_is_finalized_by_core`. - Ran `just test -p codex-core handle_output_item_done_records_image_save_history_message`. - Ran `just fix -p codex-tools -p codex-extension-api -p codex-core -p codex-image-generation-extension`.
## Why Some Windows users do not have local admin access, so they cannot complete the elevated portion of the Windows sandbox setup when Codex first needs it. This adds an alpha provisioning path that an admin or IT deployment script can run ahead of time for the Codex user. The intended managed-deployment shape is: ```powershell codex sandbox setup --elevated --user "$env:COMPUTERNAME\Alice" --codex-home "C:\Users\Alice\.codex" ``` `--elevated` is treated as the requested sandbox setup level, not as proof that the process is elevated. The Windows sandbox setup orchestration still checks that the caller is actually elevated before launching the helper without a UAC prompt. ## What changed - Added `codex sandbox setup --elevated` with explicit user selection via either `--current-user` or `--user ... --codex-home ...`. - Moved the CLI implementation into `cli/src/sandbox_setup.rs` instead of growing `cli/src/main.rs`. - Added a Windows sandbox `ProvisionOnly` helper mode that runs the elevation-required provisioning work without requiring a workspace cwd or runtime sandbox policy. - Reused the existing elevated helper path for creating/updating sandbox users, configuring firewall/WFP rules, and applying sandbox directory ACLs. - Persisted `windows.sandbox = "elevated"` into the target `CODEX_HOME` so the desktop app does not show the initial sandbox setup banner after pre-provisioning succeeds. ## Validation - `cargo fmt -p codex-windows-sandbox -p codex-core -p codex-cli` - `cargo test -p codex-cli sandbox_setup --target-dir target\sandbox-setup-check` - `cargo test -p codex-windows-sandbox payload_accepts_provision_only_mode --target-dir target\sandbox-setup-check` - `git diff --check` - Manual Windows alpha flow with a standard local user (`Mandi Lavida`): ran the new setup command from an admin shell, verified the target `.codex` contents, sandbox marker/secrets, ACLs, firewall rules, and desktop startup without the sandbox setup banner once experimental network proxy requirements were disabled. ## Notes This intentionally does not solve later elevated update coordination for IT-managed deployments. The setup command can still apply provisioning updates when run again, but a broader coordination/process story is out of scope for this alpha.
## Summary The desktop app now presents the on-request permissions mode as `Ask for approval` and the manual-review-backed mode as `Approve for me`. The TUI still exposed older/internal labels like `Default` and `Auto-review`, which made the same underlying settings look different across clients. This updates the TUI UX copy to match the app without changing the underlying default behavior. Fresh threads continue to use the existing on-request approval mode, now displayed as `Ask for approval`. The label changes cover `/permissions`, explicit profile permissions menus, status surfaces, config persistence history/error text, and the corresponding TUI snapshots. ### Before <img width="1181" height="119" alt="Screenshot 2026-05-28 at 10 19 47 PM" src="https://github.com/user-attachments/assets/0664846b-b6dd-4931-b4dd-d0af0d42058e" /> <img width="523" height="19" alt="Screenshot 2026-05-28 at 10 21 29 PM" src="https://github.com/user-attachments/assets/7899c33e-b35d-4684-8389-97e357803423" /> ### After <img width="1216" height="117" alt="Screenshot 2026-05-28 at 10 19 32 PM" src="https://github.com/user-attachments/assets/015aab43-ac97-411f-8031-75cdd887251b" /> <img width="567" height="18" alt="Screenshot 2026-05-28 at 10 20 24 PM" src="https://github.com/user-attachments/assets/28b6422c-b823-4298-b221-c83d46d09d66" />
## Why TUI users can archive saved sessions from other surfaces, but there is no in-session command for archiving the active session. Since archiving the active session also exits the TUI, the command should ask for explicit confirmation instead of firing immediately. I'm also working on [a companion PR](#25021) that adds `codex archive` and `codex unarchive` top-level CLI commands. ## What changed - Adds a new `/archive` slash command described as `archive this session and exit`. - Shows a confirmation dialog with `No, don't archive` selected first and `Yes, archive and exit` as the explicit action. - On confirmation, calls the existing `thread/archive` app-server RPC for the active main session and exits after success. - Keeps `/archive` disabled while a task is running and unavailable in side conversations. ## Verification Added focused TUI coverage for the `/archive` confirmation flow, disabled-while-task-running behavior, and the `/ar` slash-command popup snapshot.
## Why The TUI `/rename` confirmation should use the term "session" for consistency.
## Why We recently added `forked_from_thread_id` which lets us trace where a thread's _context_ comes from, but we also want to understand subagent lineage (e.g. which parent thread spawned this subagent? what kind of subagent is it?) which is orthogonal. This PR adds `parent_thread_id` and `subagent_kind` to the `x-codex-turn-metadata` header sent to ResponsesAPI. ## What changed - Adds `parent_thread_id` and `subagent_kind` to core-owned `x-codex-turn-metadata`. - Restores persisted `SessionSource` and `ThreadSource` from resumed session metadata so cold-resumed subagent threads keep their lineage on later Responses API requests. - Centralizes parent-thread extraction on `SessionSource` / `SubAgentSource` and reuses it in the Responses client, analytics, agent control, and state parsing paths. - Extends reserved-key, git-enrichment, thread-spawn, and app-server v2 metadata coverage for the new lineage fields. ## Verification - Not run locally per request. - Added focused coverage in `core/src/turn_metadata_tests.rs` and `app-server/tests/suite/v2/client_metadata.rs`.
## Summary - terminate sandbox filesystem helpers when the Tokio child handle is dropped ## Why A sandbox filesystem helper can stall during process startup before reading stdin. If the owning async operation is cancelled or torn down, the spawned helper should not remain running as an orphaned process. Setting `kill_on_drop(true)` gives the filesystem helper the cleanup behavior that Tokio child processes otherwise do not enable by default. This intentionally does not add a timeout. It does not detect or recover an active hung file edit while the owning future remains alive. A more precise startup-health mechanism can be handled separately. ## Validation - `just test -p codex-exec-server` (186 tests passed; benchmark smoke passed) - `just fmt` - `just fix -p codex-exec-server` - `git diff --check`
## Summary Introduce a `CodeModeSession` interface for executing and managing code-mode cells. This moves cell lifecycle, callback delegation, termination, and shutdown behind a session abstraction, while continuing to use the existing in-process implementation, and the ability to implement an external process one behind this interface. A Codex session owns one `CodeModeSession`, which in turn owns its running cells and stored code-mode state. Each cell is represented to the caller as a `StartedCell`, exposing its cell ID and initial response. It also introduces a `CodeModeSessionDelegate` callback interface. A session uses the delegate to invoke nested host tools and emit notifications while a cell is running, allowing the runtime to communicate with its owning Codex session without depending directly on core turn handling. <img width="2121" height="1001" alt="image" src="https://github.com/user-attachments/assets/c349a819-2a59-485c-bda4-2caf68ac4c31" />
## Why `SandboxPolicy` is the legacy compatibility shape, but `codex-thread-store` still exposed it through `StoredThread`, `ThreadMetadataPatch`, and live metadata sync. That kept thread-store consumers tied to the legacy representation and meant richer permission profile data could not round-trip through thread metadata or cold rollout reconciliation. ## What Changed - Replaced thread-store `sandbox_policy` API fields with canonical `PermissionProfile` fields. - Persist new permission-profile metadata as canonical JSON in the existing SQLite metadata slot while continuing to read older legacy sandbox policy values. - Updated local, in-memory, live metadata sync, and rollout extraction paths to propagate `TurnContextItem::permission_profile()`. - Re-materialize legacy permission metadata against the final rollout cwd when rollout-derived metadata replaces stale SQLite summaries. - Updated affected app-server and core test constructors to build `PermissionProfile` values directly. ## Test Plan - `cargo test -p codex-state` - `cargo test -p codex-thread-store` - `cargo test -p codex-app-server summary_from_stored_thread_preserves_millisecond_precision --lib` - `cargo test -p codex-core realtime_context --lib`
## Why The standalone `/v1/alpha/search` request now requires a `model`, but the `web.run` extension currently omits it. Adds `model` to extension `ToolCall` invocation. Follow-up to #23823. ## What changed - Make `SearchRequest.model` required. - Expose the effective per-turn model on extension tool calls and pass it in standalone web-search requests. - Assert the model is forwarded in the app-server round-trip test. ## Testing - `just test -p codex-api -p codex-tools -p codex-web-search-extension -p codex-memories-extension -p codex-goal-extension` - `just test -p codex-core -E 'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'` - `just test -p codex-app-server -E 'test(standalone_web_search_round_trips_encrypted_output)'`
## Summary This adds `environment: issue-triage` to the Codex-calling issue workflow jobs so they can read the GitHub Environment Secret while staying on GitHub-hosted runners for public issue-triggered workflows.
## Summary - preserve macOS `__CF_USER_TEXT_ENCODING` when launching the sandboxed fs helper - keep the fs-helper env narrow; this adds only the CoreFoundation startup var instead of copying the broader MCP stdio baseline - add focused coverage that the helper keeps that var without admitting `HOME` ## Diagnosis The sandboxed fs helper is not launched like a normal child process. Exec-server rebuilds its environment from an allowlist, then calls `env_clear()` before re-execing Codex with `--codex-run-as-fs-helper`. That helper dispatches before the normal Codex startup path and only needs to boot a small Tokio runtime, read one JSON request from stdin, perform the direct filesystem operation, and write one JSON response. The reported macOS hang sampled the helper before Rust main, in CoreFoundation initialization while resolving the default text encoding: `_CFStringGetUserDefaultEncoding -> getpwuid_r -> notify_register_check -> bootstrap_look_up3 -> mach_msg2_trap`. The fs-helper allowlist kept `PATH` and temp vars for runtime needs, but it dropped macOS `__CF_USER_TEXT_ENCODING`. Other Codex subprocess launchers that intentionally build a minimal Unix baseline, such as MCP stdio, already preserve that variable. My read is that stripping `__CF_USER_TEXT_ENCODING` forced this internal helper down CoreFoundation's fallback user-lookup path, and that lookup intermittently wedged on the affected machine before the helper could read stdin or touch the target file. Preserving only this macOS startup variable avoids that fallback without broadening the fs-helper environment to shell-like vars such as `HOME`, `USER`, locale settings, terminal settings, or proxy credentials. Internal Slack thread omitted from the public PR body. ## Validation - `cd codex-rs && just fmt` - `git diff --check`
## Summary - add Vim normal-mode `s` support to substitute the character under the cursor and enter insert mode - fix Vim normal-mode `o` so opening below the final line moves the cursor onto the new blank line - update keymap config/schema and keymap picker snapshots for the new action ## Validation - `just fmt` - `just write-config-schema` - `just test -p codex-config` - focused `just test -p codex-tui` coverage for the Vim `s` and `o` behavior, keymap conflict handling, and keymap picker snapshots - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml` - `git diff --check` ## Notes A full `just test -p codex-tui` run still has two unrelated Guardian feature-flag failures in this checkout: - `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` - `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
Provides starlark syntax highlighting and editor formatting.
## Summary - Keep the original `TOOL_SUGGEST_DISCOVERABLE_PLUGIN_ALLOWLIST` as a fallback seed list, so users with no installed plugins still get initial install suggestions. - Allow additional install suggestions from trusted marketplaces: `openai-curated` and `openai-bundled`. - Require non-fallback, non-configured marketplace candidates to share `.app.json` connector IDs with already installed plugins. - Preserve explicit configured plugin discoverables as an override, while still omitting installed, disabled, and `NOT_AVAILABLE` plugins. ## Context `list_available_plugins_to_install` controls which plugins the model can trigger via `request_plugin_install`. We want a small starter set for empty/new users, but we also want installed workflow plugins to unlock relevant source plugins without maintaining every source plugin ID by hand. This keeps the legacy plugin ID allowlist only as the starter fallback. For everything else, the trusted marketplace is the candidate boundary, and installed app connector overlap is the relevance filter. For example, an installed Sales plugin can make HubSpot and Granola suggestible when those source plugins are in `openai-curated` and share Sales app connector IDs, while an unrelated test-source plugin with an app connector not declared by Sales stays hidden. ## Test Coverage - Empty/no-installed-plugin case: returns the fallback seed plugins from the original allowlist. - Installed-app expansion: returns non-fallback marketplace plugins only when their app connector IDs overlap with an installed plugin. - Sales workflow case: installed Sales declares HubSpot and Granola apps, so `hubspot@openai-curated` and `granola@openai-curated` are returned. - Sales negative case: `test-source@openai-curated` has an app connector not declared by Sales, so it is not returned. - Existing guardrails: installed plugins, disabled suggestions, and `NOT_AVAILABLE` plugins remain omitted; explicit configured discoverables still work as an override. ## Validation - `just fmt` - `just test -p codex-core plugins::discoverable::tests` - `just test -p codex-core` was attempted earlier, but current `main` / local env failed with unrelated existing failures around missing `test_stdio_server`, CLI/code-mode MCP tool setup, and unified_exec/shell snapshot flakes/timeouts. The touched discoverable tests pass.
# Why Managed requirements can already constrain sandbox policy choices, but Windows sandbox implementation selection was still resolved independently from those requirements. That left the TUI able to continue through the unelevated fallback even when an organization wants to require the elevated Windows sandbox implementation. # What - Add `[windows].allowed_sandbox_implementations` requirements support for the Windows `elevated` and `unelevated` implementations. - Apply that allowlist during core config resolution so disallowed configured or feature-selected Windows sandbox implementations fall back to an allowed implementation with the existing requirements warning path. - Reuse the existing TUI Windows setup prompts to block disallowed unelevated continuation, keep required elevated setup in front of the user, and refuse to persist a TUI-selected Windows sandbox mode that requirements disallow. # Semantics | Allowed | Selected | Effective | | --- | --- | --- | | `["elevated"]` | `unelevated` / unset | `elevated` | | `["unelevated"]` | `elevated` / unset | `unelevated` | | `["elevated", "unelevated"]` | `elevated` | `elevated` | | `["elevated", "unelevated"]` | `unelevated` | `unelevated` | | `["elevated", "unelevated"]` | unset | `elevated` | Availability is handled by interactive setup surfaces after allowlist resolution. If the effective elevated implementation is not ready, elevated-only requirements block on setup. When unelevated is also allowed, the UI may offer the existing unelevated fallback. ## TUI Screens If elevated setup is not already complete: ``` Your organization requires the default Codex agent sandbox to continue. Set it up to protect your files and control network access. Learn more <https://developers.openai.com/codex/windows> › 1. Set up default sandbox (requires Administrator permissions) 2. Quit ``` If admin setup fails under `["elevated"]`: ``` Couldn't set up your sandbox with Administrator permissions Your organization requires the default sandbox before Codex can continue. Learn more <https://developers.openai.com/codex/windows> › 1. Try setting up admin sandbox again 2. Quit ``` # Next Steps - extend the requirements/readout surface, such as `configRequirements/read`, so clients can inspect the loaded `[windows].allowed_sandbox_implementations` requirement instead of inferring it from Windows setup state - consider extending `windowsSandbox/readiness` as well - update the App startup guide, setup flow, and banner surfaces so an elevated-only requirement omits any continue-unelevated escape hatch and blocks startup until a permitted implementation is ready; - preserve the existing unelevated fallback path when requirements allow it, including the `["unelevated"]` case where elevated is disallowed
## Summary - Use the session-loaded plugin app IDs as the source of connector suggestion candidates. - Remove the redundant plugin reload from `tool_suggest_connector_ids()`. - Add regression coverage for connectors declared by a loaded remote plugin, using the Databricks app case. ## Context Loaded remote plugins can declare app connector IDs in `.app.json`. The session-owned `PluginsManager` already loads those plugins and exposes their effective app IDs. The connector suggestion path was creating a separate `PluginsManager` and recomputing plugin app IDs. That new manager does not share the session manager’s remote installed plugin cache, so app IDs from loaded remote plugins were missing from connector suggestions. ## Fix Pass the already-loaded effective app IDs into connector suggestion generation and use them directly as the plugin-derived connector candidate set. Connector candidates are now built from: - App IDs declared by loaded plugins - Explicitly configured connector discoverables - Existing disabled-suggestion filtering This avoids a second plugin-manager lookup and keeps connector suggestions aligned with the plugins actually loaded for the turn. ## Behavior For example, when a plugin is loaded and its `.app.json` declares data apps, `list_available_plugins_to_install` can now return those data connectors. This does not create plugin suggestions from the plugin itself. Plugin suggestions still come from eligible uninstalled entries in the marketplace catalog and require existing matching/filtering rules. ## Validation - `just fmt` - Added regression coverage for a loaded-plugin connector ID appearing in discoverable tools - Attempted `just test -p codex-core`; the command exited unsuccessfully in the local test environment without useful failure detail captured in the run output
## Why Users following the Amazon Bedrock API-key setup can export `AWS_BEARER_TOKEN_BEDROCK` and `AWS_REGION`, but Codex's bearer-token auth path only accepted `model_providers.amazon-bedrock.aws.region`. That made the documented env-based setup fail with a missing-region error even though the standard AWS region environment variable was present. ## What Changed - Updates Bedrock bearer-token region resolution to use `model_providers.amazon-bedrock.aws.region` first, then fall back to `AWS_REGION`, then `AWS_DEFAULT_REGION`. - Updates the missing-region error to list all supported region sources. - Adds focused coverage for config precedence, `AWS_REGION`, `AWS_DEFAULT_REGION`, and the missing-region failure.
## Summary Experimental flag to allow toggling `request_user_input`: ``` tools.experimental_request_user_input = false ``` ## Testing - [x] Added unit tests
## Problem Saved threads can already be archived through app-server RPCs, but the command line did not expose direct archive or unarchive commands. ## Solution Add `codex archive <thread>` and `codex unarchive <thread>`, resolving UUIDs or exact thread names before calling the existing `thread/archive` and `thread/unarchive` RPCs. The commands support scoped remote flags so callers can target remote app-server endpoints when archiving or unarchiving threads. This also fixes a long-standing bug in `codex resume <thread id>` and `codex fork <thread id>` that I found when testing the new commands. These operations shouldn't be allowed on archived sessions. They now fail with an error that tells the user to run `codex unarchive <thread id>` first. ## Verification Added app-server coverage for rejecting archived thread resume by id and checking that the error includes the matching `codex unarchive <thread id>` command.
## Summary - rename the multi-agent v2 follow-up task tool surface to assign_task - update core tests and spec-plan expectations - keep rollout-trace classification backward-compatible with legacy followup_task ## Tests - just fmt - just test -p codex-core multi_agents_spec::tests::assign_task_tool_requires_message_and_has_no_output_schema - just test -p codex-rollout-trace - just fix -p codex-core - just fix -p codex-rollout-trace Note: a broad just test -p codex-core run was attempted locally, but this sandbox produced unrelated environment failures around sandbox-exec, missing test_stdio_server, and realtime timeouts.
## Description Bedrock currently only supports the implicit `default` service tier for GPT models. This PR strips non-default service tier metadata from Bedrock model catalogs so Codex does not advertise or send unsupported tiers. ## What changed - Normalize both built-in and configured Bedrock catalogs to default-only service tier behavior. - Add regression coverage for built-in and configured Bedrock catalogs. ## Validation - `just fmt` - `just test -p codex-model-provider`
…cks (#25381) ## Summary - Use normal directory loading for plugin install app metadata so install avoids forced directory refresh while still loading metadata on cold cache. - Continue force-refreshing codex_apps tools for auth state. - Add regression coverage that pre-warms the directory cache and asserts install returns cached app metadata without extra directory requests. ## Validation - just fmt - git diff --check - just test -p codex-app-server plugin_install_returns_apps_needing_auth plugin_install_filters_disallowed_apps_needing_auth (blocked locally: cargo-nextest is not installed)
## Why Closes #25006. `tui.keymap` currently rejects `F13` even though Codex's terminal event layer can report higher function keys. This prevents users from using common remappings such as Caps Lock to `F13`. ## What Changed - Define a shared portable upper bound of `F24` for stored TUI keybindings. - Accept `f13` through `f24` in config normalization and runtime parsing. - Allow `/keymap` capture to persist `F13` through `F24`. - Update the unsupported-function-key error and add boundary tests for `F13`, `F24`, and `F25`. ## How to Test 1. Add a binding such as: ```toml [tui.keymap.global] open_transcript = "f13" ``` 2. Start Codex and press the remapped `F13` key. 3. Confirm Codex loads the config without the previous `F1 through F12` error and the action runs. 4. Open `/keymap`, capture `F13` for an action, and confirm the saved binding is `f13`. 5. As a regression check, try to capture `F25` and confirm Codex reports that only `F1` through `F24` can be stored. Targeted tests: - `just test -p codex-config` - `just test -p codex-tui function_keys` Full `just test -p codex-tui` completed with 2,752 passing tests, 4 skipped tests, and two unrelated guardian feature-flag failures: - `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` - `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )