Skip to content

Support for Codex CLI by skipping unsupported Responses tools#23041

Merged
pwilkin merged 3 commits into
ggml-org:masterfrom
SidShaytay:fix-responses-non-function-tools
May 15, 2026
Merged

Support for Codex CLI by skipping unsupported Responses tools#23041
pwilkin merged 3 commits into
ggml-org:masterfrom
SidShaytay:fix-responses-non-function-tools

Conversation

@SidShaytay
Copy link
Copy Markdown
Contributor

Overview

This enables support for codex CLI, which now uses the Responses API. As per https://platform.openai.com/docs/guides/tools?api-mode=responses type's can be beyond just function, like file_search, web_search , mcp, image_generation, namespace, etc. llama.cpp can't support each type but instead of breaking down entirely, we only pass the ones we can support to the backend. The patch is intentionally minimal as there isn't a full implementation of Responses in llama.cpp as far as I can tell. This is merely making the compatibility shim (Responses <-> Chat completion) less brittle.

Issue faced by users @ #20156

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: Codex 5.5 (API) medium used to rapidly query/understand the high level flow and cross-verify Responses API vs Chat Completion API handling within llama.cpp. The unit tests were entirely written by Codex as per guidance to cover both +ve and -ve cases.

Copy link
Copy Markdown
Contributor

@aldehir aldehir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is better than the other PR, which tries to add support for everything.

OpenAI is constantly changing Codex, it is infeasible to expect llama.cpp to maintain feature parity.

Comment thread tools/server/server-chat.cpp
@pwilkin
Copy link
Copy Markdown
Member

pwilkin commented May 14, 2026

Agreed with @aldehir , we should be selective about what we add. Might be able to add more later once we figure out the shape of the tooling protocols.

@SidShaytay SidShaytay requested review from aldehir May 14, 2026 17:04
@SidShaytay
Copy link
Copy Markdown
Contributor Author

SidShaytay commented May 14, 2026

To clarify, the goal was never to add 'full functionality' of Responses API here. That's a moving target as OpenAI continually adds server-hosted tools (web_search , code_interpreter, even mcp which are mcps on openai's side, not your client machine etc). Plus it's a major design change, better served by something where llama-server handles just the API/handshake with clients, while offloading various tools invoked within Responses API to another service (say, a separately maintained tools container). However, complete Responses API functionality is outside the scope of this PR.

With that out of the way, this PR's intent is to narrowly allow llama.cpp to be more resilient / less brittle to Responses API clients like codex. @aldehir 's suggestions of

  1. emitting warnings when unsupported tools are discovered
    => I'm aligned, it's added with minimal code

  2. continue to throw for the codex + gpt-oss combo
    => It's now added to the PR because I'd like to get done and move on, BUT unlike @aldehir's comment, I am not seeing codex 0.130.0 advertise apply_patch at my end even with codex exec + gpt-oss-20b model. The apply_patch string is found inside the codex v0.130 rust binary but it's trigger point is opaque. I'm in slight favor to not adding any legacy support in fresh code when the target (codex) itself appears to have moved on. LMK, I can reverse just this bit as handled in the original PR (no special treatment for apply_patch)

Action: Reviewers to review

CC: @pwilkin

@aldehir
Copy link
Copy Markdown
Contributor

aldehir commented May 14, 2026

Yes, it does seem that was changed in Codex. Previously it would define apply_patch as a freeform tool for gpt-oss-120b.

Strip the logic out, I apologize for the misdirection. Everything else looks good.

@SidShaytay SidShaytay force-pushed the fix-responses-non-function-tools branch from de6562f to 28b7457 Compare May 14, 2026 22:26
@SidShaytay
Copy link
Copy Markdown
Contributor Author

@aldehir - no worries, all good. I've reverted gpt-oss apply_patch special handling (and it's 2x tests). Also rebased to latest master, should merge cleanly.

If all looks good, proceed?

@pwilkin
Copy link
Copy Markdown
Member

pwilkin commented May 15, 2026

@SidShaytay no, it's fine, we were referring to the other huge Responses API PR :)

@pwilkin pwilkin merged commit 91e84fe into ggml-org:master May 15, 2026
50 checks passed
xxmustafacooTR pushed a commit to xxPlayground/llama-cpp-turboquant that referenced this pull request May 15, 2026
…rg#23041)

* Support for Codex CLI by skipping unsupported Responses tools

* Warn on skipped Responses tools and preserve gpt-oss apply_patch rejection

* Revert gpt-oss apply_patch special handling
dandm1 pushed a commit to dandm1/llama.cpp that referenced this pull request May 16, 2026
…rg#23041)

* Support for Codex CLI by skipping unsupported Responses tools

* Warn on skipped Responses tools and preserve gpt-oss apply_patch rejection

* Revert gpt-oss apply_patch special handling
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 19, 2026
…rg#23041)

* Support for Codex CLI by skipping unsupported Responses tools

* Warn on skipped Responses tools and preserve gpt-oss apply_patch rejection

* Revert gpt-oss apply_patch special handling
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request May 19, 2026
…rg#23041)

* Support for Codex CLI by skipping unsupported Responses tools

* Warn on skipped Responses tools and preserve gpt-oss apply_patch rejection

* Revert gpt-oss apply_patch special handling
baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026
…rg#23041)

* Support for Codex CLI by skipping unsupported Responses tools

* Warn on skipped Responses tools and preserve gpt-oss apply_patch rejection

* Revert gpt-oss apply_patch special handling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples server testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants