[FEAT] Add Anthropic compatible API endpoint by JustinTong0323 · Pull Request #18630 · sgl-project/sglang

JustinTong0323 · 2026-02-11T15:14:35Z

Summary

Add Anthropic-compatible /v1/messages and /v1/messages/count_tokens endpoints that translate between Anthropic Messages API format and
SGLang's existing OpenAI-compatible chat completion infrastructure
Supports non-streaming, streaming (SSE), tool use, system messages, and all standard Anthropic request parameters
Enables tools like Claude Code to use SGLang-served models as drop-in Anthropic API replacements via ANTHROPIC_BASE_URL

Details

Architecture: A translation layer (AnthropicServing) that delegates to OpenAIServingChat internally. Anthropic requests are converted to
ChatCompletionRequest, processed through existing infrastructure, and responses are converted back to Anthropic format.

New files:

python/sglang/srt/entrypoints/anthropic/protocol.py — Pydantic models for Anthropic Messages API
python/sglang/srt/entrypoints/anthropic/serving.py — Core handler with request/response conversion and streaming state machine
test/registered/openai_server/basic/test_anthropic_server.py — 19 basic API tests
test/registered/openai_server/function_call/test_anthropic_tool_use.py — 10 tool use tests
test/manual/vlm/test_anthropic_vision.py — visual understanding test

Modified files:

python/sglang/srt/entrypoints/http_server.py — Register endpoints and initialize handler

Test plan

19 basic tests passing (non-streaming, streaming, system messages, content blocks, error handling, count tokens)
10 tool use tests passing (tool format, tool_choice auto/any/specific, multi-turn, streaming tool calls, event sequence)
Manual integration test with cherry-studio anthropic client
Manual integration test with Claude Code via set ANTHROPIC_BASE_URL

Note: This PR is vibed by claude code, I believe he is more familiar with it's own api then me 🤣

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

…odels This commit introduces a new endpoint for counting tokens in messages compatible with the Anthropic API. It includes the implementation of the `AnthropicCountTokensRequest` and `AnthropicCountTokensResponse` models, as well as the necessary handling logic in the `AnthropicServing` class. Additionally, tests for the new endpoint have been added to ensure proper functionality and validation of token counting behavior. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

gemini-code-assist · 2026-02-11T15:16:02Z

Summary of Changes

Hello @JustinTong0323, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request adds Anthropic API compatibility to the SGLang server, allowing users to interact with SGLang models using the Anthropic Messages API. It introduces new endpoints, implements a translation layer for request/response conversion, and includes thorough testing to ensure proper functionality.

Highlights

New Anthropic-compatible API endpoints: Introduces /v1/messages and /v1/messages/count_tokens endpoints, enabling translation between Anthropic Messages API format and SGLang's OpenAI-compatible chat completion infrastructure.
Comprehensive Support: Supports non-streaming, streaming (SSE), tool use, system messages, and all standard Anthropic request parameters.
Seamless Integration: Allows tools like Claude Code to use SGLang-served models as drop-in Anthropic API replacements via ANTHROPIC_BASE_URL.
Translation Layer Architecture: Implements an AnthropicServing translation layer that delegates to OpenAIServingChat internally, converting Anthropic requests to ChatCompletionRequest and responses back to Anthropic format.
Extensive Testing: Includes 19 basic API tests and 10 tool use tests, ensuring comprehensive functionality and reliability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

python/sglang/srt/entrypoints/anthropic/protocol.py
- Added Pydantic models for Anthropic Messages API protocol.
python/sglang/srt/entrypoints/anthropic/serving.py
- Added core handler with request/response conversion and streaming state machine for Anthropic Messages API.
python/sglang/srt/entrypoints/http_server.py
- Registered Anthropic-compatible API endpoints and initialized the Anthropic serving handler.
test/registered/openai_server/basic/test_anthropic_server.py
- Added 19 basic API tests for the Anthropic-compatible server.
test/registered/openai_server/function_call/test_anthropic_tool_use.py
- Added 10 tool use tests for the Anthropic-compatible server.

Activity

Added new files for Anthropic API protocol and serving logic.
Modified http_server.py to register new endpoints.
Implemented request/response conversion between Anthropic and OpenAI formats.
Added tests for basic API functionality and tool use.
Verified integration with cherry-studio anthropic client and Claude Code.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces Anthropic-compatible API endpoints by creating a translation layer over the existing OpenAI-compatible infrastructure. This is a significant and well-executed feature, covering streaming, tool use, and other standard parameters. The code is well-structured and includes a comprehensive set of tests. My review focuses on improving robustness by using uuid for ID generation, removing some dead code, and suggesting a minor refactoring in the tests to reduce code duplication. Overall, this is a high-quality contribution.

python/sglang/srt/entrypoints/anthropic/protocol.py

python/sglang/srt/entrypoints/anthropic/serving.py

test/registered/openai_server/basic/test_anthropic_server.py

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

jhinpan

Comprehensive Review: Anthropic-Compatible API Endpoints (PR #18630)

Thanks for this well-structured contribution! The translation-layer approach of delegating to OpenAIServingChat is clean and avoids duplicating business logic. However, this review identified several issues that need attention before merge, organized by severity.

🔴 Critical Bugs (5)

C1. Image URL conversion is broken (serving.py:128)
Raw base64 data is passed directly as a URL. OpenAI format expects a data URI: data:<media_type>;base64,<data>. All image inputs will silently fail.

# Current (broken):
"url": block.source.get("data", ""),
# Fix:
media_type = block.source.get("media_type", "image/png")
data = block.source.get("data", "")
"url": f"data:{media_type};base64,{data}",

C2. tool_result uses id instead of tool_use_id (protocol.py:40)
The Anthropic spec defines tool_use_id for tool_result blocks, but AnthropicContentBlock only has id. SDK clients sending spec-compliant tool_use_id will have it silently ignored, breaking tool-use round-trips. The serving.py code at line 151 reads block.id which won't be populated from tool_use_id in the JSON payload.

C3. tool_result.content with list content produces garbled output (serving.py:147)
When block.content is a list[dict] (Anthropic allows content blocks inside tool_result), str() produces Python repr like [{'type': 'text', 'text': '...'}] instead of extracting text. Should iterate the list and concatenate text block values.

C4. Non-streaming response id uses OpenAI chatcmpl-* format (serving.py:594)
id=response.id passes through the OpenAI ID. Anthropic IDs must be msg_*. Streaming correctly generates msg_* IDs (line 361), but non-streaming doesn't. Fix: use f"msg_{uuid.uuid4().hex}" as in the streaming path.

C5. Missing thinking content block type and request parameter (protocol.py:35, 94-108)
Extended thinking is a major Anthropic API feature. The type literal only allows text|image|tool_use|tool_result, missing thinking and redacted_thinking. No thinking request parameter exists. While not strictly required for an MVP, SDK clients using thinking blocks will get validation errors — this should at least be documented as a known limitation.

🟠 Important Issues (8)

I1. stop_sequence stop_reason is never emitted (serving.py:46-50)
Both natural EOS and stop-sequence stops map to "end_turn". The code doesn't check matched_stop from the OpenAI response. Clients relying on stop_reason == "stop_sequence" will never see it.

I2. Streaming errors are silently swallowed (serving.py:410-414)
If the OpenAI stream emits an error, parsing as ChatCompletionStreamResponse fails and the continue silently skips it. The stream ends without an error event or message_stop. Should emit an Anthropic error event.

I3. Empty responses produce no content blocks (streaming state machine)
If the model emits only a finish_reason with no text/tool deltas, the stream emits message_start → message_delta → message_stop with zero content blocks, violating the Anthropic event schema which expects at least one content block.

I4. Missing "none" tool_choice type (protocol.py:74)
The Anthropic spec defines "none" to disallow all tool use. Only auto|any|tool are accepted.

I5. Missing disable_parallel_tool_use on AnthropicToolChoice)
This is a commonly-used Anthropic API field with no equivalent. Should be documented as unsupported at minimum.

I6. System text blocks concatenated without separator (serving.py:105)
["Hello", "World"] becomes "HelloWorld". Should use "\n" as separator.

I7. Tight coupling to OpenAIServingChat private methods (serving.py:248, 260, 269, 351, 652)
Five underscore-prefixed private methods are called. Any internal refactor of OpenAIServingChat will silently break this. Consider making these methods a stable internal API (e.g., remove underscore prefix and document them) or add a comment acknowledging the coupling.

I8. Exception details exposed in error responses (serving.py:77, 277, 323, 669)
message=str(e) can leak internal details (stack frames, module paths) to clients. Use generic messages for 500 errors; only expose specifics for 400 errors.

🟡 Minor Issues (7)

M1. x-api-key header is never validated — Anthropic clients using only x-api-key (the standard Anthropic auth pattern) without Authorization: Bearer will be rejected. Tests misleadingly send both headers.

M2. AnthropicUsage is missing cache_creation_input_tokens and cache_read_input_tokens in the message_delta usage tracking (they're defined in the model but never populated from the OpenAI response).

M3. response.choices[0] at serving.py:564 can IndexError on empty choices (defensive check recommended).

M4. AnthropicError.type should be a Literal enum of known error types (invalid_request_error, authentication_error, permission_error, not_found_error, rate_limit_error, api_error, overloaded_error).

M5. input_schema validator mutates input (adds type: "object") rather than rejecting invalid schemas.

M6. message_delta usage includes both input_tokens and output_tokens — Anthropic spec says message_delta should only have output_tokens (input_tokens goes in message_start).

M7. No ping event generation — useful for keep-alive on long-running streams.

🟢 Test Coverage Gaps

No streaming error test — no test for error events during streaming
No auth failure test — wrong/missing API key not tested
No tool_result with list content test — only string content tested
No stop_sequence stop_reason assertion — test_stop_sequences only checks status 200
No concurrent request test
_parse_sse_events silently swallows parse errors — could mask test failures
Several tool use tests have conditional assertions (e.g. if len(tool_use_starts) > 0:) that pass even if no tool use occurred

✅ What's Done Well

Clean architecture — Translation-layer pattern is correct; no duplicated business logic
SSE event format — Correct event: <type>\ndata: <json>\n\n formatting
Content block lifecycle — start→deltas→stop ordering is mostly correct for both text and tool_use
tool_choice mapping — auto→auto, any→required, tool→function is correct
stream_options injection — Properly enables usage tracking for streaming
Streaming arg reconstruction test — test_tool_use_streaming_args_parsing is excellent
Comparative token count test — Smart approach that's model-independent
No route conflicts — /v1/messages and /v1/messages/count_tokens are clean additions
Test isolation — Each test class manages its own server lifecycle

Recommendation

Request changes. The 5 critical bugs (especially C1-C4) will produce incorrect behavior for common API usage patterns. C1 breaks all image inputs. C2 breaks spec-compliant tool use round-trips. C3 produces garbled tool results. C4 violates the ID format contract. These should be fixed before merge. The important issues (I1-I3 especially) should also be addressed to prevent silent failures in production.

python/sglang/srt/entrypoints/anthropic/serving.py

python/sglang/srt/entrypoints/anthropic/protocol.py

python/sglang/srt/entrypoints/anthropic/serving.py

python/sglang/srt/entrypoints/anthropic/protocol.py

python/sglang/srt/entrypoints/anthropic/serving.py

test/registered/openai_server/basic/test_anthropic_server.py

test/registered/openai_server/function_call/test_anthropic_tool_use.py

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

JustinTong0323 · 2026-02-13T04:19:42Z

/tag-and-rerun-ci

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

JustinTong0323 · 2026-02-19T02:36:23Z

/rerun-failed-ci

JustinTong0323 · 2026-02-20T09:42:26Z

/rerun-failed-ci

JustinTong0323 added 2 commits February 11, 2026 14:47

Add Anthropic-compatible API endpoint

2836643

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

JustinTong0323 requested review from CatherineSue, ispobock, merrymercy and slin1237 as code owners February 11, 2026 15:14

gemini-code-assist bot reviewed Feb 11, 2026

View reviewed changes

resolve comments

0c8d694

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

JustinTong0323 changed the title ~~Add Anthropic compatible api endpoint~~ [FEAT] Add Anthropic compatible API endpoint Feb 11, 2026

JustinTong0323 mentioned this pull request Feb 11, 2026

feat: Add Anthropic API support to SGLang #8149

Closed

6 tasks

TensorTemplar mentioned this pull request Feb 11, 2026

Add claude-code feature complete ant api working with interleaved thinkers #16266

Closed

5 tasks

jhinpan reviewed Feb 12, 2026

View reviewed changes

JustinTong0323 and others added 2 commits February 12, 2026 08:16

resolve comments

2bcb4c5

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Merge branch 'main' into feat-authropic-api

2ef9fe4

github-actions bot added the run-ci label Feb 13, 2026

add anthropic api manual vision test

25f1639

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

github-actions bot added the Multi-modal multi-modal language model label Feb 16, 2026

JustinTong0323 and others added 2 commits February 16, 2026 01:36

Merge branch 'main' into feat-authropic-api

0201bc8

lint

5c79516

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

ispobock added the high priority label Feb 18, 2026

ehfd mentioned this pull request Feb 19, 2026

/anthropic/v1/messages should be available for OpenAI schema when /v1/messages is available with same API provider envoyproxy/ai-gateway#1867

Closed

Fridge003 approved these changes Feb 21, 2026

View reviewed changes

Fridge003 merged commit cc45167 into sgl-project:main Feb 21, 2026
217 of 235 checks passed

Conversation

JustinTong0323 commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Test plan

Uh oh!

gemini-code-assist bot commented Feb 11, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jhinpan left a comment

Choose a reason for hiding this comment

Comprehensive Review: Anthropic-Compatible API Endpoints (PR #18630)

🔴 Critical Bugs (5)

🟠 Important Issues (8)

🟡 Minor Issues (7)

🟢 Test Coverage Gaps

✅ What's Done Well

Recommendation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JustinTong0323 commented Feb 13, 2026

Uh oh!

JustinTong0323 commented Feb 19, 2026

Uh oh!

JustinTong0323 commented Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

JustinTong0323 commented Feb 11, 2026 •

edited

Loading