Skip to content

[RadixTree][4/N Refactor]: Move available_and_evictable_str to individual radix cache classes#17852

Merged
ispobock merged 4 commits intosgl-project:mainfrom
pansicheng:hybrid-radix-cache-simplification
Feb 19, 2026
Merged

[RadixTree][4/N Refactor]: Move available_and_evictable_str to individual radix cache classes#17852
ispobock merged 4 commits intosgl-project:mainfrom
pansicheng:hybrid-radix-cache-simplification

Conversation

@pansicheng
Copy link
Collaborator

Motivation

Modifications

The available_and_evictable_str method is now an abstract method in BasePrefixCache that each radix cache implementation must provide, rather than a generic function in common.py with isinstance checks.

Accuracy Tests

python3 benchmark/gsm8k/bench_sglang.py --num-shots 8 --num-questions 1319 --parallel 1319

Qwen3-Next-80B-A3B-Instruct mamba
Accuracy: 0.949
Invalid: 0.000
Latency: 26.027 s
Output throughput: 8175.272 token/s

Qwen3-32B full
Accuracy: 0.927
Invalid: 0.000
Latency: 34.917 s
Output throughput: 5637.058 token/s

gpt-oss-120b swa
Accuracy: 0.803
Invalid: 0.043
Latency: 48.075 s
Output throughput: 9391.664 token/s

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

…classes

The available_and_evictable_str method is now an abstract method in BasePrefixCache that each radix cache implementation must provide, rather than a generic function in common.py with isinstance checks.
@pansicheng pansicheng force-pushed the hybrid-radix-cache-simplification branch from 3c78567 to 8b21325 Compare January 28, 2026 03:50
@hzh0425
Copy link
Collaborator

hzh0425 commented Feb 3, 2026

/tag-and-rerun-ci

@hzh0425 hzh0425 self-assigned this Feb 3, 2026
@github-actions github-actions bot added the run-ci label Feb 3, 2026
def is_tree_cache(self) -> bool:
return not self.is_chunk_cache()

def available_and_evictable_str(self) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also test chunk cache (disable radix cache) case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--disable-radix-cache

Qwen3-Next-80B-A3B-Instruct mamba
Accuracy: 0.939
Invalid: 0.000
Latency: 69.834 s
Output throughput: 3050.199 token/s

Qwen3-32B full
Accuracy: 0.927
Invalid: 0.000
Latency: 166.645 s
Output throughput: 1167.150 token/s

gpt-oss-120b swa
Accuracy: 0.800
Invalid: 0.052
Latency: 66.569 s
Output throughput: 6632.504 token/s

@ispobock ispobock merged commit 48642d5 into sgl-project:main Feb 19, 2026
209 of 222 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants