Skip to content

[Performance] Optimize radix cache eviction performance#14339

Merged
stmatengss merged 1 commit intosgl-project:mainfrom
YiXR:devel
Feb 3, 2026
Merged

[Performance] Optimize radix cache eviction performance#14339
stmatengss merged 1 commit intosgl-project:mainfrom
YiXR:devel

Conversation

@YiXR
Copy link
Contributor

@YiXR YiXR commented Dec 3, 2025

Motivation

Currently, the RadixCache.evict method calls _collect_leaves() to traverse the entire Radix Tree to find evictable nodes. This operation has a time complexity of O(N), where N is the total number of nodes in the tree.

In high-concurrency scenarios where the GPU memory is fully utilized, evict is triggered frequently. The O(N) traversal causes significant CPU overhead and leads to latency jitter (spikes) during the decoding phase, especially when the Radix Tree is large.

Modifications

Introduced self.evictable_leaves:

Changed from a dynamic search to an incrementally maintained Set

Incremental Updates:

Updated insert, _delete_leaf, inc_lock_ref, and dec_lock_ref methods to add/remove nodes from self.evictable_leaves dynamically when their state changes (e.g., reference count drops to 0 or a node becomes a leaf).

Benchmarking and Profiling

Radix Cache Test

Test on H20 TP8 Qwen/Qwen3-0.6B

Before optimze: 7 ms for each eviction
After optimize: 0.5ms for each eviction
image

HiCache Test

image

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@XucSh
Copy link
Collaborator

XucSh commented Dec 3, 2025

/tag-and-rerun-ci

@github-actions github-actions bot added the run-ci label Dec 3, 2025
@YiXR YiXR force-pushed the devel branch 8 times, most recently from 86533f5 to 3a846cf Compare December 4, 2025 05:58
@xiezhq-hermann xiezhq-hermann self-assigned this Dec 4, 2025
@xiezhq-hermann
Copy link
Collaborator

let's get this rebased after merging this PR: #13334

@stmatengss
Copy link
Collaborator

/tag-and-rerun-ci


for child in node.children.values():
if not child.evicted:
if node in self.evictable_leaves:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove this? as long as there are non-evicted child, the node should be removed from the list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a new leaf node (x) is added to the device pool, both _update_leaf_status(x) and _update_leaf_status(x.parent) should be called, as the parent node needs to update its status based on the child node's state. (In HiCache Insert() func)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, should we remove the line 756

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or alternatively should we do update leaf node update in a different order? update for itself and then check for the parent node.

Copy link
Contributor Author

@YiXR YiXR Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the parent node was not previously in the list (e.g., when it had existing children and a new child is added), the remove operation will throw an error without this validation.

If we only increment or decrement the lock reference count on a node without changing the tree structure, there is no need to update the parent's status.

I will always update node itself and check it's parent like _update_leaf_status(x) + _update_leaf_status(x.parent) only if the tree structure changed(insert, delete, evict, promote).

@XucSh
Copy link
Collaborator

XucSh commented Jan 6, 2026

/rerun-failed-ci

@XucSh
Copy link
Collaborator

XucSh commented Jan 27, 2026

/rerun-failed-ci

Signed-off-by: Xingrui Yi <yixingrui@linux.alibaba.com>
Co-authored-by: Xuchun Shang <xuchun.shang@gmail.com>
@YiXR YiXR reopened this Feb 2, 2026
@XucSh
Copy link
Collaborator

XucSh commented Feb 2, 2026

/rerun-failed-ci

1 similar comment
@XucSh
Copy link
Collaborator

XucSh commented Feb 3, 2026

/rerun-failed-ci

@stmatengss stmatengss merged commit fd983b0 into sgl-project:main Feb 3, 2026
218 of 279 checks passed
charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Feb 5, 2026
…14339)

Signed-off-by: Xingrui Yi <yixingrui@linux.alibaba.com>
Co-authored-by: Xuchun Shang <xuchun.shang@gmail.com>
sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026
…14339)

Signed-off-by: Xingrui Yi <yixingrui@linux.alibaba.com>
Co-authored-by: Xuchun Shang <xuchun.shang@gmail.com>
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
…14339)

Signed-off-by: Xingrui Yi <yixingrui@linux.alibaba.com>
Co-authored-by: Xuchun Shang <xuchun.shang@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants