Skip to content

sglang on Ascend 910C runs less than half the speed of vllm-ascend #207

@weiyangdaren

Description

@weiyangdaren

Hi, I encountered a significant performance gap when running sglang on Ascend 910C compared to vllm-ascend.

Test Results

Hardware Framework model Speed (tokens/sec)
NVIDIA 4090D vLLM Qwen3-8B ~1055
NVIDIA 4090D sglang Qwen3-8B ~1109
Ascend 910C vllm-ascend Qwen3-32B ~567
Ascend 910C sglang Qwen3-32B ~244

On 4090D, vLLM and sglang have very similar performance.
But on Ascend 910C, sglang is less than half the speed of vllm-ascend.

What could be the potential factors causing this performance difference on Ascend 910C?

Thank you for your help!

Below is a screenshot of the experimental results.

Image
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions