Skip to content

[BUGFIX] Fix dp size > 1 for qwen3 vl model#17624

Merged
ispobock merged 11 commits intosgl-project:mainfrom
zju-stu-lizheng:fix_dp
Jan 30, 2026
Merged

[BUGFIX] Fix dp size > 1 for qwen3 vl model#17624
ispobock merged 11 commits intosgl-project:mainfrom
zju-stu-lizheng:fix_dp

Conversation

@zju-stu-lizheng
Copy link
Contributor

Question

In my local tests, the Qwen3-VL service fails to start when both
--mm-enable-dp-encoder
and
--enable-dp-attention
are enabled.

I came across the related PR: pr17157
After applying this PR and setting tp = dp, enabling both options works without any issues.

However, when using configurations where tp != dp (for example, tp=8 and dp=4), enabling both options leads to precision/accuracy problems[sometimes ima bug].

Environment:
Model: Qwen3-VL
Command-line options: --mm-enable-dp-encoder --enable-dp-attention
Observed: service startup failure (before PR), precision issue (after PR) in tp != dp setups

Fix bug

This branch is intended to fix the problem mentioned in this issue.

Many thanks to yizhang2077 for identifying and fixing the mrope_positions padding issue.

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@yhyang201
Copy link
Collaborator

/tag-and-rerun-ci

tp_size = get_tensor_model_parallel_world_size()
from sglang.srt.layers.dp_attention import (
get_attention_tp_group,
get_attention_tp_rank,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move it into header

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move it into header has some bug:
ImportError: cannot import name 'get_global_server_args' from partially initialized module 'sglang.srt.server_args' (most likely due to a circular import)

self.mrope_positions = self._pad_tensor_to_size(self.mrope_positions, bs)
self.mrope_positions = torch.cat(
[
self.mrope_positions,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be compatible with self._pad_tensor_to_size, maybe you could write like
self.mrope_positions = self._pad_tensor_to_size(self.mrope_positions.t(), bs).t()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it can refactor as

self.mrope_positions = self._pad_tensor_to_size(
                self.mrope_positions.transpose(0, 1), num_tokens
            ).transpose(0, 1)

@yhyang201
Copy link
Collaborator

/rerun-failed-ci

@yhyang201
Copy link
Collaborator

I’m using this PR and running the following command:

python -m sglang.launch_server \
  --model-path Qwen/Qwen3-VL-30B-A3B-Instruct \
  --tp 4 \
  --enable-dp-attention \
  --dp 4 \
  --ep 4 \
  --disable-cuda-graph

However, the server fails to launch. Could you please help check what might be wrong? Thank you!

@zju-stu-lizheng
Copy link
Contributor Author

Thanks for the report — this is indeed a known issue with the current PR.
After adding support for DP attention, the problem should be resolved.
I will work on this and include the fix in an upcoming commit.

@github-actions github-actions bot added the Multi-modal multi-modal language model label Jan 25, 2026
@yhyang201
Copy link
Collaborator

I tested the following command with four different DP configurations (by varying --dp and whether --mm-enable-dp-encoder is enabled), and all of them worked fine:

python -m sglang.launch_server \
  --model-path Qwen/Qwen3-VL-30B-A3B-Instruct \
  --tp 4 \
  --enable-dp-attention \
  --ep 4 \
  [--dp 4 | --dp 2 | --dp 2 --mm-enable-dp-encoder | --dp 4 --mm-enable-dp-encoder]

No issues on my side. Many thanks for the fix!

@yizhang2077
Copy link
Collaborator

/rerun-failed-ci

@yhyang201
Copy link
Collaborator

While making the vision part compatible with DP attention for kimi-k2.5, I modified the vision.py file, so it may now need to be rebased.
I think the approach in this PR is better, so we can revert the changes to vision.py in https://github.com/sgl-project/sglang/pull/17789/changes and then apply the changes from the current PR.
image

@ispobock ispobock merged commit 0c5a81a into sgl-project:main Jan 30, 2026
297 of 327 checks passed
yuki-brook pushed a commit to scitix/sglang that referenced this pull request Jan 30, 2026
Co-authored-by: yizhang2077 <1109276519@qq.com>
dsingal0 pushed a commit to dsingal0/sglang that referenced this pull request Feb 1, 2026
Co-authored-by: yizhang2077 <1109276519@qq.com>
charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Feb 2, 2026
Co-authored-by: yizhang2077 <1109276519@qq.com>
sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026
Co-authored-by: yizhang2077 <1109276519@qq.com>
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
Co-authored-by: yizhang2077 <1109276519@qq.com>
yuki-brook pushed a commit to scitix/sglang that referenced this pull request Feb 16, 2026
Co-authored-by: yizhang2077 <1109276519@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Multi-modal multi-modal language model run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants