[BUGFIX] Fix dp size > 1 for qwen3 vl model by zju-stu-lizheng · Pull Request #17624 · sgl-project/sglang

zju-stu-lizheng · 2026-01-23T05:09:43Z

Question

In my local tests, the Qwen3-VL service fails to start when both
--mm-enable-dp-encoder
and
--enable-dp-attention
are enabled.

I came across the related PR: pr17157
After applying this PR and setting tp = dp, enabling both options works without any issues.

However, when using configurations where tp != dp (for example, tp=8 and dp=4), enabling both options leads to precision/accuracy problems[sometimes ima bug].

Environment:
Model: Qwen3-VL
Command-line options: --mm-enable-dp-encoder --enable-dp-attention
Observed: service startup failure (before PR), precision issue (after PR) in tp != dp setups

Fix bug

This branch is intended to fix the problem mentioned in this issue.

Many thanks to yizhang2077 for identifying and fixing the mrope_positions padding issue.

fix launch server hang

gemini-code-assist · 2026-01-23T05:09:47Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

yhyang201 · 2026-01-23T05:11:56Z

/tag-and-rerun-ci

yizhang2077 · 2026-01-23T05:58:20Z

python/sglang/srt/multimodal/mm_utils.py

-    tp_size = get_tensor_model_parallel_world_size()
+    from sglang.srt.layers.dp_attention import (
+        get_attention_tp_group,
+        get_attention_tp_rank,


move it into header

move it into header has some bug:
ImportError: cannot import name 'get_global_server_args' from partially initialized module 'sglang.srt.server_args' (most likely due to a circular import)

yizhang2077 · 2026-01-23T06:01:23Z

python/sglang/srt/model_executor/forward_batch_info.py

-            self.mrope_positions = self._pad_tensor_to_size(self.mrope_positions, bs)
+            self.mrope_positions = torch.cat(
+                [
+                    self.mrope_positions,


it should be compatible with self._pad_tensor_to_size, maybe you could write like
self.mrope_positions = self._pad_tensor_to_size(self.mrope_positions.t(), bs).t()

yes, it can refactor as

self.mrope_positions = self._pad_tensor_to_size( self.mrope_positions.transpose(0, 1), num_tokens ).transpose(0, 1)

yhyang201 · 2026-01-23T12:36:30Z

/rerun-failed-ci

yhyang201 · 2026-01-24T02:43:48Z

I’m using this PR and running the following command:

python -m sglang.launch_server \
  --model-path Qwen/Qwen3-VL-30B-A3B-Instruct \
  --tp 4 \
  --enable-dp-attention \
  --dp 4 \
  --ep 4 \
  --disable-cuda-graph

However, the server fails to launch. Could you please help check what might be wrong? Thank you!

zju-stu-lizheng · 2026-01-25T04:37:05Z

Thanks for the report — this is indeed a known issue with the current PR.
After adding support for DP attention, the problem should be resolved.
I will work on this and include the fix in an upcoming commit.

yhyang201 · 2026-01-26T10:56:38Z

I tested the following command with four different DP configurations (by varying --dp and whether --mm-enable-dp-encoder is enabled), and all of them worked fine:

python -m sglang.launch_server \
  --model-path Qwen/Qwen3-VL-30B-A3B-Instruct \
  --tp 4 \
  --enable-dp-attention \
  --ep 4 \
  [--dp 4 | --dp 2 | --dp 2 --mm-enable-dp-encoder | --dp 4 --mm-enable-dp-encoder]

No issues on my side. Many thanks for the fix!

yizhang2077 · 2026-01-28T03:42:31Z

/rerun-failed-ci

yhyang201 · 2026-01-29T02:53:04Z

While making the vision part compatible with DP attention for kimi-k2.5, I modified the vision.py file, so it may now need to be rebased.
I think the approach in this PR is better, so we can revert the changes to vision.py in https://github.com/sgl-project/sglang/pull/17789/changes and then apply the changes from the current PR.

Co-authored-by: yizhang2077 <1109276519@qq.com>

zju-stu-lizheng added 4 commits January 23, 2026 11:46

[BUGFIX] fix mrope positions for dp_size>1

237741f

[BUGFIX] refer to pull/17157

abf506a

fix launch server hang

[BUGFIX] fix dp for lmhead

fcfd128

fix lint

60abe49

zju-stu-lizheng requested review from Fridge003, JustinTong0323, Ying1123, hnyls2002, ispobock, merrymercy, mickqian and yhyang201 as code owners January 23, 2026 05:09

github-actions bot added the run-ci label Jan 23, 2026

yizhang2077 reviewed Jan 23, 2026

View reviewed changes

zju-stu-lizheng added 3 commits January 23, 2026 14:53

refactor(model_executor)

31972b4

fallback import

8eee4cf

add contiguous

220c2db

[BUGFIX] fix only dp attn for vision

28cfe7e

zju-stu-lizheng requested review from BBuf, Edwardf0t1, HaiShaw, Qiaolin-Yu, ch-wan and hebiao064 as code owners January 25, 2026 04:49

github-actions bot added the Multi-modal multi-modal language model label Jan 25, 2026

zju-stu-lizheng and others added 3 commits January 25, 2026 13:16

fallback pad self.mrope_positions

273c41d

rename params

ff0dd9d

fix

1bce742

yhyang201 approved these changes Jan 26, 2026

View reviewed changes

ispobock merged commit 0c5a81a into sgl-project:main Jan 30, 2026
297 of 327 checks passed

yuki-brook pushed a commit to scitix/sglang that referenced this pull request Jan 30, 2026

[BUGFIX] Fix dp size > 1 for qwen3 vl model (sgl-project#17624)

59ef673

Co-authored-by: yizhang2077 <1109276519@qq.com>

dsingal0 pushed a commit to dsingal0/sglang that referenced this pull request Feb 1, 2026

[BUGFIX] Fix dp size > 1 for qwen3 vl model (sgl-project#17624)

1f27f7d

Co-authored-by: yizhang2077 <1109276519@qq.com>

charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Feb 2, 2026

[BUGFIX] Fix dp size > 1 for qwen3 vl model (sgl-project#17624)

cc1edcc

Co-authored-by: yizhang2077 <1109276519@qq.com>

sufeng-buaa mentioned this pull request Feb 2, 2026

[Bug?] Failed to run Qwen3-VL-235B-A22B-Instruct-FP8 using deepep on H20; Exception triggered in _pad_inputs_to_size during prepare_mlp_sync_batch. #17014

Closed

5 tasks

sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026

[BUGFIX] Fix dp size > 1 for qwen3 vl model (sgl-project#17624)

dfdba1c

Co-authored-by: yizhang2077 <1109276519@qq.com>

Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026

[BUGFIX] Fix dp size > 1 for qwen3 vl model (sgl-project#17624)

2868542

Co-authored-by: yizhang2077 <1109276519@qq.com>

yuki-brook pushed a commit to scitix/sglang that referenced this pull request Feb 16, 2026

[BUGFIX] Fix dp size > 1 for qwen3 vl model (sgl-project#17624)

4b0b336

Co-authored-by: yizhang2077 <1109276519@qq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUGFIX] Fix dp size > 1 for qwen3 vl model#17624

[BUGFIX] Fix dp size > 1 for qwen3 vl model#17624
ispobock merged 11 commits intosgl-project:mainfrom
zju-stu-lizheng:fix_dp

zju-stu-lizheng commented Jan 23, 2026

Uh oh!

gemini-code-assist bot commented Jan 23, 2026

Uh oh!

yhyang201 commented Jan 23, 2026

Uh oh!

yizhang2077 Jan 23, 2026

Uh oh!

zju-stu-lizheng Jan 23, 2026

Uh oh!

yizhang2077 Jan 23, 2026

Uh oh!

zju-stu-lizheng Jan 23, 2026

Uh oh!

yhyang201 commented Jan 23, 2026

Uh oh!

yhyang201 commented Jan 24, 2026

Uh oh!

zju-stu-lizheng commented Jan 25, 2026

Uh oh!

yhyang201 commented Jan 26, 2026

Uh oh!

yizhang2077 commented Jan 28, 2026

Uh oh!

yhyang201 commented Jan 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

zju-stu-lizheng commented Jan 23, 2026

Question

Fix bug

Uh oh!

gemini-code-assist bot commented Jan 23, 2026

Uh oh!

yhyang201 commented Jan 23, 2026

Uh oh!

yizhang2077 Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

zju-stu-lizheng Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

yizhang2077 Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

zju-stu-lizheng Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

yhyang201 commented Jan 23, 2026

Uh oh!

yhyang201 commented Jan 24, 2026

Uh oh!

zju-stu-lizheng commented Jan 25, 2026

Uh oh!

yhyang201 commented Jan 26, 2026

Uh oh!

yizhang2077 commented Jan 28, 2026

Uh oh!

yhyang201 commented Jan 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants