[BUGFIX] Fix dp size > 1 for qwen3 vl model#17624
[BUGFIX] Fix dp size > 1 for qwen3 vl model#17624ispobock merged 11 commits intosgl-project:mainfrom
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
/tag-and-rerun-ci |
| tp_size = get_tensor_model_parallel_world_size() | ||
| from sglang.srt.layers.dp_attention import ( | ||
| get_attention_tp_group, | ||
| get_attention_tp_rank, |
There was a problem hiding this comment.
move it into header has some bug:
ImportError: cannot import name 'get_global_server_args' from partially initialized module 'sglang.srt.server_args' (most likely due to a circular import)
| self.mrope_positions = self._pad_tensor_to_size(self.mrope_positions, bs) | ||
| self.mrope_positions = torch.cat( | ||
| [ | ||
| self.mrope_positions, |
There was a problem hiding this comment.
it should be compatible with self._pad_tensor_to_size, maybe you could write like
self.mrope_positions = self._pad_tensor_to_size(self.mrope_positions.t(), bs).t()
There was a problem hiding this comment.
yes, it can refactor as
self.mrope_positions = self._pad_tensor_to_size(
self.mrope_positions.transpose(0, 1), num_tokens
).transpose(0, 1)
|
/rerun-failed-ci |
|
I’m using this PR and running the following command: However, the server fails to launch. Could you please help check what might be wrong? Thank you! |
|
Thanks for the report — this is indeed a known issue with the current PR. |
|
I tested the following command with four different DP configurations (by varying No issues on my side. Many thanks for the fix! |
|
/rerun-failed-ci |
|
While making the vision part compatible with DP attention for kimi-k2.5, I modified the vision.py file, so it may now need to be rebased. |
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>

Question
In my local tests, the Qwen3-VL service fails to start when both
--mm-enable-dp-encoderand
--enable-dp-attentionare enabled.
I came across the related PR: pr17157
After applying this PR and setting tp = dp, enabling both options works without any issues.
However, when using configurations where tp != dp (for example, tp=8 and dp=4), enabling both options leads to precision/accuracy problems[sometimes ima bug].
Environment:
Model: Qwen3-VL
Command-line options: --mm-enable-dp-encoder --enable-dp-attention
Observed: service startup failure (before PR), precision issue (after PR) in tp != dp setups
Fix bug
This branch is intended to fix the problem mentioned in this issue.
Many thanks to yizhang2077 for identifying and fixing the mrope_positions padding issue.