Skip to content

fix: correct weight loading prefix mapping for Qwen3-VL#18024

Merged
Kangyan-Zhou merged 3 commits intosgl-project:mainfrom
Lollipop:fix/qwen3-vl-weight-loading
Feb 2, 2026
Merged

fix: correct weight loading prefix mapping for Qwen3-VL#18024
Kangyan-Zhou merged 3 commits intosgl-project:mainfrom
Lollipop:fix/qwen3-vl-weight-loading

Conversation

@Lollipop
Copy link
Contributor

@Lollipop Lollipop commented Jan 31, 2026

Summary

Fix Qwen3-VL-8B model producing garbage output due to incorrect weight loading.

Fixes #17887

Problem

The weight loading code unconditionally copies embed_tokens.weight to lm_head.weight:

if self.pp_group.is_last_rank and "model.embed_tokens.weight" in name:
    if "lm_head.weight" in params_dict:
        # copies embed_tokens to lm_head unconditionally

This is incorrect for models with tie_word_embeddings=False (like Qwen3-VL-8B), where lm_head has independent weights that should not be overwritten.

Model tie_word_embeddings lm_head weights
Qwen3-VL-2B True Shared with embed_tokens
Qwen3-VL-8B False Independent (should NOT be overwritten)

Fix

Add a check to only copy when tie_word_embeddings=True:

if (
    self.pp_group.is_last_rank
    and "model.embed_tokens.weight" in name
    and self.config.tie_word_embeddings  # <-- added check
):

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@JustinTong0323
Copy link
Collaborator

It seems not a valid fix, my test still shows garbage output.

@Lollipop Lollipop force-pushed the fix/qwen3-vl-weight-loading branch from 400c549 to ce0e92f Compare February 2, 2026 05:22
liuxiaoming added 2 commits February 2, 2026 13:26
…to lm_head

The weight loading code unconditionally copied embed_tokens.weight to
lm_head.weight, which is incorrect for models with tie_word_embeddings=False
(e.g. Qwen3-VL-8B). This caused garbage output from the 8B model.

Add a check for self.config.tie_word_embeddings to ensure embed_tokens
is only copied to lm_head when they are supposed to share weights.

Fixes sgl-project#17887
…to lm_head

The weight loading code unconditionally copied embed_tokens.weight to
lm_head.weight, which is incorrect for models with tie_word_embeddings=False
(e.g. Qwen3-VL-8B). This caused garbage output from the 8B model.

Add a check for self.config.tie_word_embeddings to ensure embed_tokens
is only copied to lm_head when they are supposed to share weights.

Fixes sgl-project#17887
@Lollipop Lollipop force-pushed the fix/qwen3-vl-weight-loading branch from ce0e92f to 670aec5 Compare February 2, 2026 05:27
@JustinTong0323
Copy link
Collaborator

/tag-and-rerun-ci

@github-actions github-actions bot added the run-ci label Feb 2, 2026
@Lollipop
Copy link
Contributor Author

Lollipop commented Feb 2, 2026

@JustinTong0323 Thanks for testing! The previous fix was incorrect - I've updated the PR with the correct fix now.

Root Cause Clarification

The issue only affects models with tie_word_embeddings=False. Here's the difference:

Model tie_word_embeddings Affected
Qwen3-VL-2B True No
Qwen3-VL-4B True No
Qwen3-VL-8B False Yes

For models with tie_word_embeddings=True, embed_tokens and lm_head share the same weights, so copying is correct.

For Qwen3-VL-8B (tie_word_embeddings=False), lm_head has its own independent weights. The unconditional copy overwrites these weights with embed_tokens, causing garbage output.

Updated Fix

The new fix adds a check for self.config.tie_word_embeddings:

if (
    self.pp_group.is_last_rank
    and "model.embed_tokens.weight" in name
    and self.config.tie_word_embeddings  # <-- only copy when weights are shared
):

Could you please test again with Qwen3-VL-8B specifically? The 2B/4B models should work fine with or without this fix.

Copy link
Collaborator

@JustinTong0323 JustinTong0323 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Kangyan-Zhou Kangyan-Zhou merged commit 522e13b into sgl-project:main Feb 2, 2026
40 of 78 checks passed
@Lollipop Lollipop deleted the fix/qwen3-vl-weight-loading branch February 3, 2026 07:12
charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Feb 5, 2026
…18024)

Co-authored-by: liuxiaoming <liuxiaoming@modelbest.cn>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026
…18024)

Co-authored-by: liuxiaoming <liuxiaoming@modelbest.cn>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
…18024)

Co-authored-by: liuxiaoming <liuxiaoming@modelbest.cn>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] the model result is wrong when using sglang to serve qwen3-vl-8b-instruct

3 participants