Skip to content

convert float16 weight to bfloat16 for FP8 models#4276

Merged
lvhan028 merged 2 commits intoInternLM:mainfrom
lvhan028:half-fp8-workround
Jan 15, 2026
Merged

convert float16 weight to bfloat16 for FP8 models#4276
lvhan028 merged 2 commits intoInternLM:mainfrom
lvhan028:half-fp8-workround

Conversation

@lvhan028
Copy link
Copy Markdown
Collaborator

fix #4261

In the model Qwen/Qwen3-4B-Instruct-2507-FP8, some parameters like "*.weight_scale_inv" are in half precision. However, the turbomind FP8 kernel is only compatible with bfloat16.
This PR implements a temporary workaround by converting Half-precision weights to the bfloat16 format.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses issue #4261 by adding support for converting float16 weights to bfloat16 format in FP8 models, specifically for the Qwen/Qwen3-4B-Instruct-2507-FP8 model where parameters like *.weight_scale_inv are stored in half precision but require bfloat16 for compatibility with turbomind FP8 kernels.

Changes:

  • Added float16 to bfloat16 conversion in the process_fp8 function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return x.view(dtype=torch.uint8)
elif kind != 'weight_scale_inv' and x.dtype == torch.float:
return x.to(dtype=torch.bfloat16)
elif x.dtype == torch.float16:
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new float16 to bfloat16 conversion does not respect the 'weight_scale_inv' exclusion that exists for float32 tensors (line 65). This inconsistency means that weight_scale_inv parameters will be converted from float16 to bfloat16, but not from float32 to bfloat16. Consider whether this condition should also check kind != 'weight_scale_inv' to maintain consistency with the existing logic.

Suggested change
elif x.dtype == torch.float16:
elif kind != 'weight_scale_inv' and x.dtype == torch.float16:

Copilot uses AI. Check for mistakes.
@lvhan028 lvhan028 merged commit 0e335e0 into InternLM:main Jan 15, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Qwen/Qwen3-4B-Instruct-2507-FP8 在lmdeploy 10.2上输出乱码

3 participants