-
Notifications
You must be signed in to change notification settings - Fork 153
Feat: Qwen2.5 VIT hipblaslt swizzle #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds ROCm/hipBLASLt “swizzle” support for Qwen2.5-VL vision attention weights and swaps vision attention Linear layers to ROCm-optimized implementations after weight load.
Changes:
- Override multimodal weight loading to optionally swizzle attention weights based on an env flag.
- Add
_replace_with_rocm_linear()to vision attention modules and invoke it post-load on ROCm. - Introduce ROCm platform check (
is_hip) to gate the ROCm-specific path.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| rtp_llm/models/qwen2_5_vl/qwen2_5_vl.py | Adds swizzle-aware multimodal weight loading and post-load replacement of attention Linear layers. |
| rtp_llm/models/qwen2_5_vl/modeling_qwen2_5_vl.py | Adds ROCm Linear replacement helpers to vision attention classes and ROCm gating via is_hip(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
internal source has been updated, please review the changes! |
5a13105 to
2f7a4e3
Compare
|
internal source has been updated, please review the changes! |
Feat: Qwen2.5 VIT hipblaslt swizzle