Skip to content

MXFP4 support for turbomind GEMM library#3927

Merged
lvhan028 merged 21 commits intoInternLM:mainfrom
lzhangzz:mxfp4a
Sep 4, 2025
Merged

MXFP4 support for turbomind GEMM library#3927
lvhan028 merged 21 commits intoInternLM:mainfrom
lzhangzz:mxfp4a

Conversation

@lzhangzz
Copy link
Copy Markdown
Collaborator

@lzhangzz lzhangzz commented Sep 2, 2025

  • Grouped MXFP4 * half/bfloat16 for sm_70 ... sm_90
  • Support official gpt-oss (MXFP4) models

@lvhan028
Copy link
Copy Markdown
Collaborator

lvhan028 commented Sep 2, 2025

FINALLY!!!!

@lvhan028 lvhan028 added the enhancement New feature or request label Sep 2, 2025
@lvhan028
Copy link
Copy Markdown
Collaborator

lvhan028 commented Sep 3, 2025

build failed:

ptxas /tmp/tmpxft_0003abdf_00000000-8_sm80_mxfp4.compute_86.ptx, line 2626; error   : Feature 'mul.bf16x2' requires .target sm_90 or higher

@lvhan028 lvhan028 mentioned this pull request Sep 4, 2025
3 tasks
@lvhan028 lvhan028 merged commit 693c5b9 into InternLM:main Sep 4, 2025
7 of 9 checks passed
littlegy pushed a commit to littlegy/lmdeploy that referenced this pull request Sep 11, 2025
* unify gemm test

* mxfp468 conversion

* mxfp4 gemm

* add group gemm tests

* new sm70 tile scheduler

* mxfp4 model loading

* fix tp

* fix dispatch heuristic

* add half x mxfp4

* fix configs for sm70-89

* remove unused

* minor

* disable cache miss warning by default

* fix

* fix

* fuse bias

* fix split-k

* fix lint

* fix kernel names

* stochastic rounding experiment

* disable debug info
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants