[diffusion] feat: support nunchaku for Z-Image-Turbo and flux.1 (int4) by mickqian · Pull Request #18959 · sgl-project/sglang

mickqian · 2026-02-18T06:04:39Z

Motivation

Modifications

Performance and Accuracy Tests

Flux.1-dev

original:

nunchaku:

Z-Image-Turbo

original:

nunchaku:

Model	Resolution	Original (s)	Nunchaku (s)	Speedup
FLUX.1-dev	256×256	61.79	8.21	~7.5×
Z-Image-Turbo	512×512	5.84	2.06	~2.8×

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-02-18T06:04:42Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

ping1jing2 · 2026-02-19T06:18:21Z

python/sglang/multimodal_gen/configs/models/dits/flux.py

+            # HF diffusers format
+            r"^transformer\.(\w*)\.(.*)$": r"\1.\2",
+            # transformer_blocks nunchaku format (raw export - before internal conversion)
+            r"^transformer_blocks\.(\d+)\.mlp_fc1\.(.*)$": r"transformer_blocks.\1.ff.net.0.proj.\2",
+            r"^transformer_blocks\.(\d+)\.mlp_fc2\.(.*)$": r"transformer_blocks.\1.ff.net.2.\2",
+            r"^transformer_blocks\.(\d+)\.mlp_context_fc1\.(.*)$": r"transformer_blocks.\1.ff_context.net.0.proj.\2",
+            r"^transformer_blocks\.(\d+)\.mlp_context_fc2\.(.*)$": r"transformer_blocks.\1.ff_context.net.2.\2",
+            r"^transformer_blocks\.(\d+)\.qkv_proj\.(.*)$": r"transformer_blocks.\1.attn.to_qkv.\2",
+            r"^transformer_blocks\.(\d+)\.qkv_proj_context\.(.*)$": r"transformer_blocks.\1.attn.to_added_qkv.\2",
+            r"^transformer_blocks\.(\d+)\.out_proj\.(.*)$": r"transformer_blocks.\1.attn.to_out.0.\2",
+            r"^transformer_blocks\.(\d+)\.out_proj_context\.(.*)$": r"transformer_blocks.\1.attn.to_add_out.\2",
+            r"^transformer_blocks\.(\d+)\.norm_q\.(.*)$": r"transformer_blocks.\1.attn.norm_q.\2",
+            r"^transformer_blocks\.(\d+)\.norm_k\.(.*)$": r"transformer_blocks.\1.attn.norm_k.\2",
+            r"^transformer_blocks\.(\d+)\.norm_added_q\.(.*)$": r"transformer_blocks.\1.attn.norm_added_q.\2",
+            r"^transformer_blocks\.(\d+)\.norm_added_k\.(.*)$": r"transformer_blocks.\1.attn.norm_added_k.\2",
+            # transformer_blocks nunchaku format (already converted with convert_flux_state_dict)
+            r"^transformer_blocks\.(\d+)\.attn\.add_qkv_proj\.(.*)$": r"transformer_blocks.\1.attn.to_added_qkv.\2",
+            # single_transformer_blocks nunchaku format (raw export - before internal conversion)
+            r"^single_transformer_blocks\.(\d+)\.qkv_proj\.(.*)$": r"single_transformer_blocks.\1.attn.to_qkv.\2",
+            r"^single_transformer_blocks\.(\d+)\.out_proj\.(.*)$": r"single_transformer_blocks.\1.attn.to_out.0.\2",
+            r"^single_transformer_blocks\.(\d+)\.norm_q\.(.*)$": r"single_transformer_blocks.\1.attn.norm_q.\2",
+            r"^single_transformer_blocks\.(\d+)\.norm_k\.(.*)$": r"single_transformer_blocks.\1.attn.norm_k.\2",
+            # nunchaku quantization parameter name conversions (apply to all blocks)
+            r"^(.*)\.smooth_orig$": r"\1.smooth_factor_orig",
+            r"^(.*)\.smooth$": r"\1.smooth_factor",
+            r"^(.*)\.lora_down$": r"\1.proj_down",
+            r"^(.*)\.lora_up$": r"\1.proj_up",


According to DRY principle, i think i might be better to change code below:

def get_param_names_mapping(): block_rules = { # MLP related r"mlp_fc1\.(.*)$": r"ff.net.0.proj.\1", r"mlp_fc2\.(.*)$": r"ff.net.2.\1", r"mlp_context_fc1\.(.*)$": r"ff_context.net.0.proj.\1", r"mlp_context_fc2\.(.*)$": r"ff_context.net.2.\1", # Attention related r"qkv_proj\.(.*)$": r"attn.to_qkv.\1", r"qkv_proj_context\.(.*)$": r"attn.to_added_qkv.\1", r"attn\.add_qkv_proj\.(.*)$": r"attn.to_added_qkv.\1", r"out_proj\.(.*)$": r"attn.to_out.0.\1", r"out_proj_context\.(.*)$": r"attn.to_add_out.\1", # Norm related r"norm_q\.(.*)$": r"attn.norm_q.\1", r"norm_k\.(.*)$": r"attn.norm_k.\1", r"norm_added_q\.(.*)$": r"attn.norm_added_q.\1", r"norm_added_k\.(.*)$": r"attn.norm_added_k.\1", } mapping = { # HF diffusers format r"^transformer\.(\w*)\.(.*)$": r"\1.\2", } prefixes = ["transformer_blocks", "single_transformer_blocks"] for prefix in prefixes: for pattern, replacement in block_rules.items(): mapping[rf"^{prefix}\.(\d+)\.{pattern}"] = rf"{prefix}.\1.{replacement}" # nunchaku quantization parameter name conversions (apply to all blocks) suffix_rules = { r"smooth_orig$": r"smooth_factor_orig", r"smooth$": r"smooth_factor", r"lora_down$": r"proj_down", r"lora_up$": r"proj_up", } for old, new in suffix_rules.items(): mapping[rf"^(.*)\.{old}"] = rf"\1.{new}" return mapping # nunchaku checkpoint uses different weight names; map to sglang flux layout param_names_mapping: dict = field(default_factory=get_param_names_mapping)

ping1jing2 · 2026-02-19T06:29:50Z

python/sglang/multimodal_gen/runtime/models/dits/zimage.py

+try:
+    from nunchaku.models.attention import NunchakuFeedForward  # type: ignore[import]
+except Exception:
+    NunchakuFeedForward = None


NunchakuFeedForward = None if nunchaku is not avaiable, but i saw we have self.feed_forward = NunchakuFeedForward(ff, **nunchaku_kwargs) below, so maybe we can change code like this?

_nunchaku_enabled = is_nunchaku_available() if _nunchaku_enabled: from nunchaku.models.attention import NunchakuFeedForward # type: ignore[import]

mickqian · 2026-02-20T03:42:38Z

/tag-and-rerun-ci

mickqian requested review from BBuf, ping1jing2, yhyang201 and yingluosanqian as code owners February 18, 2026 06:04

github-actions bot added the diffusion SGLang Diffusion label Feb 18, 2026

ping1jing2 reviewed Feb 19, 2026

View reviewed changes

mickqian changed the title ~~[diffusion] feat: support nunchaku for Z-Image-Turbo and flux.1~~ [diffusion] feat: support nunchaku for Z-Image-Turbo and flux.1 (int4) Feb 19, 2026

mickqian added 6 commits February 19, 2026 23:29

[diffusion] feat: support nunchaku for Z-Image-Turbo

1b1467c

fix z-image-turbo

bc91251

flux half correct: very blurry

d588c43

upd

bc117af

upd

506fd95

upd

c12e364

mickqian force-pushed the diffusion-nunchaku branch from f2cb425 to c12e364 Compare February 20, 2026 02:43

refactor

04bb9d2

github-actions bot added the run-ci label Feb 20, 2026

mickqian added 3 commits February 20, 2026 16:35

upd

1ccd426

upd

e2ad62a

upd

7b2ea0a

mickqian merged commit 8d789b5 into main Feb 20, 2026
92 of 94 checks passed

mickqian deleted the diffusion-nunchaku branch February 20, 2026 13:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[diffusion] feat: support nunchaku for Z-Image-Turbo and flux.1 (int4)#18959

[diffusion] feat: support nunchaku for Z-Image-Turbo and flux.1 (int4)#18959
mickqian merged 10 commits intomainfrom
diffusion-nunchaku

mickqian commented Feb 18, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 18, 2026

Uh oh!

ping1jing2 Feb 19, 2026

Uh oh!

ping1jing2 Feb 19, 2026

Uh oh!

mickqian commented Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mickqian commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Performance and Accuracy Tests

Flux.1-dev

Z-Image-Turbo

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist bot commented Feb 18, 2026

Uh oh!

ping1jing2 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ping1jing2 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

mickqian commented Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mickqian commented Feb 18, 2026 •

edited

Loading