Skip to content

[diffusion] feat: support nunchaku for Z-Image-Turbo and flux.1 (int4)#18959

Merged
mickqian merged 10 commits intomainfrom
diffusion-nunchaku
Feb 20, 2026
Merged

[diffusion] feat: support nunchaku for Z-Image-Turbo and flux.1 (int4)#18959
mickqian merged 10 commits intomainfrom
diffusion-nunchaku

Conversation

@mickqian
Copy link
Collaborator

@mickqian mickqian commented Feb 18, 2026

Motivation

Modifications

Performance and Accuracy Tests

Flux.1-dev

original:
image

nunchaku:
image

Z-Image-Turbo

original:
image

nunchaku:

image
Model Resolution Original (s) Nunchaku (s) Speedup
FLUX.1-dev 256×256 61.79 8.21 ~7.5×
Z-Image-Turbo 512×512 5.84 2.06 ~2.8×

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions bot added the diffusion SGLang Diffusion label Feb 18, 2026
Comment on lines +29 to +55
# HF diffusers format
r"^transformer\.(\w*)\.(.*)$": r"\1.\2",
# transformer_blocks nunchaku format (raw export - before internal conversion)
r"^transformer_blocks\.(\d+)\.mlp_fc1\.(.*)$": r"transformer_blocks.\1.ff.net.0.proj.\2",
r"^transformer_blocks\.(\d+)\.mlp_fc2\.(.*)$": r"transformer_blocks.\1.ff.net.2.\2",
r"^transformer_blocks\.(\d+)\.mlp_context_fc1\.(.*)$": r"transformer_blocks.\1.ff_context.net.0.proj.\2",
r"^transformer_blocks\.(\d+)\.mlp_context_fc2\.(.*)$": r"transformer_blocks.\1.ff_context.net.2.\2",
r"^transformer_blocks\.(\d+)\.qkv_proj\.(.*)$": r"transformer_blocks.\1.attn.to_qkv.\2",
r"^transformer_blocks\.(\d+)\.qkv_proj_context\.(.*)$": r"transformer_blocks.\1.attn.to_added_qkv.\2",
r"^transformer_blocks\.(\d+)\.out_proj\.(.*)$": r"transformer_blocks.\1.attn.to_out.0.\2",
r"^transformer_blocks\.(\d+)\.out_proj_context\.(.*)$": r"transformer_blocks.\1.attn.to_add_out.\2",
r"^transformer_blocks\.(\d+)\.norm_q\.(.*)$": r"transformer_blocks.\1.attn.norm_q.\2",
r"^transformer_blocks\.(\d+)\.norm_k\.(.*)$": r"transformer_blocks.\1.attn.norm_k.\2",
r"^transformer_blocks\.(\d+)\.norm_added_q\.(.*)$": r"transformer_blocks.\1.attn.norm_added_q.\2",
r"^transformer_blocks\.(\d+)\.norm_added_k\.(.*)$": r"transformer_blocks.\1.attn.norm_added_k.\2",
# transformer_blocks nunchaku format (already converted with convert_flux_state_dict)
r"^transformer_blocks\.(\d+)\.attn\.add_qkv_proj\.(.*)$": r"transformer_blocks.\1.attn.to_added_qkv.\2",
# single_transformer_blocks nunchaku format (raw export - before internal conversion)
r"^single_transformer_blocks\.(\d+)\.qkv_proj\.(.*)$": r"single_transformer_blocks.\1.attn.to_qkv.\2",
r"^single_transformer_blocks\.(\d+)\.out_proj\.(.*)$": r"single_transformer_blocks.\1.attn.to_out.0.\2",
r"^single_transformer_blocks\.(\d+)\.norm_q\.(.*)$": r"single_transformer_blocks.\1.attn.norm_q.\2",
r"^single_transformer_blocks\.(\d+)\.norm_k\.(.*)$": r"single_transformer_blocks.\1.attn.norm_k.\2",
# nunchaku quantization parameter name conversions (apply to all blocks)
r"^(.*)\.smooth_orig$": r"\1.smooth_factor_orig",
r"^(.*)\.smooth$": r"\1.smooth_factor",
r"^(.*)\.lora_down$": r"\1.proj_down",
r"^(.*)\.lora_up$": r"\1.proj_up",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to DRY principle, i think i might be better to change code below:

def get_param_names_mapping():
    block_rules = {
        # MLP related
        r"mlp_fc1\.(.*)$": r"ff.net.0.proj.\1",
        r"mlp_fc2\.(.*)$": r"ff.net.2.\1",
        r"mlp_context_fc1\.(.*)$": r"ff_context.net.0.proj.\1",
        r"mlp_context_fc2\.(.*)$": r"ff_context.net.2.\1",

        # Attention related
        r"qkv_proj\.(.*)$": r"attn.to_qkv.\1",
        r"qkv_proj_context\.(.*)$": r"attn.to_added_qkv.\1",
        r"attn\.add_qkv_proj\.(.*)$": r"attn.to_added_qkv.\1",
        r"out_proj\.(.*)$": r"attn.to_out.0.\1",
        r"out_proj_context\.(.*)$": r"attn.to_add_out.\1",

        # Norm related
        r"norm_q\.(.*)$": r"attn.norm_q.\1",
        r"norm_k\.(.*)$": r"attn.norm_k.\1",
        r"norm_added_q\.(.*)$": r"attn.norm_added_q.\1",
        r"norm_added_k\.(.*)$": r"attn.norm_added_k.\1",
    }

    mapping = {
        # HF diffusers format
        r"^transformer\.(\w*)\.(.*)$": r"\1.\2",
    }

    prefixes = ["transformer_blocks", "single_transformer_blocks"]
    for prefix in prefixes:
        for pattern, replacement in block_rules.items():
            mapping[rf"^{prefix}\.(\d+)\.{pattern}"] = rf"{prefix}.\1.{replacement}"

    # nunchaku quantization parameter name conversions (apply to all blocks)
    suffix_rules = {
        r"smooth_orig$": r"smooth_factor_orig",
        r"smooth$": r"smooth_factor",
        r"lora_down$": r"proj_down",
        r"lora_up$": r"proj_up",
    }
    for old, new in suffix_rules.items():
        mapping[rf"^(.*)\.{old}"] = rf"\1.{new}"

    return mapping

# nunchaku checkpoint uses different weight names; map to sglang flux layout
param_names_mapping: dict = field(default_factory=get_param_names_mapping)

try:
from nunchaku.models.attention import NunchakuFeedForward # type: ignore[import]
except Exception:
NunchakuFeedForward = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NunchakuFeedForward = None if nunchaku is not avaiable, but i saw we have self.feed_forward = NunchakuFeedForward(ff, **nunchaku_kwargs) below, so maybe we can change code like this?

_nunchaku_enabled = is_nunchaku_available()
if _nunchaku_enabled:
    from nunchaku.models.attention import NunchakuFeedForward  # type: ignore[import]

@mickqian mickqian changed the title [diffusion] feat: support nunchaku for Z-Image-Turbo and flux.1 [diffusion] feat: support nunchaku for Z-Image-Turbo and flux.1 (int4) Feb 19, 2026
@mickqian
Copy link
Collaborator Author

/tag-and-rerun-ci

@mickqian mickqian merged commit 8d789b5 into main Feb 20, 2026
92 of 94 checks passed
@mickqian mickqian deleted the diffusion-nunchaku branch February 20, 2026 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion SGLang Diffusion run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants