Skip to content

[diffusion] Skip negative prompt encoding when guidance_scale <= 1.0 or negative_prompt is None#16919

Merged
mickqian merged 5 commits intosgl-project:mainfrom
zejunchen-zejun:yuhua/neg_text_encode_cond
Jan 21, 2026
Merged

[diffusion] Skip negative prompt encoding when guidance_scale <= 1.0 or negative_prompt is None#16919
mickqian merged 5 commits intosgl-project:mainfrom
zejunchen-zejun:yuhua/neg_text_encode_cond

Conversation

@zhuyuhua-v
Copy link
Contributor

Motivation

classifier-free guidance is enabled when both guidance_scale > 1.0 and negative_prompt is not None.
Consequently, the negative prompt needs to be encoded only under this condition.

When either guidance_scale <= 1.0 or negative_prompt is None, the unconditional noise prediction (noise_pred_uncond) is not used, making the negative prompt encoding redundant. Skipping it in these cases avoids unnecessary computation in the text encoder and improves inference efficiency.

Modifications

  • Skip the negative prompt text encoding if guidance_scale <= 1.0 or negative_prompt is None.

Accuracy Tests

test cmd:

from sglang.multimodal_gen import DiffGenerator

def main():
    generator = DiffGenerator.from_pretrained(
        model_path="/mnt/models/Qwen/Qwen-Image-Edit",
        num_gpus=2,
        ulysses_degree=2,
        tp_size=1,
        enable_torch_compile=True,
        dit_cpu_offload=False,
        text_encoder_cpu_offload=False,
        vae_cpu_offload=False,
        image_encoder_cpu_offload=False,
        image_encoder_precision="bf16",
        vae_precision="bf16",
    )

    generator.generate(
            sampling_params_kwargs=dict(
                prompt="make the clothes to red",
                image_path="input/768x1024.png",
                output_path="qwen_image_edit/",
                height=1024,
                width=768,
                num_inference_steps=8,
                guidance_scale=1, 
            )
        )

if __name__ == '__main__':
    main()

input image:
image

prompt: make the clothes to red

output with this pr:
image

output without this pr:
image

Benchmarking and Profiling

performance without this pr on MI308:

[01-12 02:38:53] [ImageEncodingStage] finished in 0.7408 seconds
[01-12 02:39:05] Pixel data generated successfully in 12.66 seconds

performance with this pr on MI308:

[01-12 02:33:32] [ImageEncodingStage] finished in 0.3976 seconds
[01-12 02:33:44] Pixel data generated successfully in 12.33 seconds

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>
@github-actions github-actions bot added the diffusion SGLang Diffusion label Jan 12, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @zhuyuhua-v, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant optimization to the diffusion pipeline by making the negative prompt encoding conditional. By only performing this computationally intensive step when classifier-free guidance is actively engaged (i.e., when a guidance_scale greater than 1.0 and a negative_prompt are present), the system avoids unnecessary processing, leading to improved inference efficiency without compromising the quality of the generated output.

Highlights

  • Performance Optimization: Implemented conditional encoding of negative prompts, skipping the process when guidance_scale <= 1.0 or negative_prompt is None. This avoids redundant computation in the text encoder, as the unconditional noise prediction is not utilized under these conditions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable optimization by skipping the negative prompt encoding when classifier-free guidance is not active. The logic is sound and the performance benefits are clear from the benchmarks. My review includes one suggestion to improve code maintainability by using a pre-existing flag, do_classifier_free_guidance, instead of re-evaluating the condition in two places. This will make the code cleaner and more robust against future changes.

Comment on lines +153 to +168
if batch.guidance_scale > 1.0 and batch.negative_prompt is not None:
neg_outputs = self.text_encoder(
input_ids=neg_image_inputs.input_ids,
attention_mask=neg_image_inputs.attention_mask,
pixel_values=neg_image_inputs.pixel_values,
image_grid_thw=neg_image_inputs.image_grid_thw,
output_hidden_states=True,
)
batch.prompt_embeds.append(
self.encoding_qwen_image_edit(outputs, image_inputs)
)

batch.negative_prompt_embeds.append(
self.encoding_qwen_image_edit(neg_outputs, neg_image_inputs)
)
if batch.guidance_scale > 1.0 and batch.negative_prompt is not None:
batch.negative_prompt_embeds.append(
self.encoding_qwen_image_edit(neg_outputs, neg_image_inputs)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This change correctly adds the optimization, but the condition batch.guidance_scale > 1.0 and batch.negative_prompt is not None is duplicated, which can be a maintenance risk.

The Req class (which batch is an instance of) already computes a do_classifier_free_guidance boolean flag for this exact purpose in its __post_init__ method.

Using this existing flag batch.do_classifier_free_guidance would make the code cleaner and more robust by centralizing the logic. This is a recommended practice to avoid logic duplication.

Suggested change
if batch.guidance_scale > 1.0 and batch.negative_prompt is not None:
neg_outputs = self.text_encoder(
input_ids=neg_image_inputs.input_ids,
attention_mask=neg_image_inputs.attention_mask,
pixel_values=neg_image_inputs.pixel_values,
image_grid_thw=neg_image_inputs.image_grid_thw,
output_hidden_states=True,
)
batch.prompt_embeds.append(
self.encoding_qwen_image_edit(outputs, image_inputs)
)
batch.negative_prompt_embeds.append(
self.encoding_qwen_image_edit(neg_outputs, neg_image_inputs)
)
if batch.guidance_scale > 1.0 and batch.negative_prompt is not None:
batch.negative_prompt_embeds.append(
self.encoding_qwen_image_edit(neg_outputs, neg_image_inputs)
)
if batch.do_classifier_free_guidance:
neg_outputs = self.text_encoder(
input_ids=neg_image_inputs.input_ids,
attention_mask=neg_image_inputs.attention_mask,
pixel_values=neg_image_inputs.pixel_values,
image_grid_thw=neg_image_inputs.image_grid_thw,
output_hidden_states=True,
)
batch.prompt_embeds.append(
self.encoding_qwen_image_edit(outputs, image_inputs)
)
if batch.do_classifier_free_guidance:
batch.negative_prompt_embeds.append(
self.encoding_qwen_image_edit(neg_outputs, neg_image_inputs)
)

image_grid_thw=neg_image_inputs.image_grid_thw,
output_hidden_states=True,
)
if batch.guidance_scale > 1.0 and batch.negative_prompt is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we generalize this logic to something like batch::should_do_cfg, and apply that logic to denoising stage as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, we have a do_classifier_free_guidance already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the notification! I've updated these changes to use do_classifier_free_guidance for judgment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case, should be skip these too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@mickqian
Copy link
Collaborator

/tag-and-rerun-ci

@mickqian mickqian merged commit 2c1b164 into sgl-project:main Jan 21, 2026
210 of 225 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion SGLang Diffusion run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants