Skip to content

[Diffusion] fix serving image_edit get input image bug#18109

Merged
BBuf merged 1 commit intomainfrom
fix_serving_image_edit_bug
Feb 3, 2026
Merged

[Diffusion] fix serving image_edit get input image bug#18109
BBuf merged 1 commit intomainfrom
fix_serving_image_edit_bug

Conversation

@BBuf
Copy link
Collaborator

@BBuf BBuf commented Feb 2, 2026

Motivation

Server:

#!/bin/bash
# export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libcuda.so.1:$LD_PRELOAD
# export FLASHINFER_DISABLE_VERSION_CHECK=1

# Set default values
MODEL_PATH="/nas/shared/models/FLUX.1-Kontext-dev"

PORT=30000
HOST="0.0.0.0"
NUM_GPUS=1
TP_SIZE=1
USP_SIZE=1
RING_SIZE=1

# Launch the multimodal_gen server with optimizations enabled
python3 -m sglang.multimodal_gen.runtime.launch_server \
  --model-path "$MODEL_PATH" \
  --port "$PORT" \
  --host "$HOST" \
  --num-gpus "$NUM_GPUS" \
  --tp-size "$TP_SIZE" \
  --ulysses-degree "$USP_SIZE" \
  --ring-degree "$RING_SIZE" \
  --attention-backend fa3 \
  --pin-cpu-memory \
  --warmup \
  --enable-torch-compile

Client:

#!/usr/bin/env python3
"""Test script for FLUX.1-Kontext-dev (image-to-image editing)"""

import argparse
import base64
import requests
import os
import time


def main():
    parser = argparse.ArgumentParser(description="Test SGLang FLUX.1-Kontext-dev image editing")
    parser.add_argument("--host", type=str, default="127.0.0.1", help="Server host")
    parser.add_argument("--port", type=int, default=30000, help="Server port")
    parser.add_argument("--input-image", type=str, default="/nas/bbuf/cat.png", help="Input image path")
    parser.add_argument("--prompt", type=str, default="Add a hat to the cat", help="Edit instruction prompt")
    parser.add_argument("--guidance", type=float, default=2.5, help="Guidance scale (Kontext uses 2.5)")
    parser.add_argument("--steps", type=int, default=28, help="Number of inference steps")
    parser.add_argument("--seed", type=int, default=42, help="Random seed")
    parser.add_argument("--output", type=str, default="kontext-output.png", help="Output filename")
    args = parser.parse_args()

    # Check if input image exists
    if not os.path.exists(args.input_image):
        print(f"❌ Error: Input image not found: {args.input_image}")
        return 1

    # Prepare multipart form data
    url = f"http://{args.host}:{args.port}/v1/images/edits"
    
    # Read input image
    with open(args.input_image, "rb") as f:
        image_bytes = f.read()
    
    # Prepare files - use simple 'image' field (single file, not list)
    files = {
        "image": (os.path.basename(args.input_image), image_bytes, "image/png")
    }
    
    data = {
        "prompt": args.prompt,
        "guidance_scale": args.guidance,
        "num_inference_steps": args.steps,
        "seed": args.seed,
        "response_format": "b64_json",
        "n": 1
    }

    # Generate edited image
    print(f"🎨 FLUX.1-Kontext-dev: Editing image with {args.steps} steps (guidance={args.guidance})...")
    print(f"🖼️  Input image: {args.input_image}")
    print(f"📝 Prompt: {args.prompt}")
    print(f"🔢 Seed: {args.seed}")
    
    start_time = time.time()
    try:
        response = requests.post(url, files=files, data=data, timeout=600)
        response.raise_for_status()
        elapsed = time.time() - start_time
        
        # Save output image
        result = response.json()
        image_data = base64.b64decode(result["data"][0]["b64_json"])
        with open(args.output, "wb") as f:
            f.write(image_data)
        
        print(f"✅ Success! Edited image saved to: {args.output}")
        print(f"⏱️  Time: {elapsed:.2f}s")
        return 0
        
    except requests.exceptions.HTTPError as e:
        print(f"❌ HTTP Error: {e}")
        print(f"Response: {e.response.text if e.response else 'No response'}")
        return 1
    except Exception as e:
        print(f"❌ Error: {e}")
        return 1


if __name__ == "__main__":
    import sys
    sys.exit(main())

The input image is:

图片

After send requests, I got this error :

20260202-170023

And the result is bad:

图片

After pr fix, I can got the correct result:

图片

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@github-actions github-actions bot added the diffusion SGLang Diffusion label Feb 2, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @BBuf, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug within the diffusion server's image editing functionality. The change ensures that the system correctly interprets the input image path, even when it is unexpectedly provided as a list. This fix prevents errors during image processing, allowing image editing requests to execute successfully and produce the intended visual results.

Highlights

  • Bug Fix: Image Path Handling: Fixed an issue in the image editing pipeline where the input image path (batch.image_path) was occasionally received as a list instead of a string, causing processing errors.
  • Robust Input Processing: Ensured that if batch.image_path is a list, the first element is correctly extracted and used as the actual image path, preventing failures in image loading.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug in the image editing functionality where the input image path was not handled correctly when passed as a list. The change correctly extracts the path from the list. However, the implementation has a potential issue with certain edge cases (e.g., a list containing None) which could lead to a runtime error. I've suggested a more robust implementation to handle this. Additionally, the current approach modifies the input batch object in-place, which could be avoided by using a local variable for better code safety.

Comment on lines +335 to +337
if isinstance(batch.image_path, list):
batch.image_path = batch.image_path[0]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change correctly handles a list of image paths, but it's vulnerable to an AttributeError if the list contains a None value (e.g., [None]). In that scenario, batch.image_path becomes None, and the subsequent call to .startswith() on line 339 will crash.

To make this more robust, we should add a check for a falsy value after potentially unwrapping the list.

As a minor style point, modifying the batch object in-place can sometimes have unintended side effects. Using a local variable for the path would be a safer pattern, though it would require a slightly larger refactoring of this function.

Suggested change
if isinstance(batch.image_path, list):
batch.image_path = batch.image_path[0]
if isinstance(batch.image_path, list):
batch.image_path = batch.image_path[0]
if not batch.image_path: return None

Copy link
Collaborator

@mickqian mickqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also modify an existing testcase to cover this case?

@BBuf
Copy link
Collaborator Author

BBuf commented Feb 2, 2026

Could you also modify an existing testcase to cover this case?

I'll do it in future pr when I have bandwidth. batch.image_path must be a list in serving, so the bug existed.

@mickqian
Copy link
Collaborator

mickqian commented Feb 2, 2026

/tag-and-rerun-ci

@github-actions github-actions bot added the run-ci label Feb 2, 2026
@BBuf BBuf merged commit eedd472 into main Feb 3, 2026
217 of 263 checks passed
@BBuf BBuf deleted the fix_serving_image_edit_bug branch February 3, 2026 04:17
yuki-brook pushed a commit to scitix/sglang that referenced this pull request Feb 3, 2026
charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Feb 5, 2026
sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion SGLang Diffusion run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants