Skip to content

feat: diffusion router#2

Merged
zhaochenyang20 merged 2 commits intozhaochenyang20:mainfrom
alphabetc1:feat/sglang-d-router
Feb 21, 2026
Merged

feat: diffusion router#2
zhaochenyang20 merged 2 commits intozhaochenyang20:mainfrom
alphabetc1:feat/sglang-d-router

Conversation

@alphabetc1
Copy link
Collaborator

Extract the SGLang diffusion router into a standalone package with CLI entry point, load-balancing algorithms, health checking, SSRF protection, unit tests, benchmarks, and pre-commit hooks for code quality. Derived from radixark/miles#544.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alphabetc1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a dedicated and feature-rich router for SGLang diffusion workers, enhancing the system's scalability, reliability, and maintainability. By centralizing worker management, load distribution, and health monitoring, it provides a robust layer for handling diffusion generation requests, while also improving code quality and testability through new development tooling and comprehensive testing.

Highlights

  • Standalone Diffusion Router Package: The SGLang diffusion router has been extracted into a new standalone Python package, sglang-diffusion-routing, complete with its own CLI entry point and installation instructions.
  • Load Balancing and Health Checking: Implemented robust load-balancing algorithms ('least-request' by default, 'round-robin', 'random') and background health checks with worker quarantine after repeated failures, ensuring high availability and efficient request distribution.
  • Security Enhancements: Incorporated SSRF (Server-Side Request Forgery) protection by validating worker URLs to prevent malicious redirection to internal metadata hosts or link-local addresses.
  • Code Quality and Testing Infrastructure: Added pre-commit hooks for code formatting (Black, Ruff, isort), spelling (codespell), and general code quality checks. Comprehensive unit tests and benchmark scripts for the router and its algorithms have also been introduced.
  • API for Worker Management and Proxying: The router exposes APIs for dynamic worker registration (/add_worker), listing workers (/list_workers), aggregated and per-worker health status (/health, /health_workers), and proxying of generation requests (/generate, /generate_video) and weight updates (/update_weights_from_disk).
Changelog
  • .codespellrc
    • Added a configuration file for codespell to ignore specific words during spell checking.
  • .gitignore
    • Uncommented the entry for .vscode/ to ensure Visual Studio Code specific files are ignored by Git.
  • .pre-commit-config.yaml
    • Added a new configuration file for pre-commit hooks, integrating various tools like trailing-whitespace, end-of-file-fixer, check-toml, check-yaml, check-ast, check-added-large-files, check-merge-conflict, debug-statements, detect-private-key, no-commit-to-branch, isort, ruff, black, and codespell to enforce code quality standards.
  • README.md
    • Completely rewrote the README to reflect the new standalone diffusion router, detailing its features, installation, quick start guide, API endpoints, update_weights_from_disk behavior, benchmark scripts, and project layout.
  • docs/update_weights_from_disk.md
    • Added a new documentation file explaining the behavior and requirements for the POST /update_weights_from_disk endpoint.
  • pyproject.toml
    • Added a new pyproject.toml file to define the project metadata, build system, dependencies (FastAPI, httpx, uvicorn), and a CLI entry point for the sglang-d-router.
  • src/sglang_diffusion_routing/init.py
    • Added the package's __init__.py file, exposing the DiffusionRouter class as part of the public API.
  • src/sglang_diffusion_routing/main.py
    • Added a __main__.py file to enable direct execution of the package as a script.
  • src/sglang_diffusion_routing/cli/init.py
    • Added the __init__.py file for the CLI subpackage.
  • src/sglang_diffusion_routing/cli/main.py
    • Added the main CLI script for the diffusion router, including argument parsing for host, port, worker URLs, routing algorithm, and other router configurations, and integrates with Uvicorn to run the FastAPI application.
  • src/sglang_diffusion_routing/router/init.py
    • Added the __init__.py file for the router subpackage, exposing the DiffusionRouter class.
  • src/sglang_diffusion_routing/router/diffusion_router.py
    • Added the core DiffusionRouter class, implementing FastAPI routes, worker registration, load balancing (least-request, round-robin, random), background health checks with quarantine logic, request forwarding, and SSRF protection for worker URLs.
  • tests/benchmarks/diffusion_router/bench_router.py
    • Added a new benchmark script to launch diffusion workers and the router, then run a serving benchmark against the router, including port management and process control.
  • tests/benchmarks/diffusion_router/bench_routing_algorithms.py
    • Added a new benchmark script to compare different routing algorithms by running bench_router.py for each and collecting performance results in JSON and CSV formats.
  • tests/unit/test_cli.py
    • Added unit tests for the CLI argument parsing in sglang_diffusion_routing.cli.main.
  • tests/unit/test_diffusion_router.py
    • Added unit tests for the core DiffusionRouter logic, covering load-balancing algorithms, worker selection, error handling, and request count management.
  • tests/unit/test_router_endpoints.py
    • Added unit tests for the router's FastAPI endpoints, including worker registration, URL normalization, and broadcast functionality.
Activity
  • The pull request introduces a new standalone package for the SGLang diffusion router, derived from an existing PR ([Do Not Merge]: adding diffuser router for diffusion RL radixark/miles#544).
  • It includes comprehensive changes across multiple files, setting up the project structure, CLI, core router logic, documentation, and testing.
  • No specific human activity (comments, reviews) is provided in the context for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new SGLang diffusion router, adding comprehensive infrastructure for load-balancing diffusion workers. Key changes include the creation of a DiffusionRouter class with FastAPI endpoints for worker registration, health checks, request proxying (for image/video generation), and broadcasting update_weights_from_disk operations. It supports 'least-request', 'round-robin', and 'random' routing algorithms, and implements background health checks with worker quarantine. The PR also adds a CLI for the router, detailed README.md documentation, and dedicated benchmark scripts for performance comparison of routing algorithms. Unit tests cover CLI argument parsing, router core logic, and API endpoint behavior. Review comments highlight several issues: outdated rev versions for pre-commit hooks (pre-commit-hooks, ruff-pre-commit, black, codespell) which need to be updated to stable versions, insufficient security validation in _normalize_worker_url for the add_worker endpoint (failing to block loopback addresses or enforce port presence, potentially leading to SSRF), the lack of authentication/authorization for sensitive endpoints like /add_worker, /update_weights_from_disk, and the catch-all proxy, and a suggestion to expand the ruff configuration to include more code quality checks (e.g., E,F,W). A minor suggestion was also made to simplify the pytest command in the README.md.

@alphabetc1 alphabetc1 closed this Feb 20, 2026
@alphabetc1 alphabetc1 reopened this Feb 20, 2026
@zhaochenyang20 zhaochenyang20 merged commit bcc37e2 into zhaochenyang20:main Feb 21, 2026
1 check passed
@alphabetc1 alphabetc1 deleted the feat/sglang-d-router branch February 21, 2026 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants