-
Notifications
You must be signed in to change notification settings - Fork 122
Open
Labels
good first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Description
Motivation
To support large-scale RL rollouts and high-throughput generation, we need to implement a dedicated router for the diffusion engine, tentatively named Diffusion Router.
This router will build upon the concepts of Cache-Aware Load Balancing and Data Parallel (DP) Routing used in the SGLang LLM engine. The goal is to implement a "workload-minimal" routing strategy that ensures requests are distributed to the most available or appropriate engine instances while maintaining system health.
Goals
- Core Infrastructure: Create a standalone demo/implementation of a router tailored for
sglang-diffusioninstances. - Interface Support: Implement the following three critical API interfaces:
health_check: Monitor the status of downstream diffusion workers.generate: Route generation requests based on current workload/availability.update_weights_from_disk: Interface placeholder (can be a stub for now) to support future dynamic weight updates.
- Minimalist Routing: Focus on low-latency, workload-aware distribution to minimize generation bottlenecks during rollouts.
Technical Tasks
- Study the existing SGLang Router implementation for LLMs (see resources below).
- Develop the
miles-diffusion-routerscript/module. - Implement basic load balancing logic (e.g., Least-Request or Round-Robin as a baseline, moving toward workload-minimal).
- Create a demo script showing the router coordinating multiple
sglang-diffusionbackends. - Document the setup process and API usage.
Resources
- Concept: [SGLang v0.4 Cache-Aware Load Balancer](https://lmsys.org/blog/2024-12-04-sglang-v0-4/#cache-aware-load-balancer)
- Reference Implementation: [SGLang Router Documentation & Source](https://github.com/sgl-project/sglang/blob/b6e0cfb5e1c355f9526defdf9bbee430c0bfebaa/docs/router/router.md)
Calling community members interested in distributed systems and RL infrastructure! Help us build the backbone of the Miles rollout system.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed