[Feature] Implement "Miles Diffusion Router" for Workload-Aware Rollouts

### Motivation

To support large-scale RL rollouts and high-throughput generation, we need to implement a dedicated router for the diffusion engine, tentatively named **Diffusion Router**.

This router will build upon the concepts of **Cache-Aware Load Balancing** and **Data Parallel (DP) Routing** used in the SGLang LLM engine. The goal is to implement a "workload-minimal" routing strategy that ensures requests are distributed to the most available or appropriate engine instances while maintaining system health.

### Goals

1. **Core Infrastructure:** Create a standalone demo/implementation of a router tailored for `sglang-diffusion` instances.
2. **Interface Support:** Implement the following three critical API interfaces:
* `health_check`: Monitor the status of downstream diffusion workers.
* `generate`: Route generation requests based on current workload/availability.
* `update_weights_from_disk`: Interface placeholder (can be a stub for now) to support future dynamic weight updates.


3. **Minimalist Routing:** Focus on low-latency, workload-aware distribution to minimize generation bottlenecks during rollouts.

### Technical Tasks

* [ ] Study the existing SGLang Router implementation for LLMs (see resources below).
* [ ] Develop the `miles-diffusion-router` script/module.
* [ ] Implement basic load balancing logic (e.g., Least-Request or Round-Robin as a baseline, moving toward workload-minimal).
* [ ] Create a demo script showing the router coordinating multiple `sglang-diffusion` backends.
* [ ] Document the setup process and API usage.

### Resources

* **Concept:** [[SGLang v0.4 Cache-Aware Load Balancer](https://lmsys.org/blog/2024-12-04-sglang-v0-4/#cache-aware-load-balancer)](https://lmsys.org/blog/2024-12-04-sglang-v0-4/#cache-aware-load-balancer)
* **Reference Implementation:** [[SGLang Router Documentation & Source](https://github.com/sgl-project/sglang/blob/b6e0cfb5e1c355f9526defdf9bbee430c0bfebaa/docs/router/router.md)](https://github.com/sgl-project/sglang/blob/b6e0cfb5e1c355f9526defdf9bbee430c0bfebaa/docs/router/router.md)

**Calling community members interested in distributed systems and RL infrastructure!** Help us build the backbone of the Miles rollout system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Implement "Miles Diffusion Router" for Workload-Aware Rollouts #541

Motivation

Goals

Technical Tasks

Resources

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Implement "Miles Diffusion Router" for Workload-Aware Rollouts #541

Description

Motivation

Goals

Technical Tasks

Resources

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions