Skip to content

Add concurrency tracking to runner utilization report#17963

Merged
Kangyan-Zhou merged 1 commit intosgl-project:mainfrom
Kangyan-Zhou:add-concurrency-tracking
Jan 30, 2026
Merged

Add concurrency tracking to runner utilization report#17963
Kangyan-Zhou merged 1 commit intosgl-project:mainfrom
Kangyan-Zhou:add-concurrency-tracking

Conversation

@Kangyan-Zhou
Copy link
Collaborator

  • Add calculate_concurrency_metrics() using sweep line algorithm to track:

    • Peak concurrent runners in use
    • Average concurrent runners over time
    • Saturation time (when all runners busy)
    • Peak queue depth (jobs waiting)
  • Use parallel API fetching with ThreadPoolExecutor for faster data collection

  • Add effective runner capacity based on observed peak (handles offline runners)

  • Add Concurrency Analysis section and Recommendations to report output

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

- Add calculate_concurrency_metrics() using sweep line algorithm to track:
  - Peak concurrent runners in use
  - Average concurrent runners over time
  - Saturation time (when all runners busy)
  - Peak queue depth (jobs waiting)

- Use parallel API fetching with ThreadPoolExecutor for faster data collection

- Add effective runner capacity based on observed peak (handles offline runners)

- Add Concurrency Analysis section and Recommendations to report output

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Kangyan-Zhou, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the runner utilization report by integrating advanced concurrency tracking and optimization features. It provides a deeper understanding of how runners are being utilized, identifies bottlenecks through queue depth and saturation analysis, and offers practical recommendations for scaling runner pools. The changes also improve the report generation speed by parallelizing API calls, making the tool more efficient and insightful for CI/CD pipeline management.

Highlights

  • Concurrency Metrics: Introduced a new calculate_concurrency_metrics() function utilizing a sweep line algorithm to track key performance indicators such as peak concurrent runners, average concurrent runners over time, saturation time (when all runners are busy), and peak queue depth.
  • Parallel API Fetching: Implemented parallel API fetching for job data using ThreadPoolExecutor, significantly improving the speed of data collection for the runner utilization report.
  • Effective Runner Capacity: Added logic to determine 'effective runner capacity' based on observed peak concurrency, which provides a more realistic view of available resources by accounting for offline or underutilized runners.
  • Enhanced Reporting: The report output now includes a dedicated 'Concurrency Analysis' section and a 'Recommendations' section, offering actionable insights based on saturation levels and queue buildup.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the runner utilization report by introducing concurrency analysis, which is a great addition. The use of a sweep-line algorithm is well-suited for these metrics, and parallelizing the job fetching will substantially improve the script's performance. The introduction of an "effective runner capacity" is a clever way to achieve more accurate saturation metrics. My review includes a few suggestions to address a logic bug, enhance code clarity, and refactor for improved maintainability.

Comment on lines +376 to +384
effective_runners = min(num_runners, concurrency_initial["peak_concurrent"])
if effective_runners < num_runners and effective_runners > 0:
# Recalculate with effective capacity for accurate saturation
concurrency = calculate_concurrency_metrics(
jobs, window_start, window_end, effective_runners
)
else:
concurrency = concurrency_initial
effective_runners = num_runners
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There's a logic bug in the calculation of effective_runners. In the else block, effective_runners is incorrectly reset to num_runners. This happens when peak_concurrent is 0, causing effective_runners to be reported as num_runners instead of 0. The logic can be simplified to fix this bug and correctly handle all cases.

        effective_runners = min(num_runners, concurrency_initial["peak_concurrent"])
        if 0 < effective_runners < num_runners:
            # Recalculate with effective capacity for accurate saturation
            concurrency = calculate_concurrency_metrics(
                jobs, window_start, window_end, effective_runners
            )
        else:
            concurrency = concurrency_initial

Comment on lines +142 to +159
if not jobs:
return {
"peak_concurrent": 0,
"avg_concurrent": 0.0,
"saturation_seconds": 0,
"saturation_pct": 0.0,
"peak_queue": 0,
}

window_seconds = (window_end - window_start).total_seconds()
if window_seconds <= 0:
return {
"peak_concurrent": 0,
"avg_concurrent": 0.0,
"saturation_seconds": 0,
"saturation_pct": 0.0,
"peak_queue": 0,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability and reduce code duplication, you can define the dictionary for empty results as a constant at the beginning of the function and reuse it in the early return paths.

    EMPTY_RESULT = {
        "peak_concurrent": 0,
        "avg_concurrent": 0.0,
        "saturation_seconds": 0,
        "saturation_pct": 0.0,
        "peak_queue": 0,
    }
    if not jobs:
        return EMPTY_RESULT

    window_seconds = (window_end - window_start).total_seconds()
    if window_seconds <= 0:
        return EMPTY_RESULT

Comment on lines +161 to +186
# Create events for running jobs: +1 at start, -1 at end
running_events = []
for job in jobs:
start = job["start"]
end = job["end"]
# Clamp to window
if end < window_start or start > window_end:
continue
clamped_start = max(start, window_start)
clamped_end = min(end, window_end)
running_events.append((clamped_start, 1, "start")) # +1 for start
running_events.append((clamped_end, -1, "end")) # -1 for end

# Create events for queue tracking (jobs created but not started)
queue_events = []
for job in jobs:
created_at = job.get("created_at")
started_at = job["start"]
if created_at and created_at < started_at:
# Clamp to window
if started_at < window_start or created_at > window_end:
continue
clamped_created = max(created_at, window_start)
clamped_started = min(started_at, window_end)
queue_events.append((clamped_created, 1, "queued"))
queue_events.append((clamped_started, -1, "dequeued"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better performance and readability, the two separate loops over jobs to create running_events and queue_events can be combined into a single loop.

    # Create events for running and queued jobs
    running_events = []
    queue_events = []
    for job in jobs:
        start = job["start"]
        end = job["end"]

        # Running events
        if not (end < window_start or start > window_end):
            clamped_start = max(start, window_start)
            clamped_end = min(end, window_end)
            running_events.append((clamped_start, 1, "start"))
            running_events.append((clamped_end, -1, "end"))

        # Queue events
        created_at = job.get("created_at")
        if created_at and created_at < start:
            if not (start < window_start or created_at > window_end):
                clamped_created = max(created_at, window_start)
                clamped_started = min(start, window_end)
                queue_events.append((clamped_created, 1, "queued"))
                queue_events.append((clamped_started, -1, "dequeued"))

Comment on lines +471 to +472
if not has_recommendations and results:
lines.append("All runner pools have healthy utilization.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The summary message "All runner pools have healthy utilization." is redundant because each healthy runner pool already gets a "✓ Healthy utilization..." message. Removing this summary line will make the recommendations section cleaner when all pools are healthy.

@Kangyan-Zhou Kangyan-Zhou merged commit 2cd2c31 into sgl-project:main Jan 30, 2026
55 of 59 checks passed
charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Jan 30, 2026
)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Chen-0210 pushed a commit to Chen-0210/sglang that referenced this pull request Jan 30, 2026
)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026
)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant