[WIP] Promote SaturationDetector to an extension point.#2605
[WIP] Promote SaturationDetector to an extension point.#2605LukeAVanDrie wants to merge 5 commits intokubernetes-sigs:mainfrom
Conversation
|
Skipping CI for Draft Pull Request. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: LukeAVanDrie The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Implements the Filter extension point in the `UtilizationDetector` to enable per-pod saturation guarding. The filter allows the scheduler to bypass endpoints that exceed queue depth or KV-cache utilization thresholds. It introduces a configurable 'Headroom' parameter to provide burst tolerance (e.g., 20% above base limits) for scheduling flexibility.
…sted within FlowControlConfig.
…ramework interface package.
3970c2a to
14f040b
Compare
14f040b to
c593dfa
Compare
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What type of PR is this?
/kind cleanup
/kind feature
/kind deprecation
What this PR does / why we need it:
This PR promotes the
SaturationDetectorto an EPP extension point. This is a breaking change as it removes the top-level config block for saturation detection.Which issue(s) this PR fixes:
Fixes #1405
Does this PR introduce a user-facing change?: