feat: profile strategy-aware provider routing (019)#24
Open
john-zhh wants to merge 3 commits intofeat/v3.0.1from
Open
feat: profile strategy-aware provider routing (019)#24john-zhh wants to merge 3 commits intofeat/v3.0.1from
john-zhh wants to merge 3 commits intofeat/v3.0.1from
Conversation
Add per-profile load balancing strategy support with 4 strategies: least-latency (metrics-based), least-cost (pricing-based), round-robin (atomic counter), and weighted (random selection with configurable weights). - Extend LoadBalancer.Select() with strategy switch and decision logging - Add selectWeighted() with health-aware weight recalculation - Wire ProfileProxy → ProxyServer → LoadBalancer strategy pipeline - Add Provider.Weight and ProviderConfig.Weight fields - Add ProviderMetrics and GetProviderLatencyMetrics() to LogDB - Comprehensive TDD: 82.5% coverage, race-free, benchmarks <0.02ms - Fix integration test compilation (config import + syntax) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Weighted reads profile-level ProviderWeights (not just global Weight)
- buildProviders() now accepts profileWeights map, applies per-profile
weights with precedence over provider-level defaults
2. Round-robin uses per-profile counters (not shared global state)
- LoadBalancer.rrCounters map[string]*uint64 isolates rotation per profile
- getProfileRRCounter() with double-checked locking for thread safety
3. Least-cost uses scenario model overrides for cost calculation
- Select() accepts modelOverrides param, passed to selectLeastCost()
- Provider cost now reflects actual model used (scenario override > p.Model > request model)
4. Disabled providers filtered BEFORE strategy selection
- filterDisabledProviders() moved before LoadBalancer.Select() in ServeHTTP()
- Prevents disabled providers from polluting RR counters, weighted distribution, least-* rankings
Tests: 4 new targeted tests covering each fix, all passing, 82.5% coverage, race-free.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reset globalSessionCache (data + keyOrder) before setting maxSize=3, preventing stale entries from other parallel tests from making eviction order non-deterministic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Weightfield to Provider/ProviderConfig,ProviderMetricsto LogDB for latency-based routingTest plan
go test -cover ./internal/proxy)go test -race ./internal/proxy)go test -short ./tests/integration)🤖 Generated with Claude Code