Skip to content

feat: profile strategy-aware provider routing (019)#24

Open
john-zhh wants to merge 3 commits intofeat/v3.0.1from
019-profile-strategy-routing
Open

feat: profile strategy-aware provider routing (019)#24
john-zhh wants to merge 3 commits intofeat/v3.0.1from
019-profile-strategy-routing

Conversation

@john-zhh
Copy link
Contributor

@john-zhh john-zhh commented Mar 9, 2026

Summary

  • Add per-profile load balancing strategy support with 4 strategies: least-latency, least-cost, round-robin, and weighted
  • Wire full pipeline: ProfileProxy → ProxyServer → LoadBalancer.Select() with strategy decision logging
  • Add Weight field to Provider/ProviderConfig, ProviderMetrics to LogDB for latency-based routing
  • Fix integration test compilation errors (missing config import + double closing parens)

Test plan

  • Unit tests: 82.5% coverage (go test -cover ./internal/proxy)
  • Race detector: clean (go test -race ./internal/proxy)
  • Benchmarks: all strategies <0.02ms (target <5ms)
  • Integration tests: all pass (go test -short ./tests/integration)
  • Backward compatibility: empty strategy defaults to failover

🤖 Generated with Claude Code

john-zhh and others added 3 commits March 9, 2026 22:35
Add per-profile load balancing strategy support with 4 strategies:
least-latency (metrics-based), least-cost (pricing-based), round-robin
(atomic counter), and weighted (random selection with configurable weights).

- Extend LoadBalancer.Select() with strategy switch and decision logging
- Add selectWeighted() with health-aware weight recalculation
- Wire ProfileProxy → ProxyServer → LoadBalancer strategy pipeline
- Add Provider.Weight and ProviderConfig.Weight fields
- Add ProviderMetrics and GetProviderLatencyMetrics() to LogDB
- Comprehensive TDD: 82.5% coverage, race-free, benchmarks <0.02ms
- Fix integration test compilation (config import + syntax)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Weighted reads profile-level ProviderWeights (not just global Weight)
   - buildProviders() now accepts profileWeights map, applies per-profile
     weights with precedence over provider-level defaults

2. Round-robin uses per-profile counters (not shared global state)
   - LoadBalancer.rrCounters map[string]*uint64 isolates rotation per profile
   - getProfileRRCounter() with double-checked locking for thread safety

3. Least-cost uses scenario model overrides for cost calculation
   - Select() accepts modelOverrides param, passed to selectLeastCost()
   - Provider cost now reflects actual model used (scenario override > p.Model > request model)

4. Disabled providers filtered BEFORE strategy selection
   - filterDisabledProviders() moved before LoadBalancer.Select() in ServeHTTP()
   - Prevents disabled providers from polluting RR counters, weighted distribution, least-* rankings

Tests: 4 new targeted tests covering each fix, all passing, 82.5% coverage, race-free.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reset globalSessionCache (data + keyOrder) before setting maxSize=3,
preventing stale entries from other parallel tests from making eviction
order non-deterministic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant