🌐 Languages: 🇺🇸 English | 🇧🇷 Português (Brasil) | 🇪🇸 Español | 🇫🇷 Français | 🇮🇹 Italiano | 🇷🇺 Русский | 🇨🇳 中文 (简体) | 🇩🇪 Deutsch | 🇮🇳 हिन्दी | 🇹🇭 ไทย | 🇺🇦 Українська | 🇸🇦 العربية | 🇯🇵 日本語 | 🇻🇳 Tiếng Việt | 🇧🇬 Български | 🇩🇰 Dansk | 🇫🇮 Suomi | 🇮🇱 עברית | 🇭🇺 Magyar | 🇮🇩 Bahasa Indonesia | 🇰🇷 한국어 | 🇲🇾 Bahasa Melayu | 🇳🇱 Nederlands | 🇳🇴 Norsk | 🇵🇹 Português (Portugal) | 🇷🇴 Română | 🇵🇱 Polski | 🇸🇰 Slovenčina | 🇸🇪 Svenska | 🇵🇭 Filipino
Last updated: 2026-03-04
OmniRoute is a local AI routing gateway and dashboard built on Next.js.
It provides a single OpenAI-compatible endpoint (/v1/*) and routes traffic across multiple upstream providers with translation, fallback, token refresh, and usage tracking.
Core capabilities:
- OpenAI-compatible API surface for CLI/tools (28 providers)
- Request/response translation across provider formats
- Model combo fallback (multi-model sequence)
- Account-level fallback (multi-account per provider)
- OAuth + API-key provider connection management
- Embedding generation via
/v1/embeddings(6 providers, 9 models) - Image generation via
/v1/images/generations(4 providers, 9 models) - Think tag parsing (
<think>...</think>) for reasoning models - Response sanitization for strict OpenAI SDK compatibility
- Role normalization (developer→system, system→user) for cross-provider compatibility
- Structured output conversion (json_schema → Gemini responseSchema)
- Local persistence for providers, keys, aliases, combos, settings, pricing
- Usage/cost tracking and request logging
- Optional cloud sync for multi-device/state sync
- IP allowlist/blocklist for API access control
- Thinking budget management (passthrough/auto/custom/adaptive)
- Global system prompt injection
- Session tracking and fingerprinting
- Per-account enhanced rate limiting with provider-specific profiles
- Circuit breaker pattern for provider resilience
- Anti-thundering herd protection with mutex locking
- Signature-based request deduplication cache
- Domain layer: model availability, cost rules, fallback policy, lockout policy
- Domain state persistence (SQLite write-through cache for fallbacks, budgets, lockouts, circuit breakers)
- Policy engine for centralized request evaluation (lockout → budget → fallback)
- Request telemetry with p50/p95/p99 latency aggregation
- Correlation ID (X-Request-Id) for end-to-end tracing
- Compliance audit logging with opt-out per API key
- Eval framework for LLM quality assurance
- Resilience UI dashboard with real-time circuit breaker status
- Modular OAuth providers (12 individual modules under
src/lib/oauth/providers/)
Primary runtime model:
- Next.js app routes under
src/app/api/*implement both dashboard APIs and compatibility APIs - A shared SSE/routing core in
src/sse/*+open-sse/*handles provider execution, translation, streaming, fallback, and usage
- Local gateway runtime
- Dashboard management APIs
- Provider authentication and token refresh
- Request translation and SSE streaming
- Local state + usage persistence
- Optional cloud sync orchestration
- Cloud service implementation behind
NEXT_PUBLIC_CLOUD_URL - Provider SLA/control plane outside local process
- External CLI binaries themselves (Claude CLI, Codex CLI, etc.)
flowchart LR
subgraph Clients[Developer Clients]
C1[Claude Code]
C2[Codex CLI]
C3[OpenClaw / Droid / Cline / Continue / Roo]
C4[Custom OpenAI-compatible clients]
BROWSER[Browser Dashboard]
end
subgraph Router[OmniRoute Local Process]
API[V1 Compatibility API\n/v1/*]
DASH[Dashboard + Management API\n/api/*]
CORE[SSE + Translation Core\nopen-sse + src/sse]
DB[(storage.sqlite)]
UDB[(usage tables + log artifacts)]
end
subgraph Upstreams[Upstream Providers]
P1[OAuth Providers\nClaude/Codex/Gemini/Qwen/iFlow/GitHub/Kiro/Cursor/Antigravity]
P2[API Key Providers\nOpenAI/Anthropic/OpenRouter/GLM/Kimi/MiniMax\nDeepSeek/Groq/xAI/Mistral/Perplexity\nTogether/Fireworks/Cerebras/Cohere/NVIDIA]
P3[Compatible Nodes\nOpenAI-compatible / Anthropic-compatible]
end
subgraph Cloud[Optional Cloud Sync]
CLOUD[Cloud Sync Endpoint\nNEXT_PUBLIC_CLOUD_URL]
end
C1 --> API
C2 --> API
C3 --> API
C4 --> API
BROWSER --> DASH
API --> CORE
DASH --> DB
CORE --> DB
CORE --> UDB
CORE --> P1
CORE --> P2
CORE --> P3
DASH --> CLOUD
Main directories:
src/app/api/v1/*andsrc/app/api/v1beta/*for compatibility APIssrc/app/api/*for management/configuration APIs- Next rewrites in
next.config.mjsmap/v1/*to/api/v1/*
Important compatibility routes:
src/app/api/v1/chat/completions/route.tssrc/app/api/v1/messages/route.tssrc/app/api/v1/responses/route.tssrc/app/api/v1/models/route.ts— includes custom models withcustom: truesrc/app/api/v1/embeddings/route.ts— embedding generation (6 providers)src/app/api/v1/images/generations/route.ts— image generation (4+ providers incl. Antigravity/Nebius)src/app/api/v1/messages/count_tokens/route.tssrc/app/api/v1/providers/[provider]/chat/completions/route.ts— dedicated per-provider chatsrc/app/api/v1/providers/[provider]/embeddings/route.ts— dedicated per-provider embeddingssrc/app/api/v1/providers/[provider]/images/generations/route.ts— dedicated per-provider imagessrc/app/api/v1beta/models/route.tssrc/app/api/v1beta/models/[...path]/route.ts
Management domains:
- Auth/settings:
src/app/api/auth/*,src/app/api/settings/* - Providers/connections:
src/app/api/providers* - Provider nodes:
src/app/api/provider-nodes* - Custom models:
src/app/api/provider-models(GET/POST/DELETE) - Model catalog:
src/app/api/models/route.ts(GET) - Proxy config:
src/app/api/settings/proxy(GET/PUT/DELETE) +src/app/api/settings/proxy/test(POST) - OAuth:
src/app/api/oauth/* - Keys/aliases/combos/pricing:
src/app/api/keys*,src/app/api/models/alias,src/app/api/combos*,src/app/api/pricing - Usage:
src/app/api/usage/* - Sync/cloud:
src/app/api/sync/*,src/app/api/cloud/* - CLI tooling helpers:
src/app/api/cli-tools/* - IP filter:
src/app/api/settings/ip-filter(GET/PUT) - Thinking budget:
src/app/api/settings/thinking-budget(GET/PUT) - System prompt:
src/app/api/settings/system-prompt(GET/PUT) - Sessions:
src/app/api/sessions(GET) - Rate limits:
src/app/api/rate-limits(GET) - Resilience:
src/app/api/resilience(GET/PATCH) — provider profiles, circuit breaker, rate limit state - Resilience reset:
src/app/api/resilience/reset(POST) — reset breakers + cooldowns - Cache stats:
src/app/api/cache/stats(GET/DELETE) - Model availability:
src/app/api/models/availability(GET/POST) - Telemetry:
src/app/api/telemetry/summary(GET) - Budget:
src/app/api/usage/budget(GET/POST) - Fallback chains:
src/app/api/fallback/chains(GET/POST/DELETE) - Compliance audit:
src/app/api/compliance/audit-log(GET) - Evals:
src/app/api/evals(GET/POST),src/app/api/evals/[suiteId](GET) - Policies:
src/app/api/policies(GET/POST)
Main flow modules:
- Entry:
src/sse/handlers/chat.ts - Core orchestration:
open-sse/handlers/chatCore.ts - Provider execution adapters:
open-sse/executors/* - Format detection/provider config:
open-sse/services/provider.ts - Model parse/resolve:
src/sse/services/model.ts,open-sse/services/model.ts - Account fallback logic:
open-sse/services/accountFallback.ts - Translation registry:
open-sse/translator/index.ts - Stream transformations:
open-sse/utils/stream.ts,open-sse/utils/streamHandler.ts - Usage extraction/normalization:
open-sse/utils/usageTracking.ts - Think tag parser:
open-sse/utils/thinkTagParser.ts - Embedding handler:
open-sse/handlers/embeddings.ts - Embedding provider registry:
open-sse/config/embeddingRegistry.ts - Image generation handler:
open-sse/handlers/imageGeneration.ts - Image provider registry:
open-sse/config/imageRegistry.ts - Response sanitization:
open-sse/handlers/responseSanitizer.ts - Role normalization:
open-sse/services/roleNormalizer.ts
Services (business logic):
- Account selection/scoring:
open-sse/services/accountSelector.ts - Context lifecycle management:
open-sse/services/contextManager.ts - IP filter enforcement:
open-sse/services/ipFilter.ts - Session tracking:
open-sse/services/sessionManager.ts - Request deduplication:
open-sse/services/signatureCache.ts - System prompt injection:
open-sse/services/systemPrompt.ts - Thinking budget management:
open-sse/services/thinkingBudget.ts - Wildcard model routing:
open-sse/services/wildcardRouter.ts - Rate limit management:
open-sse/services/rateLimitManager.ts - Circuit breaker:
open-sse/services/circuitBreaker.ts
Domain layer modules:
- Model availability:
src/lib/domain/modelAvailability.ts - Cost rules/budgets:
src/lib/domain/costRules.ts - Fallback policy:
src/lib/domain/fallbackPolicy.ts - Combo resolver:
src/lib/domain/comboResolver.ts - Lockout policy:
src/lib/domain/lockoutPolicy.ts - Policy engine:
src/domain/policyEngine.ts— centralized lockout → budget → fallback evaluation - Error codes catalog:
src/lib/domain/errorCodes.ts - Request ID:
src/lib/domain/requestId.ts - Fetch timeout:
src/lib/domain/fetchTimeout.ts - Request telemetry:
src/lib/domain/requestTelemetry.ts - Compliance/audit:
src/lib/domain/compliance/index.ts - Eval runner:
src/lib/domain/evalRunner.ts - Domain state persistence:
src/lib/db/domainState.ts— SQLite CRUD for fallback chains, budgets, cost history, lockout state, circuit breakers
OAuth provider modules (12 individual files under src/lib/oauth/providers/):
- Registry index:
src/lib/oauth/providers/index.ts - Individual providers:
claude.ts,codex.ts,gemini.ts,antigravity.ts,iflow.ts,qwen.ts,kimi-coding.ts,github.ts,kiro.ts,cursor.ts,kilocode.ts,cline.ts - Thin wrapper:
src/lib/oauth/providers.ts— re-exports from individual modules
Primary state DB (SQLite):
- Core infra:
src/lib/db/core.ts(better-sqlite3, migrations, WAL) - Re-export facade:
src/lib/localDb.ts(thin compatibility layer for callers) - file:
${DATA_DIR}/storage.sqlite(or$XDG_CONFIG_HOME/omniroute/storage.sqlitewhen set, else~/.omniroute/storage.sqlite) - entities (tables + KV namespaces): providerConnections, providerNodes, modelAliases, combos, apiKeys, settings, pricing, customModels, proxyConfig, ipFilter, thinkingBudget, systemPrompt
Usage persistence:
- facade:
src/lib/usageDb.ts(decomposed modules insrc/lib/usage/*) - SQLite tables in
storage.sqlite:usage_history,call_logs,proxy_logs - optional file artifacts remain for compatibility/debug (
${DATA_DIR}/log.txt,${DATA_DIR}/call_logs/,<repo>/logs/...) - legacy JSON files are migrated to SQLite by startup migrations when present
Domain State DB (SQLite):
src/lib/db/domainState.ts— CRUD operations for domain state- Tables (created in
src/lib/db/core.ts):domain_fallback_chains,domain_budgets,domain_cost_history,domain_lockout_state,domain_circuit_breakers - Write-through cache pattern: in-memory Maps are authoritative at runtime; mutations are written synchronously to SQLite; state is restored from DB on cold start
- Dashboard cookie auth:
src/proxy.ts,src/app/api/auth/login/route.ts - API key generation/verification:
src/shared/utils/apiKey.ts - Provider secrets persisted in
providerConnectionsentries - Outbound proxy support via
open-sse/utils/proxyFetch.ts(env vars) andopen-sse/utils/networkProxy.ts(configurable per-provider or global)
- Scheduler init:
src/lib/initCloudSync.ts,src/shared/services/initializeCloudSync.ts - Periodic task:
src/shared/services/cloudSyncScheduler.ts - Control route:
src/app/api/sync/cloud/route.ts
sequenceDiagram
autonumber
participant Client as CLI/SDK Client
participant Route as /api/v1/chat/completions
participant Chat as src/sse/handlers/chat
participant Core as open-sse/handlers/chatCore
participant Model as Model Resolver
participant Auth as Credential Selector
participant Exec as Provider Executor
participant Prov as Upstream Provider
participant Stream as Stream Translator
participant Usage as usageDb
Client->>Route: POST /v1/chat/completions
Route->>Chat: handleChat(request)
Chat->>Model: parse/resolve model or combo
alt Combo model
Chat->>Chat: iterate combo models (handleComboChat)
end
Chat->>Auth: getProviderCredentials(provider)
Auth-->>Chat: active account + tokens/api key
Chat->>Core: handleChatCore(body, modelInfo, credentials)
Core->>Core: detect source format
Core->>Core: translate request to target format
Core->>Exec: execute(provider, transformedBody)
Exec->>Prov: upstream API call
Prov-->>Exec: SSE/JSON response
Exec-->>Core: response + metadata
alt 401/403
Core->>Exec: refreshCredentials()
Exec-->>Core: updated tokens
Core->>Exec: retry request
end
Core->>Stream: translate/normalize stream to client format
Stream-->>Client: SSE chunks / JSON response
Stream->>Usage: extract usage + persist history/log
flowchart TD
A[Incoming model string] --> B{Is combo name?}
B -- Yes --> C[Load combo models sequence]
B -- No --> D[Single model path]
C --> E[Try model N]
E --> F[Resolve provider/model]
D --> F
F --> G[Select account credentials]
G --> H{Credentials available?}
H -- No --> I[Return provider unavailable]
H -- Yes --> J[Execute request]
J --> K{Success?}
K -- Yes --> L[Return response]
K -- No --> M{Fallback-eligible error?}
M -- No --> N[Return error]
M -- Yes --> O[Mark account unavailable cooldown]
O --> P{Another account for provider?}
P -- Yes --> G
P -- No --> Q{In combo with next model?}
Q -- Yes --> E
Q -- No --> R[Return all unavailable]
Fallback decisions are driven by open-sse/services/accountFallback.ts using status codes and error-message heuristics.
sequenceDiagram
autonumber
participant UI as Dashboard UI
participant OAuth as /api/oauth/[provider]/[action]
participant ProvAuth as Provider Auth Server
participant DB as localDb
participant Test as /api/providers/[id]/test
participant Exec as Provider Executor
UI->>OAuth: GET authorize or device-code
OAuth->>ProvAuth: create auth/device flow
ProvAuth-->>OAuth: auth URL or device code payload
OAuth-->>UI: flow data
UI->>OAuth: POST exchange or poll
OAuth->>ProvAuth: token exchange/poll
ProvAuth-->>OAuth: access/refresh tokens
OAuth->>DB: createProviderConnection(oauth data)
OAuth-->>UI: success + connection id
UI->>Test: POST /api/providers/[id]/test
Test->>Exec: validate credentials / optional refresh
Exec-->>Test: valid or refreshed token info
Test->>DB: update status/tokens/errors
Test-->>UI: validation result
Refresh during live traffic is executed inside open-sse/handlers/chatCore.ts via executor refreshCredentials().
sequenceDiagram
autonumber
participant UI as Endpoint Page UI
participant Sync as /api/sync/cloud
participant DB as localDb
participant Cloud as External Cloud Sync
participant Claude as ~/.claude/settings.json
UI->>Sync: POST action=enable
Sync->>DB: set cloudEnabled=true
Sync->>DB: ensure API key exists
Sync->>Cloud: POST /sync/{machineId} (providers/aliases/combos/keys)
Cloud-->>Sync: sync result
Sync->>Cloud: GET /{machineId}/v1/verify
Sync-->>UI: enabled + verification status
UI->>Sync: POST action=sync
Sync->>Cloud: POST /sync/{machineId}
Cloud-->>Sync: remote data
Sync->>DB: update newer local tokens/status
Sync-->>UI: synced
UI->>Sync: POST action=disable
Sync->>DB: set cloudEnabled=false
Sync->>Cloud: DELETE /sync/{machineId}
Sync->>Claude: switch ANTHROPIC_BASE_URL back to local (if needed)
Sync-->>UI: disabled
Periodic sync is triggered by CloudSyncScheduler when cloud is enabled.
erDiagram
SETTINGS ||--o{ PROVIDER_CONNECTION : controls
PROVIDER_NODE ||--o{ PROVIDER_CONNECTION : backs_compatible_provider
PROVIDER_CONNECTION ||--o{ USAGE_ENTRY : emits_usage
SETTINGS {
boolean cloudEnabled
number stickyRoundRobinLimit
boolean requireLogin
string password_hash
string fallbackStrategy
json rateLimitDefaults
json providerProfiles
}
PROVIDER_CONNECTION {
string id
string provider
string authType
string name
number priority
boolean isActive
string apiKey
string accessToken
string refreshToken
string expiresAt
string testStatus
string lastError
string rateLimitedUntil
json providerSpecificData
}
PROVIDER_NODE {
string id
string type
string name
string prefix
string apiType
string baseUrl
}
MODEL_ALIAS {
string alias
string targetModel
}
COMBO {
string id
string name
string[] models
}
API_KEY {
string id
string name
string key
string machineId
}
USAGE_ENTRY {
string provider
string model
number prompt_tokens
number completion_tokens
string connectionId
string timestamp
}
CUSTOM_MODEL {
string id
string name
string providerId
}
PROXY_CONFIG {
string global
json providers
}
IP_FILTER {
string mode
string[] allowlist
string[] blocklist
}
THINKING_BUDGET {
string mode
number customBudget
string effortLevel
}
SYSTEM_PROMPT {
boolean enabled
string prompt
string position
}
Physical storage files:
- primary runtime DB:
${DATA_DIR}/storage.sqlite - request log lines:
${DATA_DIR}/log.txt(compat/debug artifact) - structured call payload archives:
${DATA_DIR}/call_logs/ - optional translator/request debug sessions:
<repo>/logs/...
flowchart LR
subgraph LocalHost[Developer Host]
CLI[CLI Tools]
Browser[Dashboard Browser]
end
subgraph ContainerOrProcess[OmniRoute Runtime]
Next[Next.js Server\nPORT=20128]
Core[SSE Core + Executors]
MainDB[(storage.sqlite)]
UsageDB[(usage tables + log artifacts)]
end
subgraph External[External Services]
Providers[AI Providers]
SyncCloud[Cloud Sync Service]
end
CLI --> Next
Browser --> Next
Next --> Core
Next --> MainDB
Core --> MainDB
Core --> UsageDB
Core --> Providers
Next --> SyncCloud
src/app/api/v1/*,src/app/api/v1beta/*: compatibility APIssrc/app/api/v1/providers/[provider]/*: dedicated per-provider routes (chat, embeddings, images)src/app/api/providers*: provider CRUD, validation, testingsrc/app/api/provider-nodes*: custom compatible node managementsrc/app/api/provider-models: custom model management (CRUD)src/app/api/models/route.ts: model catalog API (aliases + custom models)src/app/api/oauth/*: OAuth/device-code flowssrc/app/api/keys*: local API key lifecyclesrc/app/api/models/alias: alias managementsrc/app/api/combos*: fallback combo managementsrc/app/api/pricing: pricing overrides for cost calculationsrc/app/api/settings/proxy: proxy configuration (GET/PUT/DELETE)src/app/api/settings/proxy/test: outbound proxy connectivity test (POST)src/app/api/usage/*: usage and logs APIssrc/app/api/sync/*+src/app/api/cloud/*: cloud sync and cloud-facing helperssrc/app/api/cli-tools/*: local CLI config writers/checkerssrc/app/api/settings/ip-filter: IP allowlist/blocklist (GET/PUT)src/app/api/settings/thinking-budget: thinking token budget config (GET/PUT)src/app/api/settings/system-prompt: global system prompt (GET/PUT)src/app/api/sessions: active session listing (GET)src/app/api/rate-limits: per-account rate limit status (GET)
src/sse/handlers/chat.ts: request parse, combo handling, account selection loopopen-sse/handlers/chatCore.ts: translation, executor dispatch, retry/refresh handling, stream setupopen-sse/executors/*: provider-specific network and format behavior
open-sse/translator/index.ts: translator registry and orchestration- Request translators:
open-sse/translator/request/* - Response translators:
open-sse/translator/response/* - Format constants:
open-sse/translator/formats.ts
src/lib/db/*: persistent config/state and domain persistence on SQLitesrc/lib/localDb.ts: compatibility re-export for DB modulessrc/lib/usageDb.ts: usage history/call logs facade on top of SQLite tables
Each provider has a specialized executor extending BaseExecutor (in open-sse/executors/base.ts), which provides URL building, header construction, retry with exponential backoff, credential refresh hooks, and the execute() orchestration method.
| Executor | Provider(s) | Special Handling |
|---|---|---|
DefaultExecutor |
OpenAI, Claude, Gemini, Qwen, iFlow, OpenRouter, GLM, Kimi, MiniMax, DeepSeek, Groq, xAI, Mistral, Perplexity, Together, Fireworks, Cerebras, Cohere, NVIDIA | Dynamic URL/header config per provider |
AntigravityExecutor |
Google Antigravity | Custom project/session IDs, Retry-After parsing |
CodexExecutor |
OpenAI Codex | Injects system instructions, forces reasoning effort |
CursorExecutor |
Cursor IDE | ConnectRPC protocol, Protobuf encoding, request signing via checksum |
GithubExecutor |
GitHub Copilot | Copilot token refresh, VSCode-mimicking headers |
KiroExecutor |
AWS CodeWhisperer/Kiro | AWS EventStream binary format → SSE conversion |
GeminiCLIExecutor |
Gemini CLI | Google OAuth token refresh cycle |
All other providers (including custom compatible nodes) use the DefaultExecutor.
| Provider | Format | Auth | Stream | Non-Stream | Token Refresh | Usage API |
|---|---|---|---|---|---|---|
| Claude | claude | API Key / OAuth | ✅ | ✅ | ✅ | |
| Gemini | gemini | API Key / OAuth | ✅ | ✅ | ✅ | |
| Gemini CLI | gemini-cli | OAuth | ✅ | ✅ | ✅ | |
| Antigravity | antigravity | OAuth | ✅ | ✅ | ✅ | ✅ Full quota API |
| OpenAI | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Codex | openai-responses | OAuth | ✅ forced | ❌ | ✅ | ✅ Rate limits |
| GitHub Copilot | openai | OAuth + Copilot Token | ✅ | ✅ | ✅ | ✅ Quota snapshots |
| Cursor | cursor | Custom checksum | ✅ | ✅ | ❌ | ❌ |
| Kiro | kiro | AWS SSO OIDC | ✅ (EventStream) | ❌ | ✅ | ✅ Usage limits |
| Qwen | openai | OAuth | ✅ | ✅ | ✅ | |
| iFlow | openai | OAuth (Basic) | ✅ | ✅ | ✅ | |
| OpenRouter | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| GLM/Kimi/MiniMax | claude | API Key | ✅ | ✅ | ❌ | ❌ |
| DeepSeek | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Groq | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| xAI (Grok) | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Mistral | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Perplexity | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Together AI | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Fireworks AI | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Cerebras | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Cohere | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| NVIDIA NIM | openai | API Key | ✅ | ✅ | ❌ | ❌ |
Detected source formats include:
openaiopenai-responsesclaudegemini
Target formats include:
- OpenAI chat/Responses
- Claude
- Gemini/Gemini-CLI/Antigravity envelope
- Kiro
- Cursor
Translations use OpenAI as the hub format — all conversions go through OpenAI as intermediate:
Source Format → OpenAI (hub) → Target Format
Translations are selected dynamically based on source payload shape and provider target format.
Additional processing layers in the translation pipeline:
- Response sanitization — Strips non-standard fields from OpenAI-format responses (both streaming and non-streaming) to ensure strict SDK compliance
- Role normalization — Converts
developer→systemfor non-OpenAI targets; mergessystem→userfor models that reject the system role (GLM, ERNIE) - Think tag extraction — Parses
<think>...</think>blocks from content intoreasoning_contentfield - Structured output — Converts OpenAI
response_format.json_schemato Gemini'sresponseMimeType+responseSchema
| Endpoint | Format | Handler |
|---|---|---|
POST /v1/chat/completions |
OpenAI Chat | src/sse/handlers/chat.ts |
POST /v1/messages |
Claude Messages | Same handler (auto-detected) |
POST /v1/responses |
OpenAI Responses | open-sse/handlers/responsesHandler.ts |
POST /v1/embeddings |
OpenAI Embeddings | open-sse/handlers/embeddings.ts |
GET /v1/embeddings |
Model listing | API route |
POST /v1/images/generations |
OpenAI Images | open-sse/handlers/imageGeneration.ts |
GET /v1/images/generations |
Model listing | API route |
POST /v1/providers/{provider}/chat/completions |
OpenAI Chat | Dedicated per-provider with model validation |
POST /v1/providers/{provider}/embeddings |
OpenAI Embeddings | Dedicated per-provider with model validation |
POST /v1/providers/{provider}/images/generations |
OpenAI Images | Dedicated per-provider with model validation |
POST /v1/messages/count_tokens |
Claude Token Count | API route |
GET /v1/models |
OpenAI Models list | API route (chat + embedding + image + custom models) |
GET /api/models/catalog |
Catalog | All models grouped by provider + type |
POST /v1beta/models/*:streamGenerateContent |
Gemini native | API route |
GET/PUT/DELETE /api/settings/proxy |
Proxy Config | Network proxy configuration |
POST /api/settings/proxy/test |
Proxy Connectivity | Proxy health/connectivity test endpoint |
GET/POST/DELETE /api/provider-models |
Custom Models | Custom model management per provider |
The bypass handler (open-sse/utils/bypassHandler.ts) intercepts known "throwaway" requests from Claude CLI — warmup pings, title extractions, and token counts — and returns a fake response without consuming upstream provider tokens. This is triggered only when User-Agent contains claude-cli.
The request logger (open-sse/utils/requestLogger.ts) provides a 7-stage debug logging pipeline, disabled by default, enabled via ENABLE_REQUEST_LOGS=true:
1_req_client.json → 2_req_source.json → 3_req_openai.json → 4_req_target.json
→ 5_res_provider.txt → 6_res_openai.txt → 7_res_client.txt
Files are written to <repo>/logs/<session>/ for each request session.
- provider account cooldown on transient/rate/auth errors
- account fallback before failing request
- combo model fallback when current model/provider path is exhausted
- pre-check and refresh with retry for refreshable providers
- 401/403 retry after refresh attempt in core path
- disconnect-aware stream controller
- translation stream with end-of-stream flush and
[DONE]handling - usage estimation fallback when provider usage metadata is missing
- sync errors are surfaced but local runtime continues
- scheduler has retry-capable logic, but periodic execution currently calls single-attempt sync by default
- SQLite schema migrations and auto-upgrade hooks at startup
- legacy JSON → SQLite migration compatibility path
Runtime visibility sources:
- console logs from
src/sse/utils/logger.ts - per-request usage aggregates in SQLite (
usage_history,call_logs,proxy_logs) - textual request status log in
log.txt(optional/compat) - optional deep request/translation logs under
logs/whenENABLE_REQUEST_LOGS=true - dashboard usage endpoints (
/api/usage/*) for UI consumption
- JWT secret (
JWT_SECRET) secures dashboard session cookie verification/signing - Initial password bootstrap (
INITIAL_PASSWORD) should be explicitly configured for first-run provisioning - API key HMAC secret (
API_KEY_SECRET) secures generated local API key format - Provider secrets (API keys/tokens) are persisted in local DB and should be protected at filesystem level
- Cloud sync endpoints rely on API key auth + machine id semantics
Environment variables actively used by code:
- App/auth:
JWT_SECRET,INITIAL_PASSWORD - Storage:
DATA_DIR - Compatible node behavior:
ALLOW_MULTI_CONNECTIONS_PER_COMPAT_NODE - Optional storage base override (Linux/macOS when
DATA_DIRunset):XDG_CONFIG_HOME - Security hashing:
API_KEY_SECRET,MACHINE_ID_SALT - Logging:
ENABLE_REQUEST_LOGS - Sync/cloud URLing:
NEXT_PUBLIC_BASE_URL,NEXT_PUBLIC_CLOUD_URL - Outbound proxy:
HTTP_PROXY,HTTPS_PROXY,ALL_PROXY,NO_PROXYand lowercase variants - SOCKS5 feature flags:
ENABLE_SOCKS5_PROXY,NEXT_PUBLIC_ENABLE_SOCKS5_PROXY - Platform/runtime helpers (not app-specific config):
APPDATA,NODE_ENV,PORT,HOSTNAME
usageDbandlocalDbshare the same base directory policy (DATA_DIR->XDG_CONFIG_HOME/omniroute->~/.omniroute) with legacy file migration./api/v1/route.tsdelegates to the same unified catalog builder used by/api/v1/models(src/app/api/v1/models/catalog.ts) to avoid semantic drift.- Request logger writes full headers/body when enabled; treat log directory as sensitive.
- Cloud behavior depends on correct
NEXT_PUBLIC_BASE_URLand cloud endpoint reachability. - The
open-sse/directory is published as the@omniroute/open-ssenpm workspace package. Source code imports it via@omniroute/open-sse/...(resolved by Next.jstranspilePackages). File paths in this document still use the directory nameopen-sse/for consistency. - Charts in the dashboard use Recharts (SVG-based) for accessible, interactive analytics visualizations (model usage bar charts, provider breakdown tables with success rates).
- E2E tests use Playwright (
tests/e2e/), run vianpm run test:e2e. Unit tests use Node.js test runner (tests/unit/), run vianpm run test:unit. Source code undersrc/is TypeScript (.ts/.tsx); theopen-sse/workspace remains JavaScript (.js). - Settings page is organized into 5 tabs: Security, Routing (6 global strategies: fill-first, round-robin, p2c, random, least-used, cost-optimized), Resilience (editable rate limits, circuit breaker, policies), AI (thinking budget, system prompt, prompt cache), Advanced (proxy).
- Build from source:
npm run build - Build Docker image:
docker build -t omniroute . - Start service and verify:
GET /api/settingsGET /api/v1/models- CLI target base URL should be
http://<host>:20128/v1whenPORT=20128