Skip to content

diegosouzapw/OmniRoute

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

355 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 OmniRoute — The Free AI Gateway

Never stop coding. Smart routing to FREE & low-cost AI models with automatic fallback.

Your universal API proxy — one endpoint, 36+ providers, zero downtime. Now with MCP & A2A agent orchestration.

Chat Completions • Embeddings • Image Generation • Video • Music • Audio • Reranking • MCP Server • A2A Protocol • 100% TypeScript


🌐 Available in: 🇺🇸 English | 🇧🇷 Português (Brasil) | 🇪🇸 Español | 🇫🇷 Français | 🇮🇹 Italiano | 🇷🇺 Русский | 🇨🇳 中文 (简体) | 🇩🇪 Deutsch | 🇮🇳 हिन्दी | 🇹🇭 ไทย | 🇺🇦 Українська | 🇸🇦 العربية | 🇯🇵 日本語 | 🇻🇳 Tiếng Việt | 🇧🇬 Български | 🇩🇰 Dansk | 🇫🇮 Suomi | 🇮🇱 עברית | 🇭🇺 Magyar | 🇮🇩 Bahasa Indonesia | 🇰🇷 한국어 | 🇲🇾 Bahasa Melayu | 🇳🇱 Nederlands | 🇳🇴 Norsk | 🇵🇹 Português (Portugal) | 🇷🇴 Română | 🇵🇱 Polski | 🇸🇰 Slovenčina | 🇸🇪 Svenska | 🇵🇭 Filipino


🖼️ Main Dashboard

OmniRoute Dashboard

📸 Dashboard Preview

Click to see dashboard screenshots
Page Screenshot
Providers Providers
Combos Combos
Analytics Analytics
Health Health
Translator Translator
Settings Settings
CLI Tools CLI Tools
Usage Logs Usage
Endpoints Endpoints

🤖 Free AI Provider for your favorite coding agents

Connect any AI-powered IDE or CLI tool through OmniRoute — free API gateway for unlimited coding.

OpenClaw
OpenClaw

⭐ 205K
NanoBot
NanoBot

⭐ 20.9K
PicoClaw
PicoClaw

⭐ 14.6K
ZeroClaw
ZeroClaw

⭐ 9.9K
IronClaw
IronClaw

⭐ 2.1K
OpenCode
OpenCode

⭐ 106K
Codex CLI
Codex CLI

⭐ 60.8K
Claude Code
Claude Code

⭐ 67.3K
Gemini CLI
Gemini CLI

⭐ 94.7K
Kilo Code
Kilo Code

⭐ 15.5K

📡 All agents connect via http://localhost:20128/v1 or http://cloud.omniroute.online/v1 — one config, unlimited models and quota


🤔 Why OmniRoute?

Stop wasting money and hitting limits:

  • Subscription quota expires unused every month
  • Rate limits stop you mid-coding
  • Expensive APIs ($20-50/month per provider)
  • Manual switching between providers

OmniRoute solves this:

  • Maximize subscriptions - Track quota, use every bit before reset
  • Auto fallback - Subscription → API Key → Cheap → Free, zero downtime
  • Multi-account - Round-robin between accounts per provider
  • Universal - Works with Claude Code, Codex, Gemini CLI, Cursor, Cline, OpenClaw, any CLI tool

📧 Support

💬 Join our community! WhatsApp Group — Get help, share tips, and stay updated.


🔄 How It Works

┌─────────────┐
│  Your CLI   │  (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...)
│   Tool      │
└──────┬──────┘
       │ http://localhost:20128/v1
       ↓
┌─────────────────────────────────────────┐
│           OmniRoute (Smart Router)        │
│  • Format translation (OpenAI ↔ Claude) │
│  • Quota tracking + Embeddings + Images │
│  • Auto token refresh                   │
└──────┬──────────────────────────────────┘
       │
       ├─→ [Tier 1: SUBSCRIPTION] Claude Code, Codex, Gemini CLI
       │   ↓ quota exhausted
       ├─→ [Tier 2: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM, etc.
       │   ↓ budget limit
       ├─→ [Tier 3: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M)
       │   ↓ budget limit
       └─→ [Tier 4: FREE] iFlow, Qwen, Kiro (unlimited)

Result: Never stop coding, minimal cost

🎯 What OmniRoute Solves — 30 Real Pain Points & Use Cases

Every developer using AI tools faces these problems daily. OmniRoute was built to solve them all — from cost overruns to regional blocks, from broken OAuth flows to protocol operations and enterprise observability.

💸 1. "I pay for an expensive subscription but still get interrupted by limits"

Developers pay $20–200/month for Claude Pro, Codex Pro, or GitHub Copilot. Even paying, quota has a ceiling — 5h of usage, weekly limits, or per-minute rate limits. Mid-coding session, the provider stops responding and the developer loses flow and productivity.

How OmniRoute solves it:

  • Smart 4-Tier Fallback — If subscription quota runs out, automatically redirects to API Key → Cheap → Free with zero manual intervention
  • Real-Time Quota Tracking — Shows token consumption in real-time with reset countdown (5h, daily, weekly)
  • Multi-Account Support — Multiple accounts per provider with auto round-robin — when one runs out, switches to the next
  • Custom Combos — Customizable fallback chains with 6 balancing strategies (fill-first, round-robin, P2C, random, least-used, cost-optimized)
  • Codex Business Quotas — Business/Team workspace quota monitoring directly in the dashboard
🔌 2. "I need to use multiple providers but each has a different API"

OpenAI uses one format, Claude (Anthropic) uses another, Gemini yet another. If a dev wants to test models from different providers or fallback between them, they need to reconfigure SDKs, change endpoints, deal with incompatible formats. Custom providers (FriendLI, NIM) have non-standard model endpoints.

How OmniRoute solves it:

  • Unified Endpoint — A single http://localhost:20128/v1 serves as proxy for all 36+ providers
  • Format Translation — Automatic and transparent: OpenAI ↔ Claude ↔ Gemini ↔ Responses API
  • Response Sanitization — Strips non-standard fields (x_groq, usage_breakdown, service_tier) that break OpenAI SDK v1.83+
  • Role Normalization — Converts developersystem for non-OpenAI providers; systemuser for GLM/ERNIE
  • Think Tag Extraction — Extracts <think> blocks from models like DeepSeek R1 into standardized reasoning_content
  • Structured Output for Geminijson_schemaresponseMimeType/responseSchema automatic conversion
  • stream defaults to false — Aligns with OpenAI spec, avoiding unexpected SSE in Python/Rust/Go SDKs
🌐 3. "My AI provider blocks my region/country"

Providers like OpenAI/Codex block access from certain geographic regions. Users get errors like unsupported_country_region_territory during OAuth and API connections. This is especially frustrating for developers from developing countries.

How OmniRoute solves it:

  • 3-Level Proxy Config — Configurable proxy at 3 levels: global (all traffic), per-provider (one provider only), and per-connection/key
  • Color-Coded Proxy Badges — Visual indicators: 🟢 global proxy, 🟡 provider proxy, 🔵 connection proxy, always showing the IP
  • OAuth Token Exchange Through Proxy — OAuth flow also goes through the proxy, solving unsupported_country_region_territory
  • Connection Tests via Proxy — Connection tests use the configured proxy (no more direct bypass)
  • SOCKS5 Support — Full SOCKS5 proxy support for outbound routing
  • TLS Fingerprint Spoofing — Browser-like TLS fingerprint via wreq-js to bypass bot detection
🆓 4. "I want to use AI for coding but I have no money"

Not everyone can pay $20–200/month for AI subscriptions. Students, devs from emerging countries, hobbyists, and freelancers need access to quality models at zero cost.

How OmniRoute solves it:

  • Free Tier Providers Built-in — Native support for 100% free providers: iFlow (8 unlimited models), Qwen (3 unlimited models), Kiro (Claude for free), Gemini CLI (180K/month free)
  • Free-Only Combos — Chain gc/gemini-3-flash → if/kimi-k2-thinking → qw/qwen3-coder-plus = $0/month with zero downtime
  • NVIDIA NIM Free Credits — 1000 free credits integrated
  • Cost Optimized Strategy — Routing strategy that automatically chooses the cheapest available provider
🔒 5. "I need to protect my AI gateway from unauthorized access"

When exposing an AI gateway to the network (LAN, VPS, Docker), anyone with the address can consume the developer's tokens/quota. Without protection, APIs are vulnerable to misuse, prompt injection, and abuse.

How OmniRoute solves it:

  • API Key Management — Generation, rotation, and scoping per provider with a dedicated /dashboard/api-manager page
  • Model-Level Permissions — Restrict API keys to specific models (openai/*, wildcard patterns), with Allow All/Restrict toggle
  • API Endpoint Protection — Require a key for /v1/models and block specific providers from the listing
  • Auth Guard + CSRF Protection — All dashboard routes protected with withAuth middleware + CSRF tokens
  • Rate Limiter — Per-IP rate limiting with configurable windows
  • IP Filtering — Allowlist/blocklist for access control
  • Prompt Injection Guard — Sanitization against malicious prompt patterns
  • AES-256-GCM Encryption — Credentials encrypted at rest
🛑 6. "My provider went down and I lost my coding flow"

AI providers can become unstable, return 5xx errors, or hit temporary rate limits. If a dev depends on a single provider, they're interrupted. Without circuit breakers, repeated retries can crash the application.

How OmniRoute solves it:

  • Circuit Breaker per-model — Auto-open/close with configurable thresholds and cooldown (Closed/Open/Half-Open), scoped per-model to avoid cascading blocks
  • Exponential Backoff — Progressive retry delays
  • Anti-Thundering Herd — Mutex + semaphore protection against concurrent retry storms
  • Combo Fallback Chains — If the primary provider fails, automatically falls through the chain with no intervention
  • Combo Circuit Breaker — Auto-disables failing providers within a combo chain
  • Health Dashboard — Uptime monitoring, circuit breaker states, lockouts, cache stats, p50/p95/p99 latency
🔧 7. "Configuring each AI tool is tedious and repetitive"

Developers use Cursor, Claude Code, Codex CLI, OpenClaw, Gemini CLI, Kilo Code... Each tool needs a different config (API endpoint, key, model). Reconfiguring when switching providers or models is a waste of time.

How OmniRoute solves it:

  • CLI Tools Dashboard — Dedicated page with one-click setup for Claude Code, Codex CLI, OpenClaw, Kilo Code, Antigravity, Cline
  • GitHub Copilot Config Generator — Generates chatLanguageModels.json for VS Code with bulk model selection
  • Onboarding Wizard — Guided 4-step setup for first-time users
  • One endpoint, all models — Configure http://localhost:20128/v1 once, access 36+ providers
🔑 8. "Managing OAuth tokens from multiple providers is hell"

Claude Code, Codex, Gemini CLI, Copilot — all use OAuth 2.0 with expiring tokens. Developers need to re-authenticate constantly, deal with client_secret is missing, redirect_uri_mismatch, and failures on remote servers. OAuth on LAN/VPS is particularly problematic.

How OmniRoute solves it:

  • Auto Token Refresh — OAuth tokens refresh in background before expiration
  • OAuth 2.0 (PKCE) Built-in — Automatic flow for Claude Code, Codex, Gemini CLI, Copilot, Kiro, Qwen, iFlow
  • Multi-Account OAuth — Multiple accounts per provider via JWT/ID token extraction
  • OAuth LAN/Remote Fix — Private IP detection for redirect_uri + manual URL mode for remote servers
  • OAuth Behind Nginx — Uses window.location.origin for reverse proxy compatibility
  • Remote OAuth Guide — Step-by-step guide for Google Cloud credentials on VPS/Docker
📊 9. "I don't know how much I'm spending or where"

Developers use multiple paid providers but have no unified view of spending. Each provider has its own billing dashboard, but there's no consolidated view. Unexpected costs can pile up.

How OmniRoute solves it:

  • Cost Analytics Dashboard — Per-token cost tracking and budget management per provider
  • Budget Limits per Tier — Spending ceiling per tier that triggers automatic fallback
  • Per-Model Pricing Configuration — Configurable prices per model
  • Usage Statistics Per API Key — Request count and last-used timestamp per key
  • Analytics Dashboard — Stat cards, model usage chart, provider table with success rates and latency
🐛 10. "I can't diagnose errors and problems in AI calls"

When a call fails, the dev doesn't know if it was a rate limit, expired token, wrong format, or provider error. Fragmented logs across different terminals. Without observability, debugging is trial-and-error.

How OmniRoute solves it:

  • Unified Logs Dashboard — 4 tabs: Request Logs, Proxy Logs, Audit Logs, Console
  • Console Log Viewer — Real-time terminal-style viewer with color-coded levels, auto-scroll, search, filter
  • SQLite Proxy Logs — Persistent logs that survive server restarts
  • Translator Playground — 4 debugging modes: Playground (format translation), Chat Tester (round-trip), Test Bench (batch), Live Monitor (real-time)
  • Request Telemetry — p50/p95/p99 latency + X-Request-Id tracing
  • File-Based Logging with Rotation — Console interceptor captures everything to JSON log with size-based rotation
🏗️ 11. "Deploying and maintaining the gateway is complex"

Installing, configuring, and maintaining an AI proxy across different environments (local, VPS, Docker, cloud) is labor-intensive. Problems like hardcoded paths, EACCES on directories, port conflicts, and cross-platform builds add friction.

How OmniRoute solves it:

  • npm global installnpm install -g omniroute && omniroute — done
  • Docker Multi-Platform — AMD64 + ARM64 native (Apple Silicon, AWS Graviton, Raspberry Pi)
  • Docker Compose Profilesbase (no CLI tools) and cli (with Claude Code, Codex, OpenClaw)
  • Electron Desktop App — Native app for Windows/macOS/Linux with system tray, auto-start, offline mode
  • Split-Port Mode — API and Dashboard on separate ports for advanced scenarios (reverse proxy, container networking)
  • Cloud Sync — Config synchronization across devices via Cloudflare Workers
  • DB Backups — Automatic backup, restore, export and import of all settings
🌍 12. "The interface is English-only and my team doesn't speak English"

Teams in non-English-speaking countries, especially in Latin America, Asia, and Europe, struggle with English-only interfaces. Language barriers reduce adoption and increase configuration errors.

How OmniRoute solves it:

  • Dashboard i18n — 30 Languages — All 500+ keys translated including Arabic, Bulgarian, Danish, German, Spanish, Finnish, French, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese (PT/BR), Romanian, Russian, Slovak, Swedish, Thai, Ukrainian, Vietnamese, Chinese, Filipino, English
  • RTL Support — Right-to-left support for Arabic and Hebrew
  • Multi-Language READMEs — 30 complete documentation translations
  • Language Selector — Globe icon in header for real-time switching
🔄 13. "I need more than chat — I need embeddings, images, audio"

AI isn't just chat completion. Devs need to generate images, transcribe audio, create embeddings for RAG, rerank documents, and moderate content. Each API has a different endpoint and format.

How OmniRoute solves it:

  • Embeddings/v1/embeddings with 6 providers and 9+ models
  • Image Generation/v1/images/generations with 10 providers and 20+ models (OpenAI, xAI, Together, Fireworks, Nebius, Hyperbolic, NanoBanana, Antigravity, SD WebUI, ComfyUI)
  • Text-to-Video/v1/videos/generations — ComfyUI (AnimateDiff, SVD) and SD WebUI
  • Text-to-Music/v1/music/generations — ComfyUI (Stable Audio Open, MusicGen)
  • Audio Transcription/v1/audio/transcriptions — Whisper + Nvidia NIM, HuggingFace, Qwen3
  • Text-to-Speech/v1/audio/speech — ElevenLabs, Nvidia NIM, HuggingFace, Coqui, Tortoise, Qwen3, + existing providers
  • Moderations/v1/moderations — Content safety checks
  • Reranking/v1/rerank — Document relevance reranking
  • Responses API — Full /v1/responses support for Codex
🧪 14. "I have no way to test and compare quality across models"

Developers want to know which model is best for their use case — code, translation, reasoning — but comparing manually is slow. No integrated eval tools exist.

How OmniRoute solves it:

  • LLM Evaluations — Golden set testing with 10 pre-loaded cases covering greetings, math, geography, code generation, JSON compliance, translation, markdown, safety refusal
  • 4 Match Strategiesexact, contains, regex, custom (JS function)
  • Translator Playground Test Bench — Batch testing with multiple inputs and expected outputs, cross-provider comparison
  • Chat Tester — Full round-trip with visual response rendering
  • Live Monitor — Real-time stream of all requests flowing through the proxy
📈 15. "I need to scale without losing performance"

As request volume grows, without caching the same questions generate duplicate costs. Without idempotency, duplicate requests waste processing. Per-provider rate limits must be respected.

How OmniRoute solves it:

  • Semantic Cache — Two-tier cache (signature + semantic) reduces cost and latency
  • Request Idempotency — 5s deduplication window for identical requests
  • Rate Limit Detection — Per-provider RPM, min gap, and max concurrent tracking
  • Editable Rate Limits — Configurable defaults in Settings → Resilience with persistence
  • API Key Validation Cache — 3-tier cache for production performance
  • Health Dashboard with Telemetry — p50/p95/p99 latency, cache stats, uptime
🤖 16. "I want to control model behavior globally"

Developers who want all responses in a specific language, with a specific tone, or want to limit reasoning tokens. Configuring this in every tool/request is impractical.

How OmniRoute solves it:

  • System Prompt Injection — Global prompt applied to all requests
  • Thinking Budget Validation — Reasoning token allocation control per request (passthrough, auto, custom, adaptive)
  • 6 Routing Strategies — Global strategies that determine how requests are distributed
  • Wildcard Routerprovider/* patterns route dynamically to any provider
  • Combo Enable/Disable Toggle — Toggle combos directly from the dashboard
  • Provider Toggle — Enable/disable all connections for a provider with one click
  • Blocked Providers — Exclude specific providers from /v1/models listing
🧰 17. "I need MCP tools as first-class product capabilities"

Many AI gateways expose MCP only as a hidden implementation detail. Teams need a visible, manageable operation layer.

How OmniRoute solves it:

  • MCP appears in the dashboard navigation and endpoint protocol tab
  • Dedicated MCP management page with process, tools, scopes, and audit
  • Built-in quick-start for omniroute --mcp and client onboarding
🧠 18. "I need A2A orchestration with sync + stream task paths"

Agent workflows need both direct replies and long-running streamed execution with lifecycle control.

How OmniRoute solves it:

  • A2A JSON-RPC endpoint (POST /a2a) with message/send and message/stream
  • SSE streaming with terminal state propagation
  • Task lifecycle APIs for tasks/get and tasks/cancel
🛰️ 19. "I need real MCP process health, not guessed status"

Operational teams need to know if MCP is actually alive, not just whether an API is reachable.

How OmniRoute solves it:

  • Runtime heartbeat file with PID, timestamps, transport, tool count, and scope mode
  • MCP status API combining heartbeat + recent activity
  • UI status cards for process/uptime/heartbeat freshness
📋 20. "I need auditable MCP tool execution"

When tools mutate config or trigger ops actions, teams need forensic traceability.

How OmniRoute solves it:

  • SQLite-backed audit logging for MCP tool calls
  • Filters by tool, success/failure, API key, and pagination
  • Dashboard audit table + stats endpoints for automation
🔐 21. "I need scoped MCP permissions per integration"

Different clients should have least-privilege access to tool categories.

How OmniRoute solves it:

  • 9 granular MCP scopes for controlled tool access
  • Scope enforcement and visibility in MCP management UI
  • Safe default posture for operational tooling
⚙️ 22. "I need operational controls without redeploying"

Teams need quick runtime changes during incidents or cost events.

How OmniRoute solves it:

  • Switch combo activation directly from MCP dashboard
  • Apply resilience profiles from pre-defined policy packs
  • Reset circuit breaker state from the same operations panel
🔄 23. "I need live A2A task lifecycle visibility and cancellation"

Without lifecycle visibility, task incidents become hard to triage.

How OmniRoute solves it:

  • Task listing/filtering by state/skill with pagination
  • Drill-down on task metadata, events, and artifacts
  • Task cancellation endpoint and UI action with confirmation
🌊 24. "I need active stream metrics for A2A load"

Streaming workflows require operational insight into concurrency and live connections.

How OmniRoute solves it:

  • Active stream counters integrated into A2A status
  • Last task timestamp and per-state counts
  • A2A dashboard cards for real-time ops monitoring
🪪 25. "I need standard agent discovery for clients"

External clients and orchestrators need machine-readable metadata for onboarding.

How OmniRoute solves it:

  • Agent Card exposed at /.well-known/agent.json
  • Capabilities and skills shown in management UI
  • A2A status API includes discovery metadata for automation
🧭 26. "I need protocol discoverability in the product UX"

If users cannot discover protocol surfaces, adoption and support quality drop.

How OmniRoute solves it:

  • Consolidated Endpoints page with tabs for Proxy, MCP, A2A, and API Endpoints
  • Inline service status toggles (Online/Offline) for MCP and A2A
  • Links from overview to dedicated management tabs
🧪 27. "I need end-to-end protocol validation with real clients"

Mock tests are not enough to validate protocol compatibility before release.

How OmniRoute solves it:

  • E2E suite that boots app and uses real MCP SDK client transport
  • A2A client tests for discovery, send, stream, get, and cancel flows
  • Cross-check assertions against MCP audit and A2A tasks APIs
📡 28. "I need unified observability across all interfaces"

Splitting observability by protocol creates blind spots and longer MTTR.

How OmniRoute solves it:

  • Unified dashboards/logs/analytics in one product
  • Health + audit + request telemetry across OpenAI, MCP, and A2A layers
  • Operational APIs for status and automation
💼 29. "I need one runtime for proxy + tools + agent orchestration"

Running many separate services increases operational cost and failure modes.

How OmniRoute solves it:

  • OpenAI-compatible proxy, MCP server, and A2A server in one stack
  • Shared auth, resilience, data store, and observability
  • Consistent policy model across all interaction surfaces
🚀 30. "I need to ship agentic workflows without glue-code sprawl"

Teams lose velocity when stitching multiple ad-hoc services and scripts.

How OmniRoute solves it:

  • Unified endpoint strategy for clients and agents
  • Built-in protocol management UIs and smoke validation paths
  • Production-ready foundations (security, logging, resilience, backup)

Example Playbooks (Integrated Use Cases)

Playbook A: Maximize paid subscription + cheap backup

Combo: "maximize-claude"
  1. cc/claude-opus-4-6
  2. glm/glm-4.7
  3. if/kimi-k2-thinking

Monthly cost: $20 + small backup spend
Outcome: higher quality, near-zero interruption

Playbook B: Zero-cost coding stack

Combo: "free-forever"
  1. gc/gemini-3-flash
  2. if/kimi-k2-thinking
  3. qw/qwen3-coder-plus

Monthly cost: $0
Outcome: stable free coding workflow

Playbook C: 24/7 always-on fallback chain

Combo: "always-on"
  1. cc/claude-opus-4-6
  2. cx/gpt-5.2-codex
  3. glm/glm-4.7
  4. minimax/MiniMax-M2.1
  5. if/kimi-k2-thinking

Outcome: deep fallback depth for deadline-critical workloads

Playbook D: Agent ops with MCP + A2A

1) Start MCP transport (`omniroute --mcp`) for tool-driven operations
2) Run A2A tasks via `message/send` and `message/stream`
3) Observe via /dashboard/endpoint (MCP and A2A tabs)
4) Toggle services via inline status controls

⚡ Quick Start

1) Install and run

npm install -g omniroute
omniroute

Dashboard opens at http://localhost:20128 and API base URL is http://localhost:20128/v1.

Command Description
omniroute Start server (PORT=20128, API and dashboard on same port)
omniroute --port 3000 Set canonical/API port to 3000
omniroute --mcp Start MCP server (stdio transport)
omniroute --no-open Don't auto-open browser
omniroute --help Show help

Optional split-port mode:

PORT=20128 DASHBOARD_PORT=20129 omniroute
# API:       http://localhost:20128/v1
# Dashboard: http://localhost:20129

2) Connect providers and create your API key

  1. Open Dashboard → Providers and connect at least one provider (OAuth or API key).
  2. Open Dashboard → Endpoints and create an API key.
  3. (Optional) Open Dashboard → Combos and set your fallback chain.

3) Point your coding tool to OmniRoute

Base URL: http://localhost:20128/v1
API Key:  [copy from Endpoint page]
Model:    if/kimi-k2-thinking (or any provider/model prefix)

Works with Claude Code, Codex CLI, Gemini CLI, Cursor, Cline, OpenClaw, OpenCode, and OpenAI-compatible SDKs.

4) Enable and validate protocols (v2.0)

MCP (for tool-driven operations):

omniroute --mcp

Then connect your MCP client over stdio and test tools like:

  • omniroute_get_health
  • omniroute_list_combos

A2A (for agent-to-agent workflows):

curl http://localhost:20128/.well-known/agent.json
curl -X POST http://localhost:20128/a2a \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":"quickstart","method":"message/send","params":{"skill":"quota-management","messages":[{"role":"user","content":"Give me a short quota summary."}]}}'

5) Validate everything end-to-end (recommended)

npm run test:protocols:e2e

This suite validates real MCP and A2A client flows against a running app.

Alternative: run from source

cp .env.example .env
npm install
PORT=20128 DASHBOARD_PORT=20129 NEXT_PUBLIC_BASE_URL=http://localhost:20129 npm run dev

🐳 Docker

OmniRoute is available as a public Docker image on Docker Hub.

Quick run:

docker run -d \
  --name omniroute \
  --restart unless-stopped \
  -p 20128:20128 \
  -v omniroute-data:/app/data \
  diegosouzapw/omniroute:latest

With environment file:

# Copy and edit .env first
cp .env.example .env

docker run -d \
  --name omniroute \
  --restart unless-stopped \
  --env-file .env \
  -p 20128:20128 \
  -v omniroute-data:/app/data \
  diegosouzapw/omniroute:latest

Using Docker Compose:

# Base profile (no CLI tools)
docker compose --profile base up -d

# CLI profile (Claude Code, Codex, OpenClaw built-in)
docker compose --profile cli up -d
Image Tag Size Description
diegosouzapw/omniroute latest ~250MB Latest stable release
diegosouzapw/omniroute 1.0.3 ~250MB Current version

🖥️ Desktop App — Offline & Always-On

🆕 NEW! OmniRoute is now available as a native desktop application for Windows, macOS, and Linux.

Run OmniRoute as a standalone desktop app — no terminal, no browser, no internet required for local models. The Electron-based app includes:

  • 🖥️ Native Window — Dedicated app window with system tray integration
  • 🔄 Auto-Start — Launch OmniRoute on system login
  • 🔔 Native Notifications — Get alerts for quota exhaustion or provider issues
  • One-Click Install — NSIS (Windows), DMG (macOS), AppImage (Linux)
  • 🌐 Offline Mode — Works fully offline with bundled server

Quick Start

# Development mode
npm run electron:dev

# Build for your platform
npm run electron:build         # Current platform
npm run electron:build:win     # Windows (.exe)
npm run electron:build:mac     # macOS (.dmg) — x64 & arm64
npm run electron:build:linux   # Linux (.AppImage)

System Tray

When minimized, OmniRoute lives in your system tray with quick actions:

  • Open dashboard
  • Change server port
  • Quit application

📖 Full documentation: electron/README.md


💰 Pricing at a Glance

Tier Provider Cost Quota Reset Best For
💳 SUBSCRIPTION Claude Code (Pro) $20/mo 5h + weekly Already subscribed
Codex (Plus/Pro) $20-200/mo 5h + weekly OpenAI users
Gemini CLI FREE 180K/mo + 1K/day Everyone!
GitHub Copilot $10-19/mo Monthly GitHub users
🔑 API KEY NVIDIA NIM FREE (1000 credits) One-time Free tier testing
DeepSeek Pay-per-use None Best price/quality
Groq Free tier + paid Rate limited Ultra-fast inference
xAI (Grok) Pay-per-use None Grok models
Mistral Free tier + paid Rate limited European AI
OpenRouter Pay-per-use None 100+ models
💰 CHEAP GLM-4.7 $0.6/1M Daily 10AM Budget backup
MiniMax M2.1 $0.2/1M 5-hour rolling Cheapest option
Kimi K2 $9/mo flat 10M tokens/mo Predictable cost
🆓 FREE iFlow $0 Unlimited 8 models free
Qwen $0 Unlimited 3 models free
Kiro $0 Unlimited Claude free

💡 Pro Tip: Start with Gemini CLI (180K free/month) + iFlow (unlimited free) combo = $0 cost!


💡 Key Features

OmniRoute v2.0 is built as an operational platform, not just a relay proxy.

🤖 Agent & Protocol Operations (v2.0)

Feature What It Does
🔧 MCP Server (16 tools) IDE/agent tools via 3 transports: stdio, SSE (/api/mcp/sse), Streamable HTTP (/api/mcp/stream)
🤝 A2A Server (JSON-RPC + SSE) Agent-to-agent task execution with sync and streaming flows
🧭 Consolidated Endpoints Page Tabbed management page with Endpoint Proxy, MCP, A2A, and API Endpoints tabs
🎚️ Service Enable/Disable Toggles ON/OFF switches for MCP and A2A with settings persistence (default: OFF)
🛰️ MCP Runtime Heartbeat Real process status (pid, uptime, heartbeat age, transport, scope mode)
📋 MCP Audit Trail Filterable audit logs with success/failure and key attribution
🔐 MCP Scope Enforcement 9 granular scope permissions for controlled tool access
📡 A2A Task Lifecycle Management List/filter tasks, inspect events/artifacts, cancel running tasks
📋 Agent Card Discovery /.well-known/agent.json for client auto-discovery
🧪 Protocol E2E Test Harness Real MCP SDK + A2A client flows in test:protocols:e2e
⚙️ Operational Controls Switch combo, apply resilience profiles, reset breakers from one control surface

🧠 Routing & Intelligence

Feature What It Does
🎯 Smart 4-Tier Fallback Auto-route: Subscription → API Key → Cheap → Free
📊 Real-Time Quota Tracking Live token count + reset countdown per provider
🔄 Format Translation OpenAI ↔ Claude ↔ Gemini ↔ Responses with schema-safe conversions
👥 Multi-Account Support Multiple accounts per provider with intelligent selection
🔄 Auto Token Refresh OAuth tokens refresh automatically with retry
🎨 Custom Combos 6 balancing strategies + fallback chain control
🌐 Wildcard Router provider/* dynamic routing
🧠 Thinking Budget Controls Passthrough, auto, custom, and adaptive reasoning limits
🔀 Model Aliases Built-in + custom model aliasing and migration safety
Background Degradation Route low-priority background tasks to cheaper models
💬 System Prompt Injection Global behavior controls applied consistently
📄 Responses API Compatibility Full /v1/responses support for Codex and advanced agentic workflows

🎵 Multi-Modal APIs

Feature What It Does
🖼️ Image Generation /v1/images/generations with cloud and local backends
📐 Embeddings /v1/embeddings for search and RAG pipelines
🎤 Audio Transcription /v1/audio/transcriptions (Whisper and additional providers)
🔊 Text-to-Speech /v1/audio/speech (multiple engines/providers)
🎬 Video Generation /v1/videos/generations (ComfyUI + SD WebUI workflows)
🎵 Music Generation /v1/music/generations (ComfyUI workflows)
🛡️ Moderations /v1/moderations safety checks
🔀 Reranking /v1/rerank for relevance scoring

🛡️ Resilience, Security & Governance

Feature What It Does
🔌 Circuit Breakers Per-model trip/recover with threshold controls
🎯 Endpoint-Aware Models Custom models declare supported endpoints + API format
🛡️ Anti-Thundering Herd Mutex + semaphore protections on retry/rate events
🧠 Semantic + Signature Cache Cost/latency reduction with two cache layers
Request Idempotency Duplicate protection window
🔒 TLS Fingerprint Spoofing Better compatibility with anti-bot filtered providers
🌐 IP Filtering Allowlist/blocklist control for exposed deployments
📊 Editable Rate Limits Configurable global/provider-level limits with persistence
🔑 API Key Management + Scoping Secure key issuance/rotation and model/provider controls
🛡️ Protected /models Optional auth gating and provider hiding for model catalog

📊 Observability & Analytics

Feature What It Does
📝 Request + Proxy Logging Full request/response and proxy logging
📋 Unified Logs Dashboard Request, proxy, audit, and console views in one page
🔍 Request Telemetry p50/p95/p99 latency and request tracing
🏥 Health Dashboard Uptime, breaker states, lockouts, cache stats
💰 Cost Tracking Budget controls and per-model pricing visibility
📈 Analytics Visualizations Model/provider usage insights and trend views
🧪 Evaluation Framework Golden set testing with configurable match strategies

☁️ Deployment & Platform

Feature What It Does
🌐 Deploy Anywhere Localhost, VPS, Docker, Cloud environments
💾 Cloud Sync Configuration sync via cloud worker
🔄 Backup/Restore Export/import and disaster recovery flows
🧙 Onboarding Wizard First-run guided setup
🔧 CLI Tools Dashboard One-click setup for popular coding tools
🌐 i18n (30 languages) Full dashboard + docs language support with RTL coverage
📂 Custom Data Directory DATA_DIR override for storage location

Feature Deep Dive

Smart fallback with practical cost control

Combo: "my-coding-stack"
  1. cc/claude-opus-4-6
  2. nvidia/llama-3.3-70b
  3. glm/glm-4.7
  4. if/kimi-k2-thinking

When quota, rate, or health fails, OmniRoute automatically moves to the next candidate without manual switching.

Protocol management that is visible and operable

  • MCP + A2A are discoverable in UI and docs (not hidden)
  • Protocol status APIs expose live operational data (/api/mcp/*, /api/a2a/*)
  • Dashboards include actions for day-2 ops (combo toggles, breaker resets, task cancellation)

Translator + validation workflow

The Translator area includes:

  • Playground: request transformation checks
  • Chat Tester: full request/response round-trip
  • Test Bench: multiple cases in one run
  • Live Monitor: real-time traffic view

Plus protocol validation with real clients via npm run test:protocols:e2e.

📖 MCP Server README — Tool reference, IDE configs, and client examples

📖 A2A Server README — Skills, JSON-RPC methods, streaming, and task lifecycle

🧪 Evaluations (Evals)

OmniRoute includes a built-in evaluation framework to test LLM response quality against a golden set. Access it via Analytics → Evals in the dashboard.

Built-in Golden Set

The pre-loaded "OmniRoute Golden Set" contains test cases for:

  • Greetings, math, geography, code generation
  • JSON format compliance, translation, markdown generation
  • Safety refusal (harmful content), counting, boolean logic

Evaluation Strategies

Strategy Description Example
exact Output must match exactly "4"
contains Output must contain substring (case-insensitive) "Paris"
regex Output must match regex pattern "1.*2.*3"
custom Custom JS function returns true/false (output) => output.length > 10

📖 Setup Guide

Protocol Setup (MCP + A2A)

🧩 MCP Setup (Model Context Protocol)

Start MCP transport in stdio mode:

omniroute --mcp

Recommended validation flow:

  1. Connect your MCP client over stdio.
  2. Run omniroute_get_health.
  3. Run omniroute_list_combos.
  4. Open /dashboard/mcp to confirm heartbeat, activity, and audit.

Useful APIs for automation:

  • GET /api/mcp/status
  • GET /api/mcp/tools
  • GET /api/mcp/audit
  • GET /api/mcp/audit/stats
🤝 A2A Setup (Agent2Agent)

Discover the agent:

curl http://localhost:20128/.well-known/agent.json

Send a task:

curl -X POST http://localhost:20128/a2a \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":"setup-a2a","method":"message/send","params":{"skill":"quota-management","messages":[{"role":"user","content":"Summarize quota status."}]}}'

Manage lifecycle:

  • GET /api/a2a/status
  • GET /api/a2a/tasks
  • GET /api/a2a/tasks/:id
  • POST /api/a2a/tasks/:id/cancel

Operational UI:

  • /dashboard/a2a for task/state/stream observability and smoke actions
🧪 End-to-end protocol validation

Validate both protocols with real clients:

npm run test:protocols:e2e

This verifies:

  • MCP SDK client connect/list/call
  • A2A discovery/send/stream/get/cancel
  • Cross-check data in MCP audit and A2A task management APIs
💳 Subscription Providers

Claude Code (Pro/Max)

Dashboard → Providers → Connect Claude Code
→ OAuth login → Auto token refresh
→ 5-hour + weekly quota tracking

Models:
  cc/claude-opus-4-6
  cc/claude-sonnet-4-5-20250929
  cc/claude-haiku-4-5-20251001

Pro Tip: Use Opus for complex tasks, Sonnet for speed. OmniRoute tracks quota per model!

OpenAI Codex (Plus/Pro)

Dashboard → Providers → Connect Codex
→ OAuth login (port 1455)
→ 5-hour + weekly reset

Models:
  cx/gpt-5.2-codex
  cx/gpt-5.1-codex-max

Gemini CLI (FREE 180K/month!)

Dashboard → Providers → Connect Gemini CLI
→ Google OAuth
→ 180K completions/month + 1K/day

Models:
  gc/gemini-3-flash-preview
  gc/gemini-2.5-pro

Best Value: Huge free tier! Use this before paid tiers.

GitHub Copilot

Dashboard → Providers → Connect GitHub
→ OAuth via GitHub
→ Monthly reset (1st of month)

Models:
  gh/gpt-5
  gh/claude-4.5-sonnet
  gh/gemini-3-pro
🔑 API Key Providers

NVIDIA NIM (FREE 1000 credits!)

  1. Sign up: build.nvidia.com
  2. Get free API key (1000 inference credits included)
  3. Dashboard → Add Provider → NVIDIA NIM:
    • API Key: nvapi-your-key

Models: nvidia/llama-3.3-70b-instruct, nvidia/mistral-7b-instruct, and 50+ more

Pro Tip: OpenAI-compatible API — works seamlessly with OmniRoute's format translation!

DeepSeek

  1. Sign up: platform.deepseek.com
  2. Get API key
  3. Dashboard → Add Provider → DeepSeek

Models: deepseek/deepseek-chat, deepseek/deepseek-coder

Groq (Free Tier Available!)

  1. Sign up: console.groq.com
  2. Get API key (free tier included)
  3. Dashboard → Add Provider → Groq

Models: groq/llama-3.3-70b, groq/mixtral-8x7b

Pro Tip: Ultra-fast inference — best for real-time coding!

OpenRouter (100+ Models)

  1. Sign up: openrouter.ai
  2. Get API key
  3. Dashboard → Add Provider → OpenRouter

Models: Access 100+ models from all major providers through a single API key.

💰 Cheap Providers (Backup)

GLM-4.7 (Daily reset, $0.6/1M)

  1. Sign up: Zhipu AI
  2. Get API key from Coding Plan
  3. Dashboard → Add API Key:
    • Provider: glm
    • API Key: your-key

Use: glm/glm-4.7

Pro Tip: Coding Plan offers 3× quota at 1/7 cost! Reset daily 10:00 AM.

MiniMax M2.1 (5h reset, $0.20/1M)

  1. Sign up: MiniMax
  2. Get API key
  3. Dashboard → Add API Key

Use: minimax/MiniMax-M2.1

Pro Tip: Cheapest option for long context (1M tokens)!

Kimi K2 ($9/month flat)

  1. Subscribe: Moonshot AI
  2. Get API key
  3. Dashboard → Add API Key

Use: kimi/kimi-latest

Pro Tip: Fixed $9/month for 10M tokens = $0.90/1M effective cost!

🆓 FREE Providers (Emergency Backup)

iFlow (8 FREE models)

Dashboard → Connect iFlow
→ iFlow OAuth login
→ Unlimited usage

Models:
  if/kimi-k2-thinking
  if/qwen3-coder-plus
  if/glm-4.7
  if/minimax-m2
  if/deepseek-r1

Qwen (3 FREE models)

Dashboard → Connect Qwen
→ Device code authorization
→ Unlimited usage

Models:
  qw/qwen3-coder-plus
  qw/qwen3-coder-flash

Kiro (Claude FREE)

Dashboard → Connect Kiro
→ AWS Builder ID or Google/GitHub
→ Unlimited usage

Models:
  kr/claude-sonnet-4.5
  kr/claude-haiku-4.5
🎨 Create Combos

Example 1: Maximize Subscription → Cheap Backup

Dashboard → Combos → Create New

Name: premium-coding
Models:
  1. cc/claude-opus-4-6 (Subscription primary)
  2. glm/glm-4.7 (Cheap backup, $0.6/1M)
  3. minimax/MiniMax-M2.1 (Cheapest fallback, $0.20/1M)

Use in CLI: premium-coding

Example 2: Free-Only (Zero Cost)

Name: free-combo
Models:
  1. gc/gemini-3-flash-preview (180K free/month)
  2. if/kimi-k2-thinking (unlimited)
  3. qw/qwen3-coder-plus (unlimited)

Cost: $0 forever!
🔧 CLI Integration

Cursor IDE

Settings → Models → Advanced:
  OpenAI API Base URL: http://localhost:20128/v1
  OpenAI API Key: [from OmniRoute dashboard]
  Model: cc/claude-opus-4-6

Claude Code

Use the CLI Tools page in the dashboard for one-click configuration, or edit ~/.claude/settings.json manually.

Codex CLI

export OPENAI_BASE_URL="http://localhost:20128"
export OPENAI_API_KEY="your-omniroute-api-key"

codex "your prompt"

OpenClaw

Option 1 — Dashboard (recommended):

Dashboard → CLI Tools → OpenClaw → Select Model → Apply

Option 2 — Manual: Edit ~/.openclaw/openclaw.json:

{
  "models": {
    "providers": {
      "omniroute": {
        "baseUrl": "http://127.0.0.1:20128/v1",
        "apiKey": "sk_omniroute",
        "api": "openai-completions"
      }
    }
  }
}

Note: OpenClaw only works with local OmniRoute. Use 127.0.0.1 instead of localhost to avoid IPv6 resolution issues.

Cline / Continue / RooCode

Settings → API Configuration:
  Provider: OpenAI Compatible
  Base URL: http://localhost:20128/v1
  API Key: [from OmniRoute dashboard]
  Model: if/kimi-k2-thinking

OpenCode

Step 1: Add OmniRoute as a custom provider:

opencode
/connect
# Select "Other" → Enter ID: "omniroute" → Enter your OmniRoute API key

Step 2: Create/edit opencode.json in your project root:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "omniroute": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "OmniRoute",
      "options": {
        "baseURL": "http://localhost:20128/v1"
      },
      "models": {
        "cc/claude-sonnet-4-20250514": { "name": "Claude Sonnet 4" },
        "gg/gemini-2.5-pro": { "name": "Gemini 2.5 Pro" },
        "if/kimi-k2-thinking": { "name": "Kimi K2 (Free)" }
      }
    }
  }
}

Step 3: Select the model in OpenCode:

/models
# Select any OmniRoute model from the list

Tip: Add any model available in your OmniRoute /v1/models endpoint to the models section. Use the format provider/model-id from your OmniRoute dashboard.


🐛 Troubleshooting

Click to expand troubleshooting guide

"Language model did not provide messages"

  • Provider quota exhausted → Check dashboard quota tracker
  • Solution: Use combo fallback or switch to cheaper tier

Rate limiting

  • Subscription quota out → Fallback to GLM/MiniMax
  • Add combo: cc/claude-opus-4-6 → glm/glm-4.7 → if/kimi-k2-thinking

OAuth token expired

  • Auto-refreshed by OmniRoute
  • If issues persist: Dashboard → Provider → Reconnect

High costs

  • Check usage stats in Dashboard → Costs
  • Switch primary model to GLM/MiniMax
  • Use free tier (Gemini CLI, iFlow) for non-critical tasks

Dashboard/API ports are wrong

  • PORT is the canonical base port (and API port by default)
  • API_PORT overrides only OpenAI-compatible API listener
  • DASHBOARD_PORT overrides only dashboard/Next.js listener
  • Set NEXT_PUBLIC_BASE_URL to your dashboard/public URL (for OAuth callbacks)

Cloud sync errors

  • Verify BASE_URL points to your running instance
  • Verify CLOUD_URL points to your expected cloud endpoint
  • Keep NEXT_PUBLIC_* values aligned with server-side values

First login not working

  • Check INITIAL_PASSWORD in .env
  • If unset, fallback password is 123456

No request logs

  • Set ENABLE_REQUEST_LOGS=true in .env

Connection test shows "Invalid" for OpenAI-compatible providers

  • Many providers don't expose a /models endpoint
  • OmniRoute v1.0.6+ includes fallback validation via chat completions
  • Ensure base URL includes /v1 suffix

🔐 OAuth em Servidor Remoto (Remote OAuth Setup)

⚠️ IMPORTANTE para usuários com OmniRoute em VPS/Docker/servidor remoto

Por que o OAuth do Antigravity / Gemini CLI falha em servidores remotos?

Os provedores Antigravity e Gemini CLI usam Google OAuth 2.0 para autenticação. O Google exige que a redirect_uri usada no fluxo OAuth seja exatamente uma das URIs pré-cadastradas no Google Cloud Console do aplicativo.

As credenciais OAuth embutidas no OmniRoute estão cadastradas apenas para localhost. Quando você acessa o OmniRoute em um servidor remoto (ex: https://omniroute.meuservidor.com), o Google rejeita a autenticação com:

Error 400: redirect_uri_mismatch

Solução: Configure suas próprias credenciais OAuth

Você precisa criar um OAuth 2.0 Client ID no Google Cloud Console com a URI do seu servidor.

Passo a passo

1. Acesse o Google Cloud Console

Abra: https://console.cloud.google.com/apis/credentials

2. Crie um novo OAuth 2.0 Client ID

  • Clique em "+ Create Credentials""OAuth client ID"
  • Tipo de aplicativo: "Web application"
  • Nome: escolha qualquer nome (ex: OmniRoute Remote)

3. Adicione as Authorized Redirect URIs

No campo "Authorized redirect URIs", adicione:

https://seu-servidor.com/callback

Substitua seu-servidor.com pelo domínio ou IP do seu servidor (inclua a porta se necessário, ex: http://45.33.32.156:20128/callback).

4. Salve e copie as credenciais

Após criar, o Google mostrará o Client ID e o Client Secret.

5. Configure as variáveis de ambiente

No seu .env (ou nas variáveis de ambiente do Docker):

# Para Antigravity:
ANTIGRAVITY_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com
ANTIGRAVITY_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret

# Para Gemini CLI:
GEMINI_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com
GEMINI_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret
GEMINI_CLI_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret

6. Reinicie o OmniRoute

# Se usando npm:
npm run dev

# Se usando Docker:
docker restart omniroute

7. Tente conectar novamente

Dashboard → Providers → Antigravity (ou Gemini CLI) → OAuth

Agora o Google redirecionará corretamente para https://seu-servidor.com/callback e a autenticação funcionará.


Workaround temporário (sem configurar credenciais próprias)

Se não quiser criar credenciais próprias agora, ainda é possível usar o fluxo manual de URL:

  1. O OmniRoute abrirá a URL de autorização do Google
  2. Após você autorizar, o Google tentará redirecionar para localhost (que falha no servidor remoto)
  3. Copie a URL completa da barra de endereço do seu browser (mesmo que a página não carregue)
  4. Cole essa URL no campo que aparece no modal de conexão do OmniRoute
  5. Clique em "Connect"

Este workaround funciona porque o código de autorização na URL é válido independente do redirect ter carregado ou não.


🛠️ Tech Stack

Click to expand tech stack details
  • Runtime: Node.js 18–22 LTS (⚠️ Node.js 24+ is not supportedbetter-sqlite3 native binaries are incompatible)
  • Language: TypeScript 5.9 — 100% TypeScript across src/ and open-sse/ (zero any in core modules since v2.0)
  • Framework: Next.js 16 + React 19 + Tailwind CSS 4
  • Database: LowDB (JSON) + SQLite (domain state + proxy logs + MCP audit + routing decisions)
  • Schemas: Zod (MCP tool I/O validation, API contracts)
  • Protocols: MCP (stdio/HTTP) + A2A v0.3 (JSON-RPC 2.0 + SSE)
  • Streaming: Server-Sent Events (SSE)
  • Auth: OAuth 2.0 (PKCE) + JWT + API Keys + MCP Scoped Authorization
  • Testing: Node.js test runner + Vitest (900+ tests including unit, integration, E2E)
  • CI/CD: GitHub Actions (auto npm publish + Docker Hub on release)
  • Website: omniroute.online
  • Package: npmjs.com/package/omniroute
  • Docker: hub.docker.com/r/diegosouzapw/omniroute
  • Resilience: Circuit breaker, exponential backoff, anti-thundering herd, TLS spoofing, auto-combo self-healing

📖 Documentation

Document Description
User Guide Providers, combos, CLI integration, deployment
API Reference All endpoints with examples
MCP Server 16 MCP tools, IDE configs, Python/TS/Go clients
A2A Server JSON-RPC 2.0 protocol, skills, streaming, task mgmt
Auto-Combo Engine 6-factor scoring, mode packs, self-healing
Troubleshooting Common problems and solutions
Architecture System architecture and internals
Contributing Development setup and guidelines
OpenAPI Spec OpenAPI 3.0 specification
Security Policy Vulnerability reporting and security practices
VM Deployment Complete guide: VM + nginx + Cloudflare setup
Features Gallery Visual dashboard tour with screenshots
Release Checklist Pre-release validation steps

🗺️ Roadmap

OmniRoute has 210+ features planned across multiple development phases. Here are the key areas:

Category Planned Features Highlights
🧠 Routing & Intelligence 25+ Lowest-latency routing, tag-based routing, quota preflight, P2C account selection
🔒 Security & Compliance 20+ SSRF hardening, credential cloaking, rate-limit per endpoint, management key scoping
📊 Observability 15+ OpenTelemetry integration, real-time quota monitoring, cost tracking per model
🔄 Provider Integrations 20+ Dynamic model registry, provider cooldowns, multi-account Codex, Copilot quota parsing
Performance 15+ Dual cache layer, prompt cache, response cache, streaming keepalive, batch API
🌐 Ecosystem 10+ WebSocket API, config hot-reload, distributed config store, commercial mode

🔜 Coming Soon

  • 🔗 OpenCode Integration — Native provider support for the OpenCode AI coding IDE
  • 🔗 TRAE Integration — Full support for the TRAE AI development framework
  • 📦 Batch API — Asynchronous batch processing for bulk requests
  • 🎯 Tag-Based Routing — Route requests based on custom tags and metadata
  • 💰 Lowest-Cost Strategy — Automatically select the cheapest available provider

📝 Full feature specifications available in docs/new-features/ (217 detailed specs)


👥 Contributors

Contributors

How to Contribute

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

See CONTRIBUTING.md for detailed guidelines.

Releasing a New Version

# Create a release — npm publish happens automatically
gh release create v2.0.0 --title "v2.0.0" --generate-notes

📊 Star History

Star History Chart

🙏 Acknowledgments

Special thanks to 9router by decolua — the original project that inspired this fork. OmniRoute builds upon that incredible foundation with additional features, multi-modal APIs, and a full TypeScript rewrite.

Special thanks to CLIProxyAPI — the original Go implementation that inspired this JavaScript port.


📄 License

MIT License - see LICENSE for details.


Built with ❤️ for developers who code 24/7
omniroute.online

About

OmniRoute is an AI gateway for multi-provider LLMs: an OpenAI-compatible endpoint with smart routing, load balancing, retries, and fallbacks. Add policies, rate limits, caching, and observability for reliable, cost-aware inference.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors