🚀 OmniRoute — 免费 AI 网关

永不停止编程。智能路由至免费和低成本 AI 模型，自动故障转移。

您的通用 API 代理 — 一个端点，36+ 提供商，零停机时间。

Chat Completions • Embeddings • 图像生成 • 音频 • Reranking • 100% TypeScript

🤖 为您最爱的编程代理提供免费 AI

通过 OmniRoute 连接任何 AI 驱动的 IDE 或 CLI 工具 — 免费 API 网关，无限编程。

OpenClaw _{⭐ 205K}	NanoBot _{⭐ 20.9K}	PicoClaw _{⭐ 14.6K}	ZeroClaw _{⭐ 9.9K}	IronClaw _{⭐ 2.1K}
OpenCode _{⭐ 106K}	Codex CLI _{⭐ 60.8K}	Claude Code _{⭐ 67.3K}	Gemini CLI _{⭐ 94.7K}	Kilo Code _{⭐ 15.5K}

_{📡 所有代理通过 http://localhost:20128/v1 或 http://cloud.omniroute.online/v1 连接 — 一个配置，无限模型和配额}

🌐 网站 • 🚀 快速开始 • 💡 功能特性 • 📖 文档 • 💰 定价

🤔 为什么选择 OmniRoute？

停止浪费金钱和遭遇限制：

订阅配额每月未使用就过期
速率限制在编程中途停止你
昂贵的 API（每个提供商 $20-50/月）
手动在提供商间切换

OmniRoute 解决这些问题：

✅ 最大化订阅 — 追踪配额，在重置前用完每一点
✅ 自动故障转移 — 订阅 → API Key → 低价 → 免费，零停机
✅ 多账号 — 每个提供商的账号轮询
✅ 通用 — 适用于 Claude Code、Codex、Gemini CLI、Cursor、Cline、OpenClaw、任何 CLI 工具

🔄 工作原理

┌─────────────┐
│  您的 CLI   │  (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...)
│   工具      │
└──────┬──────┘
       │ http://localhost:20128/v1
       ↓
┌─────────────────────────────────────────┐
│         OmniRoute（智能路由器）           │
│  • 格式转换（OpenAI ↔ Claude）          │
│  • 配额追踪 + Embeddings + 图像         │
│  • 自动令牌刷新                         │
└──────┬──────────────────────────────────┘
       │
       ├─→ [第1层: 订阅] Claude Code, Codex, Gemini CLI
       │   ↓ 配额用完
       ├─→ [第2层: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM 等
       │   ↓ 预算限制
       ├─→ [第3层: 低价] GLM ($0.6/1M), MiniMax ($0.2/1M)
       │   ↓ 预算限制
       └─→ [第4层: 免费] iFlow, Qwen, Kiro（无限制）

结果：永不停止编程，成本最低

🎯 What OmniRoute Solves — 16 Real Pain Points

Every developer using AI tools faces these problems daily. OmniRoute was built to solve them all — from cost overruns to regional blocks, from broken OAuth flows to zero observability.

💸 1. "I pay for an expensive subscription but still get interrupted by limits"

Developers pay $20–200/month for Claude Pro, Codex Pro, or GitHub Copilot. Even paying, quota has a ceiling — 5h of usage, weekly limits, or per-minute rate limits. Mid-coding session, the provider stops responding and the developer loses flow and productivity.

How OmniRoute solves it:

Smart 4-Tier Fallback — If subscription quota runs out, automatically redirects to API Key → Cheap → Free with zero manual intervention
Real-Time Quota Tracking — Shows token consumption in real-time with reset countdown (5h, daily, weekly)
Multi-Account Support — Multiple accounts per provider with auto round-robin — when one runs out, switches to the next
Custom Combos — Customizable fallback chains with 6 balancing strategies (fill-first, round-robin, P2C, random, least-used, cost-optimized)
Codex Business Quotas — Business/Team workspace quota monitoring directly in the dashboard

🔌 2. "I need to use multiple providers but each has a different API"

OpenAI uses one format, Claude (Anthropic) uses another, Gemini yet another. If a dev wants to test models from different providers or fallback between them, they need to reconfigure SDKs, change endpoints, deal with incompatible formats. Custom providers (FriendLI, NIM) have non-standard model endpoints.

How OmniRoute solves it:

Unified Endpoint — A single http://localhost:20128/v1 serves as proxy for all 36+ providers
Format Translation — Automatic and transparent: OpenAI ↔ Claude ↔ Gemini ↔ Responses API
Response Sanitization — Strips non-standard fields (x_groq, usage_breakdown, service_tier) that break OpenAI SDK v1.83+
Role Normalization — Converts developer → system for non-OpenAI providers; system → user for GLM/ERNIE
Think Tag Extraction — Extracts <think> blocks from models like DeepSeek R1 into standardized reasoning_content
Structured Output for Gemini — json_schema → responseMimeType/responseSchema automatic conversion
stream defaults to false — Aligns with OpenAI spec, avoiding unexpected SSE in Python/Rust/Go SDKs

🌐 3. "My AI provider blocks my region/country"

Providers like OpenAI/Codex block access from certain geographic regions. Users get errors like unsupported_country_region_territory during OAuth and API connections. This is especially frustrating for developers from developing countries.

How OmniRoute solves it:

3-Level Proxy Config — Configurable proxy at 3 levels: global (all traffic), per-provider (one provider only), and per-connection/key
Color-Coded Proxy Badges — Visual indicators: 🟢 global proxy, 🟡 provider proxy, 🔵 connection proxy, always showing the IP
OAuth Token Exchange Through Proxy — OAuth flow also goes through the proxy, solving unsupported_country_region_territory
Connection Tests via Proxy — Connection tests use the configured proxy (no more direct bypass)
SOCKS5 Support — Full SOCKS5 proxy support for outbound routing
TLS Fingerprint Spoofing — Browser-like TLS fingerprint via wreq-js to bypass bot detection

🆓 4. "I want to use AI for coding but I have no money"

Not everyone can pay $20–200/month for AI subscriptions. Students, devs from emerging countries, hobbyists, and freelancers need access to quality models at zero cost.

How OmniRoute solves it:

Free Tier Providers Built-in — Native support for 100% free providers: iFlow (8 unlimited models), Qwen (3 unlimited models), Kiro (Claude for free), Gemini CLI (180K/month free)
Free-Only Combos — Chain gc/gemini-3-flash → if/kimi-k2-thinking → qw/qwen3-coder-plus = $0/month with zero downtime
NVIDIA NIM Free Credits — 1000 free credits integrated
Cost Optimized Strategy — Routing strategy that automatically chooses the cheapest available provider

🔒 5. "I need to protect my AI gateway from unauthorized access"

When exposing an AI gateway to the network (LAN, VPS, Docker), anyone with the address can consume the developer's tokens/quota. Without protection, APIs are vulnerable to misuse, prompt injection, and abuse.

How OmniRoute solves it:

API Key Management — Generation, rotation, and scoping per provider with a dedicated /dashboard/api-manager page
Model-Level Permissions — Restrict API keys to specific models (openai/*, wildcard patterns), with Allow All/Restrict toggle
API Endpoint Protection — Require a key for /v1/models and block specific providers from the listing
Auth Guard + CSRF Protection — All dashboard routes protected with withAuth middleware + CSRF tokens
Rate Limiter — Per-IP rate limiting with configurable windows
IP Filtering — Allowlist/blocklist for access control
Prompt Injection Guard — Sanitization against malicious prompt patterns
AES-256-GCM Encryption — Credentials encrypted at rest

🛑 6. "My provider went down and I lost my coding flow"

AI providers can become unstable, return 5xx errors, or hit temporary rate limits. If a dev depends on a single provider, they're interrupted. Without circuit breakers, repeated retries can crash the application.

How OmniRoute solves it:

Circuit Breaker per-provider — Auto-open/close with configurable thresholds and cooldown (Closed/Open/Half-Open)
Exponential Backoff — Progressive retry delays
Anti-Thundering Herd — Mutex + semaphore protection against concurrent retry storms
Combo Fallback Chains — If the primary provider fails, automatically falls through the chain with no intervention
Combo Circuit Breaker — Auto-disables failing providers within a combo chain
Health Dashboard — Uptime monitoring, circuit breaker states, lockouts, cache stats, p50/p95/p99 latency

🔧 7. "Configuring each AI tool is tedious and repetitive"

Developers use Cursor, Claude Code, Codex CLI, OpenClaw, Gemini CLI, Kilo Code... Each tool needs a different config (API endpoint, key, model). Reconfiguring when switching providers or models is a waste of time.

How OmniRoute solves it:

CLI Tools Dashboard — Dedicated page with one-click setup for Claude Code, Codex CLI, OpenClaw, Kilo Code, Antigravity, Cline
GitHub Copilot Config Generator — Generates chatLanguageModels.json for VS Code with bulk model selection
Onboarding Wizard — Guided 4-step setup for first-time users
One endpoint, all models — Configure http://localhost:20128/v1 once, access 36+ providers

🔑 8. "Managing OAuth tokens from multiple providers is hell"

Claude Code, Codex, Gemini CLI, Copilot — all use OAuth 2.0 with expiring tokens. Developers need to re-authenticate constantly, deal with client_secret is missing, redirect_uri_mismatch, and failures on remote servers. OAuth on LAN/VPS is particularly problematic.

How OmniRoute solves it:

Auto Token Refresh — OAuth tokens refresh in background before expiration
OAuth 2.0 (PKCE) Built-in — Automatic flow for Claude Code, Codex, Gemini CLI, Copilot, Kiro, Qwen, iFlow
Multi-Account OAuth — Multiple accounts per provider via JWT/ID token extraction
OAuth LAN/Remote Fix — Private IP detection for redirect_uri + manual URL mode for remote servers
OAuth Behind Nginx — Uses window.location.origin for reverse proxy compatibility
Remote OAuth Guide — Step-by-step guide for Google Cloud credentials on VPS/Docker

📊 9. "I don't know how much I'm spending or where"

Developers use multiple paid providers but have no unified view of spending. Each provider has its own billing dashboard, but there's no consolidated view. Unexpected costs can pile up.

How OmniRoute solves it:

Cost Analytics Dashboard — Per-token cost tracking and budget management per provider
Budget Limits per Tier — Spending ceiling per tier that triggers automatic fallback
Per-Model Pricing Configuration — Configurable prices per model
Usage Statistics Per API Key — Request count and last-used timestamp per key
Analytics Dashboard — Stat cards, model usage chart, provider table with success rates and latency

🐛 10. "I can't diagnose errors and problems in AI calls"

When a call fails, the dev doesn't know if it was a rate limit, expired token, wrong format, or provider error. Fragmented logs across different terminals. Without observability, debugging is trial-and-error.

How OmniRoute solves it:

Unified Logs Dashboard — 4 tabs: Request Logs, Proxy Logs, Audit Logs, Console
Console Log Viewer — Real-time terminal-style viewer with color-coded levels, auto-scroll, search, filter
SQLite Proxy Logs — Persistent logs that survive server restarts
Translator Playground — 4 debugging modes: Playground (format translation), Chat Tester (round-trip), Test Bench (batch), Live Monitor (real-time)
Request Telemetry — p50/p95/p99 latency + X-Request-Id tracing
File-Based Logging with Rotation — Console interceptor captures everything to JSON log with size-based rotation

🏗️ 11. "Deploying and maintaining the gateway is complex"

Installing, configuring, and maintaining an AI proxy across different environments (local, VPS, Docker, cloud) is labor-intensive. Problems like hardcoded paths, EACCES on directories, port conflicts, and cross-platform builds add friction.

How OmniRoute solves it:

npm global install — npm install -g omniroute && omniroute — done
Docker Multi-Platform — AMD64 + ARM64 native (Apple Silicon, AWS Graviton, Raspberry Pi)
Docker Compose Profiles — base (no CLI tools) and cli (with Claude Code, Codex, OpenClaw)
Electron Desktop App — Native app for Windows/macOS/Linux with system tray, auto-start, offline mode
Split-Port Mode — API and Dashboard on separate ports for advanced scenarios (reverse proxy, container networking)
Cloud Sync — Config synchronization across devices via Cloudflare Workers
DB Backups — Automatic backup, restore, export and import of all settings

🌍 12. "The interface is English-only and my team doesn't speak English"

Teams in non-English-speaking countries, especially in Latin America, Asia, and Europe, struggle with English-only interfaces. Language barriers reduce adoption and increase configuration errors.

How OmniRoute solves it:

Dashboard i18n — 30 Languages — All 500+ keys translated including Arabic, Bulgarian, Danish, German, Spanish, Finnish, French, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese (PT/BR), Romanian, Russian, Slovak, Swedish, Thai, Ukrainian, Vietnamese, Chinese, Filipino, English
RTL Support — Right-to-left support for Arabic and Hebrew
Multi-Language READMEs — 30 complete documentation translations
Language Selector — Globe icon in header for real-time switching

🔄 13. "I need more than chat — I need embeddings, images, audio"

AI isn't just chat completion. Devs need to generate images, transcribe audio, create embeddings for RAG, rerank documents, and moderate content. Each API has a different endpoint and format.

How OmniRoute solves it:

Embeddings — /v1/embeddings with 6 providers and 9+ models
Image Generation — /v1/images/generations with 4 providers and 9+ models
Audio Transcription — /v1/audio/transcriptions — Whisper-compatible
Text-to-Speech — /v1/audio/speech — Multi-provider audio synthesis
Moderations — /v1/moderations — Content safety checks
Reranking — /v1/rerank — Document relevance reranking
Responses API — Full /v1/responses support for Codex

🧪 14. "I have no way to test and compare quality across models"

Developers want to know which model is best for their use case — code, translation, reasoning — but comparing manually is slow. No integrated eval tools exist.

How OmniRoute solves it:

LLM Evaluations — Golden set testing with 10 pre-loaded cases covering greetings, math, geography, code generation, JSON compliance, translation, markdown, safety refusal
4 Match Strategies — exact, contains, regex, custom (JS function)
Translator Playground Test Bench — Batch testing with multiple inputs and expected outputs, cross-provider comparison
Chat Tester — Full round-trip with visual response rendering
Live Monitor — Real-time stream of all requests flowing through the proxy

📈 15. "I need to scale without losing performance"

As request volume grows, without caching the same questions generate duplicate costs. Without idempotency, duplicate requests waste processing. Per-provider rate limits must be respected.

How OmniRoute solves it:

Semantic Cache — Two-tier cache (signature + semantic) reduces cost and latency
Request Idempotency — 5s deduplication window for identical requests
Rate Limit Detection — Per-provider RPM, min gap, and max concurrent tracking
Editable Rate Limits — Configurable defaults in Settings → Resilience with persistence
API Key Validation Cache — 3-tier cache for production performance
Health Dashboard with Telemetry — p50/p95/p99 latency, cache stats, uptime

🤖 16. "I want to control model behavior globally"

Developers who want all responses in a specific language, with a specific tone, or want to limit reasoning tokens. Configuring this in every tool/request is impractical.

How OmniRoute solves it:

System Prompt Injection — Global prompt applied to all requests
Thinking Budget Validation — Reasoning token allocation control per request (passthrough, auto, custom, adaptive)
6 Routing Strategies — Global strategies that determine how requests are distributed
Wildcard Router — provider/* patterns route dynamically to any provider
Combo Enable/Disable Toggle — Toggle combos directly from the dashboard
Provider Toggle — Enable/disable all connections for a provider with one click
Blocked Providers — Exclude specific providers from /v1/models listing

⚡ 快速开始

1. 全局安装：

npm install -g omniroute
omniroute

🎉 仪表板在 http://localhost:20128 打开

命令	描述
`omniroute`	启动服务器（默认端口 20128）
`omniroute --port 3000`	使用自定义端口
`omniroute --no-open`	不自动打开浏览器
`omniroute --help`	显示帮助

2. 连接免费提供商：

仪表板 → 提供商 → 连接 Claude Code 或 Antigravity → OAuth 登录 → 完成！

3. 在 CLI 工具中使用：

Claude Code/Codex/Gemini CLI/OpenClaw/Cursor/Cline 设置：
  Endpoint: http://localhost:20128/v1
  API Key: [从仪表板复制]
  Model: if/kimi-k2-thinking

完成！ 开始使用免费 AI 模型编程。

替代方案 — 从源代码运行：

cp .env.example .env
npm install
PORT=20128 NEXT_PUBLIC_BASE_URL=http://localhost:20128 npm run dev

🐳 Docker

OmniRoute 作为公共 Docker 镜像在 Docker Hub 上可用。

快速运行：

docker run -d \
  --name omniroute \
  --restart unless-stopped \
  -p 20128:20128 \
  -v omniroute-data:/app/data \
  diegosouzapw/omniroute:latest

使用环境文件：

# 先复制并编辑 .env
cp .env.example .env

docker run -d \
  --name omniroute \
  --restart unless-stopped \
  --env-file .env \
  -p 20128:20128 \
  -v omniroute-data:/app/data \
  diegosouzapw/omniroute:latest

使用 Docker Compose：

# 基础配置（无 CLI 工具）
docker compose --profile base up -d

# CLI 配置（内置 Claude Code、Codex、OpenClaw）
docker compose --profile cli up -d

镜像	标签	大小	描述
`diegosouzapw/omniroute`	`latest`	~250MB	最新稳定版
`diegosouzapw/omniroute`	`1.0.6`	~250MB	当前版本

🖥️ 桌面应用 — 离线 & 始终在线

🆕 全新！ OmniRoute 现已提供适用于 Windows、macOS 和 Linux 的原生桌面应用程序。

🖥️ 原生窗口 — 专属应用窗口，集成系统托盘
🔄 自动启动 — 系统登录时启动 OmniRoute
🔔 原生通知 — 配额耗尽或提供商问题时收到提醒
⚡ 一键安装 — NSIS (Windows)、DMG (macOS)、AppImage (Linux)
🌐 离线模式 — 内置服务器，完全离线工作

npm run electron:dev           # 开发模式
npm run electron:build         # 当前平台
npm run electron:build:win     # Windows (.exe)
npm run electron:build:mac     # macOS (.dmg)
npm run electron:build:linux   # Linux (.AppImage)

📖 完整文档：electron/README.md

💰 定价概览

层级	提供商	费用	配额重置	最适合
💳 订阅	Claude Code (Pro)	$20/月	5小时 + 每周	已订阅用户
	Codex (Plus/Pro)	$20-200/月	5小时 + 每周	OpenAI 用户
	Gemini CLI	免费	180K/月 + 1K/天	所有人！
	GitHub Copilot	$10-19/月	每月	GitHub 用户
🔑 API KEY	NVIDIA NIM	免费（1000 积分）	一次性	免费测试
	DeepSeek	按使用量	无	最佳性价比
	Groq	免费层 + 付费	限速	超快推理
	xAI (Grok)	按使用量	无	Grok 模型
	Mistral	免费层 + 付费	限速	欧洲 AI
	OpenRouter	按使用量	无	100+ 模型
💰 低价	GLM-4.7	$0.6/1M	每日 10时	经济备用
	MiniMax M2.1	$0.2/1M	5小时滚动	最便宜选项
	Kimi K2	$9/月固定	每月 10M Token	可预测成本
🆓 免费	iFlow	$0	无限制	8 个免费模型
	Qwen	$0	无限制	3 个免费模型
	Kiro	$0	无限制	免费 Claude

💡 专业建议： 从 Gemini CLI（每月 180K 免费）+ iFlow（无限免费）开始 = $0 成本！

💡 核心功能

🧠 路由与智能

功能	功能描述
🎯 智能 4 层故障转移	自动路由：订阅 → API Key → 低价 → 免费
📊 实时配额追踪	实时 Token 计数 + 每个提供商的重置倒计时
🔄 格式转换	OpenAI ↔ Claude ↔ Gemini ↔ Cursor ↔ Kiro 无缝切换
👥 多账号支持	每个提供商多个账号，智能选择
🔄 自动令牌刷新	OAuth 令牌自动刷新并重试
🎨 自定义组合	6 种策略：fill-first、round-robin、p2c、random、least-used、cost-optimized
🧩 自定义模型	为任何提供商添加任何模型 ID
🌐 通配符路由	动态路由 `provider/*` 模式到任何提供商
🧠 推理预算	passthrough、auto、custom 和 adaptive 模式用于推理模型
🔀 Model Aliases	Auto-forward deprecated model IDs to current replacements (built-in + custom)
⚡ Background Degradation	Auto-route background tasks (titles, summaries) to cheaper models
💬 System Prompt 注入	全局 System Prompt 应用于所有请求
📄 Responses API	完整支持 OpenAI Responses API (`/v1/responses`) 用于 Codex

🎵 多模态 API

功能	功能描述
🖼️ 图像生成	`/v1/images/generations` — 4 个提供商，9+ 模型
📐 Embeddings	`/v1/embeddings` — 6 个提供商，9+ 模型
🎤 音频转录	`/v1/audio/transcriptions` — Whisper 兼容
🔊 文字转语音	`/v1/audio/speech` — 多提供商音频合成
🛡️ 内容审核	`/v1/moderations` — 内容安全检查
🔀 重排序	`/v1/rerank` — 文档相关性重排序

🛡️ 弹性与安全

功能	功能描述
🔌 断路器	每个提供商自动打开/关闭，可配置阈值
🛡️ 反惊群	Mutex + 信号量限速用于 API Key 提供商
🧠 语义缓存	两层缓存（签名 + 语义）降低成本和延迟
⚡ 请求幂等性	5 秒去重窗口防止重复请求
🔒 TLS 指纹伪装	通过 wreq-js 绕过基于 TLS 的机器人检测
🌐 IP 过滤	白名单/黑名单用于 API 访问控制
📊 可编辑速率限制	可配置的 RPM、最小间隔和最大并发
💾 Rate Limit Persistence	Learned limits survive restarts via SQLite with 60s debounce + 24h staleness
🔄 Token Refresh Resilience	Per-provider circuit breaker (5 fails→30min) + 30s timeout per attempt

📊 可观察性与分析

功能	功能描述
📝 请求日志	调试模式，完整的请求/响应日志
💾 SQLite 日志	持久化代理日志，服务器重启后仍然保留
📊 分析仪表板	Recharts：统计卡片、使用量图表、提供商表格
📈 进度追踪	流式传输的 SSE 进度事件（可选）
🧪 LLM 评估	黄金集测试，4 种匹配策略
🔍 请求遥测	p50/p95/p99 延迟聚合 + X-Request-Id 追踪
📋 日志 + 配额	专用页面用于日志浏览和配额追踪
🏥 健康仪表板	运行时间、断路器状态、锁定、缓存统计
💰 成本追踪	预算管理 + 每模型定价配置

☁️ 部署与同步

功能	功能描述
💾 Cloud Sync	通过 Cloudflare Workers 在设备间同步配置
🌐 随处部署	Localhost、VPS、Docker、Cloudflare Workers
🔑 API Key 管理	按提供商生成、轮换和设定 API Key 范围
🧙 配置向导	4 步引导式设置，面向新用户
🔧 CLI 工具仪表板	一键配置 Claude、Codex、Cline、OpenClaw、Kilo、Antigravity
🔄 数据库备份	自动备份和恢复所有设置

📖 功能详情

🎯 智能 4 层故障转移

创建带自动故障转移的组合：

Combo: "my-coding-stack"
  1. cc/claude-opus-4-6        （您的订阅）
  2. nvidia/llama-3.3-70b      （免费 NVIDIA API）
  3. glm/glm-4.7               （便宜备用，$0.6/1M）
  4. if/kimi-k2-thinking       （免费后备）

→ 配额用完或出错时自动切换

📊 实时配额追踪

每个提供商的 Token 消耗
重置倒计时（5 小时、每日、每周）
付费层级的成本估算
月度支出报告

🔄 格式转换

格式间的无缝转换：

OpenAI ↔ Claude ↔ Gemini ↔ OpenAI Responses
您的 CLI 发送 OpenAI 格式 → OmniRoute 转换 → 提供商接收原生格式
适用于任何支持自定义 OpenAI 端点的工具

👥 多账号支持

每个提供商添加多个账号
自动轮询或基于优先级的路由
当一个账号达到配额时自动切换到下一个

🔄 自动令牌刷新

OAuth 令牌在过期前自动刷新
无需手动重新认证
所有提供商的无缝体验

🎨 自定义组合

创建无限模型组合
6 种策略：fill-first、round-robin、power-of-two-choices、random、least-used、cost-optimized
通过 Cloud Sync 在设备间共享组合

🏥 健康仪表板

系统状态（运行时间、版本、内存使用）
每个提供商的断路器状态（Closed/Open/Half-Open）
速率限制状态和活动锁定
签名缓存统计
延迟遥测（p50/p95/p99）+ 提示缓存
一键重置健康状态

🔧 翻译器 Playground

调试、测试和可视化 API 格式转换
发送请求并查看 OmniRoute 如何在提供商格式间转换
对排查集成问题非常有价值

💾 Cloud Sync

在设备间同步提供商、组合和设置
自动后台同步
安全加密存储

🎯 使用场景

场景 1："我有 Claude Pro 订阅"

问题： 配额未使用就过期，编程高峰期遇到速率限制

Combo: "maximize-claude"
  1. cc/claude-opus-4-6        （充分使用订阅）
  2. glm/glm-4.7               （配额用完时的便宜备用）
  3. if/kimi-k2-thinking       （免费应急后备）

每月成本：$20（订阅）+ ~$5（备用）= $25 总计
对比：$20 + 遇到限制 = 受挫

场景 2："我想要零成本"

问题： 无法承担订阅费用，需要可靠的 AI 编程

Combo: "free-forever"
  1. gc/gemini-3-flash         （每月 180K 免费）
  2. if/kimi-k2-thinking       （无限免费）
  3. qw/qwen3-coder-plus       （无限免费）

每月成本：$0
质量：生产级模型

场景 3："我需要 24/7 编程，不中断"

问题： 截止日期紧迫，不能有停机时间

Combo: "always-on"
  1. cc/claude-opus-4-6        （最佳质量）
  2. cx/gpt-5.2-codex          （第二个订阅）
  3. glm/glm-4.7               （便宜，每日重置）
  4. minimax/MiniMax-M2.1      （最便宜，5小时重置）
  5. if/kimi-k2-thinking       （免费无限制）

结果：5 层故障转移 = 零停机

场景 4："我想在 OpenClaw 中使用免费 AI"

问题： 需要在消息应用中使用 AI 助手，完全免费

Combo: "openclaw-free"
  1. if/glm-4.7                （无限免费）
  2. if/minimax-m2.1           （无限免费）
  3. if/kimi-k2-thinking       （无限免费）

每月成本：$0
访问方式：WhatsApp、Telegram、Slack、Discord、iMessage、Signal...

📖 设置指南

💳 订阅提供商

Claude Code (Pro/Max)

仪表板 → 提供商 → 连接 Claude Code
→ OAuth 登录 → 自动令牌刷新
→ 5 小时 + 每周配额追踪

模型：
  cc/claude-opus-4-6
  cc/claude-sonnet-4-5-20250929
  cc/claude-haiku-4-5-20251001

专业建议： 复杂任务用 Opus，追求速度用 Sonnet。OmniRoute 按模型追踪配额！

OpenAI Codex (Plus/Pro)

仪表板 → 提供商 → 连接 Codex
→ OAuth 登录（端口 1455）
→ 5 小时 + 每周重置

模型：
  cx/gpt-5.2-codex
  cx/gpt-5.1-codex-max

Gemini CLI（免费 180K/月！）

仪表板 → 提供商 → 连接 Gemini CLI
→ Google OAuth
→ 每月 180K completions + 每天 1K

模型：
  gc/gemini-3-flash-preview
  gc/gemini-2.5-pro

最佳价值： 巨大的免费额度！在付费层级之前使用。

GitHub Copilot

仪表板 → 提供商 → 连接 GitHub
→ 通过 GitHub OAuth
→ 每月重置（每月 1 日）

模型：
  gh/gpt-5
  gh/claude-4.5-sonnet
  gh/gemini-3-pro

🔑 API Key 提供商

NVIDIA NIM（免费 1000 积分！）

注册：build.nvidia.com
获取免费 API key（包含 1000 推理积分）
仪表板 → 添加提供商 → NVIDIA NIM：
- API Key：nvapi-your-key

模型： nvidia/llama-3.3-70b-instruct、nvidia/mistral-7b-instruct 及 50+ 更多

专业建议： OpenAI 兼容的 API — 与 OmniRoute 的格式转换完美配合！

DeepSeek

注册：platform.deepseek.com
获取 API key
仪表板 → 添加提供商 → DeepSeek

模型： deepseek/deepseek-chat、deepseek/deepseek-coder

Groq（免费层可用！）

注册：console.groq.com
获取 API key（包含免费层）
仪表板 → 添加提供商 → Groq

模型： groq/llama-3.3-70b、groq/mixtral-8x7b

专业建议： 超快推理 — 最适合实时编程！

OpenRouter（100+ 模型）

注册：openrouter.ai
获取 API key
仪表板 → 添加提供商 → OpenRouter

模型： 通过一个 API key 访问所有主要提供商的 100+ 模型。

💰 低价提供商（备用）

GLM-4.7（每日重置，$0.6/1M）

注册：Zhipu AI
从 Coding Plan 获取 API key
仪表板 → 添加 API Key：
- 提供商：glm
- API Key：your-key

使用： glm/glm-4.7

专业建议： Coding Plan 以 1/7 的价格提供 3 倍配额！每日 10:00 AM 重置。

MiniMax M2.1（5 小时重置，$0.20/1M）

注册：MiniMax
获取 API key
仪表板 → 添加 API Key

使用： minimax/MiniMax-M2.1

专业建议： 长上下文（1M Token）最便宜的选项！

Kimi K2（$9/月固定）

订阅：Moonshot AI
获取 API key
仪表板 → 添加 API Key

使用： kimi/kimi-latest

专业建议： 固定 $9/月 10M Token = $0.90/1M 有效成本！

🆓 免费提供商（应急备用）

iFlow（8 个免费模型）

仪表板 → 连接 iFlow
→ iFlow OAuth 登录
→ 无限使用

模型：
  if/kimi-k2-thinking
  if/qwen3-coder-plus
  if/glm-4.7
  if/minimax-m2
  if/deepseek-r1

Qwen（3 个免费模型）

仪表板 → 连接 Qwen
→ 设备码授权
→ 无限使用

模型：
  qw/qwen3-coder-plus
  qw/qwen3-coder-flash

Kiro（免费 Claude）

仪表板 → 连接 Kiro
→ AWS Builder ID 或 Google/GitHub
→ 无限使用

模型：
  kr/claude-sonnet-4.5
  kr/claude-haiku-4.5

🎨 创建组合

示例 1：最大化订阅 → 便宜备用

仪表板 → 组合 → 创建新的

名称：premium-coding
模型：
  1. cc/claude-opus-4-6（订阅主力）
  2. glm/glm-4.7（便宜备用，$0.6/1M）
  3. minimax/MiniMax-M2.1（最便宜的后备，$0.20/1M）

在 CLI 中使用：premium-coding

示例 2：仅免费（零成本）

名称：free-combo
模型：
  1. gc/gemini-3-flash-preview（每月 180K 免费）
  2. if/kimi-k2-thinking（无限制）
  3. qw/qwen3-coder-plus（无限制）

成本：永远 $0！

🔧 CLI 集成

Cursor IDE

设置 → 模型 → 高级：
  OpenAI API Base URL: http://localhost:20128/v1
  OpenAI API Key: [从 OmniRoute 仪表板获取]
  Model: cc/claude-opus-4-6

Claude Code

使用仪表板中的 CLI Tools 页面一键配置，或手动编辑 ~/.claude/settings.json。

Codex CLI

export OPENAI_BASE_URL="http://localhost:20128"
export OPENAI_API_KEY="your-omniroute-api-key"

codex "your prompt"

OpenClaw

选项 1 — 仪表板（推荐）：

仪表板 → CLI Tools → OpenClaw → 选择模型 → 应用

选项 2 — 手动： 编辑 ~/.openclaw/openclaw.json：

{
  "models": {
    "providers": {
      "omniroute": {
        "baseUrl": "http://127.0.0.1:20128/v1",
        "apiKey": "sk_omniroute",
        "api": "openai-completions"
      }
    }
  }
}

注意： OpenClaw 仅支持本地 OmniRoute。使用 127.0.0.1 而非 localhost 以避免 IPv6 解析问题。

Cline / Continue / RooCode

设置 → API 配置：
  提供商：OpenAI Compatible
  Base URL: http://localhost:20128/v1
  API Key: [从 OmniRoute 仪表板获取]
  Model: if/kimi-k2-thinking

🧪 评估 (Evals)

OmniRoute 包含内置评估框架，用于针对黄金集测试 LLM 响应质量。通过仪表板中的 Analytics → Evals 访问。

内置黄金集

预加载的「OmniRoute Golden Set」包含 10 个测试用例：

问候、数学、地理、代码生成
JSON 格式合规性、翻译、markdown
安全拒绝（有害内容）、计数、布尔逻辑

评估策略

策略	描述	示例
`exact`	输出必须完全匹配	`"4"`
`contains`	输出必须包含子串（不区分大小写）	`"Paris"`
`regex`	输出必须匹配正则表达式模式	`"1.2.3"`
`custom`	自定义 JS 函数返回 true/false	`(output) => output.length > 10`

🐛 故障排除

点击展开故障排除指南

"Language model did not provide messages"

提供商配额已耗尽 → 检查仪表板配额追踪器
解决方案：使用组合故障转移或切换到更便宜的层级

速率限制

订阅配额耗尽 → 回退到 GLM/MiniMax
添加组合：cc/claude-opus-4-6 → glm/glm-4.7 → if/kimi-k2-thinking

OAuth 令牌过期

OmniRoute 自动刷新
如果问题持续：仪表板 → 提供商 → 重新连接

高成本

在仪表板 → 成本中检查使用统计
将主要模型切换为 GLM/MiniMax
对非关键任务使用免费层（Gemini CLI、iFlow）

仪表板在错误端口打开

设置 PORT=20128 和 NEXT_PUBLIC_BASE_URL=http://localhost:20128

Cloud sync 错误

验证 BASE_URL 指向您正在运行的实例
验证 CLOUD_URL 指向预期的云端点
保持 NEXT_PUBLIC_* 值与服务器端值一致

首次登录不工作

检查 .env 中的 INITIAL_PASSWORD
如未设置，默认密码为 123456

没有请求日志

在 .env 中设置 ENABLE_REQUEST_LOGS=true

兼容 OpenAI 的提供商连接测试显示 "Invalid"

许多提供商不暴露 /models 端点
OmniRoute v1.0.6+ 包含通过 chat completions 的回退验证
确保 base URL 包含 /v1 后缀

🛠️ 技术栈

运行时: Node.js 20+
语言: TypeScript 5.9 — src/ 和 open-sse/ 中 100% TypeScript（v1.0.6）
框架: Next.js 16 + React 19 + Tailwind CSS 4
数据库: LowDB (JSON) + SQLite（领域状态 + 代理日志）
流式传输: Server-Sent Events (SSE)
认证: OAuth 2.0 (PKCE) + JWT + API Keys
测试: Node.js test runner（368+ 单元测试）
CI/CD: GitHub Actions（发布时自动 npm 发布 + Docker Hub）
网站: omniroute.online
包: npmjs.com/package/omniroute
Docker: hub.docker.com/r/diegosouzapw/omniroute
弹性: 断路器、指数退避、反惊群、TLS 伪装

📖 文档

文档	描述
用户指南	提供商、组合、CLI 集成、部署
API 参考	所有端点及示例
故障排除	常见问题和解决方案
架构	系统架构和内部机制
贡献指南	开发设置和指南
OpenAPI 规范	OpenAPI 3.0 规范
安全策略	漏洞报告和安全实践

📧 支持

💬 加入我们的社区！ WhatsApp 群组 — 获取帮助、分享技巧、了解最新动态。

网站: omniroute.online
GitHub: github.com/diegosouzapw/OmniRoute
Issues: github.com/diegosouzapw/OmniRoute/issues
WhatsApp: 社区群组
原始项目: decolua 的 9router

👥 贡献者

如何贡献

Fork 仓库
创建功能分支（git checkout -b feature/amazing-feature）
提交更改（git commit -m 'Add amazing feature'）
推送到分支（git push origin feature/amazing-feature）
打开 Pull Request

详细指南请参阅 CONTRIBUTING.md。

发布新版本

# 创建发布 — npm 发布自动完成
gh release create v1.0.6 --title "v1.0.6" --generate-notes

📊 Star 历史

🙏 致谢

特别感谢 decolua 的 9router — 启发了本 fork 的原始项目。OmniRoute 在这个令人难以置信的基础上添加了额外功能、多模态 API 和完整的 TypeScript 重写。

特别感谢 CLIProxyAPI — 启发了本 JavaScript 移植的原始 Go 实现。

📄 许可证

MIT 许可证 — 详见 LICENSE。

_{用 ❤️ 为 24/7 编程的开发者打造}
_{omniroute.online}

FilesExpand file tree

README.zh-CN.md

Latest commit

History

README.zh-CN.md

File metadata and controls

🚀 OmniRoute — 免费 AI 网关

永不停止编程。智能路由至免费和低成本 AI 模型，自动故障转移。

🤖 为您最爱的编程代理提供免费 AI

🤔 为什么选择 OmniRoute？

🔄 工作原理

🎯 What OmniRoute Solves — 16 Real Pain Points

⚡ 快速开始

🐳 Docker

🖥️ 桌面应用 — 离线 & 始终在线

💰 定价概览

💡 核心功能

🧠 路由与智能

🎵 多模态 API

🛡️ 弹性与安全

📊 可观察性与分析

☁️ 部署与同步

🎯 智能 4 层故障转移

📊 实时配额追踪

🔄 格式转换

👥 多账号支持

🔄 自动令牌刷新

🎨 自定义组合

🏥 健康仪表板

🔧 翻译器 Playground

💾 Cloud Sync

🎯 使用场景

场景 1："我有 Claude Pro 订阅"

场景 2："我想要零成本"

场景 3："我需要 24/7 编程，不中断"

场景 4："我想在 OpenClaw 中使用免费 AI"

📖 设置指南

Claude Code (Pro/Max)

OpenAI Codex (Plus/Pro)

Gemini CLI（免费 180K/月！）

GitHub Copilot

NVIDIA NIM（免费 1000 积分！）

DeepSeek

Groq（免费层可用！）

OpenRouter（100+ 模型）

GLM-4.7（每日重置，$0.6/1M）

MiniMax M2.1（5 小时重置，$0.20/1M）

Kimi K2（$9/月固定）

iFlow（8 个免费模型）

Qwen（3 个免费模型）

Kiro（免费 Claude）

示例 1：最大化订阅 → 便宜备用

示例 2：仅免费（零成本）

Cursor IDE

Claude Code

Codex CLI

OpenClaw

Cline / Continue / RooCode

🧪 评估 (Evals)

内置黄金集

评估策略

🐛 故障排除

🛠️ 技术栈

📖 文档

📧 支持

👥 贡献者

如何贡献

发布新版本

📊 Star 历史

🙏 致谢

📄 许可证