🚀 OmniRoute — Das kostenlose AI-Gateway

Höre nie auf zu programmieren. Intelligentes Routing zu KOSTENLOSEN und günstigen KI-Modellen mit automatischem Fallback.

Dein universeller API-Proxy — ein Endpoint, 36+ Anbieter, null Ausfallzeit.

Chat Completions • Embeddings • Bildgenerierung • Audio • Reranking • 100% TypeScript

🤖 Kostenloser KI-Anbieter für deine Lieblings-Coding-Agenten

Verbinde jedes KI-gesteuerte IDE- oder CLI-Tool über OmniRoute — kostenloses API-Gateway für unbegrenztes Programmieren.

OpenClaw _{⭐ 205K}	NanoBot _{⭐ 20.9K}	PicoClaw _{⭐ 14.6K}	ZeroClaw _{⭐ 9.9K}	IronClaw _{⭐ 2.1K}
OpenCode _{⭐ 106K}	Codex CLI _{⭐ 60.8K}	Claude Code _{⭐ 67.3K}	Gemini CLI _{⭐ 94.7K}	Kilo Code _{⭐ 15.5K}

_{📡 Alle Agenten verbinden sich über http://localhost:20128/v1 oder http://cloud.omniroute.online/v1 — eine Konfiguration, unbegrenzte Modelle und Kontingent}

🌐 Website • 🚀 Schnellstart • 💡 Funktionen • 📖 Doku • 💰 Preise

🤔 Warum OmniRoute?

Hör auf, Geld zu verschwenden und an Limits zu stoßen:

Abo-Kontingent verfällt jeden Monat ungenutzt
Rate-Limits stoppen dich mitten beim Programmieren
Teure APIs ($20-50/Monat pro Anbieter)
Manuelles Wechseln zwischen Anbietern

OmniRoute löst das:

✅ Abos maximieren — Kontingente tracken, alles vor dem Reset nutzen
✅ Automatischer Fallback — Abo → API Key → Günstig → Kostenlos, null Ausfallzeit
✅ Multi-Account — Round-Robin zwischen Konten pro Anbieter
✅ Universal — Funktioniert mit Claude Code, Codex, Gemini CLI, Cursor, Cline, OpenClaw, jedem CLI-Tool

🔄 So funktioniert's

┌─────────────┐
│  Dein CLI   │  (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...)
│   Tool      │
└──────┬──────┘
       │ http://localhost:20128/v1
       ↓
┌─────────────────────────────────────────┐
│         OmniRoute (Smart Router)         │
│  • Format-Übersetzung (OpenAI ↔ Claude) │
│  • Kontingent-Tracking + Embeddings + Bilder │
│  • Automatische Token-Erneuerung        │
└──────┬──────────────────────────────────┘
       │
       ├─→ [Tier 1: ABO] Claude Code, Codex, Gemini CLI
       │   ↓ Kontingent erschöpft
       ├─→ [Tier 2: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM usw.
       │   ↓ Budget-Limit
       ├─→ [Tier 3: GÜNSTIG] GLM ($0.6/1M), MiniMax ($0.2/1M)
       │   ↓ Budget-Limit
       └─→ [Tier 4: KOSTENLOS] iFlow, Qwen, Kiro (unbegrenzt)

Ergebnis: Nie aufhören zu programmieren, minimale Kosten

🎯 What OmniRoute Solves — 16 Real Pain Points

Every developer using AI tools faces these problems daily. OmniRoute was built to solve them all — from cost overruns to regional blocks, from broken OAuth flows to zero observability.

💸 1. "I pay for an expensive subscription but still get interrupted by limits"

Developers pay $20–200/month for Claude Pro, Codex Pro, or GitHub Copilot. Even paying, quota has a ceiling — 5h of usage, weekly limits, or per-minute rate limits. Mid-coding session, the provider stops responding and the developer loses flow and productivity.

How OmniRoute solves it:

Smart 4-Tier Fallback — If subscription quota runs out, automatically redirects to API Key → Cheap → Free with zero manual intervention
Real-Time Quota Tracking — Shows token consumption in real-time with reset countdown (5h, daily, weekly)
Multi-Account Support — Multiple accounts per provider with auto round-robin — when one runs out, switches to the next
Custom Combos — Customizable fallback chains with 6 balancing strategies (fill-first, round-robin, P2C, random, least-used, cost-optimized)
Codex Business Quotas — Business/Team workspace quota monitoring directly in the dashboard

🔌 2. "I need to use multiple providers but each has a different API"

OpenAI uses one format, Claude (Anthropic) uses another, Gemini yet another. If a dev wants to test models from different providers or fallback between them, they need to reconfigure SDKs, change endpoints, deal with incompatible formats. Custom providers (FriendLI, NIM) have non-standard model endpoints.

How OmniRoute solves it:

Unified Endpoint — A single http://localhost:20128/v1 serves as proxy for all 36+ providers
Format Translation — Automatic and transparent: OpenAI ↔ Claude ↔ Gemini ↔ Responses API
Response Sanitization — Strips non-standard fields (x_groq, usage_breakdown, service_tier) that break OpenAI SDK v1.83+
Role Normalization — Converts developer → system for non-OpenAI providers; system → user for GLM/ERNIE
Think Tag Extraction — Extracts <think> blocks from models like DeepSeek R1 into standardized reasoning_content
Structured Output for Gemini — json_schema → responseMimeType/responseSchema automatic conversion
stream defaults to false — Aligns with OpenAI spec, avoiding unexpected SSE in Python/Rust/Go SDKs

🌐 3. "My AI provider blocks my region/country"

Providers like OpenAI/Codex block access from certain geographic regions. Users get errors like unsupported_country_region_territory during OAuth and API connections. This is especially frustrating for developers from developing countries.

How OmniRoute solves it:

3-Level Proxy Config — Configurable proxy at 3 levels: global (all traffic), per-provider (one provider only), and per-connection/key
Color-Coded Proxy Badges — Visual indicators: 🟢 global proxy, 🟡 provider proxy, 🔵 connection proxy, always showing the IP
OAuth Token Exchange Through Proxy — OAuth flow also goes through the proxy, solving unsupported_country_region_territory
Connection Tests via Proxy — Connection tests use the configured proxy (no more direct bypass)
SOCKS5 Support — Full SOCKS5 proxy support for outbound routing
TLS Fingerprint Spoofing — Browser-like TLS fingerprint via wreq-js to bypass bot detection

🆓 4. "I want to use AI for coding but I have no money"

Not everyone can pay $20–200/month for AI subscriptions. Students, devs from emerging countries, hobbyists, and freelancers need access to quality models at zero cost.

How OmniRoute solves it:

Free Tier Providers Built-in — Native support for 100% free providers: iFlow (8 unlimited models), Qwen (3 unlimited models), Kiro (Claude for free), Gemini CLI (180K/month free)
Free-Only Combos — Chain gc/gemini-3-flash → if/kimi-k2-thinking → qw/qwen3-coder-plus = $0/month with zero downtime
NVIDIA NIM Free Credits — 1000 free credits integrated
Cost Optimized Strategy — Routing strategy that automatically chooses the cheapest available provider

🔒 5. "I need to protect my AI gateway from unauthorized access"

When exposing an AI gateway to the network (LAN, VPS, Docker), anyone with the address can consume the developer's tokens/quota. Without protection, APIs are vulnerable to misuse, prompt injection, and abuse.

How OmniRoute solves it:

API Key Management — Generation, rotation, and scoping per provider with a dedicated /dashboard/api-manager page
Model-Level Permissions — Restrict API keys to specific models (openai/*, wildcard patterns), with Allow All/Restrict toggle
API Endpoint Protection — Require a key for /v1/models and block specific providers from the listing
Auth Guard + CSRF Protection — All dashboard routes protected with withAuth middleware + CSRF tokens
Rate Limiter — Per-IP rate limiting with configurable windows
IP Filtering — Allowlist/blocklist for access control
Prompt Injection Guard — Sanitization against malicious prompt patterns
AES-256-GCM Encryption — Credentials encrypted at rest

🛑 6. "My provider went down and I lost my coding flow"

AI providers can become unstable, return 5xx errors, or hit temporary rate limits. If a dev depends on a single provider, they're interrupted. Without circuit breakers, repeated retries can crash the application.

How OmniRoute solves it:

Circuit Breaker per-provider — Auto-open/close with configurable thresholds and cooldown (Closed/Open/Half-Open)
Exponential Backoff — Progressive retry delays
Anti-Thundering Herd — Mutex + semaphore protection against concurrent retry storms
Combo Fallback Chains — If the primary provider fails, automatically falls through the chain with no intervention
Combo Circuit Breaker — Auto-disables failing providers within a combo chain
Health Dashboard — Uptime monitoring, circuit breaker states, lockouts, cache stats, p50/p95/p99 latency

🔧 7. "Configuring each AI tool is tedious and repetitive"

Developers use Cursor, Claude Code, Codex CLI, OpenClaw, Gemini CLI, Kilo Code... Each tool needs a different config (API endpoint, key, model). Reconfiguring when switching providers or models is a waste of time.

How OmniRoute solves it:

CLI Tools Dashboard — Dedicated page with one-click setup for Claude Code, Codex CLI, OpenClaw, Kilo Code, Antigravity, Cline
GitHub Copilot Config Generator — Generates chatLanguageModels.json for VS Code with bulk model selection
Onboarding Wizard — Guided 4-step setup for first-time users
One endpoint, all models — Configure http://localhost:20128/v1 once, access 36+ providers

🔑 8. "Managing OAuth tokens from multiple providers is hell"

Claude Code, Codex, Gemini CLI, Copilot — all use OAuth 2.0 with expiring tokens. Developers need to re-authenticate constantly, deal with client_secret is missing, redirect_uri_mismatch, and failures on remote servers. OAuth on LAN/VPS is particularly problematic.

How OmniRoute solves it:

Auto Token Refresh — OAuth tokens refresh in background before expiration
OAuth 2.0 (PKCE) Built-in — Automatic flow for Claude Code, Codex, Gemini CLI, Copilot, Kiro, Qwen, iFlow
Multi-Account OAuth — Multiple accounts per provider via JWT/ID token extraction
OAuth LAN/Remote Fix — Private IP detection for redirect_uri + manual URL mode for remote servers
OAuth Behind Nginx — Uses window.location.origin for reverse proxy compatibility
Remote OAuth Guide — Step-by-step guide for Google Cloud credentials on VPS/Docker

📊 9. "I don't know how much I'm spending or where"

Developers use multiple paid providers but have no unified view of spending. Each provider has its own billing dashboard, but there's no consolidated view. Unexpected costs can pile up.

How OmniRoute solves it:

Cost Analytics Dashboard — Per-token cost tracking and budget management per provider
Budget Limits per Tier — Spending ceiling per tier that triggers automatic fallback
Per-Model Pricing Configuration — Configurable prices per model
Usage Statistics Per API Key — Request count and last-used timestamp per key
Analytics Dashboard — Stat cards, model usage chart, provider table with success rates and latency

🐛 10. "I can't diagnose errors and problems in AI calls"

When a call fails, the dev doesn't know if it was a rate limit, expired token, wrong format, or provider error. Fragmented logs across different terminals. Without observability, debugging is trial-and-error.

How OmniRoute solves it:

Unified Logs Dashboard — 4 tabs: Request Logs, Proxy Logs, Audit Logs, Console
Console Log Viewer — Real-time terminal-style viewer with color-coded levels, auto-scroll, search, filter
SQLite Proxy Logs — Persistent logs that survive server restarts
Translator Playground — 4 debugging modes: Playground (format translation), Chat Tester (round-trip), Test Bench (batch), Live Monitor (real-time)
Request Telemetry — p50/p95/p99 latency + X-Request-Id tracing
File-Based Logging with Rotation — Console interceptor captures everything to JSON log with size-based rotation

🏗️ 11. "Deploying and maintaining the gateway is complex"

Installing, configuring, and maintaining an AI proxy across different environments (local, VPS, Docker, cloud) is labor-intensive. Problems like hardcoded paths, EACCES on directories, port conflicts, and cross-platform builds add friction.

How OmniRoute solves it:

npm global install — npm install -g omniroute && omniroute — done
Docker Multi-Platform — AMD64 + ARM64 native (Apple Silicon, AWS Graviton, Raspberry Pi)
Docker Compose Profiles — base (no CLI tools) and cli (with Claude Code, Codex, OpenClaw)
Electron Desktop App — Native app for Windows/macOS/Linux with system tray, auto-start, offline mode
Split-Port Mode — API and Dashboard on separate ports for advanced scenarios (reverse proxy, container networking)
Cloud Sync — Config synchronization across devices via Cloudflare Workers
DB Backups — Automatic backup, restore, export and import of all settings

🌍 12. "The interface is English-only and my team doesn't speak English"

Teams in non-English-speaking countries, especially in Latin America, Asia, and Europe, struggle with English-only interfaces. Language barriers reduce adoption and increase configuration errors.

How OmniRoute solves it:

Dashboard i18n — 30 Languages — All 500+ keys translated including Arabic, Bulgarian, Danish, German, Spanish, Finnish, French, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese (PT/BR), Romanian, Russian, Slovak, Swedish, Thai, Ukrainian, Vietnamese, Chinese, Filipino, English
RTL Support — Right-to-left support for Arabic and Hebrew
Multi-Language READMEs — 30 complete documentation translations
Language Selector — Globe icon in header for real-time switching

🔄 13. "I need more than chat — I need embeddings, images, audio"

AI isn't just chat completion. Devs need to generate images, transcribe audio, create embeddings for RAG, rerank documents, and moderate content. Each API has a different endpoint and format.

How OmniRoute solves it:

Embeddings — /v1/embeddings with 6 providers and 9+ models
Image Generation — /v1/images/generations with 4 providers and 9+ models
Audio Transcription — /v1/audio/transcriptions — Whisper-compatible
Text-to-Speech — /v1/audio/speech — Multi-provider audio synthesis
Moderations — /v1/moderations — Content safety checks
Reranking — /v1/rerank — Document relevance reranking
Responses API — Full /v1/responses support for Codex

🧪 14. "I have no way to test and compare quality across models"

Developers want to know which model is best for their use case — code, translation, reasoning — but comparing manually is slow. No integrated eval tools exist.

How OmniRoute solves it:

LLM Evaluations — Golden set testing with 10 pre-loaded cases covering greetings, math, geography, code generation, JSON compliance, translation, markdown, safety refusal
4 Match Strategies — exact, contains, regex, custom (JS function)
Translator Playground Test Bench — Batch testing with multiple inputs and expected outputs, cross-provider comparison
Chat Tester — Full round-trip with visual response rendering
Live Monitor — Real-time stream of all requests flowing through the proxy

📈 15. "I need to scale without losing performance"

As request volume grows, without caching the same questions generate duplicate costs. Without idempotency, duplicate requests waste processing. Per-provider rate limits must be respected.

How OmniRoute solves it:

Semantic Cache — Two-tier cache (signature + semantic) reduces cost and latency
Request Idempotency — 5s deduplication window for identical requests
Rate Limit Detection — Per-provider RPM, min gap, and max concurrent tracking
Editable Rate Limits — Configurable defaults in Settings → Resilience with persistence
API Key Validation Cache — 3-tier cache for production performance
Health Dashboard with Telemetry — p50/p95/p99 latency, cache stats, uptime

🤖 16. "I want to control model behavior globally"

Developers who want all responses in a specific language, with a specific tone, or want to limit reasoning tokens. Configuring this in every tool/request is impractical.

How OmniRoute solves it:

System Prompt Injection — Global prompt applied to all requests
Thinking Budget Validation — Reasoning token allocation control per request (passthrough, auto, custom, adaptive)
6 Routing Strategies — Global strategies that determine how requests are distributed
Wildcard Router — provider/* patterns route dynamically to any provider
Combo Enable/Disable Toggle — Toggle combos directly from the dashboard
Provider Toggle — Enable/disable all connections for a provider with one click
Blocked Providers — Exclude specific providers from /v1/models listing

⚡ Schnellstart

1. Global installieren:

npm install -g omniroute
omniroute

🎉 Das Dashboard öffnet sich unter http://localhost:20128

Befehl	Beschreibung
`omniroute`	Server starten (Standardport 20128)
`omniroute --port 3000`	Benutzerdefinierten Port verwenden
`omniroute --no-open`	Browser nicht automatisch öffnen
`omniroute --help`	Hilfe anzeigen

2. KOSTENLOSEN Anbieter verbinden:

Dashboard → Anbieter → Claude Code oder Antigravity verbinden → OAuth Login → Fertig!

3. In deinem CLI-Tool verwenden:

Claude Code/Codex/Gemini CLI/OpenClaw/Cursor/Cline Einstellungen:
  Endpoint: http://localhost:20128/v1
  API Key: [vom Dashboard kopieren]
  Model: if/kimi-k2-thinking

Das war's! Beginne mit KOSTENLOSEN KI-Modellen zu programmieren.

Alternative — aus Quellcode ausführen:

cp .env.example .env
npm install
PORT=20128 NEXT_PUBLIC_BASE_URL=http://localhost:20128 npm run dev

🐳 Docker

OmniRoute ist als öffentliches Docker-Image auf Docker Hub verfügbar.

Schnellstart:

docker run -d \
  --name omniroute \
  --restart unless-stopped \
  -p 20128:20128 \
  -v omniroute-data:/app/data \
  diegosouzapw/omniroute:latest

Mit Umgebungsdatei:

# .env kopieren und bearbeiten
cp .env.example .env

docker run -d \
  --name omniroute \
  --restart unless-stopped \
  --env-file .env \
  -p 20128:20128 \
  -v omniroute-data:/app/data \
  diegosouzapw/omniroute:latest

Mit Docker Compose:

# Basisprofil (ohne CLI-Tools)
docker compose --profile base up -d

# CLI-Profil (Claude Code, Codex, OpenClaw integriert)
docker compose --profile cli up -d

Image	Tag	Größe	Beschreibung
`diegosouzapw/omniroute`	`latest`	~250MB	Letztes stabiles Release
`diegosouzapw/omniroute`	`1.0.6`	~250MB	Aktuelle Version

🖥️ Desktop-App — Offline & Immer Aktiv

🆕 NEU! OmniRoute ist jetzt als native Desktop-Anwendung für Windows, macOS und Linux verfügbar.

🖥️ Natives Fenster — Dediziertes App-Fenster mit System-Tray-Integration
🔄 Autostart — OmniRoute beim Systemstart starten
🔔 Native Benachrichtigungen — Warnungen bei Kontingent-Erschöpfung
⚡ Ein-Klick-Installation — NSIS (Windows), DMG (macOS), AppImage (Linux)
🌐 Offline-Modus — Funktioniert vollständig offline mit integriertem Server

npm run electron:dev           # Entwicklungsmodus
npm run electron:build         # Aktuelle Plattform
npm run electron:build:win     # Windows (.exe)
npm run electron:build:mac     # macOS (.dmg)
npm run electron:build:linux   # Linux (.AppImage)

📖 Vollständige Dokumentation: electron/README.md

💰 Preisübersicht

Tier	Anbieter	Kosten	Kontingent-Reset	Am besten für
💳 ABO	Claude Code (Pro)	$20/Monat	5h + wöchentlich	Bereits abonniert
	Codex (Plus/Pro)	$20-200/Monat	5h + wöchentlich	OpenAI-Nutzer
	Gemini CLI	KOSTENLOS	180K/Monat + 1K/Tag	Alle!
	GitHub Copilot	$10-19/Monat	Monatlich	GitHub-Nutzer
🔑 API KEY	NVIDIA NIM	KOSTENLOS (1000 Credits)	Einmalig	Kostenloses Testen
	DeepSeek	Nach Verbrauch	Keiner	Bestes Preis-Leistung
	Groq	Gratis-Stufe + bezahlt	Begrenzt	Ultra-schnelle Inferenz
	xAI (Grok)	Nach Verbrauch	Keiner	Grok-Modelle
	Mistral	Gratis-Stufe + bezahlt	Begrenzt	Europäische KI
	OpenRouter	Nach Verbrauch	Keiner	100+ Modelle
💰 GÜNSTIG	GLM-4.7	$0.6/1M	Täglich 10h	Budget-Backup
	MiniMax M2.1	$0.2/1M	5h rotierend	Günstigste Option
	Kimi K2	$9/Monat fest	10M Token/Monat	Vorhersagbare Kosten
🆓 KOSTENLOS	iFlow	$0	Unbegrenzt	8 kostenlose Modelle
	Qwen	$0	Unbegrenzt	3 kostenlose Modelle
	Kiro	$0	Unbegrenzt	Kostenloses Claude

💡 Profi-Tipp: Starte mit Gemini CLI (180K gratis/Monat) + iFlow (unbegrenzt gratis) = $0 Kosten!

💡 Hauptfunktionen

🧠 Routing & Intelligenz

Funktion	Was es macht
🎯 Intelligenter 4-Tier-Fallback	Auto-Routing: Abo → API Key → Günstig → Kostenlos
📊 Echtzeit-Kontingent-Tracking	Live Token-Zählung + Reset-Countdown pro Anbieter
🔄 Format-Übersetzung	OpenAI ↔ Claude ↔ Gemini ↔ Cursor ↔ Kiro nahtlos
👥 Multi-Account-Unterstützung	Mehrere Konten pro Anbieter mit intelligenter Auswahl
🔄 Auto-Token-Erneuerung	OAuth-Token werden automatisch mit Wiederholungen erneuert
🎨 Benutzerdefinierte Combos	6 Strategien: fill-first, round-robin, p2c, random, least-used, cost-optimized
🧩 Benutzerdefinierte Modelle	Jede Modell-ID zu jedem Anbieter hinzufügen
🌐 Wildcard-Router	`provider/*` Muster dynamisch an jeden Anbieter routen
🧠 Reasoning-Budget	Passthrough, auto, custom und adaptive Modi für Reasoning-Modelle
🔀 Model Aliases	Auto-forward deprecated model IDs to current replacements (built-in + custom)
⚡ Background Degradation	Auto-route background tasks (titles, summaries) to cheaper models
💬 System Prompt Injection	Globaler System Prompt für alle Anfragen
📄 API Responses	Volle Unterstützung der OpenAI Responses API (`/v1/responses`) für Codex

🎵 Multi-Modale APIs

Funktion	Was es macht
🖼️ Bildgenerierung	`/v1/images/generations` — 4 Anbieter, 9+ Modelle
📐 Embeddings	`/v1/embeddings` — 6 Anbieter, 9+ Modelle
🎤 Audio-Transkription	`/v1/audio/transcriptions` — Whisper-kompatibel
🔊 Text-zu-Sprache	`/v1/audio/speech` — Multi-Anbieter Audiosynthese
🛡️ Moderationen	`/v1/moderations` — Sicherheitsüberprüfungen
🔀 Reranking	`/v1/rerank` — Dokumenten-Relevanz-Neuordnung

🛡️ Resilienz & Sicherheit

Funktion	Was es macht
🔌 Circuit Breaker	Auto-Öffnung/-Schließung pro Anbieter mit konfigurierbaren Schwellen
🛡️ Anti-Thundering Herd	Mutex + Semaphor Rate-Limit für API-Key-Anbieter
🧠 Semantischer Cache	Zwei-Ebenen-Cache (Signatur + Semantik) senkt Kosten und Latenz
⚡ Anfrage-Idempotenz	5s Dedup-Fenster für doppelte Anfragen
🔒 TLS-Fingerprint-Spoofing	Bot-Erkennung umgehen via wreq-js
🌐 IP-Filterung	Allowlist/Blocklist für API-Zugriffskontrolle
📊 Editierbare Rate-Limits	Konfigurierbare RPM, minimaler Abstand, max. Konkurrenz
💾 Rate Limit Persistence	Learned limits survive restarts via SQLite with 60s debounce + 24h staleness
🔄 Token Refresh Resilience	Per-provider circuit breaker (5 fails→30min) + 30s timeout per attempt

📊 Observability & Analytics

Funktion	Was es macht
📝 Anfrage-Logs	Debug-Modus mit vollständigen Request/Response-Logs
💾 SQLite-Logs	Persistente Proxy-Logs überleben Neustarts
📊 Analytics-Dashboard	Recharts: Statistik-Karten, Nutzungsdiagramm, Anbieter-Tabelle
📈 Fortschritts-Tracking	Opt-in SSE-Fortschrittsereignisse für Streaming
🧪 LLM-Evaluierungen	Testen mit Golden Set und 4 Match-Strategien
🔍 Anfrage-Telemetrie	p50/p95/p99 Latenz-Aggregation + X-Request-Id Tracking
📋 Logs + Kontingente	Dedizierte Seiten für Log-Browsing und Kontingent-Tracking
🏥 Health Dashboard	Uptime, Circuit-Breaker-Status, Lockouts, Cache-Statistiken
💰 Kosten-Tracking	Budget-Management + Preiseinstellung pro Modell

☁️ Deployment & Sync

Funktion	Was es macht
💾 Cloud Sync	Einstellungen zwischen Geräten via Cloudflare Workers synchronisieren
🌐 Überall deployen	Localhost, VPS, Docker, Cloudflare Workers
🔑 API-Key-Verwaltung	API-Keys pro Anbieter generieren, rotieren und einschränken
🧙 Setup-Assistent	4-Schritte geführtes Setup für neue Nutzer
🔧 CLI Tools Dashboard	Ein-Klick-Konfiguration für Claude, Codex, Cline, OpenClaw, Kilo, Antigravity
🔄 DB-Backups	Automatisches Backup und Wiederherstellung aller Einstellungen

📖 Funktionsdetails

🎯 Intelligenter 4-Tier-Fallback

Erstelle Combos mit automatischem Fallback:

Combo: "my-coding-stack"
  1. cc/claude-opus-4-6        (dein Abo)
  2. nvidia/llama-3.3-70b      (kostenlose NVIDIA API)
  3. glm/glm-4.7               (günstiges Backup, $0.6/1M)
  4. if/kimi-k2-thinking       (kostenloser Fallback)

→ Wechselt automatisch bei erschöpftem Kontingent oder Fehlern

📊 Echtzeit-Kontingent-Tracking

Token-Verbrauch pro Anbieter
Reset-Countdown (5 Stunden, täglich, wöchentlich)
Kostenabschätzung für bezahlte Stufen
Monatliche Ausgabenberichte

🔄 Format-Übersetzung

Nahtlose Übersetzung zwischen Formaten:

OpenAI ↔ Claude ↔ Gemini ↔ OpenAI Responses
Dein CLI sendet OpenAI-Format → OmniRoute übersetzt → Anbieter empfängt natives Format
Funktioniert mit jedem Tool, das benutzerdefinierte OpenAI-Endpoints unterstützt

👥 Multi-Account-Unterstützung

Mehrere Konten pro Anbieter hinzufügen
Automatisches Round-Robin oder prioritätsbasiertes Routing
Fallback zum nächsten Konto bei Kontingent-Erschöpfung

🔄 Auto-Token-Erneuerung

OAuth-Token werden automatisch vor Ablauf erneuert
Keine manuelle Neuauthentifizierung nötig
Nahtlose Erfahrung über alle Anbieter

🎨 Benutzerdefinierte Combos

Unbegrenzte Modell-Kombinationen erstellen
6 Strategien: fill-first, round-robin, power-of-two-choices, random, least-used, cost-optimized
Combos zwischen Geräten mit Cloud Sync teilen

🏥 Health Dashboard

Systemstatus (Uptime, Version, Speichernutzung)
Circuit-Breaker-Status pro Anbieter (Closed/Open/Half-Open)
Rate-Limit-Status und aktive Lockouts
Signatur-Cache-Statistiken
Latenz-Telemetrie (p50/p95/p99) + Prompt-Cache
Gesundheitsstatus mit einem Klick zurücksetzen

🔧 Übersetzer-Playground

Debug, Test und Visualisierung von API-Format-Übersetzungen
Anfragen senden und sehen, wie OmniRoute zwischen Anbieter-Formaten übersetzt
Unschätzbar für Integrationsprobleme

💾 Cloud Sync

Anbieter, Combos und Einstellungen zwischen Geräten synchronisieren
Automatische Hintergrundsynchronisierung
Sichere verschlüsselte Speicherung

🎯 Anwendungsfälle

Fall 1: „Ich habe ein Claude Pro Abo"

Problem: Kontingent verfällt ungenutzt, Rate-Limits während intensivem Programmieren

Combo: "maximize-claude"
  1. cc/claude-opus-4-6        (Abo voll ausnutzen)
  2. glm/glm-4.7               (günstiges Backup bei erschöpftem Kontingent)
  3. if/kimi-k2-thinking       (kostenloser Notfall-Fallback)

Monatliche Kosten: $20 (Abo) + ~$5 (Backup) = $25 gesamt
vs. $20 + an Limits stoßen = Frustration

Fall 2: „Ich will null Kosten"

Problem: Kann sich Abos nicht leisten, braucht zuverlässige KI zum Programmieren

Combo: "free-forever"
  1. gc/gemini-3-flash         (180K gratis/Monat)
  2. if/kimi-k2-thinking       (unbegrenzt gratis)
  3. qw/qwen3-coder-plus       (unbegrenzt gratis)

Monatliche Kosten: $0
Qualität: Produktionsreife Modelle

Fall 3: „Ich muss 24/7 programmieren, ohne Unterbrechungen"

Problem: Enge Deadlines, kann sich keine Ausfallzeit leisten

Combo: "always-on"
  1. cc/claude-opus-4-6        (beste Qualität)
  2. cx/gpt-5.2-codex          (zweites Abo)
  3. glm/glm-4.7               (günstig, täglicher Reset)
  4. minimax/MiniMax-M2.1      (günstigste, 5h Reset)
  5. if/kimi-k2-thinking       (unbegrenzt kostenlos)

Ergebnis: 5 Fallback-Ebenen = null Ausfallzeit

Fall 4: „Ich will KOSTENLOSE KI in OpenClaw"

Problem: Braucht KI-Assistenz in Messaging-Apps, komplett kostenlos

Combo: "openclaw-free"
  1. if/glm-4.7                (unbegrenzt kostenlos)
  2. if/minimax-m2.1           (unbegrenzt kostenlos)
  3. if/kimi-k2-thinking       (unbegrenzt kostenlos)

Monatliche Kosten: $0
Zugang über: WhatsApp, Telegram, Slack, Discord, iMessage, Signal...

📖 Einrichtungsanleitung

💳 Abo-Anbieter

Claude Code (Pro/Max)

Dashboard → Anbieter → Claude Code verbinden
→ OAuth Login → Automatische Token-Erneuerung
→ 5h + wöchentliches Kontingent-Tracking

Modelle:
  cc/claude-opus-4-6
  cc/claude-sonnet-4-5-20250929
  cc/claude-haiku-4-5-20251001

Profi-Tipp: Opus für komplexe Aufgaben, Sonnet für Geschwindigkeit. OmniRoute trackt Kontingent pro Modell!

OpenAI Codex (Plus/Pro)

Dashboard → Anbieter → Codex verbinden
→ OAuth Login (Port 1455)
→ 5h + wöchentlicher Reset

Modelle:
  cx/gpt-5.2-codex
  cx/gpt-5.1-codex-max

Gemini CLI (KOSTENLOS 180K/Monat!)

Dashboard → Anbieter → Gemini CLI verbinden
→ Google OAuth
→ 180K Completions/Monat + 1K/Tag

Modelle:
  gc/gemini-3-flash-preview
  gc/gemini-2.5-pro

Bester Wert: Riesiger Gratis-Tarif! Vor bezahlten Stufen nutzen.

GitHub Copilot

Dashboard → Anbieter → GitHub verbinden
→ OAuth via GitHub
→ Monatlicher Reset (1. des Monats)

Modelle:
  gh/gpt-5
  gh/claude-4.5-sonnet
  gh/gemini-3-pro

🔑 API-Key-Anbieter

NVIDIA NIM (KOSTENLOS 1000 Credits!)

Registrieren: build.nvidia.com
Kostenlosen API-Key holen (1000 Inferenz-Credits inklusive)
Dashboard → Anbieter hinzufügen → NVIDIA NIM:
- API Key: nvapi-your-key

Modelle: nvidia/llama-3.3-70b-instruct, nvidia/mistral-7b-instruct und 50+ weitere

Profi-Tipp: OpenAI-kompatible API — funktioniert perfekt mit OmniRoutes Format-Übersetzung!

DeepSeek

Registrieren: platform.deepseek.com
API-Key holen
Dashboard → Anbieter hinzufügen → DeepSeek

Modelle: deepseek/deepseek-chat, deepseek/deepseek-coder

Groq (Gratis-Stufe verfügbar!)

Registrieren: console.groq.com
API-Key holen (Gratis-Stufe inklusive)
Dashboard → Anbieter hinzufügen → Groq

Modelle: groq/llama-3.3-70b, groq/mixtral-8x7b

Profi-Tipp: Ultra-schnelle Inferenz — am besten für Echtzeit-Programmierung!

OpenRouter (100+ Modelle)

Registrieren: openrouter.ai
API-Key holen
Dashboard → Anbieter hinzufügen → OpenRouter

Modelle: Zugang zu 100+ Modellen aller großen Anbieter über einen einzigen API-Key.

💰 Günstige Anbieter (Backup)

GLM-4.7 (Täglicher Reset, $0.6/1M)

Registrieren: Zhipu AI
API-Key aus dem Coding Plan holen
Dashboard → API Key hinzufügen:
- Anbieter: glm
- API Key: your-key

Nutze: glm/glm-4.7

Profi-Tipp: Der Coding Plan bietet 3× Kontingent zu 1/7 der Kosten! Täglicher Reset um 10:00.

MiniMax M2.1 (5h Reset, $0.20/1M)

Registrieren: MiniMax
API-Key holen
Dashboard → API Key hinzufügen

Nutze: minimax/MiniMax-M2.1

Profi-Tipp: Günstigste Option für langen Kontext (1M Token)!

Kimi K2 ($9/Monat fest)

Abonnieren: Moonshot AI
API-Key holen
Dashboard → API Key hinzufügen

Nutze: kimi/kimi-latest

Profi-Tipp: Feste $9/Monat für 10M Token = $0.90/1M effektive Kosten!

🆓 KOSTENLOSE Anbieter (Notfall-Backup)

iFlow (8 KOSTENLOSE Modelle)

Dashboard → iFlow verbinden
→ iFlow OAuth Login
→ Unbegrenzte Nutzung

Modelle:
  if/kimi-k2-thinking
  if/qwen3-coder-plus
  if/glm-4.7
  if/minimax-m2
  if/deepseek-r1

Qwen (3 KOSTENLOSE Modelle)

Dashboard → Qwen verbinden
→ Geräte-Code-Autorisierung
→ Unbegrenzte Nutzung

Modelle:
  qw/qwen3-coder-plus
  qw/qwen3-coder-flash

Kiro (Kostenloses Claude)

Dashboard → Kiro verbinden
→ AWS Builder ID oder Google/GitHub
→ Unbegrenzte Nutzung

Modelle:
  kr/claude-sonnet-4.5
  kr/claude-haiku-4.5

🎨 Combos erstellen

Beispiel 1: Abo maximieren → Günstiges Backup

Dashboard → Combos → Neues erstellen

Name: premium-coding
Modelle:
  1. cc/claude-opus-4-6 (Primäres Abo)
  2. glm/glm-4.7 (Günstiges Backup, $0.6/1M)
  3. minimax/MiniMax-M2.1 (Günstigster Fallback, $0.20/1M)

Im CLI nutzen: premium-coding

Beispiel 2: Nur Kostenlos (Null Kosten)

Name: free-combo
Modelle:
  1. gc/gemini-3-flash-preview (180K gratis/Monat)
  2. if/kimi-k2-thinking (unbegrenzt)
  3. qw/qwen3-coder-plus (unbegrenzt)

Kosten: Für immer $0!

🔧 CLI-Integration

Cursor IDE

Einstellungen → Modelle → Erweitert:
  OpenAI API Base URL: http://localhost:20128/v1
  OpenAI API Key: [aus OmniRoute Dashboard]
  Model: cc/claude-opus-4-6

Claude Code

Nutze die CLI Tools Seite im Dashboard für Ein-Klick-Konfiguration, oder bearbeite ~/.claude/settings.json manuell.

Codex CLI

export OPENAI_BASE_URL="http://localhost:20128"
export OPENAI_API_KEY="your-omniroute-api-key"

codex "your prompt"

OpenClaw

Option 1 — Dashboard (empfohlen):

Dashboard → CLI Tools → OpenClaw → Modell wählen → Anwenden

Option 2 — Manuell: ~/.openclaw/openclaw.json bearbeiten:

{
  "models": {
    "providers": {
      "omniroute": {
        "baseUrl": "http://127.0.0.1:20128/v1",
        "apiKey": "sk_omniroute",
        "api": "openai-completions"
      }
    }
  }
}

Hinweis: OpenClaw funktioniert nur mit lokalem OmniRoute. Verwende 127.0.0.1 statt localhost um IPv6-Auflösungsprobleme zu vermeiden.

Cline / Continue / RooCode

Einstellungen → API-Konfiguration:
  Anbieter: OpenAI Compatible
  Base URL: http://localhost:20128/v1
  API Key: [aus OmniRoute Dashboard]
  Model: if/kimi-k2-thinking

🧪 Evaluierungen (Evals)

OmniRoute enthält ein integriertes Evaluierungs-Framework zum Testen der LLM-Antwortqualität gegen ein Golden Set. Zugang über Analytics → Evals im Dashboard.

Integriertes Golden Set

Das vorgeladene „OmniRoute Golden Set" enthält 10 Testfälle:

Begrüßungen, Mathematik, Geographie, Code-Generierung
JSON-Formatkonformität, Übersetzung, Markdown
Sicherheitsablehnung (schädlicher Inhalt), Zählung, Boolesche Logik

Evaluierungsstrategien

Strategie	Beschreibung	Beispiel
`exact`	Ausgabe muss exakt übereinstimmen	`"4"`
`contains`	Ausgabe muss Teilzeichenfolge enthalten (case-insensitive)	`"Paris"`
`regex`	Ausgabe muss Regex-Muster entsprechen	`"1.2.3"`
`custom`	Benutzerdefinierte JS-Funktion gibt true/false zurück	`(output) => output.length > 10`

🐛 Fehlerbehebung

Klicke zum Erweitern der Fehlerbehebungsanleitung

„Language model did not provide messages"

Anbieter-Kontingent erschöpft → Kontingent-Tracker im Dashboard prüfen
Lösung: Combo mit Fallback nutzen oder zu günstigerer Stufe wechseln

Rate Limiting

Abo-Kontingent erschöpft → Fallback zu GLM/MiniMax
Combo hinzufügen: cc/claude-opus-4-6 → glm/glm-4.7 → if/kimi-k2-thinking

OAuth-Token abgelaufen

Wird automatisch von OmniRoute erneuert
Falls Problem bestehen bleibt: Dashboard → Anbieter → Neu verbinden

Hohe Kosten

Nutzungsstatistiken unter Dashboard → Kosten prüfen
Primärmodell auf GLM/MiniMax umstellen
Gratis-Stufe (Gemini CLI, iFlow) für unkritische Aufgaben nutzen

Dashboard öffnet sich auf falschem Port

PORT=20128 und NEXT_PUBLIC_BASE_URL=http://localhost:20128 setzen

Cloud-Sync-Fehler

Prüfe dass BASE_URL auf deine laufende Instanz zeigt
Prüfe dass CLOUD_URL auf den erwarteten Cloud-Endpoint zeigt
NEXT_PUBLIC_* Werte mit Serverwerten synchron halten

Erster Login funktioniert nicht

INITIAL_PASSWORD in .env prüfen
Falls nicht gesetzt, Standard-Passwort ist 123456

Keine Anfrage-Logs

ENABLE_REQUEST_LOGS=true in .env setzen

Verbindungstest zeigt „Invalid" für OpenAI-kompatible Anbieter

Viele Anbieter stellen den /models Endpoint nicht bereit
OmniRoute v1.0.6+ enthält Fallback-Validierung via Chat Completions
Stelle sicher, dass die Base URL den /v1 Suffix enthält

🛠️ Technologie-Stack

Runtime: Node.js 20+
Sprache: TypeScript 5.9 — 100% TypeScript in src/ und open-sse/ (v1.0.6)
Framework: Next.js 16 + React 19 + Tailwind CSS 4
Datenbank: LowDB (JSON) + SQLite (Domain-Status + Proxy-Logs)
Streaming: Server-Sent Events (SSE)
Auth: OAuth 2.0 (PKCE) + JWT + API Keys
Testing: Node.js Test Runner (368+ Unit-Tests)
CI/CD: GitHub Actions (automatische npm + Docker Hub Veröffentlichung bei Release)
Website: omniroute.online
Paket: npmjs.com/package/omniroute
Docker: hub.docker.com/r/diegosouzapw/omniroute
Resilienz: Circuit Breaker, exponentieller Backoff, Anti-Thundering Herd, TLS-Spoofing

📖 Dokumentation

Dokument	Beschreibung
Benutzerhandbuch	Anbieter, Combos, CLI-Integration, Deploy
API-Referenz	Alle Endpoints mit Beispielen
Fehlerbehebung	Häufige Probleme und Lösungen
Architektur	Systemarchitektur und Interna
Mitwirken	Entwicklungs-Setup und Richtlinien
OpenAPI-Spezifikation	OpenAPI 3.0 Spezifikation
Sicherheitsrichtlinie	Schwachstellen melden und Sicherheitspraktiken

📧 Support

💬 Treten Sie unserer Community bei! WhatsApp-Gruppe — Hilfe bekommen, Tipps teilen und auf dem Laufenden bleiben.

Website: omniroute.online
GitHub: github.com/diegosouzapw/OmniRoute
Issues: github.com/diegosouzapw/OmniRoute/issues
WhatsApp: Community-Gruppe
WhatsApp: Community-Gruppe
Originalprojekt: 9router von decolua

👥 Mitwirkende

Wie du mitwirken kannst

Repository forken
Feature-Branch erstellen (git checkout -b feature/amazing-feature)
Änderungen committen (git commit -m 'Add amazing feature')
Branch pushen (git push origin feature/amazing-feature)
Pull Request öffnen

Siehe CONTRIBUTING.md für detaillierte Richtlinien.

Neue Version veröffentlichen

# Release erstellen — npm-Veröffentlichung erfolgt automatisch
gh release create v1.0.6 --title "v1.0.6" --generate-notes

📊 Star-Verlauf

🙏 Danksagungen

Besonderer Dank an 9router von decolua — das Originalprojekt, das diesen Fork inspiriert hat. OmniRoute baut auf diesem unglaublichen Fundament auf mit zusätzlichen Funktionen, Multi-Modalen APIs und einem vollständigen TypeScript-Rewrite.

Besonderer Dank an CLIProxyAPI — die ursprüngliche Go-Implementierung, die diese JavaScript-Portierung inspiriert hat.

📄 Lizenz

MIT-Lizenz — siehe LICENSE für Details.

_{Mit ❤️ gemacht für Entwickler, die 24/7 programmieren}
_{omniroute.online}

FilesExpand file tree

README.de.md

Latest commit

History

README.de.md

File metadata and controls

🚀 OmniRoute — Das kostenlose AI-Gateway

Höre nie auf zu programmieren. Intelligentes Routing zu KOSTENLOSEN und günstigen KI-Modellen mit automatischem Fallback.

🤖 Kostenloser KI-Anbieter für deine Lieblings-Coding-Agenten

🤔 Warum OmniRoute?

🔄 So funktioniert's

🎯 What OmniRoute Solves — 16 Real Pain Points

⚡ Schnellstart

🐳 Docker

🖥️ Desktop-App — Offline & Immer Aktiv

💰 Preisübersicht

💡 Hauptfunktionen

🧠 Routing & Intelligenz

🎵 Multi-Modale APIs

🛡️ Resilienz & Sicherheit

📊 Observability & Analytics

☁️ Deployment & Sync

🎯 Intelligenter 4-Tier-Fallback

📊 Echtzeit-Kontingent-Tracking

🔄 Format-Übersetzung

👥 Multi-Account-Unterstützung

🔄 Auto-Token-Erneuerung

🎨 Benutzerdefinierte Combos

🏥 Health Dashboard

🔧 Übersetzer-Playground

💾 Cloud Sync

🎯 Anwendungsfälle

Fall 1: „Ich habe ein Claude Pro Abo"

Fall 2: „Ich will null Kosten"

Fall 3: „Ich muss 24/7 programmieren, ohne Unterbrechungen"

Fall 4: „Ich will KOSTENLOSE KI in OpenClaw"

📖 Einrichtungsanleitung

Claude Code (Pro/Max)

OpenAI Codex (Plus/Pro)

Gemini CLI (KOSTENLOS 180K/Monat!)

GitHub Copilot

NVIDIA NIM (KOSTENLOS 1000 Credits!)

DeepSeek

Groq (Gratis-Stufe verfügbar!)

OpenRouter (100+ Modelle)

GLM-4.7 (Täglicher Reset, $0.6/1M)

MiniMax M2.1 (5h Reset, $0.20/1M)

Kimi K2 ($9/Monat fest)

iFlow (8 KOSTENLOSE Modelle)

Qwen (3 KOSTENLOSE Modelle)

Kiro (Kostenloses Claude)

Beispiel 1: Abo maximieren → Günstiges Backup

Beispiel 2: Nur Kostenlos (Null Kosten)

Cursor IDE

Claude Code

Codex CLI

OpenClaw

Cline / Continue / RooCode

🧪 Evaluierungen (Evals)

Integriertes Golden Set

Evaluierungsstrategien

🐛 Fehlerbehebung

🛠️ Technologie-Stack

📖 Dokumentation

📧 Support

👥 Mitwirkende

Wie du mitwirken kannst

Neue Version veröffentlichen

📊 Star-Verlauf

🙏 Danksagungen

📄 Lizenz