ModelGate is a production-grade Go-based LLM API Gateway providing OpenAI-compatible endpoints with support for Ollama, vLLM, llama.cpp, OpenAI, and API3 backends. It provides authentication, rate limiting, quota enforcement, and admin panel for key management.
- Clean Architecture: single responsibility per layer
- Adapter Pattern: backend services decoupled via service.Adapter interface
- OpenAI Compatibility: all external APIs match OpenAI SDK exactly
- HTTP Forwarding: all backends via HTTP, no direct model code coupling
- Security First: SHA256 hashed API keys, 5MB body limits, 300s timeouts, quota enforcement
Go 1.23+, Gin (release mode), SQLite+GORM, Redis (go-redis/v8), zerolog, Viper (MG_ prefix), Docker (multi-stage Alpine <60MB)
modelgate/ ├── cmd/server/main.go # Entry point ├── internal/ │ ├── adapters/ # Backend adapters │ ├── admin/ # Admin API endpoints │ ├── auth/ # API key hashing/validation │ ├── config/ # Config with hot-reload │ ├── database/ # GORM setup/migrations │ ├── limiter/ # Rate limiter logic │ ├── middleware/ # Auth, rate limit, logging, quota │ ├── models/ # Database models │ ├── service/ # Business logic (GatewayService) │ ├── usage/ # Token usage tracking │ └── utils/ # Helper functions ├── configs/config.yaml # Configuration ├── admin/index.html # Admin panel UI ├── Dockerfile ├── docker-compose.yml └── Makefile
make run, build, test, test-cover, docker-up, docker-down, test-ollama
go test -v ./internal/{auth,middleware,limiter,utils,config,adapters} -run {TestHashAPIKey,TestAuthMiddleware,TestRateLimiter,TestErrorResponse,TestLoadConfig,TestOllamaAdapter}
go test -coverprofile=coverage.out ./... && go tool cover -html=coverage.out
- Use gofmt for formatting
- Imports: stdlib, external, internal
- File naming: snake_case.go
- Function names: PascalCase for public, camelCase for private
- Error handling: always check errors, wrap with %w
type Adapter interface { ChatCompletion(ctx context.Context, req OpenAIRequest, model Model) (*OpenAIResponse, error) Completion(ctx context.Context, req OpenAIRequest, model Model) (*OpenAIResponse, error) Models(ctx context.Context, model Model) (*OpenAIModelsResponse, error) }
return nil, fmt.Errorf("failed: %w", err) return nil, &APIError{Message: "..", Type: "..", Code: 401}
Use zerolog with structured fields: request ID, user ID, model name
log.Debug().Str("model", name).Int("user_id", id).Msg("Processing")
All request/response bodies must match OpenAI API spec.
POST /v1/chat/completions, POST /v1/completions, GET /v1/models
{"error": {"message": "..", "type": "..", "code": 401}}
Auth: Bearer token, Rate Limiter: Redis INCR+EXPIRE (RPM), Quota Check: token quota, Logging: method, path, IP, user ID, duration, Body Size: 5MB limit
All models embed BaseModel:
type BaseModel struct { ID uint gorm:"primarykey" CreatedAt time.Time UpdatedAt time.Time Status string }
Use GORM auto-migration for development.
Route based on Model.BackendType: Ollama, vLLM, llama.cpp, OpenAI, API3
All adapters handle HTTP errors and convert to OpenAI format.
Unit: test each layer, mock dependencies. Integration: full request flow. Coverage: 80%+
go test -v ./... -run TestAuthMiddleware go test -v ./internal/adapters -count=1 make test-cover
Never log API keys (even hashed), never expose backend URLs, validate user input, 300s timeout, 5MB body limit, use context.WithTimeout
FROM golang:1.23-alpine AS builder FROM alpine:latest RUN adduser -D app USER app COPY modelgate . EXPOSE 18080 ENTRYPOINT ["./modelgate"]
| Setting | Default | |Port|18080| |DB Path|./data/modelgate.db| |Redis Addr|localhost:6379| |Timeout|300s| |Max Body|5 MB| |Rate Limit|60 RPM|
- Always use %w for error wrapping
- Pass context through all layers
- Close DB connections on shutdown
- Use shared HTTP client with timeout
- Use DB transactions for quota updates
- Rate limiter uses atomic operations
- Validate model exists before routing
Admin API provides key management endpoints (create, list, update, delete), all require authentication, admin panel at /admin/ for self-service, token quota and rate limits enforced per API key
Standard Go conventions and TypeScript/JavaScript patterns for admin UI. No specific rules defined.