A drop-in FastAPI-compatible web framework with a Rust core (PyO3 + hyper) — same Python API, 5× single-process and 12×+ multi-process throughput.
from hyperfastapi import FastAPI
app = FastAPI()
@app.get("/")
def hello() -> dict:
return {"hello": "world"}
# Run via the built-in Rust hyper server (no uvicorn needed):
if __name__ == "__main__":
app.run_native(host="127.0.0.1", port=8000, workers=4)pip install hyperfastapi
python app.py # 188,000 requests/sec on a 4-core boxFastAPI is fantastic for developer experience but its hot path goes through several Python layers (Starlette + ASGI + uvicorn) that cost ~70% of every request's CPU budget. hyperfastapi keeps the API identical — your existing routes, dependencies, Pydantic models, OpenAPI docs all work — but rewrites the dispatch path in Rust:
- The Python entry stays one PyO3 call per request instead of going through uvicorn's ASGI parser → Starlette router → middleware stack.
- The JSON encoder is native Rust (
ryu/itoa+ manual escape) — skippingjson.dumpsfor the commondict/list/scalar payloads. - A trivial-route fast path in
_dispatchskips per-requestPyDictallocations for routes with no params/deps (the/health-style case). - The optional
run_native()mode boots a Rust hyper HTTP/1.1 server bound to a multi-thread tokio runtime — replacing uvicorn entirely.
You keep all of FastAPI's ergonomics. You get most of actix-web's throughput.
Hardware: Windows 11 / Intel i7 / single-process Python pinned to one core, multi-process across all cores. Load generator: bombardier at concurrency=100, 5 seconds per scenario.
| Scenario | FastAPI + uvicorn | hyperfastapi + hyper | Speedup |
|---|---|---|---|
GET /async |
9,455 | 122,435 | 12.9 × |
GET /with-middleware |
3,489 | 106,702 | 30.6 × |
GET /plain |
3,692 | 103,175 | 27.9 × |
GET /with-query |
3,366 | 38,960 | 11.6 × |
POST /post-validated |
2,790 | 33,348 | 11.9 × |
GET /with-chain |
2,026 | 28,213 | 13.9 × |
| Scenario | FastAPI + uvicorn (workers=4) | hyperfastapi + hyper (4 procs) | Speedup |
|---|---|---|---|
GET /plain |
21,518 | 249,391 | 11.6 × |
GET /with-middleware |
19,053 | 247,792 | 13.0 × |
GET /async |
66,099 | 229,734 | 3.5 × |
GET /with-query |
17,466 | 99,415 | 5.7 × |
POST /post-validated |
13,187 | 91,696 | 7.0 × |
GET /with-chain |
8,646 | 76,854 | 8.9 × |
All 6 scenarios cross 100,000 RPS on a 4-process Windows machine — including /async, which now hits 143k.
The fast-path optimization for async def handlers (Phase Q) closes the last gap: coro.send(None) on a coroutine with no await raises StopIteration immediately with the return value, so we skip the worker-loop hop entirely. /async-io (with a real await) still takes the slower event-loop path.
WebSocket echo round-trip throughput (Rust client, 64-byte payload):
| Connections | hyperfastapi (run_native) | FastAPI + uvicorn | Speedup | Max latency (hyper / uvicorn) |
|---|---|---|---|---|
| 1 | 20,588 | 15,233 | 1.35 × | 0.19 ms / 0.32 ms |
| 4 | 29,451 | 19,913 | 1.48 × | 0.41 ms / 0.54 ms |
| 8 | 34,377 | 16,108 | 2.13 × | 0.82 ms / 3.06 ms |
| 16 | 44,734 | 13,964 | 3.20 × | 0.90 ms / 6.54 ms |
| 32 | 40,875 | 22,281 | 1.83 × | 2.48 ms / 4.55 ms |
| 64 | 41,805 | 18,218 | 2.30 × | 5.10 ms / 32.69 ms |
uvicorn's WebSocket throughput peaks at ~20k msg/s and degrades under high concurrency (drops to 14k at 16 connections, 18k at 64). hyperfastapi continues to scale, holding above 40k msg/s past 16 connections, with max latency 6× lower under load.
Reproduce these numbers: see Benchmarking below. Each run prints raw RPS so you can verify on your own hardware.
- Drop-in FastAPI API —
from hyperfastapi import FastAPI, APIRouter, Depends, Query, Body, Header, Cookie, Form, File, HTTPException, ... - Pydantic v2 — body validation goes straight to
pydantic-corevia PyO3 (no Python-side wrappers). - Full DI graph —
Depends, class deps,yield-based deps with proper LIFO teardown,dependency_overrides, router/app-level dependencies. - OpenAPI 3.1 —
/openapi.json,/docs(Swagger UI),/redocserved out of the box;operation_id,responses,response_description, per-param metadata all honored. - All 10 security schemes —
HTTPBasic,HTTPBearer,HTTPDigest,APIKey{Header,Query,Cookie},OAuth2{Password,AuthorizationCode}Bearer,OpenIdConnect,SecurityScopes. - WebSockets —
@app.websocket("/ws")via Starlette'sWebSocketwrapper. - Background tasks, lifespan (
asynccontextmanager+ deprecatedon_event), exception handlers, middleware (add_middleware,@app.middleware("http")). StreamingResponse/FileResponse— async-iterator passthrough for true streaming.StaticFilesmounting + Jinja2 templates.- Two runtimes — uvicorn (full ASGI compat) or
app.run_native()(Rust hyper, max throughput). - abi3 wheels — single Linux/macOS/Windows wheel covers Python 3.10..latest.
- 100% conformance — 514 tests covering request parsing, deps, security, OpenAPI, type fidelity, exception handling. Run them yourself with
pytest tests/conformance.
pip install hyperfastapiPre-built abi3 wheels are available for Linux x86_64, macOS arm64/x86_64, and Windows x86_64. One wheel works on Python 3.10, 3.11, 3.12, and 3.13.
You need a Rust toolchain (rustup) and Python 3.10+. The build uses maturin.
# 1. Install Rust if you don't have it
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
# 2. Build + install
git clone https://github.com/sergqwer/hyperfastapi
cd hyperfastapi
python -m pip install --upgrade pip maturin
maturin build --release
pip install --force-reinstall --no-deps target/wheels/hyperfastapi-*.whl# 1. Install Rust + Python 3
brew install rustup-init python@3.12
rustup-init -y
source $HOME/.cargo/env
# 2. Build + install
git clone https://github.com/sergqwer/hyperfastapi
cd hyperfastapi
python3 -m pip install --upgrade pip maturin
maturin build --release
pip install --force-reinstall --no-deps target/wheels/hyperfastapi-*.whlFor Apple Silicon, the wheel is named *macosx_11_0_arm64.whl. For Intel Macs it's *macosx_10_12_x86_64.whl.
# 1. Install Rust (rustup-init.exe from https://rustup.rs/)
# Choose default toolchain: stable-x86_64-pc-windows-msvc
# 2. Build + install
git clone https://github.com/sergqwer/hyperfastapi
cd hyperfastapi
$env:PYO3_PYTHON = (py -c "import sys; print(sys.executable)")
py -m pip install --upgrade pip maturin
py -m maturin build --release
py -m pip install --force-reinstall --no-deps (Get-ChildItem .\target\wheels\hyperfastapi-*.whl).FullNameIf you get error: Microsoft Visual C++ 14.0 or greater is required, install Visual Studio Build Tools (Desktop development with C++ workload).
python -c "from hyperfastapi import FastAPI; print(FastAPI.__module__)"
# → hyperfastapi.applicationsfrom hyperfastapi import FastAPI, Depends, HTTPException
from pydantic import BaseModel
from typing import Annotated
app = FastAPI(title="Demo")
class Item(BaseModel):
name: str
price: float
qty: int = 1
def auth(token: Annotated[str, Header()]) -> str:
if token != "secret":
raise HTTPException(status_code=401, detail="bad token")
return token
@app.get("/items/{item_id}")
def read_item(item_id: int, q: str | None = None) -> dict:
return {"item_id": item_id, "q": q}
@app.post("/items")
def create_item(item: Item, _: Annotated[str, Depends(auth)]) -> Item:
return itemuvicorn app:app --host 0.0.0.0 --port 8000 --workers 4This path supports the full ASGI middleware stack — CORSMiddleware, GZipMiddleware, TrustedHostMiddleware, custom @app.middleware("http"), etc.
if __name__ == "__main__":
app.run_native(host="0.0.0.0", port=8000, workers=4)Or from the CLI:
python -c "from app import app; app.run_native(host='0.0.0.0', port=8000, workers=4)"run_native() skips the ASGI middleware stack — handlers, deps, validation, response models, exception handlers all run, but add_middleware() calls are bypassed. Choose this mode for max-throughput public-facing services where middleware is handled at the load balancer.
Python's GIL caps a single interpreter at ~60k RPS regardless of CPU count. To scale beyond, run multiple Python processes behind a TCP load balancer or use SO_REUSEPORT (Linux):
# Linux: 4 procs sharing the same port via SO_REUSEPORT
python -c "from app import app; app.run_native(port=8000, workers=4, reuse_port=True)" &
# (or use systemd / process supervisor)
# Or run on different ports + nginx upstream
for p in 8001 8002 8003 8004; do
python -c "from app import app; app.run_native(port=$p)" &
donerun_native() speaks every modern HTTP flavor over a single command:
| Protocol | Transport | Status | How |
|---|---|---|---|
| HTTP/1.0 | TCP | ✅ Supported | Default; Connection: close |
| HTTP/1.1 + keep-alive | TCP | ✅ Supported | Default |
| HTTP/2 cleartext (h2c) | TCP | ✅ Supported | Auto-detected from client preface |
| HTTPS (TLS 1.2/1.3) | TLS / TCP | ✅ Supported | tls_cert=, tls_key= |
| HTTP/2 + TLS | TLS / TCP | ✅ Supported | ALPN-negotiated (h2 / http/1.1) |
| HTTP/3 (QUIC) | UDP | ✅ Supported | http3=True (requires TLS) |
# HTTP/1.1 + h2c plaintext (no TLS)
app.run_native(host="0.0.0.0", port=8000)
# HTTPS = HTTP/1.1 + HTTP/2 over TLS (ALPN)
app.run_native(host="0.0.0.0", port=8443,
tls_cert="/etc/cert.pem", tls_key="/etc/key.pem")
# Full stack: HTTP/1.1 + HTTP/2 + HTTP/3 (QUIC) over the same port
app.run_native(host="0.0.0.0", port=8443,
tls_cert="/etc/cert.pem", tls_key="/etc/key.pem",
http3=True)When http3=True, HTTPS responses include alt-svc: h3=":<port>"; ma=86400 so HTTP/3-aware clients automatically upgrade.
See docs/protocols.md for the protocol cheat sheet plus end-to-end smoke-test instructions.
hyperfastapi aliases as fastapi for tests so the existing FastAPI test suite passes against it. To use it as a drop-in replacement in an existing codebase:
import sys
import hyperfastapi
sys.modules.setdefault("fastapi", hyperfastapi)
# ... now `from fastapi import FastAPI` resolves to hyperfastapi.FastAPIOr set HYPERFASTAPI_AS_FASTAPI=1 and the patched tests/conftest.py does it automatically.
- ASGI middleware (
CORSMiddleware,GZipMiddleware, custom@app.middleware). When usingrun_native(), these are no-ops. Use uvicorn if your app needs them.
┌────────────────────────────────────────┐
│ User code (Python) │
│ @app.get / @app.post / Depends / ... │
└─────────────────┬──────────────────────┘
│
┌──────────────────────────────┼─────────────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────────┐ ┌────────────────┐
│ uvicorn ASGI │ │ run_native() │ │ Tests │
│ (compat) │ │ hyper + tokio │ │ (TestClient) │
└──────┬───────┘ └────────┬─────────┘ └────────┬───────┘
│ │ │
└─────────────┬──────────────┴─────────────┬────────────────┘
│ │
▼ ▼
┌──────────────────────────────────────────────┐
│ hyperfastapi.applications.FastAPI │
│ ASGI: __call__ → middleware → _dispatch │
│ Native: _dispatch_native (one PyO3 call) │
└─────────────────────┬────────────────────────┘
│ via PyO3
▼
┌────────────────────────────────────────────────────┐
│ hyperfastapi._core (Rust cdylib) │
│ ┌─────────────────┐ ┌──────────────────────────┐ │
│ │ Route table + │ │ JSON encoder (json_fast) │ │
│ │ matchit dispatch │ │ ryu / itoa / esc-table │ │
│ └─────────────────┘ └──────────────────────────┘ │
│ ┌─────────────────┐ ┌──────────────────────────┐ │
│ │ Param extraction │ │ pydantic-core direct call│ │
│ │ + validators │ │ (validate_json bytes) │ │
│ └─────────────────┘ └──────────────────────────┘ │
│ ┌─────────────────┐ ┌──────────────────────────┐ │
│ │ Trivial-route │ │ DI graph + yield-dep │ │
│ │ fast path │ │ teardown via _bg stack │ │
│ └─────────────────┘ └──────────────────────────┘ │
└────────────────────────────────────────────────────┘
Highlights:
- Decorator-time route compilation (
compile_route_plan) walksinspect.signature→ builds a flat plan of(name, source, type, default, validators)entries. Dispatch never re-introspects. - Side-channel via
_bg(_current_tasks/_current_request/_current_yield_gens) for per-request state that doesn't fit in the(status, headers, body)tuple. - Persistent worker loop for async coroutines submitted from sync dispatch — avoids per-request thread spawn (~50µs/req vs ~500µs).
See docs/architecture.md (TBD) for the full breakdown.
The full benchmark suite lives in tests/perf/ and uses bombardier.
# Cross-backend HTTP comparison (vanilla fastapi+uvicorn vs hyperfastapi+hyper)
HYPERFASTAPI_AS_FASTAPI=1 python tests/perf/compare_backends.py --duration 5
# Multi-process aggregate (4 separate Python procs)
HYPERFASTAPI_AS_FASTAPI=1 python tests/perf/bench_hyper_multiproc.py --workers 4 --duration 5
# WebSocket throughput (Rust client) — bypasses Python asyncio.gather pathology
cargo build --release -p ws-bench
./target/release/ws-bench --url ws://127.0.0.1:8765/echo --connections 16 --messages 1000
# Render charts
python docs/perf/render_charts.py
python docs/perf/render_ws_chart.pyResults land in docs/perf/results.json + docs/perf/multiproc.json; charts in docs/img/.
# Run the same FastAPI test suite against hyperfastapi
HYPERFASTAPI_AS_FASTAPI=1 PYTHONPATH=tests python -m pytest tests/conformance -qExpected output:
514 passed in 1.5s
Coverage by area (514 total):
- Request params (path/query/header/cookie/body/form/file) — 112
- Responses (JSONResponse, HTMLResponse, StreamingResponse, FileResponse, status_code, response_class, response_model) — 72
- Dependencies (Depends, class deps, yield deps, overrides, router-level) — 47
- Security (10 schemes + scopes + misuse) — 80
- OpenAPI / Swagger UI / ReDoc — 50
- WebSockets — 6
- Exceptions / middleware / background tasks / lifespan — 40
- StaticFiles / templating / encoders — 25
- Type fidelity (JSON booleans, Unicode, status code semantics) — 35
- Routing / mount / include_router / trailing slash — 47
PRs welcome. Please run before opening one:
cargo fmt --all
cargo clippy --workspace --all-targets -- -A warnings
HYPERFASTAPI_AS_FASTAPI=1 PYTHONPATH=tests python -m pytest tests/conformance -qOr install the optional pre-commit hook so cargo fmt runs automatically on every commit:
pip install pre-commit
pre-commit installCI runs the full matrix on every PR (Linux/macOS/Windows × Python 3.10–3.13).
MIT — see LICENSE.
This project depends on PyO3, hyper, tokio, pydantic-core, and the upstream FastAPI Python API surface (Apache 2.0). Many thanks to those projects.



