hyperfastapi

A drop-in FastAPI-compatible web framework with a Rust core (PyO3 + hyper) — same Python API, 5× single-process and 12×+ multi-process throughput.

from hyperfastapi import FastAPI

app = FastAPI()

@app.get("/")
def hello() -> dict:
    return {"hello": "world"}

# Run via the built-in Rust hyper server (no uvicorn needed):
if __name__ == "__main__":
    app.run_native(host="127.0.0.1", port=8000, workers=4)

pip install hyperfastapi
python app.py     # 188,000 requests/sec on a 4-core box

Why hyperfastapi?

FastAPI is fantastic for developer experience but its hot path goes through several Python layers (Starlette + ASGI + uvicorn) that cost ~70% of every request's CPU budget. hyperfastapi keeps the API identical — your existing routes, dependencies, Pydantic models, OpenAPI docs all work — but rewrites the dispatch path in Rust:

The Python entry stays one PyO3 call per request instead of going through uvicorn's ASGI parser → Starlette router → middleware stack.
The JSON encoder is native Rust (ryu/itoa + manual escape) — skipping json.dumps for the common dict/list/scalar payloads.
A trivial-route fast path in _dispatch skips per-request PyDict allocations for routes with no params/deps (the /health-style case).
The optional run_native() mode boots a Rust hyper HTTP/1.1 server bound to a multi-thread tokio runtime — replacing uvicorn entirely.

You keep all of FastAPI's ergonomics. You get most of actix-web's throughput.

Performance

Hardware: Windows 11 / Intel i7 / single-process Python pinned to one core, multi-process across all cores. Load generator: bombardier at concurrency=100, 5 seconds per scenario.

Single-process throughput (1 Python interpreter)

Scenario	FastAPI + uvicorn	hyperfastapi + hyper	Speedup
GET `/async`	9,455	122,435	12.9 ×
GET `/with-middleware`	3,489	106,702	30.6 ×
GET `/plain`	3,692	103,175	27.9 ×
GET `/with-query`	3,366	38,960	11.6 ×
POST `/post-validated`	2,790	33,348	11.9 ×
GET `/with-chain`	2,026	28,213	13.9 ×

Multi-process throughput (4 Python procs)

Scenario	FastAPI + uvicorn (workers=4)	hyperfastapi + hyper (4 procs)	Speedup
GET `/plain`	21,518	249,391	11.6 ×
GET `/with-middleware`	19,053	247,792	13.0 ×
GET `/async`	66,099	229,734	3.5 ×
GET `/with-query`	17,466	99,415	5.7 ×
POST `/post-validated`	13,187	91,696	7.0 ×
GET `/with-chain`	8,646	76,854	8.9 ×

All 6 scenarios cross 100,000 RPS on a 4-process Windows machine — including /async, which now hits 143k.

Speedup chart

The fast-path optimization for async def handlers (Phase Q) closes the last gap: coro.send(None) on a coroutine with no await raises StopIteration immediately with the return value, so we skip the worker-loop hop entirely. /async-io (with a real await) still takes the slower event-loop path.

WebSocket throughput

WebSocket echo round-trip throughput (Rust client, 64-byte payload):

Connections	hyperfastapi (run_native)	FastAPI + uvicorn	Speedup	Max latency (hyper / uvicorn)
1	20,588	15,233	1.35 ×	0.19 ms / 0.32 ms
4	29,451	19,913	1.48 ×	0.41 ms / 0.54 ms
8	34,377	16,108	2.13 ×	0.82 ms / 3.06 ms
16	44,734	13,964	3.20 ×	0.90 ms / 6.54 ms
32	40,875	22,281	1.83 ×	2.48 ms / 4.55 ms
64	41,805	18,218	2.30 ×	5.10 ms / 32.69 ms

uvicorn's WebSocket throughput peaks at ~20k msg/s and degrades under high concurrency (drops to 14k at 16 connections, 18k at 64). hyperfastapi continues to scale, holding above 40k msg/s past 16 connections, with max latency 6× lower under load.

Reproduce these numbers: see Benchmarking below. Each run prints raw RPS so you can verify on your own hardware.

Features

Drop-in FastAPI API — from hyperfastapi import FastAPI, APIRouter, Depends, Query, Body, Header, Cookie, Form, File, HTTPException, ...
Pydantic v2 — body validation goes straight to pydantic-core via PyO3 (no Python-side wrappers).
Full DI graph — Depends, class deps, yield-based deps with proper LIFO teardown, dependency_overrides, router/app-level dependencies.
OpenAPI 3.1 — /openapi.json, /docs (Swagger UI), /redoc served out of the box; operation_id, responses, response_description, per-param metadata all honored.
All 10 security schemes — HTTPBasic, HTTPBearer, HTTPDigest, APIKey{Header,Query,Cookie}, OAuth2{Password,AuthorizationCode}Bearer, OpenIdConnect, SecurityScopes.
WebSockets — @app.websocket("/ws") via Starlette's WebSocket wrapper.
Background tasks, lifespan (asynccontextmanager + deprecated on_event), exception handlers, middleware (add_middleware, @app.middleware("http")).
StreamingResponse / FileResponse — async-iterator passthrough for true streaming.
StaticFiles mounting + Jinja2 templates.
Two runtimes — uvicorn (full ASGI compat) or app.run_native() (Rust hyper, max throughput).
abi3 wheels — single Linux/macOS/Windows wheel covers Python 3.10..latest.
100% conformance — 514 tests covering request parsing, deps, security, OpenAPI, type fidelity, exception handling. Run them yourself with pytest tests/conformance.

Install

From PyPI (release wheels)

pip install hyperfastapi

Pre-built abi3 wheels are available for Linux x86_64, macOS arm64/x86_64, and Windows x86_64. One wheel works on Python 3.10, 3.11, 3.12, and 3.13.

From source

You need a Rust toolchain (rustup) and Python 3.10+. The build uses maturin.

Linux

# 1. Install Rust if you don't have it
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env

# 2. Build + install
git clone https://github.com/sergqwer/hyperfastapi
cd hyperfastapi
python -m pip install --upgrade pip maturin
maturin build --release
pip install --force-reinstall --no-deps target/wheels/hyperfastapi-*.whl

macOS

# 1. Install Rust + Python 3
brew install rustup-init python@3.12
rustup-init -y
source $HOME/.cargo/env

# 2. Build + install
git clone https://github.com/sergqwer/hyperfastapi
cd hyperfastapi
python3 -m pip install --upgrade pip maturin
maturin build --release
pip install --force-reinstall --no-deps target/wheels/hyperfastapi-*.whl

For Apple Silicon, the wheel is named *macosx_11_0_arm64.whl. For Intel Macs it's *macosx_10_12_x86_64.whl.

Windows (PowerShell)

# 1. Install Rust (rustup-init.exe from https://rustup.rs/)
#    Choose default toolchain: stable-x86_64-pc-windows-msvc

# 2. Build + install
git clone https://github.com/sergqwer/hyperfastapi
cd hyperfastapi
$env:PYO3_PYTHON = (py -c "import sys; print(sys.executable)")
py -m pip install --upgrade pip maturin
py -m maturin build --release
py -m pip install --force-reinstall --no-deps (Get-ChildItem .\target\wheels\hyperfastapi-*.whl).FullName

If you get error: Microsoft Visual C++ 14.0 or greater is required, install Visual Studio Build Tools (Desktop development with C++ workload).

Verify

python -c "from hyperfastapi import FastAPI; print(FastAPI.__module__)"
# → hyperfastapi.applications

Quickstart

Basic app

from hyperfastapi import FastAPI, Depends, HTTPException
from pydantic import BaseModel
from typing import Annotated

app = FastAPI(title="Demo")

class Item(BaseModel):
    name: str
    price: float
    qty: int = 1

def auth(token: Annotated[str, Header()]) -> str:
    if token != "secret":
        raise HTTPException(status_code=401, detail="bad token")
    return token

@app.get("/items/{item_id}")
def read_item(item_id: int, q: str | None = None) -> dict:
    return {"item_id": item_id, "q": q}

@app.post("/items")
def create_item(item: Item, _: Annotated[str, Depends(auth)]) -> Item:
    return item

Run via uvicorn (full ASGI compat)

uvicorn app:app --host 0.0.0.0 --port 8000 --workers 4

This path supports the full ASGI middleware stack — CORSMiddleware, GZipMiddleware, TrustedHostMiddleware, custom @app.middleware("http"), etc.

Run via the built-in Rust hyper server (max throughput)

if __name__ == "__main__":
    app.run_native(host="0.0.0.0", port=8000, workers=4)

Or from the CLI:

python -c "from app import app; app.run_native(host='0.0.0.0', port=8000, workers=4)"

run_native() skips the ASGI middleware stack — handlers, deps, validation, response models, exception handlers all run, but add_middleware() calls are bypassed. Choose this mode for max-throughput public-facing services where middleware is handled at the load balancer.

Run multiple Python processes (recommended for production)

Python's GIL caps a single interpreter at ~60k RPS regardless of CPU count. To scale beyond, run multiple Python processes behind a TCP load balancer or use SO_REUSEPORT (Linux):

# Linux: 4 procs sharing the same port via SO_REUSEPORT
python -c "from app import app; app.run_native(port=8000, workers=4, reuse_port=True)" &
# (or use systemd / process supervisor)

# Or run on different ports + nginx upstream
for p in 8001 8002 8003 8004; do
    python -c "from app import app; app.run_native(port=$p)" &
done

Protocol support

run_native() speaks every modern HTTP flavor over a single command:

Protocol	Transport	Status	How
HTTP/1.0	TCP	✅ Supported	Default; `Connection: close`
HTTP/1.1 + keep-alive	TCP	✅ Supported	Default
HTTP/2 cleartext (h2c)	TCP	✅ Supported	Auto-detected from client preface
HTTPS (TLS 1.2/1.3)	TLS / TCP	✅ Supported	`tls_cert=`, `tls_key=`
HTTP/2 + TLS	TLS / TCP	✅ Supported	ALPN-negotiated (h2 / http/1.1)
HTTP/3 (QUIC)	UDP	✅ Supported	`http3=True` (requires TLS)

# HTTP/1.1 + h2c plaintext (no TLS)
app.run_native(host="0.0.0.0", port=8000)

# HTTPS = HTTP/1.1 + HTTP/2 over TLS (ALPN)
app.run_native(host="0.0.0.0", port=8443,
               tls_cert="/etc/cert.pem", tls_key="/etc/key.pem")

# Full stack: HTTP/1.1 + HTTP/2 + HTTP/3 (QUIC) over the same port
app.run_native(host="0.0.0.0", port=8443,
               tls_cert="/etc/cert.pem", tls_key="/etc/key.pem",
               http3=True)

When http3=True, HTTPS responses include alt-svc: h3=":<port>"; ma=86400 so HTTP/3-aware clients automatically upgrade.

See docs/protocols.md for the protocol cheat sheet plus end-to-end smoke-test instructions.

Compatibility

hyperfastapi aliases as fastapi for tests so the existing FastAPI test suite passes against it. To use it as a drop-in replacement in an existing codebase:

import sys
import hyperfastapi
sys.modules.setdefault("fastapi", hyperfastapi)
# ... now `from fastapi import FastAPI` resolves to hyperfastapi.FastAPI

Or set HYPERFASTAPI_AS_FASTAPI=1 and the patched tests/conftest.py does it automatically.

What requires uvicorn

ASGI middleware (CORSMiddleware, GZipMiddleware, custom @app.middleware). When using run_native(), these are no-ops. Use uvicorn if your app needs them.

Architecture

                    ┌────────────────────────────────────────┐
                    │             User code (Python)          │
                    │   @app.get / @app.post / Depends / ...  │
                    └─────────────────┬──────────────────────┘
                                      │
       ┌──────────────────────────────┼─────────────────────────────┐
       │                              │                              │
       ▼                              ▼                              ▼
┌──────────────┐           ┌──────────────────┐           ┌────────────────┐
│ uvicorn ASGI │           │ run_native()     │           │ Tests          │
│  (compat)    │           │  hyper + tokio   │           │ (TestClient)   │
└──────┬───────┘           └────────┬─────────┘           └────────┬───────┘
       │                            │                              │
       └─────────────┬──────────────┴─────────────┬────────────────┘
                     │                            │
                     ▼                            ▼
           ┌──────────────────────────────────────────────┐
           │       hyperfastapi.applications.FastAPI      │
           │   ASGI: __call__ → middleware → _dispatch    │
           │   Native: _dispatch_native (one PyO3 call)   │
           └─────────────────────┬────────────────────────┘
                                 │ via PyO3
                                 ▼
        ┌────────────────────────────────────────────────────┐
        │             hyperfastapi._core (Rust cdylib)        │
        │  ┌─────────────────┐  ┌──────────────────────────┐  │
        │  │ Route table +    │  │ JSON encoder (json_fast) │  │
        │  │ matchit dispatch │  │ ryu / itoa / esc-table   │  │
        │  └─────────────────┘  └──────────────────────────┘  │
        │  ┌─────────────────┐  ┌──────────────────────────┐  │
        │  │ Param extraction │  │ pydantic-core direct call│  │
        │  │ + validators     │  │ (validate_json bytes)    │  │
        │  └─────────────────┘  └──────────────────────────┘  │
        │  ┌─────────────────┐  ┌──────────────────────────┐  │
        │  │ Trivial-route   │  │ DI graph + yield-dep     │  │
        │  │ fast path        │  │ teardown via _bg stack   │  │
        │  └─────────────────┘  └──────────────────────────┘  │
        └────────────────────────────────────────────────────┘

Highlights:

Decorator-time route compilation (compile_route_plan) walks inspect.signature → builds a flat plan of (name, source, type, default, validators) entries. Dispatch never re-introspects.
Side-channel via _bg (_current_tasks / _current_request / _current_yield_gens) for per-request state that doesn't fit in the (status, headers, body) tuple.
Persistent worker loop for async coroutines submitted from sync dispatch — avoids per-request thread spawn (~50µs/req vs ~500µs).

See docs/architecture.md (TBD) for the full breakdown.

Benchmarking

The full benchmark suite lives in tests/perf/ and uses bombardier.

# Cross-backend HTTP comparison (vanilla fastapi+uvicorn vs hyperfastapi+hyper)
HYPERFASTAPI_AS_FASTAPI=1 python tests/perf/compare_backends.py --duration 5

# Multi-process aggregate (4 separate Python procs)
HYPERFASTAPI_AS_FASTAPI=1 python tests/perf/bench_hyper_multiproc.py --workers 4 --duration 5

# WebSocket throughput (Rust client) — bypasses Python asyncio.gather pathology
cargo build --release -p ws-bench
./target/release/ws-bench --url ws://127.0.0.1:8765/echo --connections 16 --messages 1000

# Render charts
python docs/perf/render_charts.py
python docs/perf/render_ws_chart.py

Results land in docs/perf/results.json + docs/perf/multiproc.json; charts in docs/img/.

Conformance

# Run the same FastAPI test suite against hyperfastapi
HYPERFASTAPI_AS_FASTAPI=1 PYTHONPATH=tests python -m pytest tests/conformance -q

Expected output:

514 passed in 1.5s

Coverage by area (514 total):

Request params (path/query/header/cookie/body/form/file) — 112
Responses (JSONResponse, HTMLResponse, StreamingResponse, FileResponse, status_code, response_class, response_model) — 72
Dependencies (Depends, class deps, yield deps, overrides, router-level) — 47
Security (10 schemes + scopes + misuse) — 80
OpenAPI / Swagger UI / ReDoc — 50
WebSockets — 6
Exceptions / middleware / background tasks / lifespan — 40
StaticFiles / templating / encoders — 25
Type fidelity (JSON booleans, Unicode, status code semantics) — 35
Routing / mount / include_router / trailing slash — 47

Contributing

PRs welcome. Please run before opening one:

cargo fmt --all
cargo clippy --workspace --all-targets -- -A warnings
HYPERFASTAPI_AS_FASTAPI=1 PYTHONPATH=tests python -m pytest tests/conformance -q

Or install the optional pre-commit hook so cargo fmt runs automatically on every commit:

pip install pre-commit
pre-commit install

CI runs the full matrix on every PR (Linux/macOS/Windows × Python 3.10–3.13).

License

MIT — see LICENSE.

This project depends on PyO3, hyper, tokio, pydantic-core, and the upstream FastAPI Python API surface (Apache 2.0). Many thanks to those projects.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
crates		crates
docs		docs
python/hyperfastapi		python/hyperfastapi
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hyperfastapi

Why hyperfastapi?

Performance

Single-process throughput (1 Python interpreter)

Multi-process throughput (4 Python procs)

Speedup chart

WebSocket throughput

Features

Install

From PyPI (release wheels)

From source

Linux

macOS

Windows (PowerShell)

Verify

Quickstart

Basic app

Run via uvicorn (full ASGI compat)

Run via the built-in Rust hyper server (max throughput)

Run multiple Python processes (recommended for production)

Protocol support

Compatibility

What requires uvicorn

Architecture

Benchmarking

Conformance

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

hyperfastapi

Why hyperfastapi?

Performance

Single-process throughput (1 Python interpreter)

Multi-process throughput (4 Python procs)

Speedup chart

WebSocket throughput

Features

Install

From PyPI (release wheels)

From source

Linux

macOS

Windows (PowerShell)

Verify

Quickstart

Basic app

Run via uvicorn (full ASGI compat)

Run via the built-in Rust hyper server (max throughput)

Run multiple Python processes (recommended for production)

Protocol support

Compatibility

What requires uvicorn

Architecture

Benchmarking

Conformance

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages