Simple example demonstrating CPU-based serverless workers with automatic scaling on Runpod's infrastructure.
uv syncuv run flash loginOr create a .env file with RUNPOD_API_KEY=your_api_key_here.
python cpu_worker.pyThe function executes on a Runpod CPU worker and prints the result directly:
Testing CPU worker with payload: {'name': 'Testing CPU worker'}
Result: {'status': 'success', 'message': 'Hello, Testing CPU worker!', 'worker_type': 'CPU', ...}
First run takes 30-60 seconds (provisioning). Subsequent runs take 2-3 seconds.
To test via HTTP endpoints instead:
uv run flash runVisit http://localhost:8888/docs for interactive API documentation.
curl -X POST http://localhost:8888/cpu_worker/runsync \
-H "Content-Type: application/json" \
-d '{"name": "Flash User"}'Simple CPU-based serverless function that:
- Processes requests without GPU overhead
- Returns system and platform information
- Scales from 0-3 workers automatically
- Runs on general-purpose CPU instances
The worker demonstrates:
- Remote execution with the
@Endpointdecorator - CPU resource configuration via
cpu=parameter - Automatic scaling via
workers=parameter - Lightweight API request handling
QB (queue-based) endpoints are auto-generated from @Endpoint functions. Visit /docs for the full API schema.
Executes a simple CPU worker and returns a greeting with system information.
Request:
{
"name": "Flash User"
}Response:
{
"status": "success",
"message": "Hello, Flash User!",
"worker_type": "CPU",
"timestamp": "2024-01-24T10:30:45.123456",
"platform": "Linux",
"python_version": "3.11.0"
}02_cpu_worker/
├── cpu_worker.py # CPU worker with @Endpoint decorator
├── pyproject.toml # Project metadata
├── requirements.txt # Dependencies
├── .env.example # Environment variables template
└── README.md # This file
The @Endpoint decorator transparently executes functions on serverless infrastructure:
- Code runs locally during development
- Automatically deploys to Runpod when configured
- Handles serialization and resource management
from runpod_flash import Endpoint
@Endpoint(name="my-worker", cpu="cpu3c-1-2", workers=(0, 3))
async def my_function(data: dict) -> dict:
return {"result": "processed"}Available CPU configurations:
CpuInstanceType.CPU3G_2_8: 2 vCPU, 8GB RAM (General Purpose)CpuInstanceType.CPU3C_4_8: 4 vCPU, 8GB RAM (Compute Optimized)CpuInstanceType.CPU5G_4_16: 4 vCPU, 16GB RAM (Latest Gen)
CPU type can be specified as an enum or a string shorthand:
# enum
@Endpoint(name="worker", cpu=CpuInstanceType.CPU3C_1_2)
# string shorthand
@Endpoint(name="worker", cpu="cpu3c-1-2")The CPU worker scales to zero when idle:
- workers=(0, 3): Scale from 0 to 3 workers
- idle_timeout=5: 5 minutes before scaling down
python cpu_worker.pyuv run flash runChoose CPU workers for:
- API request handling
- Data processing and transformation
- Lightweight compute tasks
- Cost-sensitive workloads
- No GPU requirements
Compare with GPU workers when you need:
- Machine learning inference
- Image/video processing
- CUDA acceleration
- GPU-specific libraries (PyTorch, TensorFlow)
- Customize CPU type: Change
"cpu3c-1-2"to a different instance type - Add request validation and error handling
- Integrate with databases or external APIs
- Deploy to production with
flash deploy