diff --git a/README.md b/README.md
index 2c1f1eda..d442dbf4 100644
--- a/README.md
+++ b/README.md
@@ -340,6 +340,39 @@ results = await asyncio.gather(
 )
 ```
 
+### Load-Balanced Endpoints with HTTP Routing
+
+For API endpoints requiring low-latency HTTP access with direct routing, use load-balanced endpoints:
+
+```python
+from tetra_rp import LiveLoadBalancer, remote
+
+api = LiveLoadBalancer(name="api-service")
+
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int, y: int):
+    return {"result": x + y}
+
+@remote(api, method="GET", path="/api/health")
+def health_check():
+    return {"status": "ok"}
+
+# Call functions directly
+result = await process_data(5, 3)  # → {"result": 8}
+```
+
+**Key differences from queue-based endpoints:**
+- **Direct HTTP routing** - Requests routed directly to workers, no queue
+- **Lower latency** - No queuing overhead
+- **Custom HTTP methods** - GET, POST, PUT, DELETE, PATCH support
+- **No automatic retries** - Users handle errors directly
+
+Load-balanced endpoints are ideal for REST APIs, webhooks, and real-time services. Queue-based endpoints are better for batch processing and fault-tolerant workflows.
+
+For detailed information:
+- **User guide:** [Using @remote with Load-Balanced Endpoints](docs/Using_Remote_With_LoadBalancer.md)
+- **Runtime architecture:** [LoadBalancer Runtime Architecture](docs/LoadBalancer_Runtime_Architecture.md) - details on deployment, request flows, and execution
+
 ## How it works
 
 Flash orchestrates workflow execution through a sophisticated multi-step process:
@@ -460,6 +493,8 @@ After `flash build` completes:
 
 For more details on the handler architecture, see [docs/Runtime_Generic_Handler.md](docs/Runtime_Generic_Handler.md).
 
+For information on load-balanced endpoints (required for Mothership and HTTP services), see [docs/Load_Balancer_Endpoints.md](docs/Load_Balancer_Endpoints.md).
+
 #### Troubleshooting Build Issues
 
 **No @remote functions found:**
diff --git a/docs/LoadBalancer_Runtime_Architecture.md b/docs/LoadBalancer_Runtime_Architecture.md
new file mode 100644
index 00000000..b55d80f2
--- /dev/null
+++ b/docs/LoadBalancer_Runtime_Architecture.md
@@ -0,0 +1,708 @@
+# Load-Balanced Endpoint Runtime Architecture
+
+## Overview
+
+This document explains what happens after a load-balanced endpoint is deployed on RunPod and is actively running. It covers the deployment architecture, request flows, and execution patterns for both direct HTTP requests and @remote function calls.
+
+## Deployment Architecture
+
+### Container Image and Startup
+
+When you deploy a `LoadBalancerSlsResource` endpoint with `flash build` and `flash deploy`:
+
+```mermaid
+graph TD
+    A["User Code"] -->|flash build| B["Generate handler_service.py"]
+    B -->|FastAPI App| C["handler_service.py"]
+    C -->|flash deploy| D["Push to RunPod"]
+    D -->|Create Container| E["RunPod Container<br/>tetra-rp-lb image"]
+    E --> F["FastAPI Server<br/>uvicorn on port 8000"]
+    F --> G["Load your handler"]
+    G --> H["Endpoint Ready"]
+
+    style A fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style B fill:#ff6b35,stroke:#c41e0f,stroke-width:3px,color:#fff
+    style C fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style D fill:#ff6b35,stroke:#c41e0f,stroke-width:3px,color:#fff
+    style E fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style F fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+    style G fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+    style H fill:#2e7d32,stroke:#1b5e20,stroke-width:3px,color:#fff
+```
+
+**Important:** `endpoint_url` is auto-generated by RunPod after deployment
+- Cannot be specified by users
+- Generated as: `https://<runpod-base>/<endpoint-id>`
+- Automatically populated in the resource after `deploy()` completes
+- Available via `resource.endpoint_url` property (read-only)
+
+### What Gets Deployed
+
+The generated handler file contains:
+
+```python
+# handler_service.py (auto-generated)
+from fastapi import FastAPI
+from tetra_rp.runtime.lb_handler import create_lb_handler
+
+# User functions imported
+from api.endpoints import process_data
+from api.health import health_check
+
+# Route registry
+ROUTE_REGISTRY = {
+    ("POST", "/api/process"): process_data,
+    ("GET", "/api/health"): health_check,
+}
+
+# FastAPI app created
+app = create_lb_handler(ROUTE_REGISTRY)
+
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=8000)
+```
+
+**Container Setup:**
+- Base image: `runpod/tetra-rp-lb:latest` (contains FastAPI, uvicorn, dependencies)
+- Entrypoint: Runs `python handler_service.py`
+- Port: 8000 (internal)
+- RunPod exposes this via HTTPS endpoint URL
+- Health check: Polls `/ping` endpoint every 30 seconds with 15 second timeout per check
+- All HTTP requests to the endpoint include authentication via `RUNPOD_API_KEY` environment variable (if set)
+
+### Deployment Lifecycle
+
+```mermaid
+graph TD
+    A["LoadBalancerSlsResource created"] -->|flash build| B["Generate handler file"]
+    B -->|flash deploy| C["Push to RunPod"]
+    C --> D["RunPod creates container"]
+    D --> E["Container starts uvicorn"]
+    E --> F["FastAPI app loads"]
+    F --> G["Import user functions"]
+    G --> H["Register routes"]
+    H --> I["Endpoint ready"]
+    I --> J["Health checks pass"]
+    J --> K["Endpoint active"]
+```
+
+## Request Flow
+
+### Direct HTTP Request (User Routes)
+
+When a client makes an HTTP request to your deployed endpoint:
+
+```mermaid
+sequenceDiagram
+    participant Client
+    participant RunPod as RunPod Router
+    participant Container as Endpoint Container
+    participant FastAPI
+    participant UserFunc as User Function
+
+    Client->>RunPod: HTTPS POST /api/process
+    RunPod->>Container: Forward to port 8000
+    Container->>FastAPI: HTTP POST /api/process
+    FastAPI->>FastAPI: Match (POST, /api/process)<br/>in ROUTE_REGISTRY
+    FastAPI->>UserFunc: Call process_data(x=5, y=3)
+    UserFunc->>UserFunc: Execute function code
+    UserFunc-->>FastAPI: Return {"result": 8}
+    FastAPI->>FastAPI: Serialize to JSON
+    FastAPI-->>Container: HTTP 200 response
+    Container-->>RunPod: Response body
+    RunPod-->>Client: HTTPS response
+```
+
+**Example Flow:**
+
+```python
+# User code
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int, y: int):
+    return {"result": x + y}
+
+# Client request
+POST https://my-endpoint.runpod.ai/api/process
+Content-Type: application/json
+{"x": 5, "y": 3}
+
+# On RunPod:
+# 1. Request arrives at container port 8000
+# 2. FastAPI receives POST /api/process
+# 3. FastAPI parses JSON body: {"x": 5, "y": 3}
+# 4. FastAPI calls process_data(x=5, y=3)
+# 5. Function executes: returns {"result": 8}
+# 6. FastAPI serializes response
+# 7. Returns HTTP 200 with body {"result": 8}
+# 8. RunPod wraps in HTTPS response
+# 9. Client receives response
+```
+
+### @remote Function Call (Framework Endpoint)
+
+When you call an `@remote` decorated function from your local code:
+
+```mermaid
+sequenceDiagram
+    participant Local as Local Code
+    participant Stub as LoadBalancerSlsStub
+    participant RunPod as RunPod Router
+    participant Container as Endpoint Container
+    participant Execute as /execute Handler
+
+    Local->>Stub: await process_data(5, 3)
+    Stub->>Stub: Extract function source code<br/>via AST inspection
+    Stub->>Stub: Serialize args with cloudpickle<br/>+ base64 encode
+    Stub->>RunPod: POST /execute
+    RunPod->>Container: Forward to port 8000
+    Container->>Execute: HTTP POST /execute
+    Execute->>Execute: Parse JSON body
+    Execute->>Execute: Deserialize arguments<br/>(base64 decode + cloudpickle loads)
+    Execute->>Execute: Extract function code string
+    Execute->>Execute: exec(code) in isolated namespace
+    Execute->>Execute: Call func(5, 3)
+    Execute->>Execute: Get result: {"result": 8}
+    Execute->>Execute: Serialize result with cloudpickle<br/>+ base64 encode
+    Execute-->>Container: HTTP 200 {success: true, result: base64}
+    Container-->>RunPod: Response body
+    RunPod-->>Stub: Response body
+    Stub->>Stub: Deserialize result<br/>(base64 decode + cloudpickle loads)
+    Stub-->>Local: Return {"result": 8}
+```
+
+**Example Flow:**
+
+```python
+# Local code - after deployment
+api = LoadBalancerSlsResource(name="user-service",
+                            imageName="runpod/tetra-rp-lb:latest")
+
+# Deploy the endpoint (generates endpoint_url automatically)
+await api.deploy()
+# After deploy, api.endpoint_url is populated by RunPod
+# Example: "https://xxx-yyy-zzz.runpod.io"
+
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int, y: int):
+    return {"result": x + y}
+
+# Call the function locally
+result = await process_data(5, 3)
+
+# What happens:
+# 1. Decorator finds LoadBalancerSlsStub in registry
+# 2. Stub extracts function source code via AST
+# 3. Stub serializes arguments: cloudpickle.dumps([5, 3])
+# 4. Stub POST to https://my-endpoint.runpod.ai/execute
+# 5. Container receives request at /execute endpoint
+# 6. create_lb_handler's execute_remote_function handles it:
+#    a. Parses JSON body
+#    b. Deserializes arguments: [5, 3]
+#    c. Executes: exec(function_code) in isolated namespace
+#    d. Calls func(5, 3)
+#    e. Gets result: {"result": 8}
+#    f. Serializes result via cloudpickle
+#    g. Returns {success: true, result: base64_string}
+# 7. Stub deserializes result
+# 8. Returns {"result": 8} to caller
+```
+
+## Deployment Execution Model
+
+### Local Development (LiveLoadBalancer)
+
+When using `LiveLoadBalancer` for local testing, endpoints expose two types of routes:
+
+1. **User-Defined Routes** (e.g., `/api/health`, `/api/users`)
+   - Called via direct HTTP requests
+   - Called via `@remote` decorator (uses /execute internally)
+
+2. **Framework Endpoints**
+   - `/execute` - Accepts serialized function code for @remote execution
+   - `/ping` - Health check endpoint
+
+### Deployed Endpoints (LoadBalancerSlsResource)
+
+When deployed to production, endpoints **only expose user-defined routes** for security:
+
+1. **User-Defined Routes** (e.g., `/api/health`, `/api/users`)
+   - Called via direct HTTP requests from clients
+   - Called via `@remote` decorator (stub translates to HTTP requests to user routes)
+   - `/execute` endpoint NOT exposed (removed for security)
+
+2. **Framework Endpoints**
+   - `/ping` - Health check endpoint only
+
+### Request Handling by Execution Type
+
+#### Direct HTTP Requests (Always Works)
+
+```
+GET  /health
+POST /api/users
+PUT  /api/users/{user_id}
+DELETE /api/users/{user_id}
+```
+
+**Characteristics:**
+- Called by external HTTP clients
+- FastAPI handles routing automatically
+- Standard HTTP request/response
+- No serialization/deserialization
+- Direct function execution
+- Errors return HTTP error codes
+
+**Example:**
+```python
+@remote(api, method="GET", path="/health")
+def health_check():
+    return {"status": "ok"}
+
+# Client can call:
+GET https://my-endpoint.runpod.ai/health
+# Response: 200 OK {"status": "ok"}
+```
+
+#### @remote Function Calls (Different Local vs Deployed)
+
+**Local (LiveLoadBalancer):**
+```python
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int, y: int):
+    return {"result": x + y}
+
+# Called via @remote:
+result = await process_data(5, 3)  # Uses /execute internally (local only)
+```
+
+**Deployed (LoadBalancerSlsResource):**
+```python
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int, y: int):
+    return {"result": x + y}
+
+# Called via @remote:
+result = await process_data(5, 3)
+# Stub automatically translates to: POST /api/process {"x": 5, "y": 3}
+# No /execute endpoint involved (security)
+```
+
+**Key Differences:**
+- Local: Serializes function code, POSTs to /execute
+- Deployed: Maps arguments to JSON, POSTs to user-defined route
+- No code changes needed - stub handles both automatically
+
+**Important Implementation Detail: Stub Decision Logic**
+
+The stub determines which execution path to use by checking:
+1. Is this a `LiveLoadBalancer`? → Always use `/execute` for local development
+2. Does the function have `method` and `path` metadata from `@remote` decorator? → If yes, use user-defined route
+3. If routing metadata is incomplete or missing → Falls back to `/execute` (will fail on deployed endpoints)
+
+This means if you decorate a function for `LoadBalancerSlsResource` without specifying both `method` and `path`, the stub will attempt to use `/execute`, which doesn't exist in production. Always provide complete routing metadata for deployed endpoints.
+
+**Important Implementation Detail: Parameter Mapping**
+
+When using user-defined routes (deployed endpoints), the stub inspects the function signature and maps positional and keyword arguments to the HTTP request JSON body:
+
+```python
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int, y: int):
+    return {"result": x + y}
+
+# Local call:
+result = await process_data(5, 3)
+
+# Gets translated to:
+POST /api/process
+{
+  "x": 5,
+  "y": 3
+}
+```
+
+The stub uses Python's `inspect.signature()` to map positional args to parameter names. This requires that:
+- Function parameters are JSON-serializable types (int, str, bool, list, dict, None)
+- Function signature is available (defined at module level, not dynamically created)
+- No complex types (custom classes, Request objects, etc.) are used as parameters
+
+## Execution Flow Diagram
+
+```mermaid
+graph TD
+    A["HTTP Request arrives at<br/>RunPod Endpoint"] -->|HTTPS| B["RunPod Router<br/>Domain stripping"]
+    B -->|Strips domain<br/>Forwards to container| C["Container Port 8000<br/>uvicorn/FastAPI"]
+
+    C -->|Route decision| D{Is it /execute?}
+
+    D -->|Yes: Framework| E["Framework Handler<br/>execute_remote_function"]
+    D -->|No: User Route| F["FastAPI Router<br/>Match method + path in<br/>ROUTE_REGISTRY"]
+
+    E --> E1["1. Parse JSON body"]
+    E1 --> E2["2. Deserialize args/kwargs<br/>base64 + cloudpickle"]
+    E2 --> E3["3. exec function_code<br/>in isolated namespace"]
+    E3 --> E4["4. Call func with args"]
+    E4 --> E5["5. Serialize result<br/>cloudpickle + base64"]
+    E5 --> G["Build Response<br/>success: true/false"]
+
+    F --> F1["1. Find handler function<br/>from ROUTE_REGISTRY"]
+    F1 --> F2["2. Parse request parameters"]
+    F2 --> F3["3. Call function<br/>with parameters"]
+    F3 --> F4["4. Get result"]
+    F4 --> G
+
+    G -->|Serialize response| H["FastAPI Response Obj<br/>JSON or {success, result}"]
+    H -->|Wrap in HTTPS| I["RunPod Router<br/>Wraps response"]
+    I -->|Send back| J["HTTP Response to Client"]
+
+    style A fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style B fill:#ff6b35,stroke:#c41e0f,stroke-width:3px,color:#fff
+    style C fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style D fill:#ff6b35,stroke:#c41e0f,stroke-width:3px,color:#fff
+    style E fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+    style F fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+    style E1 fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style E2 fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style E3 fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style E4 fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style E5 fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style F1 fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style F2 fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style F3 fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style F4 fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style G fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style H fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style I fill:#ff6b35,stroke:#c41e0f,stroke-width:3px,color:#fff
+    style J fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+```
+
+## Security Model at Runtime
+
+### /execute Endpoint
+
+The `/execute` endpoint is an internal framework endpoint that:
+
+1. **Accepts arbitrary Python code** (serialized as string)
+2. **Executes it** in an isolated namespace using Python's `exec()`
+3. **Returns results** back to caller
+
+**Critical Security Model:**
+
+The `/execute` endpoint is **only exposed on `LiveLoadBalancer` for local development**. It is **explicitly removed from deployed `LoadBalancerSlsResource` endpoints** for security reasons.
+
+**Why This Design Is Necessary:**
+
+The `/execute` endpoint accepts and executes arbitrary Python code sent in HTTP requests. An unauthorized user with access to this endpoint could:
+- Execute system commands (e.g., `os.system()`)
+- Access file system data (e.g., read environment variables, credentials)
+- Modify application state or data
+- Use your infrastructure for malicious purposes
+
+**Why This Is Secure When Used Correctly:**
+
+- In `LiveLoadBalancer` (local development): Code originates from your own `@remote` decorator
+- You control what function code is serialized and sent
+- Only accessible during local testing, never exposed publicly
+- Same trusted-client model as queue-based serverless endpoints
+
+**What Happens When Deployed:**
+
+```
+LiveLoadBalancer (local):
+- /execute endpoint: INCLUDED (for @remote function execution)
+- User routes: Included
+- Safe because: Only you can run your code locally
+
+LoadBalancerSlsResource (deployed):
+- /execute endpoint: REMOVED for security
+- User routes: Included
+- Safe because: No arbitrary code execution possible
+```
+
+**If /execute Was Exposed (Don't Do This):**
+
+```python
+# Attacker's request
+POST https://my-endpoint.runpod.ai/execute
+{
+  "function_name": "malicious",
+  "function_code": "import os; os.system('rm -rf /')",
+  "args": [],
+  "kwargs": {}
+}
+
+# This would execute arbitrary system commands on your infrastructure
+```
+
+**Best Practices:**
+
+- Never manually add `/execute` to deployed endpoints
+- Use the default `create_lb_handler()` behavior (removes `/execute`)
+- Always use `LoadBalancerSlsResource` for production (not `LiveLoadBalancer`)
+- Test locally with `LiveLoadBalancer` first
+- For debugging deployed endpoints, use container logs, not code injection
+
+## Concurrency and Scaling
+
+### How RunPod Handles Concurrent Requests
+
+```mermaid
+graph TD
+    A["Request 1<br/>POST /api/process"] -->|→ Worker 1| B["Container [Worker 1]<br/>Executes Request 1"]
+    C["Request 2<br/>POST /api/users"] -->|→ Worker 1| D["Queued in Worker 1"]
+    D -->|Worker available| E["Container [Worker 1]<br/>Executes Request 2<br/>Concurrently"]
+    F["Request 3<br/>POST /api/health"] -->|→ Worker 2| G["Container [Worker 2]<br/>Executes Request 3"]
+
+    H["RunPod Scaler<br/>REQUEST_COUNT"] -->|Queue grows| I["Monitor Queue Depth"]
+    I -->|Q ≥ 3| J["Spin up Worker 3"]
+    I -->|Q ≥ 6| K["Spin up Worker 4"]
+    I -->|Q empty| L["Wind down Workers"]
+
+    style A fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style B fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+    style C fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style D fill:#ff6b35,stroke:#c41e0f,stroke-width:3px,color:#fff
+    style E fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+    style F fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style G fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+    style H fill:#ff6b35,stroke:#c41e0f,stroke-width:3px,color:#fff
+    style I fill:#ff6b35,stroke:#c41e0f,stroke-width:3px,color:#fff
+    style J fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style K fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+    style L fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
+```
+
+### Function Execution
+
+- Each request executes in isolated context
+- async functions execute with asyncio
+- Multiple requests can process concurrently (with async)
+- Synchronous functions block worker thread
+
+**Example Concurrency:**
+
+```python
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int):
+    import time
+    await asyncio.sleep(10)  # Simulate work
+    return {"result": x}
+
+# If 5 requests come in simultaneously:
+# - Request 1: await asyncio.sleep(10) → Worker 1
+# - Request 2: await asyncio.sleep(10) → Worker 1 (concurrent)
+# - Request 3: await asyncio.sleep(10) → Worker 1 (concurrent)
+# - Request 4: await asyncio.sleep(10) → Worker 2 (new worker)
+# - Request 5: await asyncio.sleep(10) → Worker 2 (concurrent)
+#
+# All 5 complete in ~10s (concurrent within workers)
+```
+
+## Error Handling at Runtime
+
+### Client Errors
+
+```
+POST https://endpoint.runpod.ai/api/users
+{"invalid": "json"
+
+# Response: 422 Unprocessable Entity
+{
+  "detail": [
+    {
+      "type": "json_error",
+      "loc": ["body"],
+      "msg": "JSON decode error"
+    }
+  ]
+}
+```
+
+### Function Errors
+
+```
+@remote(api, method="POST", path="/api/users")
+async def create_user(name: str):
+    if not name:
+        raise ValueError("Name required")
+    return {"id": 1, "name": name}
+
+# Call with invalid data:
+POST https://endpoint.runpod.ai/api/users
+{"name": ""}
+
+# Response: 422 Validation Error or 500 Internal Error
+# (depending on where error occurs)
+```
+
+### @remote Execution Errors
+
+```python
+# Local code
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int):
+    raise RuntimeError("Processing failed")
+
+result = await process_data(5)
+# Raises RuntimeError: "Remote execution failed: Processing failed"
+```
+
+## Performance Characteristics
+
+### Request Latency (approximate)
+
+```
+Direct HTTP Request:
+- Request → RunPod Router: 10-50ms
+- FastAPI routing: 1-5ms
+- Function execution: Variable
+- Serialization: Variable
+- Response: 10-50ms
+Total (no-op function): 30-110ms
+
+@remote Function Call:
+- Function serialization: 1-10ms
+- HTTP request to /execute: 10-50ms
+- Deserialization: 1-10ms
+- Function execution: Variable
+- Result serialization: 1-10ms
+- Result deserialization: 1-10ms
+- Response: 10-50ms
+Total (no-op function): 40-150ms
+```
+
+### Memory Usage
+
+- FastAPI app baseline: ~50-100MB
+- Per function in namespace: ~0.5-5MB
+- Serialized args/result: Variable (depends on data size)
+- RunPod allocates: Depends on pod type
+
+### Request Size Limits
+
+- RunPod has limits on request body size
+- Serialized data (via cloudpickle) increases size
+- Large arguments may hit limits
+- Consider streaming for large payloads
+
+## Monitoring and Debugging at Runtime
+
+### Logs Available on RunPod
+
+```
+Container logs (uvicorn/FastAPI):
+- Request arrival
+- Route matching
+- Function execution
+- Errors and exceptions
+- Response generation
+
+Environment:
+- Pod ID
+- Worker ID
+- GPU allocation
+- Memory usage
+```
+
+### Health Checks
+
+```
+GET https://endpoint.runpod.ai/ping
+Response: 200 OK {"status": "healthy"}
+
+RunPod polls /ping every 30 seconds
+- 200 OK → Worker healthy
+- Non-200 → Worker unhealthy
+- No response → Worker down
+- Unhealthy workers replaced
+```
+
+### Common Issues at Runtime
+
+**"Connection refused"**
+- Container not running
+- Uvicorn failed to start
+- Check container logs
+
+**"Timeout after 30s"**
+- Function took >30s
+- Network issue
+- Increase timeout if needed
+
+**"500 Internal Server Error"**
+- Function raised exception
+- Check container logs
+- Verify function code
+
+## Deployment Considerations
+
+### Image Selection
+
+```
+tetra-rp-lb:latest (default)
+- FastAPI + uvicorn pre-installed
+- Tetra runtime dependencies
+- Optimized for LB endpoints
+
+Custom image:
+- Must have FastAPI, uvicorn
+- Must expose port 8000
+- /ping endpoint should work
+```
+
+### Pod Configuration
+
+```python
+LoadBalancerSlsResource(
+    name="my-api",
+    imageName="runpod/tetra-rp-lb:latest",
+    gpus=[GpuGroup.AMPERE_80],      # Optional: if compute needed
+    instanceIds=[...],               # Or specify CPU instances
+    workersMax=5,                    # Max concurrent workers
+    template=PodTemplate(...)        # Storage, env vars, etc.
+)
+```
+
+### Network
+
+```
+Incoming:
+- HTTPS endpoint provided by RunPod
+- Auto-scaled based on REQUEST_COUNT
+- Health checks ensure availability
+
+Outgoing:
+- Your functions can make HTTP requests
+- Can access external APIs
+- Can access other RunPod endpoints
+```
+
+## Summary
+
+**What Happens at Runtime:**
+
+1. **Deployment** - FastAPI app runs in RunPod container
+2. **Request Arrival** - HTTP request reaches container
+3. **Routing** - FastAPI matches method/path to function
+4. **Execution** - Function code runs with parameters
+5. **Response** - Result serialized and returned
+
+**Two Execution Paths:**
+
+- **User Routes** - Direct HTTP from clients
+- **Framework Routes** - @remote calls from local code via /execute
+
+**Key Characteristics:**
+
+- ✅ Low latency (direct HTTP)
+- ✅ No queuing overhead
+- ✅ Concurrent request handling
+- ✅ FastAPI routing
+- ✅ Serialized function execution via @remote
+
+**Security:**
+
+- Protect `/execute` endpoint with authentication
+- Only allow @remote calls from trusted sources
+- Monitor endpoint usage
diff --git a/docs/Load_Balancer_Endpoints.md b/docs/Load_Balancer_Endpoints.md
new file mode 100644
index 00000000..ea551884
--- /dev/null
+++ b/docs/Load_Balancer_Endpoints.md
@@ -0,0 +1,406 @@
+# Load-Balanced Serverless Endpoints
+
+## Overview
+
+The `LoadBalancerSlsResource` class enables provisioning and management of RunPod load-balanced serverless endpoints. Unlike queue-based endpoints that process requests sequentially, load-balanced endpoints expose HTTP servers directly to clients, enabling REST APIs, webhooks, and real-time communication patterns.
+
+This resource type forms the foundation for the Mothership architecture, which requires HTTP-based endpoint discovery and cross-endpoint communication.
+
+## Design Context
+
+### Problem Statement
+
+RunPod supports two serverless endpoint models:
+
+1. **Queue-Based (QB)**: Sequential processing with automatic retry logic
+   - Requests queued and processed one-at-a-time
+   - Built-in error recovery
+   - Higher latency but fault-tolerant
+   - Fixed request/response format (JSON)
+
+2. **Load-Balanced (LB)**: Direct HTTP routing to healthy workers
+   - Requests routed directly to available workers
+   - No automatic retries
+   - Lower latency but less fault tolerance
+   - Custom HTTP endpoints and protocols
+
+### Design Decision
+
+Load-balanced endpoints require different provisioning and health check logic than queue-based endpoints. `LoadBalancerSlsResource` extends `ServerlessResource` with LB-specific behavior:
+
+- **Type enforcement**: Always deploys as LB (never QB)
+- **Scaler validation**: Requires REQUEST_COUNT scaler (not QUEUE_DELAY)
+- **Health checks**: Polls `/ping` endpoint to verify worker availability
+- **Post-deployment verification**: Waits for endpoint readiness before returning
+
+### Why This Matters
+
+The Mothership needs to serve as a directory server for child endpoints. This requires:
+- HTTP-based service discovery (not queue-based)
+- Ability to expose custom endpoints (`/directory`, `/ping`)
+- Health checking to verify children are ready before routing traffic
+
+## Architecture
+
+### High-Level Flow
+
+```mermaid
+graph TD
+    A["LoadBalancerSlsResource<br/>instance created"] --> B["Validate LB config<br/>Type=LB, REQUEST_COUNT scaler"]
+    B --> C["Check if already<br/>deployed"]
+    C -->|Already deployed| D["Return existing<br/>endpoint"]
+    C -->|New deployment| E["Call parent _do_deploy<br/>Create via RunPod API"]
+    E --> F["Poll /ping endpoint<br/>until healthy"]
+    F -->|Health check fails| G["Raise TimeoutError<br/>Deployment failed"]
+    F -->|Health check passes| H["Return deployed<br/>endpoint"]
+
+    style A fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style B fill:#ff6b35,stroke:#c41e0f,stroke-width:3px,color:#fff
+    style C fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style E fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
+    style F fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+    style G fill:#c41e0f,stroke:#a41100,stroke-width:3px,color:#fff
+    style H fill:#0d7f1f,stroke:#0d4f1f,stroke-width:3px,color:#fff
+```
+
+### Configuration Hierarchy
+
+```
+ServerlessResource (base class)
+├── type: ServerlessType = QB (queue-based)
+├── scalerType: ServerlessScalerType = QUEUE_DELAY
+├── Standard provisioning flow
+└── Standard health checks (RunPod SDK)
+
+LoadBalancerSlsResource (LB-specific subclass)
+├── type: ServerlessType = LB (always, cannot override)
+├── scalerType: ServerlessScalerType = REQUEST_COUNT (required)
+├── Enhanced provisioning flow
+│   ├── Validation before deploy
+│   └── Post-deployment health check polling
+├── Async health check (_check_ping_endpoint)
+├── Sync health check (is_deployed)
+└── Health check polling (_wait_for_health)
+```
+
+### Health Check Mechanism
+
+Load-balanced endpoints require a `/ping` endpoint that responds with:
+- **200 OK**: Worker is healthy and ready
+- **204 No Content**: Worker is initializing (transient)
+- **Other status**: Worker is unhealthy
+
+```mermaid
+sequenceDiagram
+    participant Deploy as LoadBalancerSlsResource
+    participant RunPod as RunPod API
+    participant Worker as LB Endpoint
+    participant Ping as /ping Handler
+
+    Deploy->>RunPod: saveEndpoint (type=LB)
+    RunPod->>Worker: Create endpoint
+    Worker->>Ping: Initialize
+
+    loop Health Check Polling
+        Deploy->>Worker: GET /ping
+        alt Healthy
+            Worker->>Ping: Handle request
+            Ping->>Worker: Return 200
+            Worker->>Deploy: 200 OK
+            Deploy->>Deploy: Deployment complete ✓
+        else Initializing
+            Worker->>Ping: Still starting
+            Ping->>Worker: Return 204
+            Worker->>Deploy: 204 No Content
+            Deploy->>Deploy: Wait and retry
+        else Unhealthy
+            Worker->>Worker: Error occurred
+            Worker->>Deploy: 500+ Error
+            Deploy->>Deploy: Retry with backoff
+        end
+    end
+```
+
+## Using @remote with LoadBalancer Endpoints
+
+This document focuses on the `LoadBalancerSlsResource` class implementation and architecture.
+
+**Related documentation:**
+- [Using @remote with Load-Balanced Endpoints](Using_Remote_With_LoadBalancer.md) - User guide for writing and testing load-balanced endpoints
+- [LoadBalancer Runtime Architecture](LoadBalancer_Runtime_Architecture.md) - Technical details on what happens when deployed on RunPod, request flows, and execution patterns
+
+**In the user guide, you'll learn:**
+- Quick start with `LiveLoadBalancer` for local development
+- HTTP routing with `method` and `path` parameters
+- Building and deploying load-balanced endpoints
+- Complete working examples
+- Troubleshooting common issues
+
+**In the runtime architecture guide, you'll learn:**
+- Deployment architecture and container setup
+- Request flow for both direct HTTP and @remote calls
+- Dual endpoint model (/execute vs user routes)
+- Security considerations
+- Performance characteristics and monitoring
+
+## Usage
+
+### Basic Provisioning
+
+```python
+from tetra_rp import LoadBalancerSlsResource
+
+# Create a load-balanced endpoint
+mothership = LoadBalancerSlsResource(
+    name="mothership",
+    imageName="my-mothership-app:latest",
+    workersMin=1,
+    workersMax=3,
+    env={
+        "FLASH_APP": "my_app",
+        "LOG_LEVEL": "INFO",
+    }
+)
+
+# Deploy endpoint
+deployed = await mothership.deploy()
+
+# Endpoint is now deployed and healthy
+print(f"Endpoint ID: {deployed.id}")
+print(f"Endpoint URL: {deployed.endpoint_url}")
+```
+
+### Configuration Options
+
+```python
+LoadBalancerSlsResource(
+    # Required fields
+    name="my-endpoint",
+    imageName="my-image:latest",
+
+    # Worker scaling (for LB, these control max concurrent requests)
+    workersMin=1,           # Min number of workers to keep warm
+    workersMax=5,           # Max workers to spin up
+    scalerValue=10,         # Target concurrent requests per worker
+
+    # Environment configuration
+    env={
+        "ENV_VAR": "value",
+    },
+
+    # Network and storage
+    networkVolume=NetworkVolume(...),  # Optional: persistent storage
+
+    # Deployment location
+    datacenter=DataCenter.EU_RO_1,  # Or US_EAST_1, etc
+
+    # Timeouts
+    executionTimeoutMs=600000,  # 10 minute timeout
+    idleTimeout=5,              # Seconds before scaling down
+)
+```
+
+### Health Checks
+
+```python
+# Synchronous health check (for compatibility with RunPod SDK)
+is_healthy = endpoint.is_deployed()
+
+# Asynchronous health check (for deployment flow)
+is_healthy = await endpoint.is_deployed_async()
+
+# Direct ping check (for debugging)
+is_responding = await endpoint._check_ping_endpoint()
+
+# Health check polling with custom parameters
+healthy = await endpoint._wait_for_health(
+    max_retries=20,
+    retry_interval=3,  # seconds
+)
+```
+
+## Validation and Error Handling
+
+### Configuration Validation
+
+The resource validates LB-specific constraints at creation and deployment time:
+
+```python
+# This will fail at validation time
+try:
+    bad_endpoint = LoadBalancerSlsResource(
+        name="test",
+        imageName="test:latest",
+        scalerType=ServerlessScalerType.QUEUE_DELAY,  # Not allowed for LB!
+    )
+    await bad_endpoint.deploy()
+except ValueError as e:
+    # Error: LoadBalancerSlsResource requires REQUEST_COUNT scaler,
+    # not QUEUE_DELAY. Load-balanced endpoints don't support queue-based scaling.
+    print(f"Validation failed: {e}")
+```
+
+### Deployment Errors
+
+```python
+try:
+    endpoint = LoadBalancerSlsResource(
+        name="mothership",
+        imageName="my-image:latest",
+    )
+    deployed = await endpoint.deploy()
+except TimeoutError as e:
+    # Health check failed after max retries
+    # Error: LB endpoint mothership (endpoint-id) failed to become
+    # healthy within 60s
+    print(f"Deployment failed: {e}")
+except ValueError as e:
+    # RunPod API error or configuration issue
+    print(f"Deployment error: {e}")
+```
+
+## Type Safety
+
+`LoadBalancerSlsResource` enforces LB type at the class level:
+
+```python
+# Type is always LB, cannot be changed
+endpoint = LoadBalancerSlsResource(
+    name="test",
+    imageName="image",
+    type=ServerlessType.QB,  # This gets overridden!
+)
+
+assert endpoint.type == ServerlessType.LB  # Always LB
+```
+
+## Performance Characteristics
+
+### Deployment Timeline
+
+| Phase | Duration | Notes |
+|-------|----------|-------|
+| API call | < 1s | RunPod endpoint creation |
+| Worker initialization | 30-60s | Endpoint starts up |
+| Health check polling | 5-50s | Depends on app startup time (10 retries × 5s = 50s max) |
+| **Total** | **35-110s** | Typical: 60-90s |
+
+### Health Check Polling
+
+```
+Attempt 1: GET /ping → No response (endpoint starting)
+  Wait 5s
+Attempt 2: GET /ping → 204 No Content (initializing)
+  Wait 5s
+Attempt 3: GET /ping → 200 OK (healthy) ✓
+  Deployment complete
+```
+
+Default configuration:
+- Max retries: 10
+- Retry interval: 5 seconds
+- Timeout per request: 5 seconds
+- Total timeout: ~50 seconds
+
+## Comparison with Standard Endpoints
+
+| Feature | Queue-Based (QB) | Load-Balanced (LB) |
+|---------|------------------|-------------------|
+| Request model | Sequential queue | Direct HTTP routing |
+| Retries | Automatic | Manual (client) |
+| Latency | Higher (queuing) | Lower (direct) |
+| Custom endpoints | Limited | Full HTTP support |
+| Scalability | Per-function | Per-worker |
+| Health checks | RunPod SDK | `/ping` endpoint |
+| Use cases | Batch processing | APIs, webhooks, real-time |
+| Suitable for | Workers | Mothership, services |
+
+## Implementation Details
+
+### Code Structure
+
+```
+LoadBalancerSlsResource (class)
+├── __init__(...)
+│   └── Enforce type=LB, scalerType=REQUEST_COUNT
+├── _validate_lb_configuration()
+│   └── Check scaler type, type field
+├── is_deployed_async()
+│   ├── Check endpoint ID
+│   └── Call _check_ping_endpoint()
+├── _check_ping_endpoint()
+│   ├── GET /ping endpoint
+│   └── Check status 200 or 204
+├── _wait_for_health(max_retries, retry_interval)
+│   ├── Loop polling
+│   ├── Exponential backoff
+│   └── Return after success or timeout
+├── _do_deploy()
+│   ├── Call _validate_lb_configuration()
+│   ├── Call parent _do_deploy()
+│   ├── Call _wait_for_health()
+│   └── Return deployed resource or raise TimeoutError
+└── is_deployed()
+    └── Sync wrapper using RunPod SDK
+```
+
+### Thread Safety
+
+- `is_deployed()` is thread-safe (uses RunPod SDK)
+- Async methods are safe for concurrent use
+- Health check polling handles multiple concurrent calls
+
+## Troubleshooting
+
+### Health Check Timeout
+
+**Problem**: Deployment times out at health check step
+
+**Causes**:
+- Endpoint failed to start (wrong image, runtime error)
+- `/ping` endpoint not implemented
+- `/ping` endpoint not responding within timeout
+- Firewall/network blocking requests
+
+**Solution**:
+- Verify image exists and runs correctly: `docker run my-image:latest`
+- Implement `/ping` endpoint that returns 200 OK
+- Check logs: `runpod-cli logs <endpoint-id>`
+- Increase timeout: `await endpoint._wait_for_health(max_retries=20)`
+
+### Configuration Validation Errors
+
+**Problem**: `ValueError: LoadBalancerSlsResource requires REQUEST_COUNT scaler`
+
+**Cause**: Scaler type set to QUEUE_DELAY
+
+**Solution**:
+```python
+# Remove scalerType specification (defaults to REQUEST_COUNT)
+endpoint = LoadBalancerSlsResource(
+    name="test",
+    imageName="image",
+    # scalerType NOT specified, defaults to REQUEST_COUNT
+)
+```
+
+### API Errors (401, 403, 429)
+
+**Problem**: RunPod GraphQL errors during deployment
+
+**Causes**:
+- Missing or invalid RUNPOD_API_KEY
+- Insufficient permissions
+- Rate limiting
+
+**Solution**:
+- Verify API key: `echo $RUNPOD_API_KEY`
+- Check RunPod dashboard permissions
+- Retry after delay for rate limits
+
+## Next Steps
+
+- **Mothership integration**: Use LoadBalancerSlsResource for Mothership endpoints
+- **Service discovery**: Implement `/directory` endpoint for child endpoint discovery
+- **Auto-provisioning**: Automatic child endpoint deployment on Mothership startup
+- **Cross-endpoint routing**: Route requests between endpoints using service discovery
diff --git a/docs/Using_Remote_With_LoadBalancer.md b/docs/Using_Remote_With_LoadBalancer.md
new file mode 100644
index 00000000..952b805a
--- /dev/null
+++ b/docs/Using_Remote_With_LoadBalancer.md
@@ -0,0 +1,548 @@
+# Using @remote with Load-Balanced Endpoints
+
+## Introduction
+
+Flash provides two ways to execute remote functions on serverless endpoints: queue-based (QB) and load-balanced (LB) endpoints. This guide covers using the `@remote` decorator with load-balanced endpoints for HTTP-based function execution.
+
+### Queue-Based vs Load-Balanced Endpoints
+
+**Queue-Based Endpoints** (ServerlessEndpoint, LiveServerless)
+- Requests queued and processed sequentially
+- Automatic retry logic on failure
+- Built-in fault tolerance
+- Higher latency (queuing + processing)
+- Fixed request/response format
+
+**Load-Balanced Endpoints** (LoadBalancerSlsResource, LiveLoadBalancer)
+- Requests routed directly to available workers
+- Direct HTTP execution, no queue
+- No automatic retries
+- Lower latency (direct HTTP)
+- Custom HTTP routes and methods
+
+### When to Use Each Type
+
+Use **Load-Balanced** when you need:
+- Low latency API endpoints
+- Custom HTTP routing (GET, POST, PUT, DELETE)
+- Direct HTTP response handling
+- Handling multiple routes on single endpoint
+
+Use **Queue-Based** when you need:
+- Automatic retry logic on failures
+- Sequential, fault-tolerant processing
+- Tolerance for higher latency
+- Simple request/response pattern
+
+## Quick Start
+
+### Basic Example with LiveLoadBalancer
+
+For local development, use `LiveLoadBalancer`:
+
+```python
+from tetra_rp import LiveLoadBalancer, remote
+
+# Create load-balanced endpoint
+api = LiveLoadBalancer(name="example-api")
+
+# Define HTTP-routed function
+@remote(api, method="POST", path="/api/greet")
+async def greet_user(name: str):
+    return {"message": f"Hello, {name}!"}
+
+# Call the function locally
+async def main():
+    result = await greet_user("Alice")
+    print(result)  # {"message": "Hello, Alice!"}
+
+if __name__ == "__main__":
+    import asyncio
+    asyncio.run(main())
+```
+
+Key points:
+- `method` parameter specifies HTTP method (GET, POST, PUT, DELETE, PATCH)
+- `path` parameter specifies URL route (must start with `/`)
+- Functions execute directly without deployment during development
+
+## HTTP Routing
+
+Load-balanced endpoints require explicit HTTP routing metadata in the `@remote` decorator.
+
+### Parameters
+
+**method** (required for LoadBalancerSlsResource)
+- Must be one of: GET, POST, PUT, DELETE, PATCH
+- Case-insensitive (POST, post, Post all work)
+
+**path** (required for LoadBalancerSlsResource)
+- Must start with `/` (e.g., `/api/process`, `/health`)
+- Can include path parameters (e.g., `/api/users/{user_id}`)
+- Cannot use reserved paths: `/execute`, `/ping`
+
+### Single Endpoint with Multiple Routes
+
+Multiple functions can share a single LoadBalancerSlsResource with different routes:
+
+```python
+from tetra_rp import LiveLoadBalancer, remote
+
+api = LiveLoadBalancer(name="user-service")
+
+@remote(api, method="GET", path="/users")
+def list_users():
+    return {"users": []}
+
+@remote(api, method="POST", path="/users")
+async def create_user(name: str, email: str):
+    return {"id": 1, "name": name, "email": email}
+
+@remote(api, method="GET", path="/users/{user_id}")
+def get_user(user_id: int):
+    return {"id": user_id, "name": "Alice"}
+
+@remote(api, method="DELETE", path="/users/{user_id}")
+async def delete_user(user_id: int):
+    return {"deleted": True}
+```
+
+When deployed:
+- Single `user-service` endpoint created
+- Four HTTP routes registered automatically
+- FastAPI handles routing to correct function
+
+### Reserved Paths
+
+The following paths are reserved by Flash and cannot be used as user-defined routes:
+
+- `/ping` - Health check endpoint (required, returns 200 OK)
+
+**Important Security Note:**
+- `/execute` - Framework endpoint for @remote stub execution (**only available with LiveLoadBalancer for local development**)
+  - Deployed `LoadBalancerSlsResource` endpoints **deliberately do NOT expose `/execute`** for security reasons
+  - The `/execute` endpoint accepts and executes arbitrary Python code - exposing it would allow remote code execution
+  - When using deployed endpoints, @remote calls are safely translated to HTTP requests to your user-defined routes
+  - Never manually add `/execute` to deployed endpoints
+
+Attempting to use `/ping` or `/execute` as user-defined routes will raise a validation error at build time.
+
+## Local Development
+
+### Using LiveLoadBalancer
+
+For local development and testing, use `LiveLoadBalancer` instead of `LoadBalancerSlsResource`:
+
+```python
+from tetra_rp import LiveLoadBalancer, remote
+
+api = LiveLoadBalancer(name="my-api")
+
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int, y: int):
+    return {"result": x + y}
+
+# In tests or scripts, call directly
+async def test():
+    result = await process_data(5, 3)
+    assert result == {"result": 8}
+```
+
+**Key differences:**
+- `LiveLoadBalancer` locks image to Tetra LB runtime (tetra-rp-lb)
+- Functions execute directly without deployment
+- Ideal for development and CI/CD testing
+- Same `@remote` decorator interface as production
+
+### Testing Patterns
+
+```python
+import pytest
+from tetra_rp import LiveLoadBalancer, remote
+
+api = LiveLoadBalancer(name="test-api")
+
+@remote(api, method="POST", path="/api/calculate")
+async def calculate(operation: str, a: int, b: int):
+    if operation == "add":
+        return a + b
+    elif operation == "multiply":
+        return a * b
+    else:
+        raise ValueError(f"Unknown operation: {operation}")
+
+@pytest.mark.asyncio
+async def test_calculate_add():
+    result = await calculate("add", 5, 3)
+    assert result == 8
+
+@pytest.mark.asyncio
+async def test_calculate_multiply():
+    result = await calculate("multiply", 5, 3)
+    assert result == 15
+
+@pytest.mark.asyncio
+async def test_calculate_invalid():
+    with pytest.raises(ValueError):
+        await calculate("unknown", 5, 3)
+```
+
+## Local vs Deployed Execution
+
+The behavior of `@remote` decorated functions differs between local development and deployed endpoints:
+
+### Local Development (LiveLoadBalancer)
+
+When using `LiveLoadBalancer` for local testing:
+- Functions decorated with `@remote` serialize their code and POST to `/execute` endpoint
+- The `/execute` endpoint accepts and executes the serialized function code
+- Useful for development and CI/CD testing before deployment
+
+### Deployed Endpoints (LoadBalancerSlsResource)
+
+When deployed to production:
+- Generated handlers do NOT expose `/execute` endpoint (security)
+- Functions decorated with `@remote` are called via HTTP requests to their user-defined routes
+- The stub automatically translates `@remote` calls into HTTP requests with mapped parameters
+- Example: `await process_data(5, 3)` becomes `POST /api/process {"x": 5, "y": 3}`
+
+### Migration from Local to Deployed
+
+When migrating code from local testing to production:
+- **No code changes needed** - `@remote` decorated functions work the same way
+- The stub automatically detects whether it's `LiveLoadBalancer` (local) or `LoadBalancerSlsResource` (deployed)
+- User-defined routes must be compatible with JSON serialization for parameters
+
+**Parameter Type Constraints on Deployed Endpoints:**
+
+When using deployed `LoadBalancerSlsResource` endpoints, function parameters are serialized to JSON in the HTTP request body. This means:
+
+**Supported types:**
+- Primitive types: `int`, `str`, `bool`, `float`
+- Collections: `list`, `dict`, `tuple`, `set`
+- Nested structures: `list[dict[str, int]]`, etc.
+- Optional types: `Optional[str]`, `Optional[int]`
+- Special: `None`
+
+**Unsupported types:**
+- Custom classes and dataclasses
+- Request objects (FastAPI Request, Starlette Request)
+- File/binary objects
+- Complex Python objects that can't serialize to JSON
+- Datetime objects (without custom serialization)
+
+**Example of parameter mapping:**
+
+```python
+# Local call:
+result = await process_data(5, "hello", [1, 2, 3])
+
+# Gets translated to deployed endpoint call:
+POST /api/process
+{
+  "x": 5,
+  "name": "hello",
+  "items": [1, 2, 3]
+}
+```
+
+If you need to use complex types (e.g., File uploads, custom objects), use direct HTTP calls instead of the `@remote` decorator for deployed endpoints. For local development with `LiveLoadBalancer`, complex types work because the entire function is serialized and executed.
+
+## Building and Deploying
+
+### Build Process
+
+When you run `flash build`, the system:
+
+1. **Scans** your code for `@remote` decorated functions
+2. **Extracts** HTTP routing metadata (method, path)
+3. **Generates** FastAPI application with routes
+4. **Creates** one handler file per LoadBalancerSlsResource
+5. **Validates** routes for conflicts and reserved paths
+
+Example generated handler:
+
+```python
+from fastapi import FastAPI
+from tetra_rp.runtime.lb_handler import create_lb_handler
+
+# Imported from user code
+from api.endpoints import process_data, health_check
+
+# Route registry built automatically
+ROUTE_REGISTRY = {
+    ("POST", "/api/process"): process_data,
+    ("GET", "/api/health"): health_check,
+}
+
+# FastAPI app created with routes
+app = create_lb_handler(ROUTE_REGISTRY)
+
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=8000)
+```
+
+### Deployment Workflow
+
+```bash
+# 1. Define functions with @remote decorator in your code
+# 2. Test locally with LiveLoadBalancer
+# 3. Build for production
+flash build
+
+# 4. Configure your endpoint (optional)
+# Edit flash.toml if needed to set image, GPU, etc.
+
+# 5. Deploy
+flash deploy
+
+# 6. Check deployment status
+flash status
+```
+
+### Verifying Deployment
+
+Once deployed, verify your endpoint:
+
+```bash
+# Check endpoint is healthy
+curl https://<endpoint-url>/ping
+# Expected response: {"status": "healthy"}
+
+# Call your function via HTTP
+curl -X POST https://<endpoint-url>/api/process \
+  -H "Content-Type: application/json" \
+  -d '{"x": 5, "y": 3}'
+```
+
+## Complete Working Example
+
+Here's a full example with multiple routes, error handling, and testing:
+
+```python
+"""
+user_service.py - Example load-balanced API service
+"""
+
+from tetra_rp import LoadBalancerSlsResource, remote
+from typing import Optional
+
+# For production, use LoadBalancerSlsResource
+# For local development, use LiveLoadBalancer
+api = LoadBalancerSlsResource(
+    name="user-service",
+    imageName="runpod/tetra-rp-lb:latest"
+)
+
+class UserNotFound(Exception):
+    pass
+
+# In-memory database for example
+users_db = {
+    1: {"id": 1, "name": "Alice", "email": "alice@example.com"},
+    2: {"id": 2, "name": "Bob", "email": "bob@example.com"},
+}
+
+@remote(api, method="GET", path="/health")
+def health_check():
+    """Health check endpoint."""
+    return {"status": "healthy"}
+
+@remote(api, method="GET", path="/users")
+def list_users():
+    """List all users."""
+    return {"users": list(users_db.values())}
+
+@remote(api, method="POST", path="/users")
+async def create_user(name: str, email: str):
+    """Create a new user."""
+    user_id = max(users_db.keys() or [0]) + 1
+    user = {"id": user_id, "name": name, "email": email}
+    users_db[user_id] = user
+    return user
+
+@remote(api, method="GET", path="/users/{user_id}")
+def get_user(user_id: int):
+    """Get a specific user."""
+    if user_id not in users_db:
+        raise UserNotFound(f"User {user_id} not found")
+    return users_db[user_id]
+
+@remote(api, method="PUT", path="/users/{user_id}")
+async def update_user(user_id: int, name: Optional[str] = None,
+                      email: Optional[str] = None):
+    """Update a user."""
+    if user_id not in users_db:
+        raise UserNotFound(f"User {user_id} not found")
+
+    user = users_db[user_id]
+    if name is not None:
+        user["name"] = name
+    if email is not None:
+        user["email"] = email
+    return user
+
+@remote(api, method="DELETE", path="/users/{user_id}")
+async def delete_user(user_id: int):
+    """Delete a user."""
+    if user_id not in users_db:
+        raise UserNotFound(f"User {user_id} not found")
+
+    del users_db[user_id]
+    return {"deleted": True}
+```
+
+### Testing the Example
+
+```python
+"""
+test_user_service.py
+"""
+
+import pytest
+from tetra_rp import LiveLoadBalancer, remote
+from typing import Optional
+
+# Use LiveLoadBalancer for testing
+api = LiveLoadBalancer(name="user-service-test")
+
+# Define functions (same as above but use test endpoint)
+# ... (function definitions) ...
+
+@pytest.mark.asyncio
+async def test_list_users():
+    users = list_users()
+    assert "users" in users
+    assert isinstance(users["users"], list)
+
+@pytest.mark.asyncio
+async def test_create_and_get_user():
+    # Create a user
+    new_user = await create_user("Charlie", "charlie@example.com")
+    assert new_user["name"] == "Charlie"
+    assert new_user["id"] > 0
+
+    # Get the user
+    user = get_user(new_user["id"])
+    assert user["name"] == "Charlie"
+
+@pytest.mark.asyncio
+async def test_update_user():
+    new_user = await create_user("Diana", "diana@example.com")
+    updated = await update_user(new_user["id"], name="Diana Updated")
+    assert updated["name"] == "Diana Updated"
+
+@pytest.mark.asyncio
+async def test_delete_user():
+    new_user = await create_user("Eve", "eve@example.com")
+    result = await delete_user(new_user["id"])
+    assert result["deleted"] is True
+
+    # Should raise error when trying to get deleted user
+    with pytest.raises(Exception):  # UserNotFound
+        get_user(new_user["id"])
+```
+
+## Troubleshooting
+
+### Validation Errors
+
+**"requires both 'method' and 'path'"**
+- Problem: Using `@remote(lb_resource)` without method/path
+- Solution: Add both parameters: `@remote(lb, method="POST", path="/api/endpoint")`
+
+**"Invalid HTTP method 'PATCH' must be one of: GET, POST, PUT, DELETE, PATCH"**
+- Problem: Typo in HTTP method (e.g., `PTACH` instead of `PATCH`)
+- Solution: Verify method spelling matches valid HTTP verbs
+
+**"path must start with '/'"**
+- Problem: Path doesn't start with forward slash
+- Solution: Use absolute paths: `/api/endpoint` not `api/endpoint`
+
+**"Route conflict detected: POST /api/process defined twice"**
+- Problem: Two functions with same method and path on same endpoint
+- Solution: Change path or method to make each route unique
+
+### Runtime Errors
+
+**"Endpoint URL not available - endpoint may not be deployed"**
+- Problem: Using LoadBalancerSlsResource before calling `await resource.deploy()`
+- Solution: Deploy the endpoint first (`await resource.deploy()`) which auto-populates endpoint_url, or use LiveLoadBalancer for local testing
+- Note: endpoint_url is auto-generated by RunPod after deployment and cannot be manually specified
+
+**"HTTP error from endpoint: 500"**
+- Problem: Function raised an error during execution
+- Solution: Check function code for exceptions, view endpoint logs
+
+**"Execution timeout on user-service after 30s"**
+- Problem: Function took longer than 30 seconds to complete
+- Solution: Optimize function, consider increasing timeout in LoadBalancerSlsStub
+
+**"404 Not Found" or "404 error" when calling @remote on deployed endpoint**
+- Problem: Function decorated with @remote but missing `method` and/or `path` parameters
+- Solution: Always provide complete routing metadata: `@remote(api, method="POST", path="/api/endpoint")`
+- Note: On `LoadBalancerSlsResource`, the stub will try to use the non-existent `/execute` endpoint if routing metadata is missing
+
+**"JSON serialization error" or "unexpected keyword argument" on deployed endpoint**
+- Problem: Deployed endpoint receiving malformed parameters from @remote call
+- Solution: This should not happen automatically (stub handles parameter mapping). Check:
+  - Function parameters are JSON-serializable (not custom classes or Request objects)
+  - Function signature matches expected parameter names
+  - For complex types, make direct HTTP calls instead of using @remote
+
+### Build Errors
+
+**"Cannot import module 'user_service'"**
+- Problem: Function module not found during handler generation
+- Solution: Ensure module is in Python path, check import statements
+
+**"Function 'process_data' not found in executed code"**
+- Problem: Function source extraction failed
+- Solution: Ensure function is defined at module level (not inside another function)
+
+## API Reference
+
+### @remote Decorator with LoadBalancerSlsResource
+
+```python
+@remote(
+    resource_config: LoadBalancerSlsResource | LiveLoadBalancer,
+    method: str = None,  # Required: GET, POST, PUT, DELETE, PATCH
+    path: str = None,    # Required: /api/route
+    dependencies: List[str] = None,  # Python packages to install
+    system_dependencies: List[str] = None,  # System packages to install
+    accelerate_downloads: bool = True  # Use download acceleration
+)
+def your_function(...):
+    pass
+```
+
+### LoadBalancerSlsResource
+
+See `docs/Load_Balancer_Endpoints.md` for detailed architecture and configuration options.
+
+### LiveLoadBalancer
+
+A test/development variant of LoadBalancerSlsResource:
+- Locks to Tetra LB image
+- Enables direct function calls without deployment
+- Same decorator interface as production
+
+## Best Practices
+
+1. **Use LiveLoadBalancer for testing** - No deployment needed for development
+2. **Test locally before deploying** - Catch routing/logic errors early
+3. **Use descriptive paths** - `/api/users/{user_id}` is clearer than `/api/u`
+4. **Group related routes** - Keep similar endpoints on same service
+5. **Handle errors gracefully** - Return meaningful error messages to clients
+6. **Verify health checks** - Ensure `/ping` endpoint works after deployment
+7. **Document your API** - Add docstrings explaining what each route does
+
+## Next Steps
+
+- Review `docs/Load_Balancer_Endpoints.md` for LoadBalancerSlsResource class architecture
+- Review `docs/LoadBalancer_Runtime_Architecture.md` for runtime execution and request flows
+- Check examples in `flash-examples/` repository for more patterns
+- Use `flash build --help` to see build options
+- Use `flash run --help` to see local testing options
diff --git a/src/tetra_rp/__init__.py b/src/tetra_rp/__init__.py
index f7f21130..e72b89d7 100644
--- a/src/tetra_rp/__init__.py
+++ b/src/tetra_rp/__init__.py
@@ -15,12 +15,16 @@
     from .client import remote
     from .core.resources import (
         CpuInstanceType,
+        CpuLiveLoadBalancer,
         CpuLiveServerless,
+        CpuLoadBalancerSlsResource,
         CpuServerlessEndpoint,
         CudaVersion,
         DataCenter,
         GpuGroup,
+        LiveLoadBalancer,
         LiveServerless,
+        LoadBalancerSlsResource,
         NetworkVolume,
         PodTemplate,
         ResourceManager,
@@ -36,47 +40,59 @@ def __getattr__(name):
 
         return remote
     elif name in (
-        "CpuServerlessEndpoint",
         "CpuInstanceType",
+        "CpuLiveLoadBalancer",
         "CpuLiveServerless",
+        "CpuLoadBalancerSlsResource",
+        "CpuServerlessEndpoint",
         "CudaVersion",
         "DataCenter",
         "GpuGroup",
+        "LiveLoadBalancer",
         "LiveServerless",
+        "LoadBalancerSlsResource",
+        "NetworkVolume",
         "PodTemplate",
         "ResourceManager",
         "ServerlessEndpoint",
         "ServerlessType",
-        "NetworkVolume",
     ):
         from .core.resources import (
-            CpuServerlessEndpoint,
             CpuInstanceType,
+            CpuLiveLoadBalancer,
             CpuLiveServerless,
+            CpuLoadBalancerSlsResource,
+            CpuServerlessEndpoint,
             CudaVersion,
             DataCenter,
             GpuGroup,
+            LiveLoadBalancer,
             LiveServerless,
+            LoadBalancerSlsResource,
+            NetworkVolume,
             PodTemplate,
             ResourceManager,
             ServerlessEndpoint,
             ServerlessType,
-            NetworkVolume,
         )
 
         attrs = {
-            "CpuServerlessEndpoint": CpuServerlessEndpoint,
             "CpuInstanceType": CpuInstanceType,
+            "CpuLiveLoadBalancer": CpuLiveLoadBalancer,
             "CpuLiveServerless": CpuLiveServerless,
+            "CpuLoadBalancerSlsResource": CpuLoadBalancerSlsResource,
+            "CpuServerlessEndpoint": CpuServerlessEndpoint,
             "CudaVersion": CudaVersion,
             "DataCenter": DataCenter,
             "GpuGroup": GpuGroup,
+            "LiveLoadBalancer": LiveLoadBalancer,
             "LiveServerless": LiveServerless,
+            "LoadBalancerSlsResource": LoadBalancerSlsResource,
+            "NetworkVolume": NetworkVolume,
             "PodTemplate": PodTemplate,
             "ResourceManager": ResourceManager,
             "ServerlessEndpoint": ServerlessEndpoint,
             "ServerlessType": ServerlessType,
-            "NetworkVolume": NetworkVolume,
         }
         return attrs[name]
     raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
@@ -84,16 +100,20 @@ def __getattr__(name):
 
 __all__ = [
     "remote",
-    "CpuServerlessEndpoint",
     "CpuInstanceType",
+    "CpuLiveLoadBalancer",
     "CpuLiveServerless",
+    "CpuLoadBalancerSlsResource",
+    "CpuServerlessEndpoint",
     "CudaVersion",
     "DataCenter",
     "GpuGroup",
+    "LiveLoadBalancer",
     "LiveServerless",
+    "LoadBalancerSlsResource",
+    "NetworkVolume",
     "PodTemplate",
     "ResourceManager",
     "ServerlessEndpoint",
     "ServerlessType",
-    "NetworkVolume",
 ]
diff --git a/src/tetra_rp/cli/commands/build.py b/src/tetra_rp/cli/commands/build.py
index b8d909d2..e4a53075 100644
--- a/src/tetra_rp/cli/commands/build.py
+++ b/src/tetra_rp/cli/commands/build.py
@@ -17,6 +17,7 @@
 
 from ..utils.ignore import get_file_tree, load_ignore_patterns
 from .build_utils.handler_generator import HandlerGenerator
+from .build_utils.lb_handler_generator import LBHandlerGenerator
 from .build_utils.manifest import ManifestBuilder
 from .build_utils.scanner import RemoteDecoratorScanner
 
@@ -115,9 +116,30 @@ def build_command(
                         manifest_path = build_dir / "flash_manifest.json"
                         manifest_path.write_text(json.dumps(manifest, indent=2))
 
-                        # Generate handler files
-                        handler_gen = HandlerGenerator(manifest, build_dir)
-                        handler_paths = handler_gen.generate_handlers()
+                        # Generate handler files based on resource type
+                        handler_paths = []
+
+                        # Separate resources by type
+                        lb_resources = {
+                            name: data
+                            for name, data in manifest.get("resources", {}).items()
+                            if data.get("resource_type") == "LoadBalancerSlsResource"
+                        }
+                        qb_resources = {
+                            name: data
+                            for name, data in manifest.get("resources", {}).items()
+                            if data.get("resource_type") != "LoadBalancerSlsResource"
+                        }
+
+                        # Generate LB handlers
+                        if lb_resources:
+                            lb_gen = LBHandlerGenerator(manifest, build_dir)
+                            handler_paths.extend(lb_gen.generate_handlers())
+
+                        # Generate QB handlers
+                        if qb_resources:
+                            qb_gen = HandlerGenerator(manifest, build_dir)
+                            handler_paths.extend(qb_gen.generate_handlers())
 
                         progress.update(
                             manifest_task,
diff --git a/src/tetra_rp/cli/commands/build_utils/handler_generator.py b/src/tetra_rp/cli/commands/build_utils/handler_generator.py
index f019d2d2..c1e5d616 100644
--- a/src/tetra_rp/cli/commands/build_utils/handler_generator.py
+++ b/src/tetra_rp/cli/commands/build_utils/handler_generator.py
@@ -43,10 +43,14 @@ def __init__(self, manifest: Dict[str, Any], build_dir: Path):
         self.build_dir = build_dir
 
     def generate_handlers(self) -> List[Path]:
-        """Generate all handler files."""
+        """Generate all handler files for queue-based (non-LB) resources."""
         handler_paths = []
 
         for resource_name, resource_data in self.manifest.get("resources", {}).items():
+            # Skip load-balanced resources (handled by LBHandlerGenerator)
+            if resource_data.get("resource_type") == "LoadBalancerSlsResource":
+                continue
+
             handler_path = self._generate_handler(resource_name, resource_data)
             handler_paths.append(handler_path)
 
diff --git a/src/tetra_rp/cli/commands/build_utils/lb_handler_generator.py b/src/tetra_rp/cli/commands/build_utils/lb_handler_generator.py
new file mode 100644
index 00000000..c7bf0ba7
--- /dev/null
+++ b/src/tetra_rp/cli/commands/build_utils/lb_handler_generator.py
@@ -0,0 +1,200 @@
+"""Generator for FastAPI handlers for LoadBalancerSlsResource endpoints."""
+
+import importlib.util
+import logging
+from pathlib import Path
+from typing import Any, Dict, List
+
+logger = logging.getLogger(__name__)
+
+LB_HANDLER_TEMPLATE = '''"""
+Auto-generated FastAPI handler for LoadBalancerSlsResource: {resource_name}
+Generated at: {timestamp}
+
+This file is generated by the Flash build process. Do not edit manually.
+
+Load-balanced endpoints expose HTTP servers directly to clients, enabling:
+- REST APIs with custom HTTP routing
+- WebSocket servers
+- Real-time communication patterns
+"""
+
+from fastapi import FastAPI, Request
+from tetra_rp.runtime.lb_handler import create_lb_handler
+
+# Import all functions/classes that belong to this resource
+{imports}
+
+# Route registry: (method, path) -> function
+ROUTE_REGISTRY = {{
+{registry}
+}}
+
+# Create FastAPI app with routes
+# Note: include_execute={include_execute} for this endpoint type
+# - LiveLoadBalancer (local): include_execute=True for /execute endpoint
+# - LoadBalancerSlsResource (deployed): include_execute=False (security)
+app = create_lb_handler(ROUTE_REGISTRY, include_execute={include_execute})
+
+
+# Health check endpoint (required for RunPod load-balancer endpoints)
+@app.get("/ping")
+def ping():
+    """Health check endpoint for RunPod load-balancer.
+
+    Returns:
+        dict: Status response
+    """
+    return {{"status": "healthy"}}
+
+
+if __name__ == "__main__":
+    import uvicorn
+    # Local development server for testing
+    uvicorn.run(app, host="0.0.0.0", port=8000)
+'''
+
+
+class LBHandlerGenerator:
+    """Generates FastAPI handlers for LoadBalancerSlsResource endpoints."""
+
+    def __init__(self, manifest: Dict[str, Any], build_dir: Path):
+        self.manifest = manifest
+        self.build_dir = build_dir
+
+    def generate_handlers(self) -> List[Path]:
+        """Generate all LB handler files."""
+        handler_paths = []
+
+        for resource_name, resource_data in self.manifest.get("resources", {}).items():
+            # Generate for both LiveLoadBalancer (local dev) and LoadBalancerSlsResource (deployed)
+            resource_type = resource_data.get("resource_type")
+            if resource_type not in ["LoadBalancerSlsResource", "LiveLoadBalancer"]:
+                continue
+
+            handler_path = self._generate_handler(resource_name, resource_data)
+            handler_paths.append(handler_path)
+
+        return handler_paths
+
+    def _generate_handler(
+        self, resource_name: str, resource_data: Dict[str, Any]
+    ) -> Path:
+        """Generate a single FastAPI handler file."""
+        handler_filename = f"handler_{resource_name}.py"
+        handler_path = self.build_dir / handler_filename
+
+        # Get timestamp from manifest
+        timestamp = self.manifest.get("generated_at", "")
+
+        # Determine if /execute endpoint should be included
+        # LiveLoadBalancer (local dev) includes /execute, deployed LoadBalancerSlsResource does not
+        resource_type = resource_data.get("resource_type", "LoadBalancerSlsResource")
+        include_execute = resource_type == "LiveLoadBalancer"
+
+        # Generate imports section
+        imports = self._generate_imports(resource_data.get("functions", []))
+
+        # Generate route registry
+        registry = self._generate_route_registry(resource_data.get("functions", []))
+
+        # Format template
+        handler_code = LB_HANDLER_TEMPLATE.format(
+            resource_name=resource_name,
+            timestamp=timestamp,
+            imports=imports,
+            registry=registry,
+            include_execute=str(include_execute),
+        )
+
+        handler_path.write_text(handler_code)
+
+        # Validate that generated handler can be imported
+        self._validate_handler_imports(handler_path)
+
+        return handler_path
+
+    def _generate_imports(self, functions: List[Dict[str, Any]]) -> str:
+        """Generate import statements for functions.
+
+        Args:
+            functions: List of function metadata dicts
+
+        Returns:
+            Import statements as string
+        """
+        imports = []
+
+        for func in functions:
+            module = func.get("module")
+            name = func.get("name")
+
+            if module and name:
+                imports.append(f"from {module} import {name}")
+
+        return "\n".join(imports) if imports else "# No functions to import"
+
+    def _generate_route_registry(self, functions: List[Dict[str, Any]]) -> str:
+        """Generate route registry for FastAPI app.
+
+        Creates mapping of (method, path) tuples to function names.
+
+        Args:
+            functions: List of function metadata dicts with http_method and http_path
+
+        Returns:
+            Registry dictionary as string
+        """
+        if not functions:
+            return "    # No functions registered"
+
+        registry_lines = []
+
+        for func in functions:
+            name = func.get("name")
+            method = func.get("http_method")
+            path = func.get("http_path")
+
+            if name and method and path:
+                # Create tuple key: ("GET", "/api/process")
+                registry_lines.append(f'    ("{method}", "{path}"): {name},')
+            elif name:
+                # Skip if method or path missing (shouldn't happen with validation)
+                logger.warning(
+                    f"Function '{name}' missing http_method or http_path. Skipping."
+                )
+
+        return (
+            "\n".join(registry_lines)
+            if registry_lines
+            else "    # No routes registered"
+        )
+
+    def _validate_handler_imports(self, handler_path: Path) -> None:
+        """Validate that generated handler has valid Python syntax.
+
+        Attempts to load the handler module to catch syntax errors.
+        ImportErrors for missing worker modules are logged but not fatal,
+        as those imports may not be available at build time.
+
+        Args:
+            handler_path: Path to generated handler file
+
+        Raises:
+            ValueError: If handler has syntax errors or cannot be parsed
+        """
+        try:
+            spec = importlib.util.spec_from_file_location("handler", handler_path)
+            if spec and spec.loader:
+                module = importlib.util.module_from_spec(spec)
+                spec.loader.exec_module(module)
+            else:
+                raise ValueError("Failed to create module spec")
+        except SyntaxError as e:
+            raise ValueError(f"Handler has syntax errors: {e}") from e
+        except ImportError as e:
+            # Log but don't fail - imports might not be available at build time
+            logger.debug(f"Handler import validation: {e}")
+        except Exception as e:
+            # Only raise for truly unexpected errors
+            logger.warning(f"Handler validation warning: {e}")
diff --git a/src/tetra_rp/cli/commands/build_utils/manifest.py b/src/tetra_rp/cli/commands/build_utils/manifest.py
index 6df594d6..2664150f 100644
--- a/src/tetra_rp/cli/commands/build_utils/manifest.py
+++ b/src/tetra_rp/cli/commands/build_utils/manifest.py
@@ -2,12 +2,14 @@
 
 import json
 from dataclasses import dataclass
-from datetime import datetime
+from datetime import datetime, timezone
 from pathlib import Path
-from typing import Any, Dict, List
+from typing import Any, Dict, List, Optional
 
 from .scanner import RemoteFunctionMetadata
 
+RESERVED_PATHS = ["/execute", "/ping"]
+
 
 @dataclass
 class ManifestFunction:
@@ -17,6 +19,8 @@ class ManifestFunction:
     module: str
     is_async: bool
     is_class: bool
+    http_method: Optional[str] = None  # HTTP method for LB endpoints (GET, POST, etc.)
+    http_path: Optional[str] = None  # HTTP path for LB endpoints (/api/process)
 
 
 @dataclass
@@ -50,31 +54,75 @@ def build(self) -> Dict[str, Any]:
         # Build manifest structure
         resources_dict: Dict[str, Dict[str, Any]] = {}
         function_registry: Dict[str, str] = {}
+        routes_dict: Dict[
+            str, Dict[str, str]
+        ] = {}  # resource_name -> {route_key -> function_name}
 
         for resource_name, functions in sorted(resources.items()):
             handler_file = f"handler_{resource_name}.py"
 
+            # Use actual resource type from first function in group
+            resource_type = (
+                functions[0].resource_type if functions else "LiveServerless"
+            )
+
+            # Validate and collect routing for LB endpoints
+            resource_routes = {}
+            is_load_balanced = resource_type in [
+                "LoadBalancerSlsResource",
+                "LiveLoadBalancer",
+            ]
+            if is_load_balanced:
+                for f in functions:
+                    if not f.http_method or not f.http_path:
+                        raise ValueError(
+                            f"{resource_type} endpoint '{resource_name}' requires "
+                            f"method and path for function '{f.function_name}'. "
+                            f"Got method={f.http_method}, path={f.http_path}"
+                        )
+
+                    # Check for route conflicts (same method + path)
+                    route_key = f"{f.http_method} {f.http_path}"
+                    if route_key in resource_routes:
+                        raise ValueError(
+                            f"Duplicate route '{route_key}' in resource '{resource_name}': "
+                            f"both '{resource_routes[route_key]}' and '{f.function_name}' "
+                            f"are mapped to the same route"
+                        )
+                    resource_routes[route_key] = f.function_name
+
+                    # Check for reserved paths
+                    if f.http_path in RESERVED_PATHS:
+                        raise ValueError(
+                            f"Function '{f.function_name}' cannot use reserved path '{f.http_path}'. "
+                            f"Reserved paths: {', '.join(RESERVED_PATHS)}"
+                        )
+
             functions_list = [
                 {
                     "name": f.function_name,
                     "module": f.module_path,
                     "is_async": f.is_async,
                     "is_class": f.is_class,
+                    **(
+                        {"http_method": f.http_method, "http_path": f.http_path}
+                        if is_load_balanced
+                        else {}
+                    ),
                 }
                 for f in functions
             ]
 
-            # Use actual resource type from first function in group
-            resource_type = (
-                functions[0].resource_type if functions else "LiveServerless"
-            )
-
             resources_dict[resource_name] = {
                 "resource_type": resource_type,
                 "handler_file": handler_file,
                 "functions": functions_list,
             }
 
+            # Store routes for LB endpoints
+            if resource_routes:
+                routes_dict[resource_name] = resource_routes
+
             # Build function registry for quick lookup
             for f in functions:
                 if f.function_name in function_registry:
@@ -84,14 +132,22 @@ def build(self) -> Dict[str, Any]:
                     )
                 function_registry[f.function_name] = resource_name
 
-        return {
+        manifest = {
             "version": "1.0",
-            "generated_at": datetime.utcnow().isoformat() + "Z",
+            "generated_at": datetime.now(timezone.utc)
+            .isoformat()
+            .replace("+00:00", "Z"),
             "project_name": self.project_name,
             "resources": resources_dict,
             "function_registry": function_registry,
         }
 
+        # Add routes section if there are LB endpoints with routing
+        if routes_dict:
+            manifest["routes"] = routes_dict
+
+        return manifest
+
     def write_to_file(self, output_path: Path) -> Path:
         """Write manifest to file."""
         manifest = self.build()
diff --git a/src/tetra_rp/cli/commands/build_utils/scanner.py b/src/tetra_rp/cli/commands/build_utils/scanner.py
index b6f1ecf4..92f80fc2 100644
--- a/src/tetra_rp/cli/commands/build_utils/scanner.py
+++ b/src/tetra_rp/cli/commands/build_utils/scanner.py
@@ -22,6 +22,8 @@ class RemoteFunctionMetadata:
     is_async: bool
     is_class: bool
     file_path: Path
+    http_method: Optional[str] = None  # HTTP method for LB endpoints: GET, POST, etc.
+    http_path: Optional[str] = None  # HTTP path for LB endpoints: /api/process
 
 
 class RemoteDecoratorScanner:
@@ -86,7 +88,7 @@ def _extract_resource_configs(self, tree: ast.AST, py_file: Path) -> None:
 
         for node in ast.walk(tree):
             if isinstance(node, ast.Assign):
-                # Look for assignments like: gpu_config = LiveServerless(...)
+                # Look for assignments like: gpu_config = LiveServerless(...) or api = LiveLoadBalancer(...)
                 for target in node.targets:
                     if isinstance(target, ast.Name):
                         variable_name = target.id
@@ -134,6 +136,11 @@ def _extract_remote_functions(
                         # Get resource type for this config
                         resource_type = self._get_resource_type(resource_config_name)
 
+                        # Extract HTTP routing metadata (for LB endpoints)
+                        http_method, http_path = self._extract_http_routing(
+                            remote_decorator
+                        )
+
                         metadata = RemoteFunctionMetadata(
                             function_name=node.name,
                             module_path=module_path,
@@ -142,6 +149,8 @@ def _extract_remote_functions(
                             is_async=is_async,
                             is_class=is_class,
                             file_path=py_file,
+                            http_method=http_method,
+                            http_path=http_path,
                         )
                         functions.append(metadata)
 
@@ -299,3 +308,40 @@ def _get_module_path(self, py_file: Path) -> str:
         except ValueError:
             # If relative_to fails, just use filename
             return py_file.stem
+
+    def _extract_http_routing(
+        self, decorator: ast.expr
+    ) -> tuple[Optional[str], Optional[str]]:
+        """Extract HTTP method and path from @remote decorator.
+
+        Returns:
+            Tuple of (method, path) or (None, None) if not found.
+            method: GET, POST, PUT, DELETE, PATCH
+            path: /api/endpoint routes
+
+        Raises:
+            ValueError: If method is not a valid HTTP verb
+        """
+        if not isinstance(decorator, ast.Call):
+            return None, None
+
+        http_method = None
+        http_path = None
+
+        # Extract keyword arguments: method="POST", path="/api/process"
+        for keyword in decorator.keywords:
+            if keyword.arg == "method":
+                if isinstance(keyword.value, ast.Constant):
+                    http_method = keyword.value.value
+            elif keyword.arg == "path":
+                if isinstance(keyword.value, ast.Constant):
+                    http_path = keyword.value.value
+
+        # Validate HTTP method if provided
+        valid_methods = {"GET", "POST", "PUT", "DELETE", "PATCH"}
+        if http_method is not None and http_method.upper() not in valid_methods:
+            raise ValueError(
+                f"Invalid HTTP method '{http_method}'. Must be one of: {', '.join(valid_methods)}"
+            )
+
+        return http_method, http_path
diff --git a/src/tetra_rp/client.py b/src/tetra_rp/client.py
index 0fa1826f..9dc8a019 100644
--- a/src/tetra_rp/client.py
+++ b/src/tetra_rp/client.py
@@ -4,7 +4,7 @@
 from functools import wraps
 from typing import List, Optional
 
-from .core.resources import ResourceManager, ServerlessResource
+from .core.resources import LoadBalancerSlsResource, ResourceManager, ServerlessResource
 from .execute_class import create_remote_class
 from .stubs import stub_resource
 
@@ -17,6 +17,8 @@ def remote(
     system_dependencies: Optional[List[str]] = None,
     accelerate_downloads: bool = True,
     local: bool = False,
+    method: Optional[str] = None,
+    path: Optional[str] = None,
     **extra,
 ):
     """
@@ -44,6 +46,12 @@ def remote(
         local (bool, optional): Execute function/class locally instead of provisioning remote servers.
             Returns the unwrapped function/class for direct local execution. Users must ensure all required
             dependencies are already installed in their local environment. Defaults to False.
+        method (str, optional): HTTP method for load-balanced endpoints (LoadBalancerSlsResource).
+            Required for LoadBalancerSlsResource: "GET", "POST", "PUT", "DELETE", "PATCH".
+            Ignored for queue-based endpoints. Defaults to None.
+        path (str, optional): HTTP path for load-balanced endpoints (LoadBalancerSlsResource).
+            Required for LoadBalancerSlsResource. Must start with "/". Example: "/api/process".
+            Ignored for queue-based endpoints. Defaults to None.
         extra (dict, optional): Additional parameters for the execution of the resource. Defaults to an empty dict.
 
     Returns:
@@ -52,9 +60,9 @@ def remote(
 
     Example:
     ```python
-        # Async function (recommended style)
+        # Queue-based endpoint (recommended for reliability)
         @remote(
-            resource_config=my_resource_config,
+            resource_config=LiveServerless(name="gpu_worker"),
             dependencies=["torch>=2.0.0"],
         )
         async def gpu_task(data: dict) -> dict:
@@ -62,20 +70,19 @@ async def gpu_task(data: dict) -> dict:
             # GPU processing here
             return {"result": "processed"}
 
-        # Sync function (also supported)
+        # Load-balanced endpoint (for low-latency APIs)
         @remote(
-            resource_config=my_resource_config,
-            dependencies=["pandas>=2.0.0"],
+            resource_config=LoadBalancerSlsResource(name="api-service"),
+            method="POST",
+            path="/api/process",
         )
-        def cpu_task(data: dict) -> dict:
-            import pandas as pd
-            # CPU processing here
-            return {"result": "processed"}
+        async def api_endpoint(x: int, y: int) -> dict:
+            return {"result": x + y}
 
         # Local execution (testing/development)
         @remote(
             resource_config=my_resource_config,
-            dependencies=["numpy", "pandas"],  # Only used for remote execution
+            dependencies=["numpy", "pandas"],
             local=True,
         )
         async def my_test_function(data):
@@ -85,18 +92,53 @@ async def my_test_function(data):
     """
 
     def decorator(func_or_class):
+        # Validate HTTP routing parameters for LoadBalancerSlsResource
+        is_lb_resource = isinstance(resource_config, LoadBalancerSlsResource)
+
+        if is_lb_resource:
+            if not method or not path:
+                raise ValueError(
+                    f"LoadBalancerSlsResource requires both 'method' and 'path' parameters. "
+                    f"Got method={method}, path={path}. "
+                    f"Example: @remote(resource_config, method='POST', path='/api/process')"
+                )
+            if not path.startswith("/"):
+                raise ValueError(f"path must start with '/'. Got: {path}")
+            valid_methods = {"GET", "POST", "PUT", "DELETE", "PATCH"}
+            if method not in valid_methods:
+                raise ValueError(
+                    f"method must be one of {valid_methods}. Got: {method}"
+                )
+        elif method or path:
+            log.warning(
+                f"HTTP routing parameters (method={method}, path={path}) are only used "
+                f"with LoadBalancerSlsResource, but resource_config is {type(resource_config).__name__}. "
+                f"They will be ignored."
+            )
+
+        # Store routing metadata for scanner and build system
+        routing_config = {
+            "resource_config": resource_config,
+            "method": method,
+            "path": path,
+            "dependencies": dependencies,
+            "system_dependencies": system_dependencies,
+        }
+
         if os.getenv("RUNPOD_POD_ID") or os.getenv("RUNPOD_ENDPOINT_ID"):
             # Worker mode when running on RunPod platform
+            func_or_class.__remote_config__ = routing_config
             return func_or_class
 
         # Local execution mode - execute without provisioning remote servers
         if local:
+            func_or_class.__remote_config__ = routing_config
             return func_or_class
 
         # Remote execution mode
         if inspect.isclass(func_or_class):
             # Handle class decoration
-            return create_remote_class(
+            wrapped_class = create_remote_class(
                 func_or_class,
                 resource_config,
                 dependencies,
@@ -104,6 +146,8 @@ def decorator(func_or_class):
                 accelerate_downloads,
                 extra,
             )
+            wrapped_class.__remote_config__ = routing_config
+            return wrapped_class
         else:
             # Handle function decoration
             @wraps(func_or_class)
@@ -123,6 +167,8 @@ async def wrapper(*args, **kwargs):
                     **kwargs,
                 )
 
+            # Store routing metadata on wrapper for scanner
+            wrapper.__remote_config__ = routing_config
             return wrapper
 
     return decorator
diff --git a/src/tetra_rp/core/resources/__init__.py b/src/tetra_rp/core/resources/__init__.py
index 60ab6b71..b47b50d9 100644
--- a/src/tetra_rp/core/resources/__init__.py
+++ b/src/tetra_rp/core/resources/__init__.py
@@ -2,23 +2,35 @@
 from .cpu import CpuInstanceType
 from .gpu import GpuGroup, GpuType, GpuTypeDetail
 from .resource_manager import ResourceManager
-from .live_serverless import LiveServerless, CpuLiveServerless
+from .live_serverless import (
+    CpuLiveLoadBalancer,
+    CpuLiveServerless,
+    LiveLoadBalancer,
+    LiveServerless,
+)
 from .serverless import (
     ServerlessResource,
     ServerlessEndpoint,
     JobOutput,
     CudaVersion,
     ServerlessType,
+    ServerlessScalerType,
 )
 from .serverless_cpu import CpuServerlessEndpoint
 from .template import PodTemplate
 from .network_volume import NetworkVolume, DataCenter
+from .load_balancer_sls_resource import (
+    CpuLoadBalancerSlsResource,
+    LoadBalancerSlsResource,
+)
 
 
 __all__ = [
     "BaseResource",
     "CpuInstanceType",
+    "CpuLiveLoadBalancer",
     "CpuLiveServerless",
+    "CpuLoadBalancerSlsResource",
     "CpuServerlessEndpoint",
     "CudaVersion",
     "DataCenter",
@@ -27,11 +39,14 @@
     "GpuType",
     "GpuTypeDetail",
     "JobOutput",
+    "LiveLoadBalancer",
     "LiveServerless",
+    "LoadBalancerSlsResource",
+    "NetworkVolume",
+    "PodTemplate",
     "ResourceManager",
-    "ServerlessResource",
     "ServerlessEndpoint",
+    "ServerlessResource",
+    "ServerlessScalerType",
     "ServerlessType",
-    "PodTemplate",
-    "NetworkVolume",
 ]
diff --git a/src/tetra_rp/core/resources/live_serverless.py b/src/tetra_rp/core/resources/live_serverless.py
index 193810b0..7064189b 100644
--- a/src/tetra_rp/core/resources/live_serverless.py
+++ b/src/tetra_rp/core/resources/live_serverless.py
@@ -1,6 +1,12 @@
 # Ship serverless code as you write it. No builds, no deploys — just run.
 import os
+
 from pydantic import model_validator
+
+from .load_balancer_sls_resource import (
+    CpuLoadBalancerSlsResource,
+    LoadBalancerSlsResource,
+)
 from .serverless import ServerlessEndpoint
 from .serverless_cpu import CpuServerlessEndpoint
 
@@ -11,6 +17,12 @@
 TETRA_CPU_IMAGE = os.environ.get(
     "TETRA_CPU_IMAGE", f"runpod/tetra-rp-cpu:{TETRA_IMAGE_TAG}"
 )
+TETRA_LB_IMAGE = os.environ.get(
+    "TETRA_LB_IMAGE", f"runpod/tetra-rp-lb:{TETRA_IMAGE_TAG}"
+)
+TETRA_CPU_LB_IMAGE = os.environ.get(
+    "TETRA_CPU_LB_IMAGE", f"runpod/tetra-rp-lb-cpu:{TETRA_IMAGE_TAG}"
+)
 
 
 class LiveServerlessMixin:
@@ -60,3 +72,98 @@ def set_live_serverless_template(cls, data: dict):
         """Set default CPU image for Live Serverless."""
         data["imageName"] = TETRA_CPU_IMAGE
         return data
+
+
+class LiveLoadBalancer(LiveServerlessMixin, LoadBalancerSlsResource):
+    """Live load-balanced endpoint for local development and testing.
+
+    Similar to LiveServerless but for HTTP-based load-balanced endpoints.
+    Enables local testing of @remote decorated functions with LB endpoints
+    before deploying to production.
+
+    Features:
+    - Locks to Tetra LB image (tetra-rp-lb)
+    - Direct HTTP execution (not queue-based)
+    - Local development with flash run
+    - Same @remote decorator pattern as LoadBalancerSlsResource
+
+    Usage:
+        from tetra_rp import LiveLoadBalancer, remote
+
+        api = LiveLoadBalancer(name="api-service")
+
+        @remote(api, method="POST", path="/api/process")
+        async def process_data(x: int, y: int):
+            return {"result": x + y}
+
+        # Test locally
+        result = await process_data(5, 3)
+
+    Local Development Flow:
+        1. Create LiveLoadBalancer with routing
+        2. Decorate functions with @remote(lb_resource, method=..., path=...)
+        3. Run with `flash run` to start local endpoint
+        4. Call functions directly in tests or scripts
+        5. Deploy to production with `flash build` and `flash deploy`
+
+    Note:
+        The endpoint_url is configured by the Flash runtime when the
+        endpoint is deployed locally. For true local testing without
+        deployment, use the functions directly or mock the HTTP layer.
+    """
+
+    @property
+    def _live_image(self) -> str:
+        return TETRA_LB_IMAGE
+
+    @model_validator(mode="before")
+    @classmethod
+    def set_live_lb_template(cls, data: dict):
+        """Set default image for Live Load-Balanced endpoint."""
+        data["imageName"] = TETRA_LB_IMAGE
+        return data
+
+
+class CpuLiveLoadBalancer(LiveServerlessMixin, CpuLoadBalancerSlsResource):
+    """CPU-only live load-balanced endpoint for local development and testing.
+
+    Similar to LiveLoadBalancer but configured for CPU instances with
+    automatic disk sizing and validation.
+
+    Features:
+    - Locks to CPU Tetra LB image (tetra-rp-lb-cpu)
+    - CPU instance support with automatic disk sizing
+    - Direct HTTP execution (not queue-based)
+    - Local development with flash run
+    - Same @remote decorator pattern as CpuLoadBalancerSlsResource
+
+    Usage:
+        from tetra_rp import CpuLiveLoadBalancer, remote
+
+        api = CpuLiveLoadBalancer(name="api-service")
+
+        @remote(api, method="POST", path="/api/process")
+        async def process_data(x: int, y: int):
+            return {"result": x + y}
+
+        # Test locally
+        result = await process_data(5, 3)
+
+    Local Development Flow:
+        1. Create CpuLiveLoadBalancer with routing
+        2. Decorate functions with @remote(lb_resource, method=..., path=...)
+        3. Run with `flash run` to start local endpoint
+        4. Call functions directly in tests or scripts
+        5. Deploy to production with `flash build` and `flash deploy`
+    """
+
+    @property
+    def _live_image(self) -> str:
+        return TETRA_CPU_LB_IMAGE
+
+    @model_validator(mode="before")
+    @classmethod
+    def set_live_cpu_lb_template(cls, data: dict):
+        """Set default CPU image for Live Load-Balanced endpoint."""
+        data["imageName"] = TETRA_CPU_LB_IMAGE
+        return data
diff --git a/src/tetra_rp/core/resources/load_balancer_sls_resource.py b/src/tetra_rp/core/resources/load_balancer_sls_resource.py
new file mode 100644
index 00000000..1ea5085b
--- /dev/null
+++ b/src/tetra_rp/core/resources/load_balancer_sls_resource.py
@@ -0,0 +1,374 @@
+"""
+LoadBalancerSlsResource - Resource type for RunPod Load-Balanced Serverless endpoints.
+
+Load-balanced endpoints expose HTTP servers directly to clients without the queue-based
+processing model of standard serverless endpoints. They're ideal for REST APIs, webhooks,
+and real-time communication patterns.
+
+Key differences from standard serverless (QB):
+- Requests route directly to healthy workers via HTTP
+- No automatic retries (client responsible)
+- Lower latency but less fault tolerance
+- Requires HTTP application, not a function handler
+- Health checks via /ping endpoint
+"""
+
+import asyncio
+import logging
+from typing import List, Optional
+
+from pydantic import model_validator
+
+from tetra_rp.core.utils.http import get_authenticated_httpx_client
+from .cpu import CpuInstanceType
+from .serverless import ServerlessResource, ServerlessType, ServerlessScalerType
+from .serverless_cpu import CpuEndpointMixin
+
+log = logging.getLogger(__name__)
+
+# Configuration constants
+DEFAULT_HEALTH_CHECK_RETRIES = 10
+DEFAULT_HEALTH_CHECK_INTERVAL = 5  # seconds between retries
+DEFAULT_PING_REQUEST_TIMEOUT = (
+    15.0  # seconds (load-balanced workers need time for cold starts)
+)
+HEALTHY_STATUS_CODES = (200, 204)
+
+
+class LoadBalancerSlsResource(ServerlessResource):
+    """
+    Resource configuration for RunPod Load-Balanced Serverless endpoints.
+
+    Load-balanced endpoints expose HTTP servers directly, making them suitable for:
+    - REST APIs
+    - WebSocket servers
+    - Real-time streaming
+    - Custom HTTP protocols
+
+    Configuration example:
+        mothership = LoadBalancerSlsResource(
+            name="mothership",
+            imageName="my-mothership:latest",
+            env={"FLASH_APP": "my_app"},
+            workersMin=1,
+            workersMax=3,
+        )
+        await mothership.deploy()
+    """
+
+    # Override default type to LB
+    type: Optional[ServerlessType] = ServerlessType.LB
+
+    def __init__(self, **data):
+        """Initialize LoadBalancerSlsResource with LB-specific defaults."""
+        # Ensure type is always LB
+        data["type"] = ServerlessType.LB
+
+        # LB endpoints shouldn't use queue-based scaling
+        if "scalerType" not in data:
+            data["scalerType"] = ServerlessScalerType.REQUEST_COUNT
+
+        super().__init__(**data)
+
+    @model_validator(mode="after")
+    def set_serverless_template(self):
+        """Create template from imageName if not provided.
+
+        Must run after sync_input_fields to ensure all input fields are synced.
+        """
+        if not any([self.imageName, self.template, self.templateId]):
+            raise ValueError(
+                "Either imageName, template, or templateId must be provided"
+            )
+
+        if not self.templateId and not self.template:
+            self.template = self._create_new_template()
+        elif self.template:
+            self._configure_existing_template()
+
+        return self
+
+    @property
+    def endpoint_url(self) -> str:
+        """Get the endpoint URL for load-balanced endpoints.
+
+        Load-balanced endpoints use a different URL format than standard
+        serverless endpoints. They use: https://{endpoint_id}.api.runpod.ai
+
+        Returns:
+            The endpoint URL for health checks and direct HTTP requests
+
+        Raises:
+            ValueError: If endpoint ID not set
+        """
+        if not self.id:
+            raise ValueError("Endpoint ID not set. Cannot determine endpoint URL.")
+        return f"https://{self.id}.api.runpod.ai"
+
+    def _validate_lb_configuration(self) -> None:
+        """
+        Validate LB-specific configuration constraints.
+
+        Raises:
+            ValueError: If configuration violates LB requirements
+        """
+        # LB must use REQUEST_COUNT scaler, not QUEUE_DELAY
+        if self.scalerType == ServerlessScalerType.QUEUE_DELAY:
+            raise ValueError(
+                f"LoadBalancerSlsResource requires REQUEST_COUNT scaler, "
+                f"not {self.scalerType.value}. "
+                "Load-balanced endpoints don't support queue-based scaling."
+            )
+
+        # Type must always be LB
+        if self.type != ServerlessType.LB:
+            raise ValueError(
+                f"LoadBalancerSlsResource type must be LB, got {self.type.value}"
+            )
+
+    async def is_deployed_async(self) -> bool:
+        """
+        Check if LB endpoint is deployed and /ping endpoint is responding.
+
+        For LB endpoints, we verify:
+        1. Endpoint ID exists (created in RunPod)
+        2. /ping endpoint returns 200 or 204
+        3. Endpoint is in healthy state
+
+        Returns:
+            True if endpoint is deployed and healthy, False otherwise
+        """
+        try:
+            if not self.id:
+                return False
+
+            # Use async health check for LB endpoints
+            return await self._check_ping_endpoint()
+
+        except Exception as e:
+            log.debug(f"Error checking {self}: {e}")
+            return False
+
+    async def _check_ping_endpoint(self) -> bool:
+        """
+        Check if /ping endpoint is accessible and healthy.
+
+        RunPod load-balancer endpoints require a /ping endpoint that returns:
+        - 200 OK: Worker is healthy and ready
+        - 204 No Content: Worker is initializing
+        - Other status: Worker is unhealthy
+
+        Returns:
+            True if /ping endpoint responds with 200 or 204
+        """
+        try:
+            if not self.id:
+                return False
+
+            ping_url = f"{self.endpoint_url}/ping"
+
+            async with get_authenticated_httpx_client(
+                timeout=DEFAULT_PING_REQUEST_TIMEOUT
+            ) as client:
+                response = await client.get(ping_url)
+                return response.status_code in HEALTHY_STATUS_CODES
+        except Exception as e:
+            log.debug(f"Ping check failed for {self.name}: {e}")
+            return False
+
+    async def _wait_for_health(
+        self,
+        max_retries: int = DEFAULT_HEALTH_CHECK_RETRIES,
+        retry_interval: int = DEFAULT_HEALTH_CHECK_INTERVAL,
+    ) -> bool:
+        """
+        Poll /ping endpoint until endpoint is healthy or timeout.
+
+        Args:
+            max_retries: Number of health check attempts
+            retry_interval: Seconds between health check attempts
+
+        Returns:
+            True if endpoint became healthy, False if timeout
+
+        Raises:
+            ValueError: If endpoint ID not set
+        """
+        if not self.id:
+            raise ValueError("Cannot wait for health: endpoint not deployed")
+
+        log.info(
+            f"Waiting for LB endpoint {self.name} ({self.id}) to become healthy... "
+            f"(max {max_retries} retries, {retry_interval}s interval)"
+        )
+
+        for attempt in range(max_retries):
+            try:
+                if await self._check_ping_endpoint():
+                    log.info(
+                        f"LB endpoint {self.name} is healthy (attempt {attempt + 1})"
+                    )
+                    return True
+
+                log.debug(
+                    f"Health check attempt {attempt + 1}/{max_retries} - "
+                    f"endpoint not ready yet"
+                )
+
+            except Exception as e:
+                log.debug(f"Health check attempt {attempt + 1} failed: {e}")
+
+            # Wait before next attempt (except on last attempt)
+            if attempt < max_retries - 1:
+                await asyncio.sleep(retry_interval)
+
+        log.error(
+            f"LB endpoint {self.name} failed to become healthy after "
+            f"{max_retries} attempts"
+        )
+        return False
+
+    async def _do_deploy(self) -> "LoadBalancerSlsResource":
+        """
+        Deploy LB endpoint and wait for health.
+
+        Deployment flow:
+        1. Validate LB configuration
+        2. Call parent deploy (creates endpoint in RunPod)
+        3. Poll /ping endpoint until healthy or timeout
+        4. Return deployed resource
+
+        Returns:
+            Deployed LoadBalancerSlsResource instance
+
+        Raises:
+            ValueError: If LB configuration invalid or deployment fails
+            TimeoutError: If /ping endpoint doesn't respond in time
+        """
+        # Validate before deploying
+        self._validate_lb_configuration()
+
+        # Check if already deployed
+        if self.is_deployed():
+            log.debug(f"{self} already deployed")
+            return self
+
+        try:
+            # Call parent deploy (creates endpoint via RunPod API)
+            log.info(f"Deploying LB endpoint {self.name}...")
+            deployed = await super()._do_deploy()
+
+            # Wait for /ping endpoint to become available
+            timeout_seconds = (
+                DEFAULT_HEALTH_CHECK_RETRIES * DEFAULT_HEALTH_CHECK_INTERVAL
+            )
+            log.info(
+                f"Endpoint created, waiting for /ping to respond "
+                f"({timeout_seconds}s timeout)..."
+            )
+
+            healthy = await self._wait_for_health(
+                max_retries=DEFAULT_HEALTH_CHECK_RETRIES,
+                retry_interval=DEFAULT_HEALTH_CHECK_INTERVAL,
+            )
+
+            if not healthy:
+                raise TimeoutError(
+                    f"LB endpoint {self.name} ({deployed.id}) failed to become "
+                    f"healthy within {timeout_seconds}s"
+                )
+
+            log.info(f"LB endpoint {self.name} ({deployed.id}) deployed and healthy")
+            return deployed
+
+        except Exception as e:
+            log.error(f"Failed to deploy LB endpoint {self.name}: {e}")
+            raise
+
+    def is_deployed(self) -> bool:
+        """
+        Override is_deployed to use async health check.
+
+        Note: This is a synchronous wrapper around the async health check.
+        Prefer is_deployed_async() in async contexts.
+
+        Returns:
+            True if endpoint is deployed and /ping responds
+        """
+        if not self.id:
+            return False
+
+        try:
+            # Try the RunPod SDK health check (works for basic connectivity)
+            response = self.endpoint.health()
+            return response is not None
+        except Exception as e:
+            log.debug(f"RunPod health check failed for {self.name}: {e}")
+            return False
+
+
+class CpuLoadBalancerSlsResource(CpuEndpointMixin, LoadBalancerSlsResource):
+    """CPU-only load-balanced endpoint with automatic disk sizing.
+
+    Similar to LoadBalancerSlsResource but configured for CPU instances
+    instead of GPUs. Inherits CPU-specific functionality from CpuEndpointMixin
+    for automatic disk sizing and validation.
+
+    Defaults to CPU_ANY instance type if not specified.
+
+    Configuration example:
+        mothership = CpuLoadBalancerSlsResource(
+            name="mothership",
+            imageName="my-mothership:latest",
+            env={"FLASH_APP": "my_app"},
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+            workersMin=1,
+            workersMax=3,
+        )
+        await mothership.deploy()
+    """
+
+    instanceIds: Optional[List[CpuInstanceType]] = [CpuInstanceType.ANY]
+
+    # CPU endpoints exclude GPU-specific fields from API payload
+    # This prevents the RunPod GraphQL API from rejecting CPU endpoints with GPU-specific fields
+    _input_only = {
+        "id",
+        "cudaVersions",
+        "datacenter",
+        "env",
+        "flashboot",
+        "gpus",
+        "gpuIds",
+        "gpuCount",
+        "allowedCudaVersions",
+        "imageName",
+        "networkVolume",
+    }
+
+    def _setup_cpu_template(self) -> None:
+        """Setup template, validating and creating/configuring as needed."""
+        if not any([self.imageName, self.template, self.templateId]):
+            raise ValueError(
+                "Either imageName, template, or templateId must be provided"
+            )
+
+        if not self.templateId and not self.template:
+            self.template = self._create_new_template()
+        elif self.template:
+            self._configure_existing_template()
+
+    @model_validator(mode="after")
+    def set_serverless_template(self):
+        """Create template from imageName if not provided.
+
+        Overrides parent to call _sync_cpu_fields first to ensure GPU defaults
+        are overridden for CPU endpoints.
+        """
+        # Sync CPU-specific fields first (override GPU defaults)
+        self._sync_cpu_fields()
+
+        # Setup template with validation and creation
+        self._setup_cpu_template()
+
+        return self
diff --git a/src/tetra_rp/core/resources/serverless.py b/src/tetra_rp/core/resources/serverless.py
index 4fc33907..8a7e650a 100644
--- a/src/tetra_rp/core/resources/serverless.py
+++ b/src/tetra_rp/core/resources/serverless.py
@@ -251,6 +251,26 @@ def sync_input_fields(self):
 
         return self
 
+    def _create_new_template(self) -> PodTemplate:
+        """Create a new PodTemplate with standard configuration."""
+        return PodTemplate(
+            name=self.resource_id,
+            imageName=self.imageName,
+            env=KeyValuePair.from_dict(self.env or get_env_vars()),
+        )
+
+    def _configure_existing_template(self) -> None:
+        """Configure an existing template with necessary overrides."""
+        if self.template is None:
+            return
+
+        self.template.name = f"{self.resource_id}__{self.template.resource_id}"
+
+        if self.imageName:
+            self.template.imageName = self.imageName
+        if self.env:
+            self.template.env = KeyValuePair.from_dict(self.env)
+
     async def _sync_graphql_object_with_inputs(
         self, returned_endpoint: "ServerlessResource"
     ):
@@ -587,28 +607,12 @@ class ServerlessEndpoint(ServerlessResource):
     Inherits from ServerlessResource.
     """
 
-    def _create_new_template(self) -> PodTemplate:
-        """Create a new PodTemplate with standard configuration."""
-        return PodTemplate(
-            name=self.resource_id,
-            imageName=self.imageName,
-            env=KeyValuePair.from_dict(self.env or get_env_vars()),
-        )
-
-    def _configure_existing_template(self) -> None:
-        """Configure an existing template with necessary overrides."""
-        if self.template is None:
-            return
-
-        self.template.name = f"{self.resource_id}__{self.template.resource_id}"
-
-        if self.imageName:
-            self.template.imageName = self.imageName
-        if self.env:
-            self.template.env = KeyValuePair.from_dict(self.env)
-
     @model_validator(mode="after")
     def set_serverless_template(self):
+        """Create template from imageName if not provided.
+
+        Must run after sync_input_fields to ensure all input fields are synced.
+        """
         if not any([self.imageName, self.template, self.templateId]):
             raise ValueError(
                 "Either imageName, template, or templateId must be provided"
diff --git a/src/tetra_rp/core/resources/serverless_cpu.py b/src/tetra_rp/core/resources/serverless_cpu.py
index 86835655..c4e6fd3b 100644
--- a/src/tetra_rp/core/resources/serverless_cpu.py
+++ b/src/tetra_rp/core/resources/serverless_cpu.py
@@ -24,6 +24,16 @@ class CpuEndpointMixin:
 
     instanceIds: Optional[List[CpuInstanceType]]
 
+    @field_validator("instanceIds")
+    @classmethod
+    def validate_instance_ids(
+        cls, value: List[CpuInstanceType]
+    ) -> List[CpuInstanceType]:
+        """Expand ANY to all available CPU instance types."""
+        if value == [CpuInstanceType.ANY]:
+            return CpuInstanceType.all()
+        return value
+
     def _is_cpu_endpoint(self) -> bool:
         """Check if this is a CPU endpoint (has instanceIds)."""
         return (
@@ -178,14 +188,6 @@ def _configure_existing_template(self) -> None:
         # Apply CPU-specific disk sizing
         self._apply_cpu_disk_sizing(self.template)
 
-    @field_validator("instanceIds")
-    @classmethod
-    def validate_cpus(cls, value: List[CpuInstanceType]) -> List[CpuInstanceType]:
-        """Expand ANY to all GPU groups"""
-        if value == [CpuInstanceType.ANY]:
-            return CpuInstanceType.all()
-        return value
-
     @model_validator(mode="after")
     def set_serverless_template(self):
         # Sync CPU-specific fields first
diff --git a/src/tetra_rp/core/resources/template.py b/src/tetra_rp/core/resources/template.py
index a4c0a254..80a74c7c 100644
--- a/src/tetra_rp/core/resources/template.py
+++ b/src/tetra_rp/core/resources/template.py
@@ -1,6 +1,7 @@
-import requests
+import warnings
 from typing import Dict, List, Optional, Any
 from pydantic import BaseModel, model_validator
+from tetra_rp.core.utils.http import get_authenticated_requests_session
 from .base import BaseResource
 
 
@@ -38,7 +39,7 @@ def sync_input_fields(self):
 
 
 def update_system_dependencies(
-    template_id, token, system_dependencies, base_entry_cmd=None
+    template_id, token=None, system_dependencies=None, base_entry_cmd=None
 ):
     """
     Updates Runpod template with system dependencies installed via apt-get,
@@ -46,12 +47,20 @@ def update_system_dependencies(
 
     Args:
         template_id (str): Runpod template ID.
-        token (str): Runpod API token.
+        token (str): [DEPRECATED] Runpod API token. Ignored; uses RUNPOD_API_KEY env var instead.
         system_dependencies (List[str]): List of apt packages to install.
         base_entry_cmd (List[str]): The default command to run the app, e.g. ["uv", "run", "handler.py"]
     Returns:
         dict: API response JSON or error info.
     """
+    # Warn if deprecated token parameter is used
+    if token is not None:
+        warnings.warn(
+            "The 'token' parameter is deprecated and ignored. "
+            "Authentication now uses RUNPOD_API_KEY environment variable.",
+            DeprecationWarning,
+            stacklevel=2,
+        )
 
     # Compose apt-get install command if any packages specified
     apt_cmd = ""
@@ -83,12 +92,16 @@ def update_system_dependencies(
         "volumeMountPath": "/workspace",
     }
 
-    headers = {"Authorization": f"Bearer {token}", "Content-Type": "application/json"}
-
     url = f"https://rest.runpod.io/v1/templates/{template_id}/update"
-    response = requests.post(url, json=payload, headers=headers)
 
+    # Use centralized auth utility instead of manual header setup
+    # Note: token parameter is deprecated; uses RUNPOD_API_KEY environment variable
+    session = get_authenticated_requests_session()
     try:
+        response = session.post(url, json=payload)
+        response.raise_for_status()
         return response.json()
-    except Exception:
-        return {"error": "Invalid JSON response", "text": response.text}
+    except Exception as e:
+        return {"error": "Failed to update template", "details": str(e)}
+    finally:
+        session.close()
diff --git a/src/tetra_rp/core/utils/http.py b/src/tetra_rp/core/utils/http.py
new file mode 100644
index 00000000..ac6ac01e
--- /dev/null
+++ b/src/tetra_rp/core/utils/http.py
@@ -0,0 +1,67 @@
+"""HTTP utilities for RunPod API communication."""
+
+import os
+from typing import Optional
+
+import httpx
+import requests
+
+
+def get_authenticated_httpx_client(
+    timeout: Optional[float] = None,
+) -> httpx.AsyncClient:
+    """Create httpx AsyncClient with RunPod authentication.
+
+    Automatically includes Authorization header if RUNPOD_API_KEY is set.
+    This provides a centralized place to manage authentication headers for
+    all RunPod HTTP requests, avoiding repetitive manual header addition.
+
+    Args:
+        timeout: Request timeout in seconds. Defaults to 30.0.
+
+    Returns:
+        Configured httpx.AsyncClient with Authorization header
+
+    Example:
+        async with get_authenticated_httpx_client() as client:
+            response = await client.post(url, json=data)
+
+        # With custom timeout
+        async with get_authenticated_httpx_client(timeout=60.0) as client:
+            response = await client.get(url)
+    """
+    headers = {}
+    api_key = os.environ.get("RUNPOD_API_KEY")
+    if api_key:
+        headers["Authorization"] = f"Bearer {api_key}"
+
+    timeout_config = timeout if timeout is not None else 30.0
+    return httpx.AsyncClient(timeout=timeout_config, headers=headers)
+
+
+def get_authenticated_requests_session() -> requests.Session:
+    """Create requests Session with RunPod authentication.
+
+    Automatically includes Authorization header if RUNPOD_API_KEY is set.
+    Provides a centralized place to manage authentication headers for
+    synchronous RunPod HTTP requests.
+
+    Returns:
+        Configured requests.Session with Authorization header
+
+    Example:
+        session = get_authenticated_requests_session()
+        response = session.post(url, json=data, timeout=30.0)
+        # Remember to close: session.close()
+
+        # Or use as context manager
+        import contextlib
+        with contextlib.closing(get_authenticated_requests_session()) as session:
+            response = session.post(url, json=data)
+    """
+    session = requests.Session()
+    api_key = os.environ.get("RUNPOD_API_KEY")
+    if api_key:
+        session.headers["Authorization"] = f"Bearer {api_key}"
+
+    return session
diff --git a/src/tetra_rp/runtime/lb_handler.py b/src/tetra_rp/runtime/lb_handler.py
new file mode 100644
index 00000000..6f7c198c
--- /dev/null
+++ b/src/tetra_rp/runtime/lb_handler.py
@@ -0,0 +1,197 @@
+"""Factory for creating FastAPI load-balanced handlers.
+
+This module provides the factory function for generating FastAPI applications
+that handle load-balanced serverless endpoints. It supports both user-defined
+HTTP routes and the framework's /execute endpoint for @remote function execution.
+
+Security Model:
+    The /execute endpoint accepts and executes serialized function code. This is
+    secure because:
+    1. The function code originates from the client's @remote decorator
+    2. The client (user) controls what function gets sent
+    3. This mirrors the trusted client model of LiveServerlessStub
+    4. In production, API authentication should protect the /execute endpoint
+
+    Users should NOT expose the /execute endpoint to untrusted clients.
+"""
+
+import base64
+import inspect
+import logging
+from typing import Any, Callable, Dict
+
+import cloudpickle
+from fastapi import FastAPI, Request
+
+logger = logging.getLogger(__name__)
+
+
+def create_lb_handler(
+    route_registry: Dict[tuple[str, str], Callable], include_execute: bool = False
+) -> FastAPI:
+    """Create FastAPI app with routes from registry.
+
+    Args:
+        route_registry: Mapping of (HTTP_METHOD, path) -> handler_function
+                       Example: {("GET", "/api/health"): health_check}
+        include_execute: Whether to register /execute endpoint for @remote execution.
+                        Only used for LiveLoadBalancer (local development).
+                        Deployed endpoints should not expose /execute for security.
+
+    Returns:
+        Configured FastAPI application with routes registered.
+    """
+    app = FastAPI(title="Flash Load-Balanced Handler")
+
+    # Register /execute endpoint for @remote stub execution (if enabled)
+    if include_execute:
+
+        @app.post("/execute")
+        async def execute_remote_function(request: Request) -> Dict[str, Any]:
+            """Framework endpoint for @remote decorator execution.
+
+            WARNING: This endpoint is INTERNAL to the Flash framework. It should only be
+            called by the @remote stub from tetra_rp.stubs.load_balancer_sls. Exposing
+            this endpoint to untrusted clients could allow arbitrary code execution.
+
+            Accepts serialized function code and arguments, executes them,
+            and returns serialized result.
+
+            Request body:
+                {
+                    "function_name": "process_data",
+                    "function_code": "def process_data(x, y): return x + y",
+                    "args": [base64_encoded_arg1, base64_encoded_arg2],
+                    "kwargs": {"key": base64_encoded_value}
+                }
+
+            Returns:
+                {
+                    "success": true,
+                    "result": base64_encoded_result
+                }
+                or
+                {
+                    "success": false,
+                    "error": "error message"
+                }
+            """
+            try:
+                body = await request.json()
+            except Exception as e:
+                logger.error(f"Failed to parse request body: {e}")
+                return {"success": False, "error": f"Invalid request body: {e}"}
+
+            try:
+                # Extract function metadata
+                function_name = body.get("function_name")
+                function_code = body.get("function_code")
+
+                if not function_name or not function_code:
+                    return {
+                        "success": False,
+                        "error": "Missing function_name or function_code in request",
+                    }
+
+                # Deserialize arguments
+                args = []
+                for arg_b64 in body.get("args", []):
+                    try:
+                        arg = cloudpickle.loads(base64.b64decode(arg_b64))
+                        args.append(arg)
+                    except Exception as e:
+                        logger.error(f"Failed to deserialize argument: {e}")
+                        return {
+                            "success": False,
+                            "error": f"Failed to deserialize argument: {e}",
+                        }
+
+                kwargs = {}
+                for key, val_b64 in body.get("kwargs", {}).items():
+                    try:
+                        val = cloudpickle.loads(base64.b64decode(val_b64))
+                        kwargs[key] = val
+                    except Exception as e:
+                        logger.error(f"Failed to deserialize kwarg '{key}': {e}")
+                        return {
+                            "success": False,
+                            "error": f"Failed to deserialize kwarg '{key}': {e}",
+                        }
+
+                # Execute function in isolated namespace
+                namespace: Dict[str, Any] = {}
+                try:
+                    exec(function_code, namespace)
+                except SyntaxError as e:
+                    logger.error(f"Syntax error in function code: {e}")
+                    return {
+                        "success": False,
+                        "error": f"Syntax error in function code: {e}",
+                    }
+                except Exception as e:
+                    logger.error(f"Error executing function code: {e}")
+                    return {
+                        "success": False,
+                        "error": f"Error executing function code: {e}",
+                    }
+
+                # Get function from namespace
+                if function_name not in namespace:
+                    return {
+                        "success": False,
+                        "error": f"Function '{function_name}' not found in executed code",
+                    }
+
+                func = namespace[function_name]
+
+                # Execute function
+                try:
+                    result = func(*args, **kwargs)
+
+                    # Handle async functions
+                    if inspect.iscoroutine(result):
+                        result = await result
+                except Exception as e:
+                    logger.error(f"Function execution failed: {e}")
+                    return {
+                        "success": False,
+                        "error": f"Function execution failed: {e}",
+                    }
+
+                # Serialize result
+                try:
+                    result_b64 = base64.b64encode(cloudpickle.dumps(result)).decode(
+                        "utf-8"
+                    )
+                    return {"success": True, "result": result_b64}
+                except Exception as e:
+                    logger.error(f"Failed to serialize result: {e}")
+                    return {
+                        "success": False,
+                        "error": f"Failed to serialize result: {e}",
+                    }
+
+            except Exception as e:
+                logger.error(f"Unexpected error in /execute endpoint: {e}")
+                return {"success": False, "error": f"Unexpected error: {e}"}
+
+    # Register user-defined routes from registry
+    for (method, path), handler in route_registry.items():
+        method_upper = method.upper()
+
+        if method_upper == "GET":
+            app.get(path)(handler)
+        elif method_upper == "POST":
+            app.post(path)(handler)
+        elif method_upper == "PUT":
+            app.put(path)(handler)
+        elif method_upper == "DELETE":
+            app.delete(path)(handler)
+        elif method_upper == "PATCH":
+            app.patch(path)(handler)
+        else:
+            logger.warning(
+                f"Unsupported HTTP method '{method}' for path '{path}'. Skipping."
+            )
+
+    return app
diff --git a/src/tetra_rp/stubs/load_balancer_sls.py b/src/tetra_rp/stubs/load_balancer_sls.py
new file mode 100644
index 00000000..1edd1a90
--- /dev/null
+++ b/src/tetra_rp/stubs/load_balancer_sls.py
@@ -0,0 +1,361 @@
+"""LoadBalancerSlsStub - Stub for load-balanced serverless execution.
+
+Enables @remote decorator to work with LoadBalancerSlsResource endpoints
+via direct HTTP calls instead of queue-based job submission.
+"""
+
+import base64
+import inspect
+import logging
+from typing import Any, Callable, Dict, List, Optional
+
+import httpx
+import cloudpickle
+
+from tetra_rp.core.utils.http import get_authenticated_httpx_client
+from .live_serverless import get_function_source
+
+log = logging.getLogger(__name__)
+
+
+class LoadBalancerSlsStub:
+    """HTTP-based stub for load-balanced serverless endpoint execution.
+
+    Implements the stub interface for @remote decorator with LoadBalancerSlsResource,
+    providing direct HTTP-based function execution instead of queue-based processing.
+
+    Key differences from LiveServerlessStub:
+    - Direct HTTP POST to /execute endpoint (not queue-based)
+    - No job ID polling - synchronous HTTP response
+    - Same function serialization pattern (cloudpickle + base64)
+    - Lower latency but no automatic retries
+
+    Architecture:
+        1. User calls @remote decorated function
+        2. Decorator dispatches to this stub via singledispatch
+        3. Stub serializes function code and arguments
+        4. Stub POSTs to endpoint /execute with serialized data
+        5. Endpoint deserializes, executes, and returns result
+        6. Stub deserializes result and returns to user
+
+    Example:
+        stub = LoadBalancerSlsStub(lb_resource)
+        result = await stub(my_func, deps, sys_deps, accel, arg1, arg2)
+    """
+
+    DEFAULT_TIMEOUT = 30.0  # Default timeout in seconds
+
+    def __init__(self, server: Any, timeout: Optional[float] = None) -> None:
+        """Initialize stub with LoadBalancerSlsResource server.
+
+        Args:
+            server: LoadBalancerSlsResource instance with endpoint_url configured
+            timeout: Request timeout in seconds (default: 30.0)
+        """
+        self.server = server
+        self.timeout = timeout if timeout is not None else self.DEFAULT_TIMEOUT
+
+    def _should_use_execute_endpoint(self, func: Callable[..., Any]) -> bool:
+        """Determine if /execute endpoint should be used for this function.
+
+        The /execute endpoint (which accepts arbitrary function code) is only used for:
+        - LiveLoadBalancer (local development)
+        - Functions without routing metadata (backward compatibility)
+
+        For deployed LoadBalancerSlsResource endpoints with routing metadata,
+        the stub translates @remote calls into HTTP requests to user-defined routes.
+
+        Args:
+            func: Function being called
+
+        Returns:
+            True if /execute should be used, False if user route should be used
+        """
+        from ..core.resources.live_serverless import LiveLoadBalancer
+
+        # Always use /execute for LiveLoadBalancer (local development)
+        if isinstance(self.server, LiveLoadBalancer):
+            log.debug(f"Using /execute endpoint for LiveLoadBalancer: {func.__name__}")
+            return True
+
+        # Check if function has routing metadata
+        routing_config = getattr(func, "__remote_config__", None)
+        if not routing_config:
+            log.debug(f"No routing config for {func.__name__}, using /execute fallback")
+            return True
+
+        # Check if routing metadata is complete
+        if not routing_config.get("method") or not routing_config.get("path"):
+            log.debug(
+                f"Incomplete routing config for {func.__name__}, using /execute fallback"
+            )
+            return True
+
+        # Use user-defined route for deployed endpoints with complete routing metadata
+        log.debug(
+            f"Using user route for deployed endpoint: {func.__name__} "
+            f"{routing_config['method']} {routing_config['path']}"
+        )
+        return False
+
+    async def __call__(
+        self,
+        func: Callable[..., Any],
+        dependencies: Optional[List[str]],
+        system_dependencies: Optional[List[str]],
+        accelerate_downloads: bool,
+        *args: Any,
+        **kwargs: Any,
+    ) -> Any:
+        """Execute function on load-balanced endpoint.
+
+        Behavior depends on endpoint type:
+        - LiveLoadBalancer: Uses /execute endpoint (local development)
+        - Deployed LoadBalancerSlsResource: Uses user-defined route via HTTP
+
+        Args:
+            func: Function to execute
+            dependencies: Pip dependencies required
+            system_dependencies: System dependencies required
+            accelerate_downloads: Whether to accelerate downloads
+            *args: Function positional arguments
+            **kwargs: Function keyword arguments
+
+        Returns:
+            Function result
+
+        Raises:
+            Exception: If endpoint returns error or HTTP call fails
+        """
+        # Determine execution path based on resource type and routing metadata
+        if self._should_use_execute_endpoint(func):
+            # Local development or backward compatibility: use /execute endpoint
+            request = self._prepare_request(
+                func,
+                dependencies,
+                system_dependencies,
+                accelerate_downloads,
+                *args,
+                **kwargs,
+            )
+            response = await self._execute_function(request)
+            return self._handle_response(response)
+        else:
+            # Deployed endpoint: use user-defined route
+            routing_config = func.__remote_config__
+            return await self._execute_via_user_route(
+                func,
+                routing_config["method"],
+                routing_config["path"],
+                *args,
+                **kwargs,
+            )
+
+    def _prepare_request(
+        self,
+        func: Callable[..., Any],
+        dependencies: Optional[List[str]],
+        system_dependencies: Optional[List[str]],
+        accelerate_downloads: bool,
+        *args: Any,
+        **kwargs: Any,
+    ) -> Dict[str, Any]:
+        """Prepare HTTP request payload.
+
+        Extracts function source code and serializes arguments using cloudpickle.
+
+        Args:
+            func: Function to serialize
+            dependencies: Pip dependencies
+            system_dependencies: System dependencies
+            accelerate_downloads: Download acceleration flag
+            *args: Function arguments
+            **kwargs: Function keyword arguments
+
+        Returns:
+            Request dictionary with serialized function and arguments
+        """
+        source, _ = get_function_source(func)
+        log.debug(f"Extracted source for {func.__name__} ({len(source)} bytes)")
+
+        request = {
+            "function_name": func.__name__,
+            "function_code": source,
+            "dependencies": dependencies or [],
+            "system_dependencies": system_dependencies or [],
+            "accelerate_downloads": accelerate_downloads,
+        }
+
+        # Serialize arguments using cloudpickle + base64
+        if args:
+            serialized_args = [
+                base64.b64encode(cloudpickle.dumps(arg)).decode("utf-8") for arg in args
+            ]
+            request["args"] = serialized_args
+            log.debug(f"Serialized {len(args)} positional args for {func.__name__}")
+
+        if kwargs:
+            serialized_kwargs = {
+                k: base64.b64encode(cloudpickle.dumps(v)).decode("utf-8")
+                for k, v in kwargs.items()
+            }
+            request["kwargs"] = serialized_kwargs
+            log.debug(f"Serialized {len(kwargs)} keyword args for {func.__name__}")
+
+        return request
+
+    async def _execute_function(self, request: Dict[str, Any]) -> Dict[str, Any]:
+        """Execute function via direct HTTP POST to endpoint.
+
+        Posts serialized function and arguments to /execute endpoint.
+        No job ID polling - waits for synchronous HTTP response.
+
+        Args:
+            request: Request dictionary with function_code, args, kwargs
+
+        Returns:
+            Response dictionary with success flag and result
+
+        Raises:
+            httpx.HTTPError: If HTTP request fails
+            ValueError: If endpoint_url not available
+        """
+        if not self.server.endpoint_url:
+            raise ValueError(
+                "Endpoint URL not available - endpoint may not be deployed"
+            )
+
+        execute_url = f"{self.server.endpoint_url}/execute"
+
+        try:
+            async with get_authenticated_httpx_client(timeout=self.timeout) as client:
+                response = await client.post(execute_url, json=request)
+                response.raise_for_status()
+                return response.json()
+        except httpx.TimeoutException as e:
+            raise TimeoutError(
+                f"Execution timeout on {self.server.name} after {self.timeout}s: {e}"
+            ) from e
+        except httpx.HTTPStatusError as e:
+            # Truncate response body to prevent huge error messages
+            response_text = e.response.text
+            if len(response_text) > 500:
+                response_text = response_text[:500] + "... (truncated)"
+            raise RuntimeError(
+                f"HTTP error from endpoint {self.server.name}: "
+                f"{e.response.status_code} - {response_text}"
+            ) from e
+        except httpx.RequestError as e:
+            raise ConnectionError(
+                f"Failed to connect to endpoint {self.server.name} ({execute_url}): {e}"
+            ) from e
+
+    async def _execute_via_user_route(
+        self,
+        func: Callable[..., Any],
+        method: str,
+        path: str,
+        *args: Any,
+        **kwargs: Any,
+    ) -> Any:
+        """Execute function by calling user-defined HTTP route.
+
+        Maps function arguments to JSON request body and makes HTTP request
+        to the user-defined route. The response is parsed as JSON and returned directly.
+
+        Args:
+            func: Function being called (used for signature inspection)
+            method: HTTP method (GET, POST, PUT, DELETE, PATCH)
+            path: URL path (e.g., /api/process)
+            *args: Function positional arguments
+            **kwargs: Function keyword arguments
+
+        Returns:
+            Function result (parsed from JSON response)
+
+        Raises:
+            ValueError: If endpoint_url not available
+            TimeoutError: If request times out
+            RuntimeError: If HTTP error occurs
+            ConnectionError: If connection fails
+        """
+        if not self.server.endpoint_url:
+            raise ValueError(
+                "Endpoint URL not available - endpoint may not be deployed"
+            )
+
+        # Get function signature to map args to parameter names
+        sig = inspect.signature(func)
+        params = list(sig.parameters.keys())
+
+        # Map positional args to parameter names
+        body = {}
+        for i, arg in enumerate(args):
+            if i < len(params):
+                body[params[i]] = arg
+        body.update(kwargs)
+
+        # Construct full URL
+        url = f"{self.server.endpoint_url}{path}"
+        log.debug(f"Executing via user route: {method} {url}")
+
+        try:
+            async with get_authenticated_httpx_client(timeout=self.timeout) as client:
+                response = await client.request(method, url, json=body)
+                response.raise_for_status()
+                result = response.json()
+                log.debug(
+                    f"User route execution successful (type={type(result).__name__})"
+                )
+                return result
+        except httpx.TimeoutException as e:
+            raise TimeoutError(
+                f"Execution timeout on {self.server.name} after {self.timeout}s: {e}"
+            ) from e
+        except httpx.HTTPStatusError as e:
+            # Truncate response body to prevent huge error messages
+            response_text = e.response.text
+            if len(response_text) > 500:
+                response_text = response_text[:500] + "... (truncated)"
+            raise RuntimeError(
+                f"HTTP error from endpoint {self.server.name}: "
+                f"{e.response.status_code} - {response_text}"
+            ) from e
+        except httpx.RequestError as e:
+            raise ConnectionError(
+                f"Failed to connect to endpoint {self.server.name} ({url}): {e}"
+            ) from e
+
+    def _handle_response(self, response: Dict[str, Any]) -> Any:
+        """Deserialize and validate response.
+
+        Args:
+            response: Response dictionary from endpoint
+
+        Returns:
+            Deserialized function result
+
+        Raises:
+            ValueError: If response format is invalid
+            Exception: If response indicates error
+        """
+        if not isinstance(response, dict):
+            raise ValueError(f"Invalid response type: {type(response)}")
+
+        if response.get("success"):
+            result_b64 = response.get("result")
+            if result_b64 is None:
+                raise ValueError("Response marked success but result is None")
+
+            try:
+                result = cloudpickle.loads(base64.b64decode(result_b64))
+                log.debug(
+                    f"Successfully deserialized response result (type={type(result).__name__})"
+                )
+                return result
+            except Exception as e:
+                raise ValueError(f"Failed to deserialize result: {e}") from e
+        else:
+            error = response.get("error", "Unknown error")
+            log.warning(f"Remote execution failed: {error}")
+            raise Exception(f"Remote execution failed: {error}")
diff --git a/src/tetra_rp/stubs/registry.py b/src/tetra_rp/stubs/registry.py
index 9ea94f45..c6363726 100644
--- a/src/tetra_rp/stubs/registry.py
+++ b/src/tetra_rp/stubs/registry.py
@@ -4,10 +4,13 @@
 from ..core.resources import (
     CpuLiveServerless,
     CpuServerlessEndpoint,
+    LiveLoadBalancer,
     LiveServerless,
+    LoadBalancerSlsResource,
     ServerlessEndpoint,
 )
 from .live_serverless import LiveServerlessStub
+from .load_balancer_sls import LoadBalancerSlsStub
 from .serverless import ServerlessEndpointStub
 
 log = logging.getLogger(__name__)
@@ -115,3 +118,53 @@ async def stubbed_resource(
         return stub.handle_response(response)
 
     return stubbed_resource
+
+
+@stub_resource.register(LoadBalancerSlsResource)
+def _(resource, **extra):
+    """Create stub for LoadBalancerSlsResource (HTTP-based execution)."""
+    stub = LoadBalancerSlsStub(resource)
+
+    async def stubbed_resource(
+        func,
+        dependencies,
+        system_dependencies,
+        accelerate_downloads,
+        *args,
+        **kwargs,
+    ) -> dict:
+        return await stub(
+            func,
+            dependencies,
+            system_dependencies,
+            accelerate_downloads,
+            *args,
+            **kwargs,
+        )
+
+    return stubbed_resource
+
+
+@stub_resource.register(LiveLoadBalancer)
+def _(resource, **extra):
+    """Create stub for LiveLoadBalancer (HTTP-based execution, local testing)."""
+    stub = LoadBalancerSlsStub(resource)
+
+    async def stubbed_resource(
+        func,
+        dependencies,
+        system_dependencies,
+        accelerate_downloads,
+        *args,
+        **kwargs,
+    ) -> dict:
+        return await stub(
+            func,
+            dependencies,
+            system_dependencies,
+            accelerate_downloads,
+            *args,
+            **kwargs,
+        )
+
+    return stubbed_resource
diff --git a/tests/integration/test_lb_remote_execution.py b/tests/integration/test_lb_remote_execution.py
new file mode 100644
index 00000000..4d34abf3
--- /dev/null
+++ b/tests/integration/test_lb_remote_execution.py
@@ -0,0 +1,306 @@
+"""Integration tests for @remote with LoadBalancerSlsResource.
+
+These tests verify the full flow of using @remote with load-balanced endpoints,
+including local development with LiveLoadBalancer and HTTP execution.
+"""
+
+import base64
+import pytest
+from unittest.mock import MagicMock
+
+import cloudpickle
+
+from tetra_rp import remote, LiveLoadBalancer, LoadBalancerSlsResource
+
+
+class TestRemoteWithLoadBalancerIntegration:
+    """Integration tests for @remote decorator with LB endpoints."""
+
+    def test_decorator_accepts_lb_resource_with_routing(self):
+        """Test that @remote accepts LoadBalancerSlsResource with method/path."""
+        lb = LoadBalancerSlsResource(name="test-api", imageName="test:latest")
+
+        @remote(lb, method="POST", path="/api/process")
+        async def process_data(x: int, y: int):
+            return {"result": x + y}
+
+        # Should not raise - decorator accepts the parameters
+        assert hasattr(process_data, "__remote_config__")
+        assert process_data.__remote_config__["method"] == "POST"
+        assert process_data.__remote_config__["path"] == "/api/process"
+
+    def test_decorator_validates_method_and_path_required(self):
+        """Test that @remote requires both method and path for LB resources."""
+        lb = LoadBalancerSlsResource(name="test-api", imageName="test:latest")
+
+        with pytest.raises(ValueError, match="requires both 'method' and 'path'"):
+
+            @remote(lb)
+            async def missing_routing():
+                pass
+
+    def test_decorator_validates_invalid_http_method(self):
+        """Test that @remote rejects invalid HTTP methods."""
+        lb = LoadBalancerSlsResource(name="test-api", imageName="test:latest")
+
+        with pytest.raises(ValueError, match="must be one of"):
+
+            @remote(lb, method="INVALID", path="/api/test")
+            async def bad_method():
+                pass
+
+    def test_decorator_validates_path_starts_with_slash(self):
+        """Test that @remote requires path to start with /."""
+        lb = LoadBalancerSlsResource(name="test-api", imageName="test:latest")
+
+        with pytest.raises(ValueError, match="must start with '/'"):
+
+            @remote(lb, method="GET", path="api/test")
+            async def bad_path():
+                pass
+
+    @pytest.mark.asyncio
+    async def test_remote_function_serialization_roundtrip(self):
+        """Test that function code and args serialize/deserialize correctly."""
+        from tetra_rp.stubs.load_balancer_sls import LoadBalancerSlsStub
+
+        mock_resource = MagicMock()
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        def add(x: int, y: int) -> int:
+            """Simple add function."""
+            return x + y
+
+        # Prepare request
+        request = stub._prepare_request(add, None, None, True, 5, 3)
+
+        # Verify request structure
+        assert request["function_name"] == "add"
+        assert "def add" in request["function_code"]
+        assert len(request["args"]) == 2
+
+        # Deserialize and verify arguments
+        arg0 = cloudpickle.loads(base64.b64decode(request["args"][0]))
+        arg1 = cloudpickle.loads(base64.b64decode(request["args"][1]))
+        assert arg0 == 5
+        assert arg1 == 3
+
+    @pytest.mark.asyncio
+    async def test_stub_response_deserialization(self):
+        """Test that response deserialization works correctly."""
+        from tetra_rp.stubs.load_balancer_sls import LoadBalancerSlsStub
+
+        mock_resource = MagicMock()
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        result_value = {"status": "success", "count": 42}
+        result_b64 = base64.b64encode(cloudpickle.dumps(result_value)).decode("utf-8")
+
+        response = {"success": True, "result": result_b64}
+
+        # Handle response
+        result = stub._handle_response(response)
+
+        assert result == result_value
+
+    def test_live_load_balancer_creation(self):
+        """Test that LiveLoadBalancer can be created and used with @remote."""
+        lb = LiveLoadBalancer(name="test-live-api")
+
+        @remote(lb, method="POST", path="/api/echo")
+        async def echo(message: str):
+            return {"echo": message}
+
+        # Verify resource is correctly configured
+        # Note: name may have "-fb" appended by flash boot validator
+        assert "test-live-api" in lb.name
+        assert "tetra-rp-lb" in lb.imageName
+        assert echo.__remote_config__["method"] == "POST"
+
+    def test_live_load_balancer_image_locked(self):
+        """Test that LiveLoadBalancer locks the image to Tetra LB image."""
+        lb = LiveLoadBalancer(name="test-api")
+
+        # Verify image is locked and cannot be overridden
+        original_image = lb.imageName
+        assert "tetra-rp-lb" in original_image
+
+        # Try to set a different image (should be ignored due to property)
+        lb.imageName = "custom-image:latest"
+
+        # Image should still be locked to Tetra
+        assert lb.imageName == original_image
+
+    def test_load_balancer_vs_queue_based_endpoints(self):
+        """Test that LB and QB endpoints have different characteristics."""
+        from tetra_rp import ServerlessEndpoint
+
+        lb = LoadBalancerSlsResource(name="lb-api", imageName="test:latest")
+        qb = ServerlessEndpoint(name="qb-api", imageName="test:latest")
+
+        @remote(lb, method="POST", path="/api/echo")
+        async def lb_func():
+            return "lb"
+
+        @remote(qb)
+        async def qb_func():
+            return "qb"
+
+        # Both should have __remote_config__
+        assert hasattr(lb_func, "__remote_config__")
+        assert hasattr(qb_func, "__remote_config__")
+
+        # LB should have routing config
+        assert lb_func.__remote_config__["method"] == "POST"
+        assert lb_func.__remote_config__["path"] == "/api/echo"
+
+        # QB should have None values for routing (not LB-specific)
+        assert qb_func.__remote_config__["method"] is None
+        assert qb_func.__remote_config__["path"] is None
+
+    def test_live_load_balancer_handler_includes_execute_endpoint(self):
+        """Test that generated handler for LiveLoadBalancer includes /execute endpoint."""
+        from tetra_rp.cli.commands.build_utils.lb_handler_generator import (
+            LBHandlerGenerator,
+        )
+        from datetime import datetime, timezone
+        from pathlib import Path
+        import tempfile
+
+        # Create a manifest for LiveLoadBalancer
+        manifest = {
+            "version": "1.0",
+            "generated_at": datetime.now(timezone.utc)
+            .isoformat()
+            .replace("+00:00", "Z"),
+            "project_name": "test-project",
+            "resources": {
+                "test-api": {
+                    "resource_type": "LiveLoadBalancer",
+                    "handler_file": "handler_test_api.py",
+                    "functions": [
+                        {
+                            "name": "process_data",
+                            "module": "api.endpoints",
+                            "is_async": True,
+                            "is_class": False,
+                            "http_method": "POST",
+                            "http_path": "/api/process",
+                        }
+                    ],
+                }
+            },
+        }
+
+        with tempfile.TemporaryDirectory() as tmpdir:
+            build_dir = Path(tmpdir)
+            generator = LBHandlerGenerator(manifest, build_dir)
+            handlers = generator.generate_handlers()
+
+            assert len(handlers) == 1
+            handler_path = handlers[0]
+            handler_code = handler_path.read_text()
+
+            # Verify the handler includes include_execute=True for LiveLoadBalancer
+            assert "include_execute=True" in handler_code
+            assert (
+                "create_lb_handler(ROUTE_REGISTRY, include_execute=True)"
+                in handler_code
+            )
+
+    def test_deployed_load_balancer_handler_excludes_execute_endpoint(self):
+        """Test that generated handler for deployed LoadBalancerSlsResource excludes /execute endpoint."""
+        from tetra_rp.cli.commands.build_utils.lb_handler_generator import (
+            LBHandlerGenerator,
+        )
+        from datetime import datetime, timezone
+        from pathlib import Path
+        import tempfile
+
+        # Create a manifest for deployed LoadBalancerSlsResource
+        manifest = {
+            "version": "1.0",
+            "generated_at": datetime.now(timezone.utc)
+            .isoformat()
+            .replace("+00:00", "Z"),
+            "project_name": "test-project",
+            "resources": {
+                "api-service": {
+                    "resource_type": "LoadBalancerSlsResource",
+                    "handler_file": "handler_api_service.py",
+                    "functions": [
+                        {
+                            "name": "process_data",
+                            "module": "api.endpoints",
+                            "is_async": True,
+                            "is_class": False,
+                            "http_method": "POST",
+                            "http_path": "/api/process",
+                        }
+                    ],
+                }
+            },
+        }
+
+        with tempfile.TemporaryDirectory() as tmpdir:
+            build_dir = Path(tmpdir)
+            generator = LBHandlerGenerator(manifest, build_dir)
+            handlers = generator.generate_handlers()
+
+            assert len(handlers) == 1
+            handler_path = handlers[0]
+            handler_code = handler_path.read_text()
+
+            # Verify the handler includes include_execute=False for deployed endpoints
+            assert "include_execute=False" in handler_code
+            assert (
+                "create_lb_handler(ROUTE_REGISTRY, include_execute=False)"
+                in handler_code
+            )
+
+    def test_scanner_discovers_load_balancer_resources(self):
+        """Test that scanner can discover LiveLoadBalancer and LoadBalancerSlsResource."""
+        from tetra_rp.cli.commands.build_utils.scanner import RemoteDecoratorScanner
+        from pathlib import Path
+        import tempfile
+
+        # Create temporary Python file with LoadBalancer resource
+        code = """
+from tetra_rp import LiveLoadBalancer, LoadBalancerSlsResource, remote
+
+# Test LiveLoadBalancer discovery
+api = LiveLoadBalancer(name="test-api")
+
+@remote(api, method="POST", path="/api/process")
+async def process_data(x: int):
+    return {"result": x}
+
+# Test LoadBalancerSlsResource discovery
+deployed = LoadBalancerSlsResource(name="deployed-api", imageName="test:latest")
+
+@remote(deployed, method="GET", path="/api/status")
+def get_status():
+    return {"status": "ok"}
+"""
+
+        with tempfile.TemporaryDirectory() as tmpdir:
+            project_dir = Path(tmpdir)
+            py_file = project_dir / "test_api.py"
+            py_file.write_text(code)
+
+            scanner = RemoteDecoratorScanner(project_dir)
+            functions = scanner.discover_remote_functions()
+
+            # Verify both resources were discovered
+            assert len(functions) == 2
+
+            # Verify resource types are correctly identified
+            resource_types = {f.resource_type for f in functions}
+            assert "LiveLoadBalancer" in resource_types
+            assert "LoadBalancerSlsResource" in resource_types
+
+            # Verify resource configs were extracted
+            assert "test-api" in scanner.resource_types
+            assert scanner.resource_types["test-api"] == "LiveLoadBalancer"
+            assert "deployed-api" in scanner.resource_types
+            assert scanner.resource_types["deployed-api"] == "LoadBalancerSlsResource"
diff --git a/tests/unit/core/utils/test_http.py b/tests/unit/core/utils/test_http.py
new file mode 100644
index 00000000..d26c0954
--- /dev/null
+++ b/tests/unit/core/utils/test_http.py
@@ -0,0 +1,125 @@
+"""Tests for HTTP utilities for RunPod API communication."""
+
+import requests
+from tetra_rp.core.utils.http import (
+    get_authenticated_httpx_client,
+    get_authenticated_requests_session,
+)
+
+
+class TestGetAuthenticatedHttpxClient:
+    """Test the get_authenticated_httpx_client utility function."""
+
+    def test_get_authenticated_httpx_client_with_api_key(self, monkeypatch):
+        """Test client includes auth header when API key is set."""
+        monkeypatch.setenv("RUNPOD_API_KEY", "test-api-key-123")
+
+        client = get_authenticated_httpx_client()
+
+        assert client is not None
+        assert "Authorization" in client.headers
+        assert client.headers["Authorization"] == "Bearer test-api-key-123"
+
+    def test_get_authenticated_httpx_client_without_api_key(self, monkeypatch):
+        """Test client works without API key (no auth header)."""
+        monkeypatch.delenv("RUNPOD_API_KEY", raising=False)
+
+        client = get_authenticated_httpx_client()
+
+        assert client is not None
+        assert "Authorization" not in client.headers
+
+    def test_get_authenticated_httpx_client_custom_timeout(self, monkeypatch):
+        """Test client respects custom timeout."""
+        monkeypatch.setenv("RUNPOD_API_KEY", "test-key")
+
+        client = get_authenticated_httpx_client(timeout=60.0)
+
+        assert client is not None
+        assert client.timeout.read == 60.0
+
+    def test_get_authenticated_httpx_client_default_timeout(self, monkeypatch):
+        """Test client uses default timeout when not specified."""
+        monkeypatch.setenv("RUNPOD_API_KEY", "test-key")
+
+        client = get_authenticated_httpx_client()
+
+        assert client is not None
+        assert client.timeout.read == 30.0
+
+    def test_get_authenticated_httpx_client_timeout_none_uses_default(
+        self, monkeypatch
+    ):
+        """Test client uses default timeout when explicitly passed None."""
+        monkeypatch.setenv("RUNPOD_API_KEY", "test-key")
+
+        client = get_authenticated_httpx_client(timeout=None)
+
+        assert client is not None
+        assert client.timeout.read == 30.0
+
+    def test_get_authenticated_httpx_client_empty_api_key_no_header(self, monkeypatch):
+        """Test that empty API key doesn't add Authorization header."""
+        monkeypatch.setenv("RUNPOD_API_KEY", "")
+
+        client = get_authenticated_httpx_client()
+
+        assert client is not None
+        # Empty string is falsy, so no auth header should be added
+        assert "Authorization" not in client.headers
+
+    def test_get_authenticated_httpx_client_zero_timeout(self, monkeypatch):
+        """Test client handles zero timeout correctly."""
+        monkeypatch.setenv("RUNPOD_API_KEY", "test-key")
+
+        client = get_authenticated_httpx_client(timeout=0.0)
+
+        assert client is not None
+        assert client.timeout.read == 0.0
+
+
+class TestGetAuthenticatedRequestsSession:
+    """Test the get_authenticated_requests_session utility function."""
+
+    def test_get_authenticated_requests_session_with_api_key(self, monkeypatch):
+        """Test session includes auth header when API key is set."""
+        monkeypatch.setenv("RUNPOD_API_KEY", "test-api-key-123")
+
+        session = get_authenticated_requests_session()
+
+        assert session is not None
+        assert "Authorization" in session.headers
+        assert session.headers["Authorization"] == "Bearer test-api-key-123"
+        session.close()
+
+    def test_get_authenticated_requests_session_without_api_key(self, monkeypatch):
+        """Test session works without API key (no auth header)."""
+        monkeypatch.delenv("RUNPOD_API_KEY", raising=False)
+
+        session = get_authenticated_requests_session()
+
+        assert session is not None
+        assert "Authorization" not in session.headers
+        session.close()
+
+    def test_get_authenticated_requests_session_empty_api_key_no_header(
+        self, monkeypatch
+    ):
+        """Test that empty API key doesn't add Authorization header."""
+        monkeypatch.setenv("RUNPOD_API_KEY", "")
+
+        session = get_authenticated_requests_session()
+
+        assert session is not None
+        # Empty string is falsy, so no auth header should be added
+        assert "Authorization" not in session.headers
+        session.close()
+
+    def test_get_authenticated_requests_session_is_valid_session(self, monkeypatch):
+        """Test returned object is a valid requests.Session."""
+        monkeypatch.setenv("RUNPOD_API_KEY", "test-key")
+
+        session = get_authenticated_requests_session()
+
+        assert isinstance(session, requests.Session)
+        session.close()
diff --git a/tests/unit/resources/test_cpu_load_balancer.py b/tests/unit/resources/test_cpu_load_balancer.py
new file mode 100644
index 00000000..92a0955d
--- /dev/null
+++ b/tests/unit/resources/test_cpu_load_balancer.py
@@ -0,0 +1,330 @@
+"""
+Tests for CpuLoadBalancerSlsResource CPU-specific functionality.
+
+Ensures CPU load balancers exclude GPU-specific fields from RunPod API payloads
+and override GPU defaults to CPU-appropriate values.
+"""
+
+import os
+
+from tetra_rp.core.resources.cpu import CpuInstanceType
+from tetra_rp.core.resources.load_balancer_sls_resource import (
+    CpuLoadBalancerSlsResource,
+)
+from tetra_rp.core.resources.serverless import ServerlessType, ServerlessScalerType
+from tetra_rp.core.resources.serverless_cpu import CpuServerlessEndpoint
+
+# Set a dummy API key for tests that create ResourceManager instances
+os.environ.setdefault("RUNPOD_API_KEY", "test-key-for-unit-tests")
+
+
+class TestCpuLoadBalancerDefaults:
+    """Test CpuLoadBalancerSlsResource default configuration."""
+
+    def test_cpu_load_balancer_creation_with_defaults(self):
+        """Test creating CpuLoadBalancerSlsResource with minimal config."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+        )
+
+        assert lb.name == "test-cpu-lb-fb"
+        assert lb.imageName == "test/image:latest"
+        assert lb.type == ServerlessType.LB
+        assert lb.scalerType == ServerlessScalerType.REQUEST_COUNT
+
+    def test_cpu_load_balancer_with_custom_instances(self):
+        """Test explicit CPU instance type configuration."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+            instanceIds=[CpuInstanceType.CPU3G_1_4, CpuInstanceType.CPU3G_2_8],
+        )
+
+        assert lb.instanceIds == [CpuInstanceType.CPU3G_1_4, CpuInstanceType.CPU3G_2_8]
+
+    def test_cpu_load_balancer_any_expansion(self):
+        """Test CpuInstanceType.ANY expansion."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+            instanceIds=[CpuInstanceType.ANY],
+        )
+
+        # ANY should expand to all CPU types
+        assert lb.instanceIds == CpuInstanceType.all()
+        assert CpuInstanceType.ANY not in lb.instanceIds
+        assert len(lb.instanceIds) == 12
+
+
+class TestCpuLoadBalancerGpuFieldOverride:
+    """Test that GPU fields are correctly overridden to CPU defaults."""
+
+    def test_sync_cpu_fields_overrides_gpu_defaults(self):
+        """Test _sync_cpu_fields overrides GPU defaults to CPU values."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+        )
+
+        # GPU fields should be overridden to CPU defaults
+        assert lb.gpuCount == 0, "gpuCount should be 0 for CPU endpoints"
+        assert lb.allowedCudaVersions == "", "allowedCudaVersions should be empty"
+        assert lb.gpuIds == "", "gpuIds should be empty"
+
+    def test_gpu_fields_not_hardcoded_in_constructor(self):
+        """Test that GPU fields are overridden even if passed to constructor."""
+        # Attempting to set GPU-specific fields should be overridden
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+            gpuCount=4,  # Should be overridden
+            allowedCudaVersions="12.0",  # Should be overridden
+        )
+
+        assert lb.gpuCount == 0
+        assert lb.allowedCudaVersions == ""
+
+
+class TestCpuLoadBalancerInputOnlyExclusion:
+    """Test that _input_only set contains all GPU-specific fields."""
+
+    def test_input_only_contains_gpu_fields(self):
+        """Test _input_only set contains all GPU-specific fields."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+        )
+
+        # Critical fields that must be excluded
+        required_excludes = {
+            "gpuCount",
+            "allowedCudaVersions",
+            "gpuIds",
+            "cudaVersions",
+            "gpus",
+        }
+        for field in required_excludes:
+            assert field in lb._input_only, f"{field} must be in _input_only"
+
+    def test_input_only_includes_common_fields(self):
+        """Test _input_only includes expected common fields."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+        )
+
+        expected_fields = {
+            "id",
+            "datacenter",
+            "env",
+            "flashboot",
+            "imageName",
+            "networkVolume",
+        }
+        for field in expected_fields:
+            assert field in lb._input_only
+
+
+class TestCpuLoadBalancerPayloadExclusion:
+    """Test that GPU fields are excluded from model_dump payload."""
+
+    def test_model_dump_excludes_gpu_fields_from_payload(self):
+        """Test model_dump payload excludes GPU fields from API."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+        )
+
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # GPU fields must NOT be in payload
+        assert "gpuCount" not in payload, "gpuCount should be excluded from payload"
+        assert "allowedCudaVersions" not in payload, (
+            "allowedCudaVersions should be excluded"
+        )
+        assert "gpuIds" not in payload, "gpuIds should be excluded"
+        assert "cudaVersions" not in payload, "cudaVersions should be excluded"
+        assert "gpus" not in payload, "gpus should be excluded"
+
+    def test_model_dump_includes_cpu_fields_in_payload(self):
+        """Test model_dump payload includes CPU-specific fields."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+        )
+
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # CPU fields must be in payload
+        assert "instanceIds" in payload
+        assert payload["instanceIds"] == ["cpu3g-1-4"]
+
+    def test_model_dump_contains_required_lb_fields(self):
+        """Test model_dump includes required load balancer fields."""
+        lb = CpuLoadBalancerSlsResource(
+            name="prod-api",
+            imageName="myapp/api:v1",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+            workersMin=1,
+            workersMax=5,
+        )
+
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # Required LB fields
+        assert payload["name"] == "prod-api-fb"
+        assert payload["type"] == "LB"
+        assert payload["scalerType"] == "REQUEST_COUNT"
+        assert payload["workersMin"] == 1
+        assert payload["workersMax"] == 5
+
+    def test_model_dump_excludes_template_image_name(self):
+        """Test imageName is excluded (sent via template object)."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test",
+            imageName="test/image:latest",
+        )
+
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # imageName should be excluded (it's template-specific)
+        assert "imageName" not in payload
+
+    def test_model_dump_includes_template_object(self):
+        """Test template object is included in payload."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test",
+            imageName="test/image:latest",
+        )
+
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # Template object should be present
+        assert "template" in payload
+        assert isinstance(payload["template"], dict)
+        assert "imageName" in payload["template"]
+
+
+class TestCpuLoadBalancerComparison:
+    """Compare CpuLoadBalancerSlsResource with CpuServerlessEndpoint for consistency."""
+
+    def test_input_only_alignment_with_cpu_serverless(self):
+        """Test _input_only aligns with CpuServerlessEndpoint for GPU fields."""
+        lb = CpuLoadBalancerSlsResource(
+            name="lb",
+            imageName="test:latest",
+        )
+
+        serverless = CpuServerlessEndpoint(
+            name="serverless",
+            imageName="test:latest",
+        )
+
+        # Critical GPU fields should be in both _input_only sets
+        gpu_fields = {
+            "gpuCount",
+            "allowedCudaVersions",
+            "gpuIds",
+            "cudaVersions",
+            "gpus",
+        }
+
+        for field in gpu_fields:
+            assert field in lb._input_only, f"{field} should be in LB _input_only"
+            assert field in serverless._input_only, (
+                f"{field} should be in Serverless _input_only"
+            )
+
+    def test_gpu_field_sync_consistency(self):
+        """Test GPU field values match between LB and Serverless."""
+        lb = CpuLoadBalancerSlsResource(
+            name="lb",
+            imageName="test:latest",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+        )
+
+        serverless = CpuServerlessEndpoint(
+            name="serverless",
+            imageName="test:latest",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+        )
+
+        # Both should have identical GPU field values
+        assert lb.gpuCount == serverless.gpuCount == 0
+        assert lb.allowedCudaVersions == serverless.allowedCudaVersions == ""
+        assert lb.gpuIds == serverless.gpuIds == ""
+
+
+class TestCpuLoadBalancerIntegration:
+    """Integration tests for CPU load balancer deployment payloads."""
+
+    def test_deployment_payload_structure_is_valid(self):
+        """Test deployment payload has correct structure for RunPod API."""
+        lb = CpuLoadBalancerSlsResource(
+            name="prod-api",
+            imageName="myapp/api:v1",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+            workersMin=1,
+            workersMax=5,
+            scalerValue=10,
+        )
+
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # Verify payload structure
+        required_fields = {"name", "type", "scalerType", "workersMin", "workersMax"}
+        for field in required_fields:
+            assert field in payload, f"Required field {field} not in payload"
+
+        # Verify no GPU fields
+        gpu_fields = {"gpuCount", "allowedCudaVersions", "gpuIds"}
+        for field in gpu_fields:
+            assert field not in payload, f"GPU field {field} should not be in payload"
+
+    def test_cpu_disk_sizing_respects_limits(self):
+        """Test that CPU load balancer doesn't raise disk sizing errors on creation."""
+        # This test verifies that we can create a CPU LB without disk sizing errors
+        # The actual disk sizing is applied when needed via _apply_cpu_disk_sizing
+        lb = CpuLoadBalancerSlsResource(
+            name="test-cpu-lb",
+            imageName="test/image:latest",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+        )
+
+        # Should have a template
+        assert lb.template is not None
+        assert lb.template.imageName == "test/image:latest"
+
+    def test_cpu_load_balancer_with_env_vars(self):
+        """Test CPU load balancer with environment variables."""
+        env = {
+            "FLASH_APP": "my_app",
+            "LOG_LEVEL": "DEBUG",
+        }
+
+        lb = CpuLoadBalancerSlsResource(
+            name="test",
+            imageName="test/image:latest",
+            env=env,
+        )
+
+        assert lb.env == env
+
+    def test_cpu_load_balancer_with_worker_config(self):
+        """Test CPU load balancer with worker scaling configuration."""
+        lb = CpuLoadBalancerSlsResource(
+            name="test",
+            imageName="test/image:latest",
+            workersMin=1,
+            workersMax=5,
+            scalerValue=10,
+        )
+
+        assert lb.workersMin == 1
+        assert lb.workersMax == 5
+        assert lb.scalerValue == 10
diff --git a/tests/unit/resources/test_live_load_balancer.py b/tests/unit/resources/test_live_load_balancer.py
new file mode 100644
index 00000000..43981292
--- /dev/null
+++ b/tests/unit/resources/test_live_load_balancer.py
@@ -0,0 +1,265 @@
+"""
+Unit tests for LiveLoadBalancer class and template serialization.
+"""
+
+import os
+
+import pytest
+
+from tetra_rp.core.resources.cpu import CpuInstanceType
+from tetra_rp.core.resources.live_serverless import (
+    CpuLiveLoadBalancer,
+    LiveLoadBalancer,
+)
+from tetra_rp.core.resources.load_balancer_sls_resource import LoadBalancerSlsResource
+
+
+class TestLiveLoadBalancer:
+    """Test LiveLoadBalancer class behavior."""
+
+    def test_live_load_balancer_creation_with_local_tag(self, monkeypatch):
+        """Test LiveLoadBalancer creates with local image tag."""
+        monkeypatch.setenv("TETRA_IMAGE_TAG", "local")
+        # Need to reload the module to pick up new env var
+        import importlib
+        import tetra_rp.core.resources.live_serverless as ls_module
+
+        importlib.reload(ls_module)
+
+        lb = ls_module.LiveLoadBalancer(name="test-lb")
+        assert lb.imageName == "runpod/tetra-rp-lb:local"
+        assert lb.template is not None
+        assert lb.template.imageName == "runpod/tetra-rp-lb:local"
+
+    def test_live_load_balancer_default_image_tag(self):
+        """Test LiveLoadBalancer uses default image tag."""
+        # Clear any custom tag
+        os.environ.pop("TETRA_IMAGE_TAG", None)
+
+        lb = LiveLoadBalancer(name="test-lb")
+
+        assert "runpod/tetra-rp-lb:" in lb.imageName
+        assert lb.template is not None
+        assert lb.template.imageName == lb.imageName
+
+    def test_live_load_balancer_template_creation(self):
+        """Test LiveLoadBalancer creates proper template from imageName."""
+        lb = LiveLoadBalancer(name="cpu_processor")
+
+        # Should have a template created from imageName
+        assert lb.template is not None
+        assert lb.template.imageName == lb.imageName
+        # Template name uses resource IDs, not the original name
+        assert "LiveLoadBalancer" in lb.template.name
+        assert "PodTemplate" in lb.template.name
+
+    def test_live_load_balancer_template_env_variables(self):
+        """Test LiveLoadBalancer template includes environment variables."""
+        lb = LiveLoadBalancer(
+            name="test-lb",
+            env={"CUSTOM_VAR": "custom_value"},
+        )
+
+        assert lb.template is not None
+        assert lb.template.env is not None
+        assert len(lb.template.env) > 0
+
+        # Check for custom env var
+        custom_vars = [kv for kv in lb.template.env if kv.key == "CUSTOM_VAR"]
+        assert len(custom_vars) == 1
+        assert custom_vars[0].value == "custom_value"
+
+    def test_live_load_balancer_payload_serialization(self):
+        """Test LiveLoadBalancer serializes correctly for GraphQL deployment."""
+        lb = LiveLoadBalancer(name="data_processor")
+
+        # Generate payload as would be sent to RunPod
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # Template must be in payload (not imageName since that's in _input_only)
+        assert "template" in payload
+        assert "imageName" not in payload
+
+        # Template must have all required fields
+        template = payload["template"]
+        assert "imageName" in template
+        assert "name" in template
+        assert template["imageName"] == lb.imageName
+
+    def test_live_load_balancer_type_is_lb(self):
+        """Test LiveLoadBalancer has type=LB."""
+        lb = LiveLoadBalancer(name="test-lb")
+
+        assert lb.type.value == "LB"
+        assert str(lb.type) == "ServerlessType.LB"
+
+    def test_live_load_balancer_scaler_is_request_count(self):
+        """Test LiveLoadBalancer uses REQUEST_COUNT scaler."""
+        lb = LiveLoadBalancer(name="test-lb")
+
+        assert lb.scalerType.value == "REQUEST_COUNT"
+
+
+class TestLoadBalancerSlsResourceTemplate:
+    """Test LoadBalancerSlsResource template handling."""
+
+    def test_load_balancer_sls_with_image_name(self):
+        """Test LoadBalancerSlsResource creates template from imageName."""
+        lb = LoadBalancerSlsResource(
+            name="test-lb",
+            imageName="runpod/tetra-rp-lb:latest",
+        )
+
+        assert lb.template is not None
+        assert lb.template.imageName == "runpod/tetra-rp-lb:latest"
+
+    def test_load_balancer_sls_requires_image_template_or_id(self):
+        """Test LoadBalancerSlsResource requires one of: imageName, template, templateId."""
+        with pytest.raises(
+            ValueError,
+            match="Either imageName, template, or templateId must be provided",
+        ):
+            LoadBalancerSlsResource(name="test-lb")
+
+    def test_load_balancer_sls_with_template_id(self):
+        """Test LoadBalancerSlsResource works with templateId."""
+        lb = LoadBalancerSlsResource(
+            name="test-lb",
+            templateId="template-123",
+        )
+
+        assert lb.templateId == "template-123"
+        assert lb.template is None
+
+
+class TestTemplateSerializationRoundtrip:
+    """Test that template serialization works correctly for GraphQL."""
+
+    def test_live_load_balancer_serialization_roundtrip(self):
+        """Test that LiveLoadBalancer can be serialized and contains template."""
+        lb = LiveLoadBalancer(
+            name="test-service",
+            env={"API_KEY": "secret123"},
+        )
+
+        # Simulate what gets sent to RunPod
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # Verify GraphQL payload has template
+        assert "template" in payload, "Template must be in GraphQL payload"
+        assert payload["template"]["imageName"] is not None
+        assert payload["template"]["name"] is not None
+
+        # Verify imageName is NOT in payload (it's in _input_only)
+        assert "imageName" not in payload
+
+        # Verify the template has the correct image
+        assert "tetra-rp-lb:" in payload["template"]["imageName"], (
+            "Must have load-balancer image"
+        )
+
+    def test_template_env_serialization(self):
+        """Test template environment variables serialize correctly."""
+        lb = LiveLoadBalancer(
+            name="test-lb",
+            env={"VAR1": "value1", "VAR2": "value2"},
+        )
+
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        template_env = payload["template"]["env"]
+        assert isinstance(template_env, list)
+        assert len(template_env) >= 2
+
+        # Check env vars are serialized as {key, value} objects
+        var_keys = {kv["key"] for kv in template_env}
+        assert "VAR1" in var_keys
+        assert "VAR2" in var_keys
+
+
+class TestCpuLiveLoadBalancer:
+    """Test CpuLiveLoadBalancer class behavior."""
+
+    def test_cpu_live_load_balancer_creation_with_local_tag(self, monkeypatch):
+        """Test CpuLiveLoadBalancer creates with local image tag."""
+        monkeypatch.setenv("TETRA_IMAGE_TAG", "local")
+        # Need to reload the module to pick up new env var
+        import importlib
+
+        import tetra_rp.core.resources.live_serverless as ls_module
+
+        importlib.reload(ls_module)
+
+        lb = ls_module.CpuLiveLoadBalancer(name="test-lb")
+        assert lb.imageName == "runpod/tetra-rp-lb-cpu:local"
+        assert lb.template is not None
+        assert lb.template.imageName == "runpod/tetra-rp-lb-cpu:local"
+
+    def test_cpu_live_load_balancer_default_image_tag(self):
+        """Test CpuLiveLoadBalancer uses default CPU LB image tag."""
+        # Clear any custom tag
+        os.environ.pop("TETRA_IMAGE_TAG", None)
+
+        lb = CpuLiveLoadBalancer(name="test-lb")
+
+        assert "runpod/tetra-rp-lb-cpu:" in lb.imageName
+        assert lb.template is not None
+        assert lb.template.imageName == lb.imageName
+
+    def test_cpu_live_load_balancer_defaults_to_cpu_any(self):
+        """Test CpuLiveLoadBalancer expands CPU_ANY to all available types."""
+        lb = CpuLiveLoadBalancer(name="test-lb")
+
+        # ANY should expand to all available CPU instance types
+        assert lb.instanceIds == CpuInstanceType.all()
+        assert len(lb.instanceIds) == 12  # 4 cpu3g + 4 cpu3c + 4 cpu5c
+
+    def test_cpu_live_load_balancer_with_specific_cpu_instances(self):
+        """Test CpuLiveLoadBalancer with explicit CPU instances."""
+        lb = CpuLiveLoadBalancer(
+            name="test-lb",
+            instanceIds=[CpuInstanceType.CPU3G_1_4],
+        )
+
+        assert lb.instanceIds == [CpuInstanceType.CPU3G_1_4]
+
+    def test_cpu_live_load_balancer_type_is_lb(self):
+        """Test CpuLiveLoadBalancer has type=LB."""
+        lb = CpuLiveLoadBalancer(name="test-lb")
+
+        assert lb.type.value == "LB"
+        assert str(lb.type) == "ServerlessType.LB"
+
+    def test_cpu_live_load_balancer_scaler_is_request_count(self):
+        """Test CpuLiveLoadBalancer uses REQUEST_COUNT scaler."""
+        lb = CpuLiveLoadBalancer(name="test-lb")
+
+        assert lb.scalerType.value == "REQUEST_COUNT"
+
+    def test_cpu_live_load_balancer_payload_serialization(self):
+        """Test CpuLiveLoadBalancer serializes correctly for GraphQL deployment."""
+        lb = CpuLiveLoadBalancer(name="data_processor")
+
+        # Generate payload as would be sent to RunPod
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # Template must be in payload (not imageName since that's in _input_only)
+        assert "template" in payload
+        assert "imageName" not in payload
+
+        # Template must have all required fields
+        template = payload["template"]
+        assert "imageName" in template
+        assert "name" in template
+        assert template["imageName"] == lb.imageName
+
+    def test_cpu_live_load_balancer_excludes_gpu_fields(self):
+        """Test CpuLiveLoadBalancer excludes GPU fields from payload."""
+        lb = CpuLiveLoadBalancer(name="test-lb")
+
+        payload = lb.model_dump(exclude=lb._input_only, exclude_none=True, mode="json")
+
+        # GPU-specific fields should not be in payload
+        assert "gpus" not in payload
+        assert "gpuIds" not in payload
+        assert "cudaVersions" not in payload
diff --git a/tests/unit/test_load_balancer_sls_resource.py b/tests/unit/test_load_balancer_sls_resource.py
new file mode 100644
index 00000000..d73f694b
--- /dev/null
+++ b/tests/unit/test_load_balancer_sls_resource.py
@@ -0,0 +1,525 @@
+"""
+Tests for LoadBalancerSlsResource provisioning and health checks.
+"""
+
+import os
+
+import pytest
+from unittest.mock import AsyncMock, MagicMock, patch
+
+from tetra_rp.core.resources import (
+    LoadBalancerSlsResource,
+    ServerlessType,
+    ServerlessScalerType,
+)
+from tetra_rp.core.resources.serverless import ServerlessResource
+
+# Set a dummy API key for tests that create ResourceManager instances
+os.environ.setdefault("RUNPOD_API_KEY", "test-key-for-unit-tests")
+
+
+class TestLoadBalancerSlsResourceCreation:
+    """Test LoadBalancerSlsResource creation and validation."""
+
+    def test_create_with_defaults(self):
+        """Test creating LoadBalancerSlsResource with minimal config."""
+        resource = LoadBalancerSlsResource(
+            name="test-endpoint",
+            imageName="test-image:latest",
+        )
+
+        # Note: name gets -fb suffix added by sync_input_fields due to flashboot=True
+        assert resource.name == "test-endpoint-fb"
+        assert resource.imageName == "test-image:latest"
+        assert resource.type == ServerlessType.LB
+        assert resource.scalerType == ServerlessScalerType.REQUEST_COUNT
+
+    def test_type_always_lb(self):
+        """Test that type is always LB regardless of input."""
+        # Try to set type to QB - should be overridden to LB
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            type=ServerlessType.QB,  # This should be overridden
+        )
+
+        assert resource.type == ServerlessType.LB
+
+    def test_scaler_type_defaults_to_request_count(self):
+        """Test that scaler type defaults to REQUEST_COUNT for LB."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+        )
+
+        assert resource.scalerType == ServerlessScalerType.REQUEST_COUNT
+
+    def test_validate_lb_configuration_rejects_queue_delay(self):
+        """Test that QUEUE_DELAY scaler is rejected for LB endpoints."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            scalerType=ServerlessScalerType.QUEUE_DELAY,
+        )
+
+        with pytest.raises(ValueError, match="requires REQUEST_COUNT scaler"):
+            resource._validate_lb_configuration()
+
+    def test_with_custom_env_vars(self):
+        """Test creating LB resource with custom environment variables."""
+        env = {
+            "FLASH_APP": "my_app",
+            "LOG_LEVEL": "DEBUG",
+        }
+
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            env=env,
+        )
+
+        assert resource.env == env
+
+    def test_with_worker_config(self):
+        """Test creating LB resource with worker scaling config."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            workersMin=1,
+            workersMax=5,
+            scalerValue=10,
+        )
+
+        assert resource.workersMin == 1
+        assert resource.workersMax == 5
+        assert resource.scalerValue == 10
+
+    def test_endpoint_url_format_for_load_balanced_endpoints(self):
+        """Test that endpoint_url uses load-balanced format, not v2 API format."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="6g2hfns3ar5pti",
+        )
+
+        # Load-balanced endpoints use: https://{id}.api.runpod.ai
+        # NOT: https://api.runpod.ai/v2/{id}
+        assert resource.endpoint_url == "https://6g2hfns3ar5pti.api.runpod.ai"
+
+    def test_endpoint_url_raises_without_id(self):
+        """Test that endpoint_url raises error when endpoint ID not set."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+        )
+
+        with pytest.raises(ValueError, match="Endpoint ID not set"):
+            _ = resource.endpoint_url
+
+
+class TestLoadBalancerSlsResourceHealthCheck:
+    """Test health check functionality."""
+
+    @staticmethod
+    def _create_mock_client(
+        status_code: int = 200, error: Exception = None
+    ) -> MagicMock:
+        """Create properly configured async context manager mock client."""
+        mock_response = AsyncMock()
+        mock_response.status_code = status_code
+        mock_client = MagicMock()
+        if error:
+            mock_client.get = AsyncMock(side_effect=error)
+        else:
+            mock_client.get = AsyncMock(return_value=mock_response)
+        mock_client.__aenter__ = AsyncMock(return_value=mock_client)
+        mock_client.__aexit__ = AsyncMock(return_value=None)
+        return mock_client
+
+    @pytest.mark.asyncio
+    async def test_check_ping_endpoint_success(self):
+        """Test successful ping endpoint check with ID set."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-endpoint-id",
+        )
+
+        mock_client = self._create_mock_client(200)
+        with (
+            patch.object(
+                LoadBalancerSlsResource,
+                "endpoint_url",
+                new_callable=lambda: property(lambda self: "https://test-endpoint.com"),
+            ),
+            patch(
+                "tetra_rp.core.utils.http.httpx.AsyncClient",
+                return_value=mock_client,
+            ),
+        ):
+            result = await resource._check_ping_endpoint()
+
+            assert result is True
+
+    @pytest.mark.asyncio
+    async def test_check_ping_endpoint_initializing(self):
+        """Test ping endpoint returning 204 (initializing)."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-endpoint-id",
+        )
+
+        mock_client = self._create_mock_client(204)
+        with (
+            patch.object(
+                LoadBalancerSlsResource,
+                "endpoint_url",
+                new_callable=lambda: property(lambda self: "https://test-endpoint.com"),
+            ),
+            patch(
+                "tetra_rp.core.utils.http.httpx.AsyncClient",
+                return_value=mock_client,
+            ),
+        ):
+            result = await resource._check_ping_endpoint()
+
+            assert result is True
+
+    @pytest.mark.asyncio
+    async def test_check_ping_endpoint_failure(self):
+        """Test ping endpoint returning unhealthy status."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-endpoint-id",
+        )
+
+        mock_client = self._create_mock_client(503)
+        with (
+            patch.object(
+                LoadBalancerSlsResource,
+                "endpoint_url",
+                new_callable=lambda: property(lambda self: "https://test-endpoint.com"),
+            ),
+            patch(
+                "tetra_rp.core.utils.http.httpx.AsyncClient",
+                return_value=mock_client,
+            ),
+        ):
+            result = await resource._check_ping_endpoint()
+
+            assert result is False
+
+    @pytest.mark.asyncio
+    async def test_check_ping_endpoint_connection_error(self):
+        """Test ping endpoint with connection error."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-endpoint-id",
+        )
+
+        mock_client = self._create_mock_client(
+            error=ConnectionError("Connection refused")
+        )
+        with (
+            patch.object(
+                LoadBalancerSlsResource,
+                "endpoint_url",
+                new_callable=lambda: property(lambda self: "https://test-endpoint.com"),
+            ),
+            patch(
+                "tetra_rp.core.utils.http.httpx.AsyncClient",
+                return_value=mock_client,
+            ),
+        ):
+            result = await resource._check_ping_endpoint()
+
+            assert result is False
+
+    @pytest.mark.asyncio
+    async def test_check_ping_endpoint_no_id(self):
+        """Test ping check when endpoint ID is not set."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            # id not set
+        )
+
+        result = await resource._check_ping_endpoint()
+        assert result is False
+
+    @pytest.mark.asyncio
+    async def test_wait_for_health_success(self):
+        """Test health check polling with successful response."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-endpoint-id",
+        )
+
+        with patch.object(resource, "_check_ping_endpoint") as mock_check:
+            mock_check.return_value = True
+
+            result = await resource._wait_for_health(max_retries=3)
+
+            assert result is True
+            mock_check.assert_called_once()
+
+    @pytest.mark.asyncio
+    async def test_wait_for_health_retry_then_success(self):
+        """Test health check polling with retries before success."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-endpoint-id",
+        )
+
+        with patch.object(resource, "_check_ping_endpoint") as mock_check:
+            # Fail twice, then succeed
+            mock_check.side_effect = [False, False, True]
+
+            result = await resource._wait_for_health(max_retries=5, retry_interval=0)
+
+            assert result is True
+            assert mock_check.call_count == 3
+
+    @pytest.mark.asyncio
+    async def test_wait_for_health_timeout(self):
+        """Test health check polling timeout after max retries."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-endpoint-id",
+        )
+
+        with patch.object(resource, "_check_ping_endpoint") as mock_check:
+            mock_check.return_value = False
+
+            result = await resource._wait_for_health(max_retries=3, retry_interval=0)
+
+            assert result is False
+            assert mock_check.call_count == 3
+
+    @pytest.mark.asyncio
+    async def test_wait_for_health_no_id(self):
+        """Test health check when endpoint ID not set."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            # id not set
+        )
+
+        with pytest.raises(ValueError, match="Cannot wait for health"):
+            await resource._wait_for_health()
+
+    @pytest.mark.asyncio
+    async def test_is_deployed_async_with_id(self):
+        """Test is_deployed_async returns True when healthy."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-endpoint-id",
+        )
+
+        with patch.object(resource, "_check_ping_endpoint") as mock_check:
+            mock_check.return_value = True
+
+            result = await resource.is_deployed_async()
+
+            assert result is True
+
+    @pytest.mark.asyncio
+    async def test_is_deployed_async_without_id(self):
+        """Test is_deployed_async returns False when ID not set."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+        )
+
+        result = await resource.is_deployed_async()
+
+        assert result is False
+
+    @pytest.mark.asyncio
+    async def test_is_deployed_async_unhealthy(self):
+        """Test is_deployed_async returns False when unhealthy."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-endpoint-id",
+        )
+
+        with patch.object(resource, "_check_ping_endpoint") as mock_check:
+            mock_check.return_value = False
+
+            result = await resource.is_deployed_async()
+
+            assert result is False
+
+
+class TestLoadBalancerSlsResourceDeployment:
+    """Test deployment flow."""
+
+    @pytest.mark.asyncio
+    async def test_do_deploy_validates_configuration(self):
+        """Test that _do_deploy validates LB configuration."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            scalerType=ServerlessScalerType.QUEUE_DELAY,
+        )
+
+        with pytest.raises(ValueError, match="requires REQUEST_COUNT scaler"):
+            await resource._do_deploy()
+
+    @pytest.mark.asyncio
+    async def test_do_deploy_already_deployed(self):
+        """Test _do_deploy skips deployment if already deployed."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="existing-id",
+        )
+
+        with patch.object(LoadBalancerSlsResource, "is_deployed") as mock_deployed:
+            mock_deployed.return_value = True
+
+            result = await resource._do_deploy()
+
+            assert result == resource
+
+    @pytest.mark.asyncio
+    async def test_do_deploy_success(self):
+        """Test successful deployment with health check."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+        )
+
+        mock_deployed = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="new-endpoint-id",
+        )
+
+        with (
+            patch.object(
+                LoadBalancerSlsResource, "is_deployed", MagicMock(return_value=False)
+            ),
+            patch.object(
+                resource, "_wait_for_health", new_callable=AsyncMock, return_value=True
+            ) as mock_wait,
+            patch.object(
+                ServerlessResource,
+                "_do_deploy",
+                new_callable=AsyncMock,
+                return_value=mock_deployed,
+            ),
+        ):
+            result = await resource._do_deploy()
+
+            assert result == mock_deployed
+            mock_wait.assert_called_once()
+
+    @pytest.mark.asyncio
+    async def test_do_deploy_health_check_timeout(self):
+        """Test deployment fails if health check times out."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+        )
+
+        mock_deployed = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="new-endpoint-id",
+        )
+
+        with (
+            patch.object(
+                LoadBalancerSlsResource, "is_deployed", MagicMock(return_value=False)
+            ),
+            patch.object(
+                resource, "_wait_for_health", new_callable=AsyncMock, return_value=False
+            ),
+            patch.object(
+                ServerlessResource,
+                "_do_deploy",
+                new_callable=AsyncMock,
+                return_value=mock_deployed,
+            ),
+        ):
+            with pytest.raises(TimeoutError, match="failed to become healthy"):
+                await resource._do_deploy()
+
+    @pytest.mark.asyncio
+    async def test_do_deploy_parent_deploy_failure(self):
+        """Test deployment handles parent deploy failure."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+        )
+
+        with (
+            patch.object(
+                LoadBalancerSlsResource, "is_deployed", MagicMock(return_value=False)
+            ),
+            patch.object(
+                ServerlessResource,
+                "_do_deploy",
+                new_callable=AsyncMock,
+                side_effect=ValueError("RunPod API error"),
+            ),
+        ):
+            with pytest.raises(ValueError, match="RunPod API error"):
+                await resource._do_deploy()
+
+
+class TestLoadBalancerSlsResourceIntegration:
+    """Integration tests with ResourceManager."""
+
+    def test_resource_manager_integration(self):
+        """Test that LoadBalancerSlsResource can be created and used."""
+        # Test that LoadBalancerSlsResource can be instantiated and used
+        resource = LoadBalancerSlsResource(
+            name="integration-test",
+            imageName="test-image:latest",
+        )
+
+        assert isinstance(resource, LoadBalancerSlsResource)
+        assert resource.type == ServerlessType.LB
+
+    def test_is_deployed_sync(self):
+        """Test synchronous is_deployed method."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+            id="test-id",
+        )
+
+        # Mock the endpoint property and its health method
+        mock_endpoint = MagicMock()
+        mock_endpoint.health.return_value = {"status": "healthy"}
+
+        with patch.object(
+            LoadBalancerSlsResource,
+            "endpoint",
+            new_callable=lambda: property(lambda self: mock_endpoint),
+        ):
+            result = resource.is_deployed()
+
+            assert result is True
+            mock_endpoint.health.assert_called_once()
+
+    def test_is_deployed_sync_no_id(self):
+        """Test is_deployed returns False when no ID."""
+        resource = LoadBalancerSlsResource(
+            name="test",
+            imageName="image",
+        )
+
+        result = resource.is_deployed()
+
+        assert result is False
diff --git a/tests/unit/test_load_balancer_sls_stub.py b/tests/unit/test_load_balancer_sls_stub.py
new file mode 100644
index 00000000..c5adcbf6
--- /dev/null
+++ b/tests/unit/test_load_balancer_sls_stub.py
@@ -0,0 +1,436 @@
+"""Unit tests for LoadBalancerSlsStub functionality."""
+
+import base64
+import pytest
+from unittest.mock import AsyncMock, MagicMock, patch
+
+import cloudpickle
+
+from tetra_rp import LoadBalancerSlsResource
+from tetra_rp.stubs.load_balancer_sls import LoadBalancerSlsStub
+
+
+# Create test resources
+test_lb_resource = LoadBalancerSlsResource(
+    name="test-lb",
+    imageName="test:latest",
+)
+
+
+class TestLoadBalancerSlsStubPrepareRequest:
+    """Test suite for _prepare_request method."""
+
+    def test_prepare_request_with_no_args(self):
+        """Test request preparation with no arguments."""
+        stub = LoadBalancerSlsStub(test_lb_resource)
+
+        def test_func():
+            return "result"
+
+        request = stub._prepare_request(test_func, None, None, True)
+
+        assert request["function_name"] == "test_func"
+        assert "def test_func" in request["function_code"]
+        assert request["dependencies"] == []
+        assert request["system_dependencies"] == []
+        assert request["accelerate_downloads"] is True
+        assert "args" not in request or request["args"] == []
+        assert "kwargs" not in request or request["kwargs"] == {}
+
+    def test_prepare_request_with_args(self):
+        """Test request preparation with positional arguments."""
+        stub = LoadBalancerSlsStub(test_lb_resource)
+
+        def add(x, y):
+            return x + y
+
+        arg1 = 5
+        arg2 = 3
+        request = stub._prepare_request(add, None, None, True, arg1, arg2)
+
+        assert request["function_name"] == "add"
+        assert len(request["args"]) == 2
+
+        # Verify args are properly serialized
+        decoded_arg1 = cloudpickle.loads(base64.b64decode(request["args"][0]))
+        decoded_arg2 = cloudpickle.loads(base64.b64decode(request["args"][1]))
+        assert decoded_arg1 == 5
+        assert decoded_arg2 == 3
+
+    def test_prepare_request_with_kwargs(self):
+        """Test request preparation with keyword arguments."""
+        stub = LoadBalancerSlsStub(test_lb_resource)
+
+        def greet(name, greeting="Hello"):
+            return f"{greeting}, {name}!"
+
+        request = stub._prepare_request(
+            greet, None, None, True, name="Alice", greeting="Hi"
+        )
+
+        assert "kwargs" in request
+        assert len(request["kwargs"]) == 2
+
+        # Verify kwargs are properly serialized
+        decoded_name = cloudpickle.loads(base64.b64decode(request["kwargs"]["name"]))
+        decoded_greeting = cloudpickle.loads(
+            base64.b64decode(request["kwargs"]["greeting"])
+        )
+        assert decoded_name == "Alice"
+        assert decoded_greeting == "Hi"
+
+    def test_prepare_request_with_dependencies(self):
+        """Test request preparation includes dependencies."""
+        stub = LoadBalancerSlsStub(test_lb_resource)
+
+        def test_func():
+            return "result"
+
+        dependencies = ["requests", "numpy"]
+        system_deps = ["git"]
+
+        request = stub._prepare_request(test_func, dependencies, system_deps, True)
+
+        assert request["dependencies"] == dependencies
+        assert request["system_dependencies"] == system_deps
+
+
+class TestLoadBalancerSlsStubHandleResponse:
+    """Test suite for _handle_response method."""
+
+    def test_handle_response_success(self):
+        """Test successful response handling."""
+        stub = LoadBalancerSlsStub(test_lb_resource)
+
+        result_value = {"status": "ok", "value": 42}
+        result_b64 = base64.b64encode(cloudpickle.dumps(result_value)).decode("utf-8")
+
+        response = {"success": True, "result": result_b64}
+
+        result = stub._handle_response(response)
+
+        assert result == result_value
+
+    def test_handle_response_error(self):
+        """Test error response handling."""
+        stub = LoadBalancerSlsStub(test_lb_resource)
+
+        response = {"success": False, "error": "Function execution failed"}
+
+        with pytest.raises(Exception, match="Remote execution failed"):
+            stub._handle_response(response)
+
+    def test_handle_response_invalid_type(self):
+        """Test handling of invalid response type."""
+        stub = LoadBalancerSlsStub(test_lb_resource)
+
+        with pytest.raises(ValueError, match="Invalid response type"):
+            stub._handle_response("not a dict")
+
+    def test_handle_response_missing_result(self):
+        """Test handling of success response without result."""
+        stub = LoadBalancerSlsStub(test_lb_resource)
+
+        response = {"success": True, "result": None}
+
+        with pytest.raises(
+            ValueError, match="Response marked success but result is None"
+        ):
+            stub._handle_response(response)
+
+    def test_handle_response_invalid_base64(self):
+        """Test handling of invalid base64 in result."""
+        stub = LoadBalancerSlsStub(test_lb_resource)
+
+        response = {"success": True, "result": "not_valid_base64!!!"}
+
+        with pytest.raises(ValueError, match="Failed to deserialize result"):
+            stub._handle_response(response)
+
+
+class TestLoadBalancerSlsStubExecuteFunction:
+    """Test suite for _execute_function method."""
+
+    @pytest.mark.asyncio
+    async def test_execute_function_no_endpoint_url(self):
+        """Test error when endpoint_url is not available."""
+        mock_resource = MagicMock()
+        mock_resource.endpoint_url = None
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        request = {
+            "function_name": "test_func",
+            "function_code": "def test_func(): pass",
+        }
+
+        with pytest.raises(ValueError, match="Endpoint URL not available"):
+            await stub._execute_function(request)
+
+    @pytest.mark.asyncio
+    async def test_execute_function_timeout(self):
+        """Test timeout error handling."""
+        mock_resource = MagicMock()
+        mock_resource.endpoint_url = "http://localhost:8000"
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        request = {
+            "function_name": "test_func",
+            "function_code": "def test_func(): pass",
+        }
+
+        import httpx
+
+        with patch("tetra_rp.stubs.load_balancer_sls.httpx.AsyncClient") as mock_client:
+            mock_client.return_value.__aenter__.return_value.post = AsyncMock(
+                side_effect=httpx.TimeoutException("Timeout")
+            )
+
+            with pytest.raises(TimeoutError, match="Execution timeout"):
+                await stub._execute_function(request)
+
+    @pytest.mark.asyncio
+    async def test_execute_function_http_error(self):
+        """Test HTTP error handling."""
+        mock_resource = MagicMock()
+        mock_resource.endpoint_url = "http://localhost:8000"
+        mock_resource.name = "test-lb"
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        request = {
+            "function_name": "test_func",
+            "function_code": "def test_func(): pass",
+        }
+
+        import httpx
+
+        mock_response = MagicMock()
+        mock_response.status_code = 500
+        mock_response.text = "Internal server error"
+
+        with patch("tetra_rp.stubs.load_balancer_sls.httpx.AsyncClient") as mock_client:
+            error = httpx.HTTPStatusError(
+                "Error", request=MagicMock(), response=mock_response
+            )
+            mock_client.return_value.__aenter__.return_value.post = AsyncMock(
+                side_effect=error
+            )
+
+            with pytest.raises(RuntimeError, match="HTTP error from endpoint"):
+                await stub._execute_function(request)
+
+
+class TestLoadBalancerSlsStubCall:
+    """Test suite for __call__ method."""
+
+    @pytest.mark.asyncio
+    async def test_call_success(self):
+        """Test successful stub execution."""
+        mock_resource = MagicMock()
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        def add(x, y):
+            return x + y
+
+        with patch.object(stub, "_execute_function") as mock_execute:
+            result_b64 = base64.b64encode(cloudpickle.dumps(8)).decode("utf-8")
+            mock_execute.return_value = {"success": True, "result": result_b64}
+
+            result = await stub(add, None, None, True, 5, 3)
+
+            assert result == 8
+            mock_execute.assert_called_once()
+
+    @pytest.mark.asyncio
+    async def test_call_with_dependencies(self):
+        """Test stub execution with dependencies."""
+        mock_resource = MagicMock()
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        def use_requests():
+            return "success"
+
+        deps = ["requests"]
+
+        with patch.object(stub, "_execute_function") as mock_execute:
+            result_b64 = base64.b64encode(cloudpickle.dumps("success")).decode("utf-8")
+            mock_execute.return_value = {"success": True, "result": result_b64}
+
+            result = await stub(use_requests, deps, None, True)
+
+            assert result == "success"
+            # Verify dependencies were included in request
+            call_args = mock_execute.call_args
+            request = call_args[0][0]
+            assert request["dependencies"] == deps
+
+
+class TestLoadBalancerSlsStubRouting:
+    """Test suite for routing detection between /execute and user routes."""
+
+    def test_should_use_execute_for_live_load_balancer(self):
+        """Test that LiveLoadBalancer always uses /execute endpoint."""
+        from tetra_rp import LiveLoadBalancer
+        from tetra_rp import remote
+
+        lb = LiveLoadBalancer(name="test-live")
+        stub = LoadBalancerSlsStub(lb)
+
+        @remote(lb, method="POST", path="/api/test")
+        def test_func():
+            pass
+
+        assert stub._should_use_execute_endpoint(test_func) is True
+
+    def test_should_use_user_route_for_deployed_lb(self):
+        """Test that deployed LoadBalancerSlsResource uses user-defined route."""
+        from tetra_rp import remote
+
+        lb = LoadBalancerSlsResource(name="test-deployed", imageName="test:latest")
+        stub = LoadBalancerSlsStub(lb)
+
+        @remote(lb, method="POST", path="/api/test")
+        def test_func():
+            pass
+
+        assert stub._should_use_execute_endpoint(test_func) is False
+
+    def test_should_fallback_to_execute_without_routing_metadata(self):
+        """Test fallback to /execute when routing metadata is missing."""
+        lb = LoadBalancerSlsResource(name="test", imageName="test:latest")
+        stub = LoadBalancerSlsStub(lb)
+
+        def func_without_metadata():
+            pass
+
+        assert stub._should_use_execute_endpoint(func_without_metadata) is True
+
+    def test_should_fallback_to_execute_with_incomplete_metadata(self):
+        """Test fallback to /execute when routing metadata is incomplete."""
+        lb = LoadBalancerSlsResource(name="test", imageName="test:latest")
+        stub = LoadBalancerSlsStub(lb)
+
+        def func_with_incomplete_metadata():
+            pass
+
+        # Attach incomplete metadata
+        func_with_incomplete_metadata.__remote_config__ = {"method": "POST"}
+
+        assert stub._should_use_execute_endpoint(func_with_incomplete_metadata) is True
+
+    @pytest.mark.asyncio
+    async def test_execute_via_user_route_success(self):
+        """Test successful execution via user-defined route."""
+        mock_resource = MagicMock()
+        mock_resource.endpoint_url = "http://localhost:8000"
+        mock_resource.name = "test-lb"
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        def add(x, y):
+            return x + y
+
+        mock_response = MagicMock()
+        mock_response.json.return_value = {"result": 8}
+
+        with patch("tetra_rp.stubs.load_balancer_sls.httpx.AsyncClient") as mock_client:
+            mock_client.return_value.__aenter__.return_value.request = AsyncMock(
+                return_value=mock_response
+            )
+
+            result = await stub._execute_via_user_route(add, "POST", "/api/add", 5, 3)
+
+            assert result == {"result": 8}
+            # Verify correct HTTP method and URL
+            mock_client.return_value.__aenter__.return_value.request.assert_called_once()
+            call_args = (
+                mock_client.return_value.__aenter__.return_value.request.call_args
+            )
+            assert call_args[0][0] == "POST"
+            assert call_args[0][1] == "http://localhost:8000/api/add"
+            # Verify correct JSON body with mapped parameters
+            assert call_args[1]["json"] == {"x": 5, "y": 3}
+
+    @pytest.mark.asyncio
+    async def test_execute_via_user_route_with_kwargs(self):
+        """Test user route execution with keyword arguments."""
+        mock_resource = MagicMock()
+        mock_resource.endpoint_url = "http://localhost:8000"
+        mock_resource.name = "test-lb"
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        def greet(name, greeting="Hello"):
+            return f"{greeting}, {name}!"
+
+        mock_response = MagicMock()
+        mock_response.json.return_value = "Hi, Alice!"
+
+        with patch("tetra_rp.stubs.load_balancer_sls.httpx.AsyncClient") as mock_client:
+            mock_client.return_value.__aenter__.return_value.request = AsyncMock(
+                return_value=mock_response
+            )
+
+            result = await stub._execute_via_user_route(
+                greet, "POST", "/api/greet", "Alice", greeting="Hi"
+            )
+
+            assert result == "Hi, Alice!"
+            # Verify JSON body has both positional arg and kwargs
+            call_args = (
+                mock_client.return_value.__aenter__.return_value.request.call_args
+            )
+            assert call_args[1]["json"] == {"name": "Alice", "greeting": "Hi"}
+
+    @pytest.mark.asyncio
+    async def test_call_routes_to_user_path_for_deployed_endpoint(self):
+        """Test that __call__ routes to user path for deployed endpoints."""
+        mock_resource = MagicMock()
+        mock_resource.endpoint_url = "http://localhost:8000"
+        mock_resource.name = "test-lb"
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        @patch.object(stub, "_should_use_execute_endpoint")
+        @patch.object(stub, "_execute_via_user_route")
+        async def run_test(mock_user_route, mock_detect):
+            mock_detect.return_value = False
+            mock_user_route.return_value = {"result": 42}
+
+            def test_func(x):
+                return x
+
+            test_func.__remote_config__ = {
+                "method": "POST",
+                "path": "/api/test",
+                "resource_config": mock_resource,
+            }
+
+            result = await stub(test_func, None, None, True, 42)
+
+            # Should route to _execute_via_user_route, not _execute_function
+            mock_user_route.assert_called_once()
+            assert result == {"result": 42}
+
+        await run_test()
+
+    @pytest.mark.asyncio
+    async def test_call_routes_to_execute_for_live_endpoint(self):
+        """Test that __call__ routes to /execute for LiveLoadBalancer."""
+        mock_resource = MagicMock()
+        stub = LoadBalancerSlsStub(mock_resource)
+
+        @patch.object(stub, "_should_use_execute_endpoint")
+        @patch.object(stub, "_execute_function")
+        @patch.object(stub, "_handle_response")
+        async def run_test(mock_handle, mock_execute, mock_detect):
+            mock_detect.return_value = True
+            mock_execute.return_value = {"success": True, "result": "test"}
+            mock_handle.return_value = "handled"
+
+            def test_func():
+                pass
+
+            result = await stub(test_func, None, None, True)
+
+            # Should route to _execute_function, not _execute_via_user_route
+            mock_execute.assert_called_once()
+            assert result == "handled"
+
+        await run_test()