Skip to content

feat: flash build --preview in local docker for smoke-testing deployments#167

Merged
deanq merged 40 commits intomainfrom
deanq/ae-1968-build-preview
Feb 2, 2026
Merged

feat: flash build --preview in local docker for smoke-testing deployments#167
deanq merged 40 commits intomainfrom
deanq/ae-1968-build-preview

Conversation

@deanq
Copy link
Copy Markdown
Member

@deanq deanq commented Feb 2, 2026

Prerequisite: #166

Summary

Implement a comprehensive local preview environment for testing Flash applications with multiple distributed endpoints. This PR adds the flash build --preview command that launches a multi-container setup with the mothership and all worker endpoints running locally in Docker.

Key Features

  • Local distributed testing: Run complete multi-endpoint system locally without deployment
  • Docker networking: Automatic inter-container communication via Docker DNS
  • One-command workflow: flash build --preview builds and launches preview in one go
  • Integrated mothership detection: Correctly identifies and configures the actual mothership from manifest

What's Changed

Core Features

  • Add --preview flag to flash build command for integrated build + launch workflow
  • Implement launch_preview() function for multi-container orchestration
  • Auto-enable keep_build when preview is requested (needed for archive and build artifacts)
  • Create Docker network for inter-container communication

Mothership Identification Fix

  • Read is_mothership flag from manifest for each resource
  • Correctly identify actual mothership resource instead of using generic default
  • Assign port 8000 to mothership, sequential ports (8001+) to workers
  • Fallback to default mothership only when none specified in manifest

Port Assignment & Docker Configuration

  • Mothership: port 8000
  • Worker resources: ports 8001, 8002, etc.
  • Archive mounting at /root/.runpod/archive.tar.gz (container's expected location)
  • Container port 80 mapped to local ports (not 8000)
  • Cross-platform support with arm64 detection

Documentation & Cleanup

  • Update build documentation with preview workflow
  • Remove deprecated test_mothership command
  • Update verification scripts for image constants
  • Centralize Docker image configuration

Test Plan

  • All 846 tests pass
  • Code coverage: 68.22% (exceeds 65% requirement)
  • Quality checks pass (formatting, linting, type checking)
  • Preview resource parsing tests (7 comprehensive tests)
    • Empty manifest creates default mothership
    • Explicit mothership from manifest is used
    • Default mothership created only when needed
    • Named resources work correctly
    • Missing fields handled with sensible defaults

Testing the Preview Command

cd /path/to/flash-examples/01_getting_started/01_hello_world
flash build --preview

Expected output:

Resource                     | Port | URL              | Type
01_01_hello_world-mothership | 8000 | http://localhost:8000 | Mothership
01_01_gpu_worker             | 8001 | http://localhost:8001 | Worker

Access endpoints:

curl http://localhost:8000/ping   # Mothership
curl http://localhost:8001/ping   # Worker

Related Issues

  • AE-1968: Build preview implementation

deanq and others added 30 commits January 28, 2026 07:39
- Add Python AST indexer (ast_to_sqlite.py) to extract framework symbols
- Add fast query CLI interface (code_intel.py) with Rich formatted output
- SQLite database with optimized indexes for symbol lookup
- Extract classes, functions, methods, decorators, type hints, docstrings
- Update Makefile with index/query targets
- Update CONTRIBUTING.md with setup and usage documentation
- Performance: 466 symbols indexed in 0.09s, database size 260KB
- Reduces token usage by ~85% when exploring framework structure

Commands:
  make index              - Generate/update code intelligence index
  make query SYMBOL=name  - Find symbol by name
  make query-classes      - List all classes
  make query-all          - List all symbols
…ude Code

Adds Model Context Protocol (MCP) server integration to expose the SQLite code
intelligence database as native Claude Code tools. This enables Claude to query
framework symbols, classes, and structure without reading full files, reducing
token usage by ~85% for code exploration tasks.

## Changes

- Add mcp_code_intel_server.py: MCP server with 5 specialized tools
  - find_symbol: Search for classes, functions, methods
  - list_classes: Browse all framework classes
  - get_class_interface: View class methods without implementations
  - list_file_symbols: Explore file structure
  - find_by_decorator: Find decorated symbols

- Create .mcp.json: MCP server configuration for automatic Claude Code discovery
- Add tetra-explorer skill: Guidance for using code intelligence tools
- Update CONTRIBUTING.md: Document MCP integration for developers
- Add mcp>=1.0.0 dependency to pyproject.toml
- Update .gitignore for .claude/ directory

## Benefits

- Zero-configuration Claude Code integration via MCP
- 85% token reduction for code exploration (from ~10k to ~2k tokens)
- <10ms query performance with SQLite indexing
- Type-safe tool interfaces with MCP schemas
- Automatic tool discovery by Claude Code

## Technical Details

- Database: 466 symbols indexed from tetra-rp source
- Storage: .code-intel/flash.db (~250KB)
- Transport: stdio for Claude Code integration
- Python 3.10+ compatible

All quality checks passed. Tests: 34 passed, code coverage: 68.78%
Allow additional open-source licenses used by transitive dependencies:
- Python-2.0: Python standard library and some tools
- Unlicense: Some utility libraries
- MPL-2.0: Some utility packages (certifi, pathspec)

These are all permissive licenses compatible with MIT distribution.
- Extract hardcoded LIMIT values into named constants for maintainability
- Rename handle_list_tools to list_tools for naming consistency
- Extract duplicated error handling into _log_indexing_error helper
- Add clarification about which dependencies require non-standard licenses
Implement two features for the code intelligence MCP server:

1. Smart Re-indexing
   - Index auto-rebuilds on MCP server startup when stale
   - Checks if any Python files in src/ changed since last index
   - Compares file mtimes against index_timestamp in metadata table
   - Triggers rebuilding only when necessary
   - Fast: only checks file mtimes, no AST parsing until rebuild needed

2. Test Output Parser MCP Tool
   - New tool parses pytest output and returns structured data
   - Extracts test summary (passed/failed/errors/skipped counts)
   - Lists failed tests with file locations and error messages
   - Parses coverage statistics if present
   - Returns markdown-formatted summary for easy reading
   - Reduces token usage by ~85% vs reading full test output

Changes to ast_to_sqlite.py:
- Track latest_file_mtime and file_count in metadata table
- Enables staleness detection in MCP server

Changes to mcp_code_intel_server.py:
- Add should_reindex() function for checking if index is stale
- Add smart re-indexing in main() before server starts
- Add parse_test_output() function to parse pytest output
- Add format_test_summary() for markdown formatting
- Add new parse_test_output MCP tool with clear description
- Tool description teaches Claude when/why to use it
- Add parse_test_output tool documentation with usage examples
- Add explicit prohibition of bash commands (tail/grep/cat) for MCP-compatible tasks
- Document bad patterns with token cost comparisons (99% reduction for test output)
- Expand skill file with:
  - Step 1.5: NEVER use bash commands for MCP tool tasks
  - Analyzing Test Results section with good/bad examples
  - Available MCP Tools reference list
  - Strong enforcement notes in Important Notes
- Add 200-token example showing difference between MCP tool vs bash approaches

This forces Claude Code to consistently use parse_test_output (~200 tokens) instead
of bash commands (~20,000+ tokens) when analyzing test output, reducing token usage
by 99% for test analysis workflows.
- Add .claude/settings.json with pre-approved permissions for all contributors
- Update code coverage requirement from 35% to 65% in CONTRIBUTING.md
- Simplify authors to generic Runpod email in pyproject.toml
- Remove empty [tool.ruff] section from pyproject.toml
- Remove outdated docs/PRD.md
- Update dependency lock file (uv.lock)
Manual dependency review during PR review is sufficient.
Removes automated license check that adds friction for legitimate dependencies.
The undeploy command was incorrectly showing NetworkVolumes in the
'Tracked RunPod Serverless Endpoints' list. NetworkVolumes are not
serverless endpoints and shouldn't be managed by the undeploy command.

Added _get_serverless_resources() filter function to exclude non-serverless
resources from undeploy operations. Updated test to use spec=ServerlessResource
for proper isinstance() checks.
Remove handler_generator.py, lb_handler_generator.py, and mothership_handler_generator.py along with their corresponding tests. The manifest builder no longer generates handler_file references, simplifying the build process and removing unused code generation.

- Delete deprecated handler generator modules and tests
- Update manifest tests to remove handler_file assertions
- Handler generation is now handled elsewhere in the deployment pipeline
- Delete docs/Runtime_Generic_Handler.md (handler architecture documentation)
- Remove handler_file fields from manifest examples
- Update build process documentation to remove handler generation steps
- Remove references to generated handler files in troubleshooting sections
- Update runtime architecture documentation to reflect current implementation
- Simplify build process documentation in README and cli docs

This cleanup reflects the removal of the handler generation system from the codebase,
as the runtime now discovers and registers functions dynamically at startup.

Fixes AE-1951
Remove handler_file fields from manifest fixture test data that were part
of deprecated handler file generation feature. These fields are no longer
used by any runtime code and remaining references only create confusion
about the manifest structure.

- Remove handler_file from test_service_registry.py fixtures (2 lines)
- Remove handler_file from test_cross_endpoint_routing.py fixtures (2 lines)
- Update outdated comment in test_manifest_mothership.py
Remove references to deprecated handler file generation feature from
documentation diagrams and build output examples:

- Flash_Deploy_Guide.md: Remove HandlerGenerator and LBHandlerGenerator
  from build phase diagrams, update flow to show Scanner going directly
  to ManifestBuilder
- LoadBalancer_Runtime_Architecture.md: Update deployment lifecycle to
  reflect packaging instead of handler generation
- flash-build.md: Update build output to show manifest creation and
  resource registration instead of handler generation
Move all Docker image constants from live_serverless.py to constants.py
to establish a single source of truth. This enables environment variable
overrides for testing and deployment flexibility, and eliminates scattered
hardcoded image names throughout the codebase.

- Centralized TETRA_*_IMAGE constants with env var support
- Added DEFAULT_WORKERS_MIN/MAX constants for consistent defaults
- Updated live_serverless.py to import from constants
- Updated manifest.py to use centralized image constants
- Updated test_mothership.py to respect environment overrides
- All 856 tests pass with 68.74% coverage
…ants fix

Add three verification scripts and documentation to validate the Docker image
constant configuration fix:

scripts/test-image-constants.py:
  - Quick, focused Python verification script
  - Tests constant definitions, manifest builder integration, LiveServerless
  - Validates environment variable overrides
  - Checks for hardcoded values (code quality)
  - 20 tests covering all aspects of the fix
  - Can be retained for future verification

scripts/verify-manifest-constants.sh:
  - Bash script for testing with actual flash build integration
  - Tests different environment configurations
  - Verifies manifest generation with environment variables
  - Comprehensive test suites for different scenarios

scripts/verify-image-constants.sh:
  - Detailed bash script for Docker and environment testing
  - Tests default behavior and custom overrides
  - Can be extended for additional verification scenarios

VERIFICATION.md:
  - Complete documentation for running verification tests
  - Step-by-step instructions for each test scenario
  - Environment variable reference
  - Troubleshooting guide
  - Can be used as permanent reference guide

All verification scripts pass with:
  ✓ 20/20 tests passed
  ✓ Constants properly centralized
  ✓ Manifest builder uses constants
  ✓ LiveServerless classes use constants
  ✓ Environment variables override constants
  ✓ No hardcoded values remain

These scripts can be retained indefinitely and re-run after any changes
to ensure the fix remains intact.
The test_live_load_balancer_creation_with_local_tag and
test_cpu_live_load_balancer_creation_with_local_tag tests were failing
on CI but passing locally due to module import caching.

The tests set the TETRA_IMAGE_TAG environment variable and reload the
live_serverless module, but the constants module (where the image names
are defined) was not being reloaded. This caused the tests to use stale
cached values on CI where test execution order or isolation differs.

Fix: Also reload the constants module after setting the environment
variable to ensure fresh evaluation of the image name constants.
deanq added 7 commits February 1, 2026 23:22
- Add explicit [tool.ruff] configuration section for clarity
- Add test to verify _get_serverless_resources filtering logic excludes
  non-ServerlessResource types like NetworkVolume (fixes feedback comment)
Add platform-aware Docker image tag selection for flash build --preview:
- Detects arm64/aarch64 machines using platform.machine()
- Appends -arm64 suffix to local development tags automatically
- Preserves production tags (latest, dev, version) unchanged via manifest support

Remove entrypoint.sh requirement from preview containers - use image's default CMD

Update test expectations to verify platform-specific image tags on both architectures

This enables seamless preview experience on Mac (arm64) and Linux/Windows (amd64)
without requiring manual image tag configuration. Some arm64 images (GPU) need to
be built separately, but CPU preview works across all platforms.

Quality checks: 839 tests pass, 67.96% coverage, zero formatting/lint/type errors
The Docker images expect to find and extract /root/.runpod/archive.tar.gz
before the application starts. The preview command was only mounting the
extracted build directory, causing containers to crash immediately with:
FileNotFoundError: flash build artifact not found at /root/.runpod/archive.tar.gz

Changes:
- Mount archive.tar.gz at /root/.runpod/archive.tar.gz (read-only)
- Verify archive exists before starting containers
- Add health check to detect container startup failures
- Include container logs in error messages for debugging
- Add comprehensive tests for archive validation and health checks

The container's unpack_volume.py script will extract the archive on startup,
matching the behavior of production deployments.
The Docker images (runpod/tetra-rp-lb-cpu:*) have uvicorn configured to
listen on port 80 via the --port 80 flag. The preview command was
incorrectly mapping container port 8000, which was not being used by
the application.

Changed the port mapping from 8000 to 80 so localhost ports correctly
map to the Uvicorn server.
The preview command was always creating a generic 'mothership' resource
instead of using the actual mothership specified in the manifest with
is_mothership: true. This caused incorrect port assignment and resource
identification.

Changes:
- Read is_mothership flag from manifest for each resource
- Only create default mothership if none found in manifest
- Updated tests to cover explicit mothership, default fallback, and
  missing is_mothership field behavior

Fixes: Resources with is_mothership: true are now correctly identified
as the mothership and assigned port 8000. Workers get sequential ports.
Add --preview flag to flash build command to automatically launch local
preview environment after successful build. This replaces the standalone
test-mothership command with a more integrated workflow.

Changes:
- Add --preview option to build command
- Auto-enable keep_build when preview is requested (needed for preview)
- Launch preview environment after successful build if --preview specified
- Remove deprecated test_mothership command
- Update documentation with preview option
- Update verification scripts for image constants

The preview environment provides a local distributed system test setup
with the mothership and all endpoints running in Docker containers.
@deanq deanq requested a review from Copilot February 2, 2026 12:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a comprehensive local preview environment for testing Flash applications with multiple distributed endpoints. It enables developers to test their complete multi-endpoint systems locally in Docker before deploying to RunPod, reducing the deployment testing cycle.

Changes:

  • Added flash build --preview command for integrated build and local launch workflow
  • Removed deprecated test_mothership command and replaced with modern preview system
  • Centralized Docker image configuration with environment variable support
  • Removed handler file generation (now uses runtime manifest loading)

Reviewed changes

Copilot reviewed 47 out of 49 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/tetra_rp/cli/commands/preview.py New preview command implementing multi-container orchestration
src/tetra_rp/cli/commands/build.py Added --preview flag and integration with preview launcher
src/tetra_rp/cli/commands/test_mothership.py Removed deprecated test command (458 lines deleted)
src/tetra_rp/core/resources/constants.py Centralized Docker image constants with platform-aware tag resolution
src/tetra_rp/core/resources/live_serverless.py Updated to import constants from centralized location
tests/unit/cli/commands/test_preview.py Comprehensive preview command tests (415 lines)
tests/unit/resources/test_live_load_balancer.py Updated tests for platform-aware image tags
src/tetra_rp/cli/docs/flash-build.md Updated documentation for preview workflow
Multiple test files Removed handler_file references from manifests
Multiple doc files Removed handler generation documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pyproject.toml
Comment thread tests/unit/cli/commands/test_preview.py Outdated
Comment thread src/tetra_rp/cli/commands/preview.py Outdated
Comment thread src/tetra_rp/cli/commands/preview.py Outdated
Comment thread src/tetra_rp/core/resources/constants.py Outdated
Comment thread src/tetra_rp/cli/commands/preview.py Outdated
Comment thread scripts/verify-manifest-constants.sh Outdated
Improvements based on code review feedback:

1. **test_preview.py**: Improve readability of mock call_args unpacking using
   descriptive call_args.args[0] instead of magic index [0][0]

2. **preview.py**: Extract hardcoded container archive path as CONTAINER_ARCHIVE_PATH
   constant for better maintainability and consistency

3. **preview.py**: Fix Docker DNS example to show correct internal port 80 instead of
   misleading port 8000 (which is external mapping)

4. **preview.py**: Improve port assignment logic documentation to explain hash-based
   deterministic assignment for unknown resources

5. **constants.py**: Expand docstring to explain why manifest support requires avoiding
   architecture suffixes on production tags

6. **verify-manifest-constants.sh**: Replace hardcoded user-specific directory path with
   configurable FLASH_EXAMPLES_DIR environment variable with sensible default

All changes maintain backward compatibility and pass quality checks.
@deanq deanq changed the base branch from main to deanq/ae-1951-fix-deployment-hosting February 2, 2026 12:29
@deanq deanq changed the title feat: local preview environment for distributed system testing feat: flash build --preview in local docker for smoke-testing deployments Feb 2, 2026
Replace the _get_platform_aware_tag() function that appended -arm64 suffixes
for local builds with direct tag resolution. Docker multi-platform manifests
in the registry now handle architecture selection automatically, so separate
local tags are no longer needed.

This simplifies tag management across local development and CI/CD:
- Local builds use single tags (local, dev, etc.)
- CI/CD builds use multi-platform manifests (amd64 and arm64)
- Docker daemon automatically selects correct architecture

Changes:
- Remove _get_platform_aware_tag() function from constants.py
- Remove platform.machine() detection for tag suffixing
- Update test expectations to use non-suffixed tags
@deanq deanq changed the base branch from deanq/ae-1951-fix-deployment-hosting to main February 2, 2026 23:30
@deanq deanq merged commit a2cc56c into main Feb 2, 2026
5 checks passed
@deanq deanq deleted the deanq/ae-1968-build-preview branch February 2, 2026 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants