feat(build): match local Python by default and add 3.13#330
feat(build): match local Python by default and add 3.13#330
Conversation
Collapse GPU_PYTHON_VERSIONS and CPU_PYTHON_VERSIONS into a single SUPPORTED_PYTHON_VERSIONS tuple now that flash-worker publishes native per-version images for all four image types uniformly. The two old names remain as aliases for any downstream callers. DEFAULT_PYTHON_VERSION stays at 3.12 -- it drives the :latest tag aliases, not SDK behavior. Drops the unused WORKER_PYTHON_VERSION and GPU_BASE_IMAGE_PYTHON_VERSION constants. Updates test_live_serverless.py to use SUPPORTED_PYTHON_VERSIONS parametrization, and updates two tests that used "3.13" as the example unsupported version (now use "3.14"). Refs AE-2827.
Change _reconcile_python_version to use sys.version_info as the resolution default when no --python-version override and no per-resource python_version declaration is set. Eliminates forced cloudpickle drift between the user's local interpreter and the deployed worker. Resolution order: 1. CLI --python-version override (validated) 2. Single distinct python_version declared across resources (validated) 3. Local sys.version_info (validated against SUPPORTED_PYTHON_VERSIONS) A local interpreter outside the supported set raises a clear error pointing at the override flag and per-resource declaration paths -- no silent downgrade to 3.12. Refs AE-2827. BREAKING CHANGE: Projects that previously deployed to Python 3.12 by default (via the SDK's hardcoded fallback) now deploy to whichever Python the user is running flash from. The first deploy after upgrading will trigger a rolling release because the manifest fingerprint changes when target_python_version flips. Teams that want lockstep behavior across team members and CI should declare python_version explicitly on each resource config.
Surface the chosen Python version and its source on a single line during flash build, so users see the parity decision without inspecting the manifest. Source is one of: - "--python-version override" - "declared on resource <name>" - "matched local interpreter" Refs AE-2827.
Lead with the parity contract (local Python = deploy target), document
the bounded {3.10..3.13} support set, document the override and
per-resource declaration paths, and call out the team-consistency
recommendation. Drop the side-by-side / 7 GB cold-start callouts — no
longer accurate after the worker image rearchitecture.
Refs AE-2827.
Phase 3 of AE-2827. Updates the Options bullet for --python-version on both flash build and flash deploy docs to reflect the new behavior: local interpreter is the default; 3.13 is now in the supported set. Refs AE-2827.
6f0e5c1 to
3b6e212
Compare
runpod-Henrik
left a comment
There was a problem hiding this comment.
PR #330 — feat(build): match local Python by default and add 3.13
Henrik's AI-Powered Bug Finder
No prior reviews. First-time review.
1. Missing pyproject.toml update — 3.13 install still blocked
The SDK now lists "3.13" in SUPPORTED_PYTHON_VERSIONS and the docs say "Python 3.10, 3.11, 3.12, or 3.13," but pyproject.toml still has requires-python = ">=3.10,<3.13". A user on Python 3.13 who runs pip install runpod-flash after this PR merges will either fail outright or silently resolve to an old SDK version. The docs and code say 3.13 works; pip says it doesn't. This needs to be in the same PR, or the 3.13 claim is false at the package level.
2. Issue: Rolling release triggered silently on first deploy after upgrade — users on 3.10 or 3.11
Scenario: A user is running Python 3.11 locally and currently has a deployed endpoint. They upgrade the Flash SDK. They run flash deploy. The manifest now stamps target_python_version = "3.11" where it previously stamped "3.12" (the old hardcoded default). Flash detects drift and triggers a rolling release. Workers are terminated and re-provisioned. Any in-flight requests during this window may be dropped.
The user sees "targeting Python 3.11 (matched local interpreter)" in the build output — but no warning that this differs from what's currently deployed, and no confirmation prompt before the rolling release fires. The BREAKING CHANGE commit message documents the intent, but there's nothing user-facing that warns "your deployed workers are about to be recycled because your Python version changed."
Users on Python 3.12 are unaffected (old default = new default). Users on 3.10 and 3.11 take the hit silently on first deploy.
3. Issue: Phase coupling not enforced — deploy-time image-pull failure with no Flash error if 3.13 merges early
Scenario: The 3.13 worker images haven't shipped yet. This PR merges. A user forces install on Python 3.13 (e.g., pip install --ignore-requires-python or installs from source). They run flash deploy. Flash successfully builds the artifact and stamps target_python_version = "3.13". At deploy time, RunPod tries to pull runpod/flash:py3.13-latest — the image doesn't exist. The user sees a RunPod-level image-pull failure, not a Flash error.
The PR asks reviewers not to merge until images ship, but there's no runtime gate in the code. If merge timing slips, the code itself won't catch it. Worth considering whether the provisioner should check image availability or at least emit a warning when target_python_version = "3.13".
4. Question: CI/CD systems with fixed Python versions
Teams that run flash deploy from CI (e.g., GitHub Actions using python-version: 3.12) will pin to 3.12 automatically. But a developer on Python 3.11 running the same flash deploy locally would now deploy to 3.11 — causing the CI and developer environments to produce different manifests. The PR recommends declaring python_version explicitly as the mitigation. Is there a plan to warn when the resolved version differs from the previously deployed version, to make this situation visible before it causes a rolling release?
5. Nit: _python_version_source has no direct test
_python_version_source is a pure function with clear inputs and outputs (override → declaration → local interpreter). It has 4 branches, none of which are tested directly. The reconcile tests cover the correct version string being produced, but not the source label. A (None, {}) → "matched local interpreter" and ("3.12", {}) → "--python-version override" case would close this gap cheaply.
Nit: Stale comment in resource_provisioner.py
The diff updates the comment from # GPU uses GPU_BASE_IMAGE_PYTHON_VERSION to reference DEFAULT_PYTHON_VERSION — but the comment still says "Falls back to the caller-provided python_version for backward compatibility" with no explanation of what the caller-provided value is. Worth clarifying who the caller is (the provisioner entry point, not user code) to avoid future confusion.
Verdict: NEEDS WORK
Finding #1 (missing pyproject.toml update) is blocking — the PR advertises 3.13 support at the SDK level while pip continues to refuse installation on 3.13. Finding #2 (silent rolling release) warrants at least a note in the upgrade guide, ideally a deploy-time warning when target_python_version changes. Finding #3 is a known timing risk but worth a code comment or assertion. The core logic is correct and the tests are thorough.
🤖 Reviewed by Henrik's AI-Powered Bug Finder
Summary
Flips _reconcile_python_version from a hardcoded 3.12 default to sys.version_info, adds 3.13 to SUPPORTED_PYTHON_VERSIONS, prints the resolved Python version at build time, updates user-facing docs.
Eliminates forced cloudpickle drift between the user's local interpreter and the deployed worker. Resolution order: (1) --python-version override, (2) single declared python_version across resources, (3) sys.version_info. No fallback to a hardcoded default.
Phase coupling
Worker images for {3.10..3.13} ship via runpod-workers/flash#94. Do not merge this PR until the worker rearchitecture has merged AND release-please has published the image tags to Docker Hub — otherwise users on local 3.13 hit image-pull failures at deploy time.
Changes
Test plan
Migration
BREAKING CHANGE on the feat(build): match local Python version by default commit (release-please will major-bump). Projects that previously deployed to 3.12 by default now deploy to whichever Python the user is running flash from. First deploy after upgrading triggers a rolling release because target_python_version flips. Teams that want lockstep behavior should declare python_version explicitly on each resource config.