Skip to content

Include hash of pod template when computing fast registration version#3247

Merged
davidmirror-ops merged 1 commit into
masterfrom
danielsola/se-378-account-for-image-and-pod-template-in-registration-command
May 8, 2025
Merged

Include hash of pod template when computing fast registration version#3247
davidmirror-ops merged 1 commit into
masterfrom
danielsola/se-378-account-for-image-and-pod-template-in-registration-command

Conversation

@dansola

@dansola dansola commented May 7, 2025

Copy link
Copy Markdown
Contributor

Why are the changes needed?

We need to account for changes to the image and pod template based on resolution of env variables rather than a code change. e.g. a user may have an env variable that controls the base image of an imagespec. If the env variable changes, the version we compute should change, but currently it does not.

There are two paths fast registration can take:

  1. pyflyte run -> version computed in remote.py
  2. pyflyte register -> version computed in repo.py

In remote.py, we account for a change in container_image but not pod_template. In repo.py, don't account for for either.

What changes were proposed in this pull request?

  1. Add a method to PodTemplate that hashes its contents.
  2. Add a method that searches an entity for a pod template and adds its hash to a list.
  3. Include the list of pod template hashes in the remote.py version computation.
  4. Include the list of pod template hashes and container_images in the repo.py version computation.

How was this patch tested?

Before:

  1. Define a workflow with a pod template where the image is defined in an imagespec which takes a base image from an env variable.
  2. pyflyte register with two different base images defined via the env variable -> get failure:
Request rejected by the API, due to Invalid input.
RPC Failed, with Status: StatusCode.INVALID_ARGUMENT
        Details: wf.get_data task with different structure already exists. (Please register a new version of the task):
/template/Target/K8SPod/pod_spec/containers/0/image:
        - ghcr.io/dansola/test-image:1b8tlwn3V2jJmbKngwS6Pg 
        + ghcr.io/dansola/test-image:r754JC2IneS_Ey0WqswIVw
  1. pyflyte run --remote with two different base images defined via the env variable -> get the same failure as above

After:

  1. Define a workflow with a pod template where the image is defined in an imagespec which takes a base image from an env variable.
  2. pyflyte register with two different base images defined via the env variable -> no failure and a re-registration doesn't generate a new version
  3. pyflyte run --remote with two different base images defined via the env variable -> no failure

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Summary by Bito

This PR enhances versioning by incorporating pod template changes alongside container images in version computation. It adds a hashing method in PodTemplate to capture environment variable changes reflecting image modifications. The implementation spans remote.py and repo.py to ensure consistent behavior across registration and run commands, with comprehensive tests validating functionality.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 2

@flyte-bot

flyte-bot commented May 7, 2025

Copy link
Copy Markdown
Contributor

Code Review Agent Run #47a158

Actionable Suggestions - 0
Additional Suggestions - 1
  • flytekit/core/pod_template.py - 1
    • Missing docstring in public module · Line 1-1
Review Details
  • Files reviewed - 4 · Commit Range: 6c294e3..6c294e3
    • flytekit/core/pod_template.py
    • flytekit/remote/remote.py
    • flytekit/tools/repo.py
    • tests/flytekit/unit/core/test_pod_template_hash.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

Refer to the documentation for additional commands.

Configuration

This repository uses code_review_bito You can customize the agent settings here or contact your Bito workspace admin at haytham@union.ai.

Documentation & Help

AI Code Review powered by Bito Logo

@flyte-bot

Copy link
Copy Markdown
Contributor

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Pod Template Serialization and Hashing Enhancement

pod_template.py - Introduced a serialize_pod_template function and added a version_hash method to compute a consistent hash for PodTemplate objects.

Feature Improvement - Remote Version Computation Enhancement

remote.py - Updated version computation in remote.py to include pod template hashes by adding static helper methods and modifying _version_from_hash calls.

Feature Improvement - Repo Version Computation Enhancement

repo.py - Modified the repo registration path to aggregate pod template hashes along with container images during version computation.

Testing - Pod Template Hashing Unit Tests

test_pod_template_hash.py - Added extensive unit tests to verify hash consistency, order independence, and differences in pod template attributes.

@davidmirror-ops davidmirror-ops merged commit c95a3df into master May 8, 2025
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Sep 15, 2025
- NOTE: the flytekit 1.16 release changed hash calculation a bit with
  flyteorg#3247, so this commit is a
  little different from what was done on the 1.15.4 version of the
  Domino branch (and the 2 commits for this were squashed into one)

- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472

  [DOM-68737] Refactor version_hash_additional_context to use bytes (#7)

  `bytes` is more flexible. This flexibility is relevant when including the
  serialized input bindings (bytes) in the version hash calculations. This
  is necessary to fix a bug that can occur:

  There is a bug where Flyte rejects a valid workflow. This can happen when
  workflow inputs are read from a file, causing the workflow input bindings
  to be updated, without an update to the version.

  Refactor version_hash_additional_context, which currently uses `str`,
  to use `bytes`
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Sep 16, 2025
- NOTE: the flytekit 1.16 release changed hash calculation a bit with
  flyteorg#3247, so this commit is a
  little different from what was done on the 1.15.4 version of the
  Domino branch (and the 2 commits for this were squashed into one)

- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472

  [DOM-68737] Refactor version_hash_additional_context to use bytes (#7)

  `bytes` is more flexible. This flexibility is relevant when including the
  serialized input bindings (bytes) in the version hash calculations. This
  is necessary to fix a bug that can occur:

  There is a bug where Flyte rejects a valid workflow. This can happen when
  workflow inputs are read from a file, causing the workflow input bindings
  to be updated, without an update to the version.

  Refactor version_hash_additional_context, which currently uses `str`,
  to use `bytes`
Atharva1723 pushed a commit to Atharva1723/flytekit that referenced this pull request Oct 5, 2025
Signed-off-by: Atharva <atharvakulkarni172003@gmail.com>
ddl-rliu added a commit to ddl-rliu/flytekit that referenced this pull request Mar 23, 2026
- NOTE: the flytekit 1.16 release changed hash calculation a bit with
  flyteorg#3247, so this commit is a
  little different from what was done on the 1.15.4 version of the
  Domino branch (and the 2 commits for this were squashed into one)

- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472

  [DOM-68737] Refactor version_hash_additional_context to use bytes (flyteorg#7)

  `bytes` is more flexible. This flexibility is relevant when including the
  serialized input bindings (bytes) in the version hash calculations. This
  is necessary to fix a bug that can occur:

  There is a bug where Flyte rejects a valid workflow. This can happen when
  workflow inputs are read from a file, causing the workflow input bindings
  to be updated, without an update to the version.

  Refactor version_hash_additional_context, which currently uses `str`,
  to use `bytes`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants