Skip to content

Extend flytekit version hash calculation to be pluggable#2428

Open
ddl-rliu wants to merge 2 commits into
flyteorg:masterfrom
dominodatalab:rliu.extend-flytekit-pluggable
Open

Extend flytekit version hash calculation to be pluggable#2428
ddl-rliu wants to merge 2 commits into
flyteorg:masterfrom
dominodatalab:rliu.extend-flytekit-pluggable

Conversation

@ddl-rliu

@ddl-rliu ddl-rliu commented May 16, 2024

Copy link
Copy Markdown
Contributor

Tracking issue

Closes flyteorg/flyte#5364

Why are the changes needed?

Extends https://github.com/flyteorg/flytekit/pull/2039/files – that PR gives the minimum API surface for configuring the API via FlytekitPlugin.

This PR increases how pyflyte is pluggable by external libraries, specifically for controlling the additional context used to generate the version string hash.

What changes were proposed in this pull request?

This can be used with a plugin like:

class MyPlugin(FlytekitPlugin):
    @staticmethod
    def get_additional_context(entity: Union[PythonAutoContainerTask, WorkflowBase]) -> List[str]:
        """Get additional context to be used for calculating the version hash."""
        if isinstance(entity, PythonTask):
            return [str(entity.task_config)]
        if isinstance(entity, WorkflowBase):
            task_configs = []
            for n in entity.nodes:
                task_configs.extend(DominoJobPlugin.get_additional_context(n.flyte_entity))
            return task_configs
        return []

This will add the task config to the calculation of the version hash.

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

@codecov

codecov Bot commented May 17, 2024

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 72.09%. Comparing base (e6e08f9) to head (a744848).
Report is 15 commits behind head on master.

Current head a744848 differs from pull request most recent head d4cda94

Please upload reports for the commit d4cda94 to get more accurate results.

Files Patch % Lines
flytekit/remote/remote.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2428       +/-   ##
===========================================
+ Coverage   42.77%   72.09%   +29.31%     
===========================================
  Files         185      181        -4     
  Lines       18677    18397      -280     
  Branches     2665     3601      +936     
===========================================
+ Hits         7990    13264     +5274     
+ Misses      10599     4508     -6091     
- Partials       88      625      +537     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ddl-rliu ddl-rliu changed the title x Extend flytekit version hash calculation to be pluggable May 22, 2024
@ddl-rliu ddl-rliu force-pushed the rliu.extend-flytekit-pluggable branch 3 times, most recently from 8d0645c to 2951085 Compare May 22, 2024 19:56
Comment thread flytekit/remote/remote.py
if isinstance(entity, WorkflowBase):
default_inputs = entity.python_interface.default_inputs_as_kwargs

from flytekit.configuration.plugin import get_plugin

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not the ideal place to have an import.

It also seems like the additional_context (which could probably be named a bit more descriptively?) value should be passed into the register_script function, shouldn't it? The pattern of pulling in special state like this makes this code a bit harder to test / reason about I think.

@ddl-rliu ddl-rliu May 22, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use lazy import here to solve the circular import issue (remote imports plugin, plugin imports remote)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've used this tactic to reduce startup time in Ruby -- but:

  • It looks like there's already a lazy_module helper in Flytekit
  • if this is a tactic for dealing with a potential circular dependency, that can be the sign of a flawed design
  • Remote now takes a dependency on this plugin helper method -- it's cleaner if you can invert the behavior so that this function doesn't need to know anything about finding special mutation functions in plugins

@ddl-rliu ddl-rliu May 22, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could rename additional_context to version_hash_additional_context? I wouldn't rename it to task_configs because it does not necessarily contain the task configs, that only happens for a particular use case.

Moving it to a parameter of register_script seems sensible, but for the moment I'm thinking of keeping the PR as very easy to review/not changing too many methods. Will give it a try at a later point.

edit: will rename to version_hash_additional_context

Comment thread flytekit/remote/remote.py Outdated
# For that add the hash of the compilation settings to hash of file
version = self._version_from_hash(
md5_bytes, serialization_settings, default_inputs, *_get_image_names(entity)
md5_bytes, serialization_settings, default_inputs, *_get_image_names(entity), *additional_context

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to have pluggable versioning instead? i.e. the plugin defines a custom version function that gets passed md5_bytes, serialization_settings, default_inputs, *_get_image_names(entity) if it exists? Would such a function need a dual?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling is that core logic i.e. the default version hash logic, should remain in its section, rather than moved to the plugin methods. This mostly follows the existing pattern I noticed in FlytekitPlugin, where core logic is not often moved to its methods – however there are some exceptions, as I suppose with FlytekitPlugin.get_remote

@ddl-rliu ddl-rliu marked this pull request as ready for review May 22, 2024 23:42
ddl-rliu added a commit to dominodatalab/flytekit that referenced this pull request May 23, 2024
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-rliu added a commit to dominodatalab/flytekit that referenced this pull request Jul 8, 2024
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Aug 8, 2024
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Aug 9, 2024
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Aug 16, 2024
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
@ddl-ebrown

Copy link
Copy Markdown
Contributor

@pingsutw this is a similar "extension" as #2661

Should we try to get these 2 PRs lined up for the new release of flytekit @ddl-rliu ?

Signed-off-by: ddl-rliu <richard.liu@dominodatalab.com>
@ddl-rliu ddl-rliu force-pushed the rliu.extend-flytekit-pluggable branch from d4cda94 to 4992305 Compare August 16, 2024 22:07

@thomasjpfan thomasjpfan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer not to have a new extension point in FlytekitPluginProtocol, if there is an simpler alternative.

For your use case, can your external library provide a FlyteRemote subclass and override _version_from_hash? This FlyteRemote subclass can then be returned by your custom FlytekitPluginProtocol.get_remote.

@ddl-rliu

Copy link
Copy Markdown
Contributor Author

@thomasjpfan Thanks, it's an interesting idea, and I never thought about that possibility! With a modest refactor of _version_from_hash, particularly to accept entity as a parameter, I think it will work exactly like you say. Proof of concept refactor: https://github.com/flyteorg/flytekit/pull/2688/files

To explain more – the parent issue deals with adding entity.task_config to the version hash calculation, therefore, the refactor would expose entity via a parameter to _version_from_hash.

ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Aug 29, 2024
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Oct 8, 2024
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Jan 17, 2025
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Jan 17, 2025
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Mar 20, 2025
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Mar 20, 2025
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-rliu added a commit to dominodatalab/flytekit that referenced this pull request Jul 7, 2025
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Sep 15, 2025
- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472

  [DOM-68737] Refactor version_hash_additional_context to use bytes (#7)

  `bytes` is more flexible. This flexibility is relevant when including the
  serialized input bindings (bytes) in the version hash calculations. This
  is necessary to fix a bug that can occur:

  There is a bug where Flyte rejects a valid workflow. This can happen when
  workflow inputs are read from a file, causing the workflow input bindings
  to be updated, without an update to the version.

  Refactor version_hash_additional_context, which currently uses `str`,
  to use `bytes`
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Sep 15, 2025
- NOTE: the flytekit 1.16 release changed hash calculation a bit with
  flyteorg#3247, so this commit is a
  little different from what was done on the 1.15.4 version of the
  Domino branch (and the 2 commits for this were squashed into one)

- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472

  [DOM-68737] Refactor version_hash_additional_context to use bytes (#7)

  `bytes` is more flexible. This flexibility is relevant when including the
  serialized input bindings (bytes) in the version hash calculations. This
  is necessary to fix a bug that can occur:

  There is a bug where Flyte rejects a valid workflow. This can happen when
  workflow inputs are read from a file, causing the workflow input bindings
  to be updated, without an update to the version.

  Refactor version_hash_additional_context, which currently uses `str`,
  to use `bytes`
ddl-ebrown pushed a commit to dominodatalab/flytekit that referenced this pull request Sep 16, 2025
- NOTE: the flytekit 1.16 release changed hash calculation a bit with
  flyteorg#3247, so this commit is a
  little different from what was done on the 1.15.4 version of the
  Domino branch (and the 2 commits for this were squashed into one)

- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472

  [DOM-68737] Refactor version_hash_additional_context to use bytes (#7)

  `bytes` is more flexible. This flexibility is relevant when including the
  serialized input bindings (bytes) in the version hash calculations. This
  is necessary to fix a bug that can occur:

  There is a bug where Flyte rejects a valid workflow. This can happen when
  workflow inputs are read from a file, causing the workflow input bindings
  to be updated, without an update to the version.

  Refactor version_hash_additional_context, which currently uses `str`,
  to use `bytes`
ddl-rliu added a commit to ddl-rliu/flytekit that referenced this pull request Mar 23, 2026
- NOTE: the flytekit 1.16 release changed hash calculation a bit with
  flyteorg#3247, so this commit is a
  little different from what was done on the 1.15.4 version of the
  Domino branch (and the 2 commits for this were squashed into one)

- This is necessary due to another pending PR to upstream flytekit:
  flyteorg#2428

  In case this PR is not likely to be merged, we have a plan to move away
  from this change, see the linked Doc in DOM-57472

  [DOM-68737] Refactor version_hash_additional_context to use bytes (flyteorg#7)

  `bytes` is more flexible. This flexibility is relevant when including the
  serialized input bindings (bytes) in the version hash calculations. This
  is necessary to fix a bug that can occur:

  There is a bug where Flyte rejects a valid workflow. This can happen when
  workflow inputs are read from a file, causing the workflow input bindings
  to be updated, without an update to the version.

  Refactor version_hash_additional_context, which currently uses `str`,
  to use `bytes`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Task config should be used when computing the task version hash

3 participants