[Python]Remove get_artifacts in MLTranform since artifacts are stored in artifact location#29016
Conversation
Codecov Report
@@ Coverage Diff @@
## master #29016 +/- ##
==========================================
- Coverage 38.41% 38.39% -0.02%
==========================================
Files 686 686
Lines 101614 101656 +42
==========================================
- Hits 39033 39032 -1
- Misses 60999 61042 +43
Partials 1582 1582
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 12 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
602752d to
6d39769
Compare
6d39769 to
0fab242
Compare
|
Run Python PreCommit 3.8 |
|
Run Python_Integration PreCommit 3.8 |
|
Run Python_Integration PreCommit |
|
Run Python_Runners PreCommit |
|
R: @damccorm |
|
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
| | "MLTransform" >> base.MLTransform( | ||
| write_artifact_location=self.artifact_location).with_transform( | ||
| tft.ScaleTo01(columns=['x']))) | ||
| _ = (list_result | beam.Map(assert_ScaleTo01_artifacts)) |
There was a problem hiding this comment.
Rather than removing all of these checks, can we check that the written artifact is correct?
There was a problem hiding this comment.
I added #29017 to do this but for now, we will have to use some TF tools to do it. It is not clear how to do it but it is not straight forward.
To read the artifacts, we use read_artifact_location and there are tests covering this path.
damccorm
left a comment
There was a problem hiding this comment.
LGTM outside of one minor suggestion (and getting checks green)
MLTransform outputs artifacts in the output dictionary. Initially, it was designed to provide users artifacts such as
minandmaxvalues from the transformScaleTo01. Now since the concept of artifact location is used, we store the artifacts directly in the artifact location and provide the user the option ofread_artifact_locationin MLTransform.In the efforts of read and write artifact location, let us remove outputting artifacts in the output dict since they might create noise in the user data. This is a breaking change but since MLTransform is not yet launched and possibly, there is no usage yet, I would like to mention this in CHANGES.md and merge the breaking changes.
Note: The artifacts are embedded in a TF graph and it is easier to read and use them with MLTransform.
Filed #29017 to export MLTransform artifacts in human readable format along with TF graph.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.