[BEAM-13984] Implement RunInference for PyTorch#17196
[BEAM-13984] Implement RunInference for PyTorch#17196TheNeuralBit merged 18 commits intoapache:masterfrom
Conversation
|
R: @ryanthompson591 @AnandInguva Please take a look, thanks! Still working on fixing pytorch imports in the meantime. |
|
I think I should remove any traces of GPU logic lest we give the impression that it has been fully tested. Or is the minimal GPU logic I have ok? |
ryanthompson591
left a comment
There was a problem hiding this comment.
Looks good. Like the tests.
sdks/python/setup.py
Outdated
There was a problem hiding this comment.
Does this mean all python sdks require pytorch going forward. Is this a heavy requirement?
If so let's make sure it's fine to require this for everyone.
There was a problem hiding this comment.
I can't see the original change now, but I think this was in an extra. It would be a requirement that's added when you pip install apache-beam[ml], so it wouldn't be a hard dependency for all users.
Regardless I think we should hold off adding a requirement spec anywhere, unless we need to communicate that there's restricted version range we support.
There was a problem hiding this comment.
For now, to allow most flexibility, we're going to support pip install apache-beam pytorch
|
R: @TheNeuralBit for review and merge |
There was a problem hiding this comment.
Shouldn't our Beam abstraction be opinionated about the element type instead of using the native type for each library? Otherwise I'd argue this is just a library of similar-looking extensions for different ML libraries
There was a problem hiding this comment.
Should this use FileSystem to add support for gs:// and s3:// paths? This seems to rely on reading from a local filesystem which is problematic when running on distributed workers.
(admittedly I may be misunderstanding when this is used as I'm still catching up on this work)
There was a problem hiding this comment.
Good point, I can look into adding that. My initial work was assuming that that we are reading only from local filesystem.
Summary
self._state_dict_pathis that path to a file that stores model states.- And
self._model_classis a Python Pytorch class that defines the model structure.
We're basically reading in a dictionary of coefficients/parameters/states that specify how to populate the model's structure (passed in via the argument model_class) with certain values.
The load_model will then be acquired by a Shared() instance.
Codecov Report
@@ Coverage Diff @@
## master #17196 +/- ##
==========================================
- Coverage 73.99% 73.99% -0.01%
==========================================
Files 685 685
Lines 89727 89735 +8
==========================================
+ Hits 66395 66399 +4
- Misses 22172 22176 +4
Partials 1160 1160
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
3320640 to
d0473fd
Compare
ryanthompson591
left a comment
There was a problem hiding this comment.
Looks good. I like it.
| for example in examples]).reshape(-1, 1)) | ||
| ] | ||
|
|
||
| gs_pth = 'gs://apache-beam-ml/pytorch_lin_reg_model_2x+0.5_state_dict.pth' |
There was a problem hiding this comment.
I didn't test gcs in my implementation. Does it make sense to break these larger E2E tests into another module instead of the more unit test like tests?
There was a problem hiding this comment.
You're probably right on separating out the E2E tests. I was thinking that this should be small enough tough to verify the usage of FileSystems module though. Perhaps I can break it out when we start adding the E2E testing file.
There was a problem hiding this comment.
I think it's preferable not to require GCP credentials for a unit test, I'm actually not clear on this, can someone run this unit test and read this GCS path without setting up GCP credentials?
|
@TheNeuralBit thanks for providing feedback on this change. have all yor comments been addressed? |
| for example in examples]).reshape(-1, 1)) | ||
| ] | ||
|
|
||
| gs_pth = 'gs://apache-beam-ml/pytorch_lin_reg_model_2x+0.5_state_dict.pth' |
There was a problem hiding this comment.
I think it's preferable not to require GCP credentials for a unit test, I'm actually not clear on this, can someone run this unit test and read this GCS path without setting up GCP credentials?
|
|
||
| toxTask "testPy38pytorch-110", "py38-pytorch-110" | ||
| test.dependsOn "testPy38pytorch-110" | ||
| preCommitPy38.dependsOn "testPy38pytorch-110" |
There was a problem hiding this comment.
Does pytorch commonly make changes that will break us between minor versions? We test different minor versions for pandas because our special usage of pandas in the DataFrame API leads to breakages even between minor versions. In the case of pyarrow, every release is a major version, which is meant to communicate that the API can change (https://arrow.apache.org/docs/format/Versioning.html).
How will we keep this up to date as new version of pytorch come out?
Neither of these are blockers, but these are questions we should consider
There was a problem hiding this comment.
Pytorch has a 90-day release cycle, but it doesn't seem that their changes really touch on the APIs that we use.
Maybe we can test the most recent release (pytorch 1.11.0), along with the last minor version of the last X major versions (pytorch 1.10.2, pytorch 1.9.1, pytorch 1.8.2, ...). I'll create a ticket to investigate this
|
Just checked: I'm able to read from the path because I have permissions to the |
|
I think it's ok to keep the unit test in for now, it's useful to have it validate that functionality until we have an e2e test that can do it. |
Pytorch implementation of RunInference
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.