[GCP] [BigQuery] Handle totalBytesProcessed NoneType#27474
[GCP] [BigQuery] Handle totalBytesProcessed NoneType#27474Abacn merged 4 commits intoapache:masterfrom
totalBytesProcessed NoneType#27474Conversation
|
In addition to the report in #22701, we started seeing the same failure in our pipelines. Error message from worker: Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 623, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1571, in apache_beam.runners.common._OutputHandler.handle_process_outputs
File "/usr/local/lib/python3.9/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1454, in process
for part, size in self.restriction_provider.split_and_size(
File "/usr/local/lib/python3.9/site-packages/apache_beam/transforms/core.py", line 331, in split_and_size
for part in self.split(element, restriction):
File "/usr/local/lib/python3.9/site-packages/apache_beam/io/iobase.py", line 1641, in split
estimated_size = restriction.source().estimate_size()
File "/usr/local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery.py", line 870, in estimate_size
size = int(job.statistics.totalBytesProcessed)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType' |
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
Codecov Report
@@ Coverage Diff @@
## master #27474 +/- ##
==========================================
+ Coverage 71.12% 71.17% +0.04%
==========================================
Files 860 861 +1
Lines 104573 104523 -50
==========================================
+ Hits 74378 74390 +12
+ Misses 28638 28585 -53
+ Partials 1557 1548 -9
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 28 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
Assigning reviewers. If you would like to opt out of this review, comment R: @AnandInguva for label python. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
1 similar comment
|
Assigning reviewers. If you would like to opt out of this review, comment R: @AnandInguva for label python. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
Reminder, please take a look at this pr: @AnandInguva @ahmedabu98 |
ahmedabu98
left a comment
There was a problem hiding this comment.
This LGTM and is in line with BoundedSource documentation:
beam/sdks/python/apache_beam/io/iobase.py
Lines 156 to 158 in b54bf52
|
@ahmedabu98 @Abacn Thank you for the approval and merge ❤️ |
|
Hi, 2.50.0 is scheduled in early September |
* [GCP] [BigQuery] Handle totalBytesProcessed NoneType * Update CHANGES.md * lint / whitespace --------- Co-authored-by: Yi Hu <yathu@google.com>
fixes #22701
Some queries may not have access to
totalBytesProcessedas a result of row-level security.Per their docs:
If any maintainer has some advice on where a good place to implement tests for this is, please let me know :)