Avoid StackOverflowException when Kafka.poll() 'timeout' param is 0#2
Conversation
implement fetchBatch() as a loop when kafka.poll is not large enough interval for data to reach kafkaClient from kafkaServer
There was a problem hiding this comment.
Thanks for this @mariobriggs
Should we take out the recursive call below then?
I think the whole if clause can go then?
if (!iter.hasNext) {
if ( requestOffset < part.untilOffset ) {
return getNext()
}
assert(requestOffset == part.untilOffset, errRanOutBeforeEnd(part))
finished = true
null.asInstanceOf[R]
} else {
Because iter.hasNext will always be true when fetchBatch returns since it won't return until it has something non-empty to return.
BTW, we should definitely test to make sure this works when the topic is empty and doesn't stall. I will take care of that. Let me know what you think of the above.
There was a problem hiding this comment.
I think the removal of recursive call is right.
make sure this works when the topic is empty
<<
the '&& requestOffset < part.untilOffset' catches that i think
Avoid StackOverflowException when Kafka.poll() 'timeout' param is 0
|
I am merging this, thanks again for this @mariobriggs |
## What changes were proposed in this pull request? This reopens apache#11836, which was merged but promptly reverted because it introduced flaky Hive tests. ## How was this patch tested? See `CatalogTestCases`, `SessionCatalogSuite` and `HiveContextSuite`. Author: Andrew Or <andrew@databricks.com> Closes apache#11938 from andrewor14/session-catalog-again.
implement fetchBatch() as a loop when kafka.poll is not large enough interval for data to reach kafkaClient from kafkaServer