Skip to content

Draft: implement non-blocking morsel API#21852

Draft
alamb wants to merge 1 commit into
apache:mainfrom
alamb:alamb/non_block_morsels
Draft

Draft: implement non-blocking morsel API#21852
alamb wants to merge 1 commit into
apache:mainfrom
alamb:alamb/non_block_morsels

Conversation

@alamb
Copy link
Copy Markdown
Contributor

@alamb alamb commented Apr 25, 2026

TODO

  • Tests
  • Make a PR to refactor the large poll_scan loop into separate functions to reduce the indent level / control flow

Which issue does this PR close?

Rationale for this change

In #21342 (comment), @adriangb pointed out that the current Morsel API relied on a comment rather than they typesystem to separate IO and CPU.

Also, it should be pointed out that the current Parquet opener actually does now do IO in the stream reader. This makes overlapping the IO and CPU harder

What changes are included in this PR?

  1. Change the signature of Morsel::into_stream to return either a Sync or Async stream.
  2. Adjust parquet reader

I don't expect this change will have much of an actual impact (yet) but I do expect that it will set us up for better IO interleaving / work stealing

Are these changes tested?

yes by CI and new tests (to be written)

Are there any user-facing changes?

The unreleased Morsel API is slightly different

@alamb alamb force-pushed the alamb/non_block_morsels branch from 7291951 to ad17058 Compare May 21, 2026 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Morsel output for Parquet known to be non blocking

1 participant