Avoid EndOfStreamException with empty pages by ishepherd · Pull Request #95 · aloneguid/parquet-dotnet

ishepherd · 2020-12-18T04:53:28Z

Fixes

Fixes #88

Description

Allows for there to be no RLE or Bitpacked values on a data page. On occasion I find Spark creates these 'empty' pages. They have a valid header, and accurate byte counts, but no values.

I'm not clear whether these are allowed by the spec... but they are 'de facto' correct: Spark writes them; Python/pandas can read them ok.

There is no test in this PR. This Gist contains the test I use locally, unfortunately I cannot share the file that goes with it.
⚠️ Help needed. @peteriehl can you help to provide a repro file?

I have included unit tests validating this fix.
I have updated markdown documentation where required. not applicable
I understand that successful approval of my pull request requires reproducible tests as per Contribution Guideline.

ishepherd · 2021-03-02T01:21:08Z

@aloneguid My first OSS contribution ✨😄
Thanks for all your work.

Iain Shepherd and others added 2 commits December 18, 2020 14:32

Avoid EndOfStreamException

d223222

Merge branch 'master' into endofstreamexception

6480a79

aloneguid added this to the 3.8.6 milestone Feb 28, 2021

aloneguid approved these changes Feb 28, 2021

View reviewed changes

aloneguid merged commit 88f37d2 into aloneguid:master Feb 28, 2021

ishepherd deleted the endofstreamexception branch March 2, 2021 01:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid EndOfStreamException with empty pages#95

Avoid EndOfStreamException with empty pages#95
aloneguid merged 2 commits into
aloneguid:masterfrom
ishepherd:endofstreamexception

ishepherd commented Dec 18, 2020

Uh oh!

ishepherd commented Mar 2, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ishepherd commented Dec 18, 2020

Fixes

Description

Uh oh!

ishepherd commented Mar 2, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants