Skip to content

Change default behavior for Parquet PageEncodingStats #8859

@etseidl

Description

@etseidl

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
#8797 adds the option to condense the page encoding statistics in the Parquet ColumnMetaData from a Vec<PageEncodingStats> to a bitmask. This reduces the number of allocations performed in the decoding of the Parquet metadata and thus speeds up metadata parsing. Currently the default behavior is to parse the full vector of encoding stats, but given the limited use of this information we should instead default to the more concise and performant bitmask.

Describe the solution you'd like
Change the default behavior, but leave an option to get the full stats if required.

Describe alternatives you've considered
No change to the defaults.

Additional context
This change should only be made in a major release as it is a significant behavior change.

Metadata

Metadata

Labels

enhancementAny new improvement worthy of a entry in the changeloggood first issueGood for newcomers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions