Skip to content

fix(parquet): Handle uint32 to uint64 array widening in reverseTransformArray#721

Merged
kodiakhq[bot] merged 1 commit into
mainfrom
fix/support-uint32array-widening
Mar 23, 2026
Merged

fix(parquet): Handle uint32 to uint64 array widening in reverseTransformArray#721
kodiakhq[bot] merged 1 commit into
mainfrom
fix/support-uint32array-widening

Conversation

@disq
Copy link
Copy Markdown
Member

@disq disq commented Mar 23, 2026

No description provided.

…ormArray

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@disq disq requested a review from a team as a code owner March 23, 2026 10:28
@disq disq requested review from Copilot, erezrokah and marianogappa and removed request for a team March 23, 2026 10:28
@erezrokah erezrokah added the automerge Add to automerge PRs once requirements are met label Mar 23, 2026
@kodiakhq kodiakhq Bot merged commit ee835b6 into main Mar 23, 2026
11 checks passed
@kodiakhq kodiakhq Bot deleted the fix/support-uint32array-widening branch March 23, 2026 10:31
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Parquet reader’s reverse-transform logic to support widening a uint32 Arrow array into a uint64 array when the expected schema type is uint64, preventing type-mismatch failures when reading Parquet data.

Changes:

  • Add *array.Uint32 handling to reverseTransformArray.
  • Introduce reverseTransformFromUint32 to convert uint32 arrays into uint64 arrays (with null preservation).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread parquet/read.go
Comment on lines +106 to +107
switch dt {
case arrow.PrimitiveTypes.Uint64:
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverseTransformFromUint32 matches dt using direct interface equality (case arrow.PrimitiveTypes.Uint64). This will fail if the incoming schema uses a different *arrow.Uint64Type instance (pointer-inequality), causing an unexpected panic even though the logical type is uint64. Prefer a type switch on dt.(type) (like reverseTransformFromDate32) or use arrow.TypeEqual(dt, arrow.PrimitiveTypes.Uint64) for the match.

Suggested change
switch dt {
case arrow.PrimitiveTypes.Uint64:
switch dt.(type) {
case *arrow.Uint64Type:

Copilot uses AI. Check for mistakes.
Comment thread parquet/read.go
Comment on lines +105 to +116
func reverseTransformFromUint32(dt arrow.DataType, arr *array.Uint32) arrow.Array {
switch dt {
case arrow.PrimitiveTypes.Uint64:
builder := array.NewUint64Builder(memory.DefaultAllocator)
for i := 0; i < arr.Len(); i++ {
if arr.IsNull(i) {
builder.AppendNull()
continue
}
builder.Append(uint64(arr.Value(i)))
}
return builder.NewArray()
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new uint32→uint64 widening path is not exercised by existing parquet read/write tests. Add a targeted test that writes a schema with a uint64 column but produces/reads a uint32 physical array (or directly unit-tests reverseTransformArray with dt=uint64 and arr=*array.Uint32, including nulls and sliced arrays) to prevent regressions.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge Add to automerge PRs once requirements are met

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants