fix(parquet): Handle uint32 to uint64 array widening in reverseTransformArray#721
Conversation
…ormArray Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR updates the Parquet reader’s reverse-transform logic to support widening a uint32 Arrow array into a uint64 array when the expected schema type is uint64, preventing type-mismatch failures when reading Parquet data.
Changes:
- Add
*array.Uint32handling toreverseTransformArray. - Introduce
reverseTransformFromUint32to convertuint32arrays intouint64arrays (with null preservation).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| switch dt { | ||
| case arrow.PrimitiveTypes.Uint64: |
There was a problem hiding this comment.
reverseTransformFromUint32 matches dt using direct interface equality (case arrow.PrimitiveTypes.Uint64). This will fail if the incoming schema uses a different *arrow.Uint64Type instance (pointer-inequality), causing an unexpected panic even though the logical type is uint64. Prefer a type switch on dt.(type) (like reverseTransformFromDate32) or use arrow.TypeEqual(dt, arrow.PrimitiveTypes.Uint64) for the match.
| switch dt { | |
| case arrow.PrimitiveTypes.Uint64: | |
| switch dt.(type) { | |
| case *arrow.Uint64Type: |
| func reverseTransformFromUint32(dt arrow.DataType, arr *array.Uint32) arrow.Array { | ||
| switch dt { | ||
| case arrow.PrimitiveTypes.Uint64: | ||
| builder := array.NewUint64Builder(memory.DefaultAllocator) | ||
| for i := 0; i < arr.Len(); i++ { | ||
| if arr.IsNull(i) { | ||
| builder.AppendNull() | ||
| continue | ||
| } | ||
| builder.Append(uint64(arr.Value(i))) | ||
| } | ||
| return builder.NewArray() |
There was a problem hiding this comment.
This new uint32→uint64 widening path is not exercised by existing parquet read/write tests. Add a targeted test that writes a schema with a uint64 column but produces/reads a uint32 physical array (or directly unit-tests reverseTransformArray with dt=uint64 and arr=*array.Uint32, including nulls and sliced arrays) to prevent regressions.
No description provided.