Skip to content

[C++][Parquet] Null fixed-length lists cannot be read from parquet file #35692

Description

@spenczar

Describe the bug, including details regarding any error messages, version, and platform.

At least from pyarrow (and possibly for c++ more generally), null fixed-length lists cannot be read into a table from a parquet file, but they can be written.

import pyarrow as pa
import pyarrow.parquet as pq
array = pa.nulls(1, pa.list_(pa.int32(), 2))
table = pa.table([array], names=["values"])
pq.write_table(table, "list_table.parquet")
pq.read_table("list_table.parquet")

gives

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/swnelson/code/personal/quivr/.quivr-venv/lib/python3.11/site-packages/pyarrow/parquet/core.py", line 2986, in read_table
    return dataset.read(columns=columns, use_threads=use_threads,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/swnelson/code/personal/quivr/.quivr-venv/lib/python3.11/site-packages/pyarrow/parquet/core.py", line 2614, in read
    table = self._dataset.to_table(
            ^^^^^^^^^^^^^^^^^^^^^^^
  File "pyarrow/_dataset.pyx", line 546, in pyarrow._dataset.Dataset.to_table
  File "pyarrow/_dataset.pyx", line 3449, in pyarrow._dataset.Scanner.to_table
  File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Expected all lists to be of size=2 but index 1 had size=0

This error does not occur for variable-length lists.

This is with pyarrow version 12.0.0.

Component(s)

Parquet, C++

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions