Skip to content

to_pandas(date_as_object=False) returns different dtype in pyarrow 13.0.0 #37545

Description

@0x26res

Describe the bug, including details regarding any error messages, version, and platform.

With pyarrow==12.0.0:

import pyarrow as pa
import datetime
import numpy as np

table = pa.table({"date": pa.array([datetime.date(2023, 1, 1)], type=pa.date32())})
assert table.to_pandas(date_as_object=False)["date"].dtype == np.dtype("<M8[ms]")

With pyarrow==13.0.0 I get np.dtype("<M8[ms]")

In my opinion, since the default precision for timestamp in arrow is "ns" it should stay the same when converting dates (though obviously you lose range).

Alternatively would there be a way to use the types_mapper argument to specify the precision? I can't get it to work:

>>> assert table.to_pandas(types_mapper={pa.date32() : np.dtype("<M8[ns]")}.get)["date"].dtype
ValueError: This column does not support to be converted to a pandas ExtensionArray

Lastly, I can't figure out from the release note of 13.0.0 which change caused that regression.

Component(s)

Python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions