Describe the bug, including details regarding any error messages, version, and platform.
Timestamps with second resolution get upcasted to millisecond resolution when serializing and deserializing. They should either round trip, or there should be a warning/error when attempting to serialize them.
from datetime import datetime
import pyarrow as pa
import pyarrow.compute as pc
from pyarrow import parquet
dates = [
datetime(2021, 1, 1, 0, 0, 3),
datetime(2021, 1, 1, 0, 0, 4),
datetime(2021, 1, 1, 0, 0, 5),
]
table = pa.table({"time": pa.array(dates, type=pa.timestamp("s"))})
print(table.schema) # timestamp[s]
parquet.write_table(table, "timestamp_roundtrip.parquet")
table2 = parquet.read_table("timestamp_roundtrip.parquet")
print(table2.schema) # timestamp[ms]
Tested with pyarrow 16.0.0
Component(s)
Parquet
Describe the bug, including details regarding any error messages, version, and platform.
Timestamps with second resolution get upcasted to millisecond resolution when serializing and deserializing. They should either round trip, or there should be a warning/error when attempting to serialize them.
Tested with pyarrow 16.0.0
Component(s)
Parquet