The place internally where the "legacy" pyarrow.filesystem filesystems are still used is in the pyarrow.parquet module.
It is used in:
-
ParquetWriter
-
ParquetManifest/ParquetDataset
-
write_to_dataset
For ParquetWriter, we need to update this to work with the new filesystems (since ParquetWriter is not dataset related, and thus won't be deprecated).
For ParquetManifest/ParquetDataset, it might not need to be updated, since those might get deprecated itself (to be discussed -> ARROW-9720), and when using the use_legacy_dataset=False option, it already uses the new datasets.
For write_to_dataset, this might depend on how the writing capabilities of the dataset project evolve.
Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Joris Van den Bossche / @jorisvandenbossche
PRs and other links:
Note: This issue was originally created as ARROW-9718. Please see the migration documentation for further details.
The place internally where the "legacy"
pyarrow.filesystemfilesystems are still used is in thepyarrow.parquetmodule.It is used in:
ParquetWriter
ParquetManifest/ParquetDataset
write_to_dataset
For
ParquetWriter, we need to update this to work with the new filesystems (since ParquetWriter is not dataset related, and thus won't be deprecated).For
ParquetManifest/ParquetDataset, it might not need to be updated, since those might get deprecated itself (to be discussed -> ARROW-9720), and when using theuse_legacy_dataset=Falseoption, it already uses the new datasets.For
write_to_dataset, this might depend on how the writing capabilities of the dataset project evolve.Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Joris Van den Bossche / @jorisvandenbossche
PRs and other links:
Note: This issue was originally created as ARROW-9718. Please see the migration documentation for further details.