Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
doc(parquet): remove content defined chunking example from dosctrings
  • Loading branch information
kszucs committed Feb 23, 2026
commit f6a71faedf3f110d0d7020672864752cd2386947
27 changes: 0 additions & 27 deletions parquet/src/arrow/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -88,33 +88,6 @@
//! writer.close().unwrap();
//! ```
//!
//! ## EXPERIMENTAL: Content-Defined Chunking
//!
//! Enable content-defined chunking (CDC) via [`WriterProperties`] to improve
//! deduplication efficiency in content-addressable storage (CAS) systems such as
//! Hugging Face Hub. CDC creates data page boundaries based on content rather than
//! fixed sizes, so unchanged data across file versions produces identical byte
//! sequences that CAS backends can deduplicate at the page level.
//!
//! ```no_run
//! # use parquet::arrow::arrow_writer::ArrowWriter;
//! # use parquet::file::properties::WriterProperties;
//! # use std::fs::File;
//! # use arrow_array::RecordBatch;
//! # fn write(batch: &RecordBatch) {
//! let file = File::create("data.parquet").unwrap();
//! let props = WriterProperties::builder()
//! .set_content_defined_chunking(true)
//! .build();
//! let mut writer = ArrowWriter::try_new(file, batch.schema(), Some(props)).unwrap();
//! writer.write(batch).unwrap();
//! writer.close().unwrap();
//! # }
//! ```
//!
//! See [`CdcOptions`](crate::file::properties::CdcOptions) for chunk size and
//! normalization level configuration.
//!
//! # Example: Reading Parquet file into Arrow `RecordBatch`
//!
//! ```rust
Expand Down
Loading