Skip to content

[SYSTEMDS-3949] Add native Delta Lake matrix read/write via Delta Kernel#2511

Open
Baunsgaard wants to merge 2 commits into
apache:mainfrom
Baunsgaard:delta-matrix-io
Open

[SYSTEMDS-3949] Add native Delta Lake matrix read/write via Delta Kernel#2511
Baunsgaard wants to merge 2 commits into
apache:mainfrom
Baunsgaard:delta-matrix-io

Conversation

@Baunsgaard

Copy link
Copy Markdown
Contributor

Introduce a DELTA file format that reads and writes Delta Lake tables natively through the Spark-free Delta Kernel library, for matrices on the single-node CP path. DML read/write with format="delta" now operates directly on Delta tables without a Spark DataFrame round-trip.

  • Add FileFormat.DELTA and exclude it from the text formats
  • Accept format="delta" with unknown dimensions in the parser (like CSV) and set blocksize -1 for the columnar format
  • Wire DELTA into the matrix reader and writer factories
  • Add DeltaKernelUtils plus ReaderDelta and WriterDelta with column-at-a-time, boxing-free data transfer
  • Refresh cached matrix metadata after a Delta read (discovered dimensions)
  • Pin parquet 1.13.1 and jackson core/annotations to 2.15.2 to align with Delta Kernel's transitive requirements

Append/overwrite table semantics, distributed execution, frames, and time travel are out of scope.

@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.20000% with 63 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.60%. Comparing base (384a8dc) to head (f2c304d).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
.../org/apache/sysds/runtime/io/DeltaKernelUtils.java 79.54% 17 Missing and 10 partials ⚠️
.../java/org/apache/sysds/runtime/io/ReaderDelta.java 77.77% 11 Missing and 5 partials ⚠️
.../java/org/apache/sysds/runtime/io/WriterDelta.java 85.07% 5 Missing and 5 partials ⚠️
...g/apache/sysds/runtime/io/ReaderDeltaParallel.java 90.47% 3 Missing and 5 partials ⚠️
...n/java/org/apache/sysds/parser/DataExpression.java 83.33% 0 Missing and 1 partial ⚠️
...g/apache/sysds/runtime/io/MatrixReaderFactory.java 75.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2511      +/-   ##
============================================
+ Coverage     71.56%   71.60%   +0.04%     
- Complexity    49110    49228     +118     
============================================
  Files          1575     1579       +4     
  Lines        189793   190162     +369     
  Branches      37235    37297      +62     
============================================
+ Hits         135816   136157     +341     
- Misses        43480    43503      +23     
- Partials      10497    10502       +5     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Baunsgaard

Copy link
Copy Markdown
Contributor Author

Native Delta read performance vs binary reader

Benchmark (org.apache.sysds.performance DeltaIO), 32 vCPU, JDK 17, -Xmx16g,
20 timed repetitions after warmup, isolated (no competing load).

Workload: 5,000,000 × 16 dense FP64 matrix (~640 MB in memory, sparsity 1.0).

Read path Time (ms) Disk throughput
Delta (serial) 5411 ± 79 79 MB/s
Delta (parallel) 875 ± 19 490 MB/s
Binary (parallel) ~100–170 3.7–6.0 GB/s

Takeaways

  • The parallel reader decodes data files concurrently and is ~6.2× faster than the
    serial Delta reader
    (5411 ms → 875 ms).
  • The binary reader is still ~5–9× faster than parallel Delta. This is expected:
    binary is a near-raw double[] dump, while Delta Kernel 3.3.0 decodes parquet
    per-value (row-by-row converters, not vectorized). File-level parallelism is the
    main lever available, which is what this reader exploits.
  • On disk Delta is smaller: 450 MB vs 645 MB for binary (0.70×), thanks to parquet
    encoding — so the read-time cost buys a ~30% smaller footprint.

For reference, the write path is encode-bound: Delta write ~7.0 s vs binary ~0.15 s
(parquet encoding dominates); not the focus of this PR but noted for completeness.

Introduce a DELTA file format that reads and writes Delta Lake tables
natively through the Spark-free Delta Kernel library, for matrices on the
single-node CP path. DML read/write with format="delta" now operates
directly on Delta tables without a Spark DataFrame round-trip.

- Add FileFormat.DELTA and exclude it from the text formats
- Accept format="delta" with unknown dimensions in the parser and set
  blocksize -1 for the columnar format
- Wire DELTA into the matrix reader and writer factories
- Add DeltaKernelUtils plus serial and parallel native Delta readers and
  WriterDelta with column-at-a-time, boxing-free data transfer
- Expose Delta reader batch size and writer target file size via DMLConfig
- Refresh cached matrix metadata after a Delta read (discovered dimensions)
- Add a parquet.version property and pin delta-kernel 3.3.2
- Run Delta component IO tests in CI and broaden matrix coverage

Append/overwrite table semantics, distributed execution, frames, and time
travel are out of scope.
- Bound each per-file decode task in the direct parallel read path to its
  numRecords-derived row slice, so a Delta file that decodes more rows than
  its statistic claims fails with a clear error instead of overflowing into
  the next file's region (concurrent overlapping writes) or off the array.
- Use the parallel recomputeNonZeros(_numThreads) in the buffered read path
  to match the direct path; the buffered fallback handles the largest matrices.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant