Skip to content

Add parameterized forward index compression codecs#18188

Open
xiangfu0 wants to merge 1 commit intoapache:masterfrom
xiangfu0:codex/compression-codec-spec-pr1
Open

Add parameterized forward index compression codecs#18188
xiangfu0 wants to merge 1 commit intoapache:masterfrom
xiangfu0:codex/compression-codec-spec-pr1

Conversation

@xiangfu0
Copy link
Copy Markdown
Contributor

Summary

  • add CompressionCodecSpec parsing and validation so forward-index compression stays externally string-based while supporting optional integer parameters
  • thread parameterized codec specs through forward-index creation, metadata persistence, load/rewrite paths, and compressor factory resolution
  • keep existing plain codec configs and deprecated compatibility paths working, and add focused tests for serde, config resolution, factory wiring, and compression round-trips

User manual

Raw forward indexes can now set fieldConfigList[].compressionCodec with either a plain codec name or a codec plus integer level.

Examples:

"compressionCodec": "ZSTD"
"compressionCodec": "ZSTD(3)"
"compressionCodec": "GZIP(6)"

Existing values such as SNAPPY, LZ4, PASS_THROUGH, and legacy chunkCompressionType / dictIdCompressionType configs continue to work.

Sample table config

{
  "fieldConfigList": [
    {
      "name": "column1",
      "encodingType": "RAW",
      "compressionCodec": "ZSTD(3)"
    }
  ]
}

Testing

  • ./mvnw spotless:apply -q
  • ./mvnw license:format -q
  • ./mvnw -pl pinot-segment-local,pinot-segment-spi,pinot-spi,pinot-tools -am -DskipTests compile -q
  • ./mvnw -pl pinot-segment-local,pinot-segment-spi,pinot-spi,pinot-common -am -Dtest=ForwardIndexConfigTest,CompressionCodecSpecTest,CompressionCodecSpecValidatorTest,TableConfigSerDeUtilsTest,ForwardIndexCreatorFactoryTest,BaseSegmentCreatorTest,ForwardIndexTypeTest,IndexLoadingConfigTest,ForwardIndexHandlerTest,TestCompression -Dsurefire.failIfNoSpecifiedTests=false test -q
  • ./mvnw checkstyle:check -q
  • ./mvnw license:check -q

@xiangfu0 xiangfu0 force-pushed the codex/compression-codec-spec-pr1 branch from fd5672b to 1a5b303 Compare April 13, 2026 23:26
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 76.57143% with 82 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.40%. Comparing base (31eac83) to head (1a5b303).

Files with missing lines Patch % Lines
...he/pinot/segment/spi/index/ForwardIndexConfig.java 58.33% 13 Missing and 7 partials ⚠️
...e/pinot/spi/config/table/CompressionCodecSpec.java 68.75% 15 Missing and 5 partials ⚠️
...spi/config/table/CompressionCodecCapabilities.java 60.46% 13 Missing and 4 partials ⚠️
...t/spi/config/table/CompressionCodecSpecParser.java 81.25% 3 Missing and 3 partials ⚠️
...t/local/io/compression/ChunkCompressorFactory.java 76.47% 2 Missing and 2 partials ⚠️
...ocal/segment/index/loader/ForwardIndexHandler.java 92.68% 1 Missing and 2 partials ⚠️
.../pinot/segment/spi/utils/SegmentMetadataUtils.java 0.00% 3 Missing ⚠️
...ot/segment/local/io/compression/LZ4Compressor.java 66.66% 2 Missing ⚠️
...ment/index/forward/ForwardIndexCreatorFactory.java 83.33% 2 Missing ⚠️
...org/apache/pinot/spi/config/table/FieldConfig.java 88.23% 1 Missing and 1 partial ⚠️
... and 3 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18188      +/-   ##
============================================
+ Coverage     63.29%   63.40%   +0.10%     
- Complexity     1627     1646      +19     
============================================
  Files          3226     3230       +4     
  Lines        196636   196904     +268     
  Branches      30401    30447      +46     
============================================
+ Hits         124466   124844     +378     
+ Misses        62192    62042     -150     
- Partials       9978    10018      +40     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.35% <76.57%> (+0.08%) ⬆️
java-21 63.31% <76.57%> (+0.05%) ⬆️
temurin 63.40% <76.57%> (+0.10%) ⬆️
unittests 63.40% <76.57%> (+0.10%) ⬆️
unittests1 55.30% <63.71%> (+0.04%) ⬆️
unittests2 35.06% <62.85%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants