Fix parallel segment reload race on IndexLoadingConfig tier; add IndexLoadingConfig.copy() to avoid per-segment ZK fetches by rsrkpatwari1234 · Pull Request #18174 · apache/pinot

rsrkpatwari1234 · 2026-04-12T11:57:30Z

Problem

When multiple segments were reloaded in parallel (reloadAllSegments / batched reloadSegments), all tasks shared a single IndexLoadingConfig from one fetchIndexLoadingConfig() call. Each reload path calls setSegmentTier(...) (and related updates) on that shared instance, so concurrent tasks could overwrite each other’s tier. With tier overrides in table config, that could apply the wrong preprocessing / loading settings (#18164).

Fix

BaseTableDataManager.reloadSegments: Renamed to reloadSegmentDataManagersInParallel and calls fetchIndexLoadingConfig() once per batch, then for each parallel task pass indexLoadingConfigTemplate.copy() into reloadSegment, so every segment gets its own config for tier and other per-segment mutation.
IndexLoadingConfig.copy(): New method that builds a new IndexLoadingConfig with the same instance / table / schema references and tableDataDir, without copying segmentTier (each copy starts clean, like a fresh fetch). This keeps correctness while avoiding N repeated ZK reads (one fetch + N light copies instead of N fetches).

Tests

IndexLoadingConfigTest: Asserts copy shares TableConfig / Schema, matches tableDataDir, does not inherit segmentTier from the template, and that tier changes on the copy do not affect the original.

Fixes #18164

…xLoadingConfig.copy() to avoid per-segment ZK fetches

codecov-commenter · 2026-04-12T13:02:05Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.23%. Comparing base (7e10a36) to head (f8e2537).

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #18174      +/-   ##
============================================
+ Coverage     63.18%   63.23%   +0.05%     
  Complexity     1616     1616              
============================================
  Files          3214     3214              
  Lines        195838   195842       +4     
  Branches      30251    30251              
============================================
+ Hits         123734   123836     +102     
+ Misses        62236    62105     -131     
- Partials       9868     9901      +33

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (ø)`
integration	`100.00% <ø> (ø)`
integration1	`100.00% <ø> (ø)`
integration2	`0.00% <ø> (ø)`
java-11	`63.19% <100.00%> (+0.07%)`	⬆️
java-21	`63.13% <100.00%> (-0.03%)`	⬇️
temurin	`63.23% <100.00%> (+0.05%)`	⬆️
unittests	`63.22% <100.00%> (+0.05%)`	⬆️
unittests1	`55.43% <100.00%> (+0.04%)`	⬆️
unittests2	`34.82% <42.85%> (+0.04%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

rsrkpatwari1234 · 2026-04-12T20:31:21Z

Requesting review on this. Integration test failure seems unrelated to this PR -

java.lang.AssertionError: [ExactlyOnce] Transaction markers were not propagated within 120s; committed records are not visible to read_committed consumers. read_committed=0, read_uncommitted=153636
	at org.apache.pinot.integration.tests.ExactlyOnceKafkaRealtimeClusterIntegrationTest.waitForCommittedRecordsVisible(ExactlyOnceKafkaRealtimeClusterIntegrationTest.java:181)

J-HowHuang · 2026-04-14T20:34:58Z

pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java

  }

-  private void reloadSegments(List<SegmentDataManager> segmentDataManagers, IndexLoadingConfig indexLoadingConfig,
+  private void reloadSegmentDataManagersInParallel(List<SegmentDataManager> segmentDataManagers,


I think we can keep the original name, and mention the parallel to this method's javadoc if you feel like.

J-HowHuang · 2026-04-14T20:44:49Z

...al/src/main/java/org/apache/pinot/segment/local/segment/index/loader/IndexLoadingConfig.java

+    IndexLoadingConfig copy = new IndexLoadingConfig(_instanceDataManagerConfig, _tableConfig, _schema);
+    copy.setTableDataDir(_tableDataDir);
+    return copy;
+  }


Let's not have this method. This doesn't copy everything in this object (e.g. _readMode is not copied here), which would fail to meet the semantic of "copy". Unless you make sure everything is honored while performing the copy.

J-HowHuang · 2026-04-14T20:48:49Z

pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java

        _segmentReloadSemaphore.acquire(segmentName, _logger);
        try {
-          reloadSegment(segmentDataManager, indexLoadingConfig, forceDownload);
+          reloadSegment(segmentDataManager, indexLoadingConfigTemplate.copy(), forceDownload);


If we have 100k of segments then we'll have as many copies of IndexLoadingConfig objects here, with only the segment tier aren't the same.

Segment tiers are expected to be only a few per server. Can we have, for example, a map from tier to index loading config so we only need to create as many copies as the amount of tiers?

rsrkpatwari1234 added 5 commits April 12, 2026 17:22

Fix parallel segment reload race on IndexLoadingConfig tier; add Inde…

674ded2

…xLoadingConfig.copy() to avoid per-segment ZK fetches

Update IndexLoadingConfig.java

cbe3fc8

Update IndexLoadingConfigTest.java

a32f01f

Update BaseTableDataManager.java

8fcad05

Update BaseTableDataManager.java

96bf28b

rsrkpatwari1234 added 6 commits April 12, 2026 20:12

Update BaseTableDataManagerTest.java

61da312

Update BaseTableDataManagerTest.java

2a54742

Update BaseTableDataManagerTest.java

37d1c77

Update BaseTableDataManagerTest.java

03fc3ca

Update BaseTableDataManagerTest.java

c659f3c

Update BaseTableDataManagerTest.java

f8e2537

J-HowHuang reviewed Apr 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix parallel segment reload race on IndexLoadingConfig tier; add IndexLoadingConfig.copy() to avoid per-segment ZK fetches#18174

Fix parallel segment reload race on IndexLoadingConfig tier; add IndexLoadingConfig.copy() to avoid per-segment ZK fetches#18174
rsrkpatwari1234 wants to merge 11 commits intoapache:masterfrom
rsrkpatwari1234:rsrkpatwari1234-issue-18164

rsrkpatwari1234 commented Apr 12, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Apr 12, 2026 •

edited

Loading

Uh oh!

rsrkpatwari1234 commented Apr 12, 2026

Uh oh!

J-HowHuang Apr 14, 2026

Uh oh!

J-HowHuang Apr 14, 2026

Uh oh!

J-HowHuang Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rsrkpatwari1234 commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Tests

Uh oh!

codecov-commenter commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

rsrkpatwari1234 commented Apr 12, 2026

Uh oh!

J-HowHuang Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

J-HowHuang Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

J-HowHuang Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rsrkpatwari1234 commented Apr 12, 2026 •

edited

Loading

codecov-commenter commented Apr 12, 2026 •

edited

Loading