virtual storage improvements#18683
Merged
clintropolis merged 5 commits intoOct 24, 2025
Merged
Conversation
changes: * in virtual storage mode, it is now impossible for a segment to be 'missing' after a caller has obtained a `DataSegment`. Now, the caller can still mount and use a segment (downloading it if necessary) even if the coordinator has issued a drop command * to support there being no such thing as a missing segment in vsf mode, drops of weakly held cache entries in a storage location are now a no-op. instead, these cache entries will remain on disk until an eviction needs to reclaim the space for some other entry
…-of-drop-and-no-missing-segments
gianm
reviewed
Oct 23, 2025
| // write the segment info file if it doesn't exist. this can happen if we are loading after a drop | ||
| final File segmentInfoCacheFile = new File(getEffectiveInfoDir(), dataSegment.getId().toString()); | ||
| if (!segmentInfoCacheFile.exists()) { | ||
| jsonMapper.writeValue(segmentInfoCacheFile, dataSegment); |
Contributor
There was a problem hiding this comment.
Use FileUtils.writeAtomically to avoid leaving a partially-written file after a crash.
| for (StorageLocation location : locations) { | ||
| final SegmentCacheEntry cacheEntry = location.getCacheEntry(cacheEntryIdentifier); | ||
| if (cacheEntry != null) { | ||
| cacheEntry.onUnmount.set(null); |
Contributor
There was a problem hiding this comment.
Can this race with the call in removeInfoFile? That one doesn't seem to acquire a lock. Same comment for the other various calls to set(null).
| final SegmentCacheEntry cacheEntry = location.getCacheEntry(entryId); | ||
| if (cacheEntry != null) { | ||
| isCached = true; | ||
| cacheEntry.onUnmount.set(delete); |
Contributor
There was a problem hiding this comment.
Can this race with an actual unmounting? A lock doesn't seem to be acquired at this point in time.
Member
Author
There was a problem hiding this comment.
yea, this one isn't under a segment lock so could, fixed to just make methods on cache entry to synchronize with mount/unmount
gianm
approved these changes
Oct 24, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR makes some behavioral changes to how historical servers manage their cache files when operating in the 'virtual storage' mode added in #18176, to provide a better experience and more flexibility to callers.
The main change is that in virtual storage mode, it is now impossible for a segment to be 'missing' in the
SegmentManagerif a caller has aDataSegment. By 'missing' segments, I am referring to the specific Druid response a historical server provides in the query response context to cover the case of a segment drop command occurring before the query engine has a chance to obtain a reference to the files (or even before ever receiving the request), so that the broker can retry these requests to the new location the segment has been moved to (if retries are enabled the the segment is available elsewhere).Now, in virtual storage mode, the historical will simply download the segment on demand and process it for any
DataSegmentobject. So for example, when usingServerManagerto get segments, this means that after resolving the set ofDataSegmentsparticipating in a query from the timeline, the query engine can always obtain aSegmentReferenceso long as there is adequate disk space to download (and it is present in deep storage of course).To complement this behavior, the segment cache itself no longer actively unmounts weakly held segment cache entries, making the coordinator drop command a near no-op. Instead, files will remain on disk until they are evicted because some other segment needs to reclaim the space so that it can be loaded. This does mean that the disk will 'fill up' over time, but is worth the benefits. To accommodate this and ensure that segment files are tracked across process boundaries (in the event of a failure or whatever), segment info files are now deleted via an 'unmount' hook on the segment files cache entry, meaning that they will not be removed until the segment files are also removed.