Skip to content

perf: Streamline s3 backend#19394

Open
jtuglu1 wants to merge 1 commit into
apache:masterfrom
jtuglu1:optimize-s3-operations
Open

perf: Streamline s3 backend#19394
jtuglu1 wants to merge 1 commit into
apache:masterfrom
jtuglu1:optimize-s3-operations

Conversation

@jtuglu1
Copy link
Copy Markdown
Contributor

@jtuglu1 jtuglu1 commented Apr 30, 2026

Description

S3 achieve strong read-after-write consistency in 2020. The current s3 backend architecture assumes a prior consistency model and therefore does some redundant calls which are both slow and costly.

Some other things to look into in a separate PR is parallel download requests using byte ranges for a single file (currently we use a single TCP connection which isn't the fastest and is subject to S3 single-connection bandwidth limitations).

High-level list of changes

  1. Removed isObjectInBucket guards before zip and gzip downloads (and the now-dead private method). The 404 from GetObject propagates as a SegmentLoadingException.
  2. Replaced the doesObjectExist + listObjectsV2 two-call sequence with a single getObjectMetadata request in S3DataSegmentMover.

Release note


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@jtuglu1 jtuglu1 force-pushed the optimize-s3-operations branch 3 times, most recently from 180aa0f to 4986d81 Compare May 1, 2026 16:52
@jtuglu1 jtuglu1 requested a review from clintropolis May 11, 2026 17:44
@jtuglu1 jtuglu1 marked this pull request as ready for review May 11, 2026 17:44
@jtuglu1 jtuglu1 requested review from gianm May 11, 2026 17:47
@jtuglu1 jtuglu1 force-pushed the optimize-s3-operations branch from 4986d81 to 50af918 Compare May 11, 2026 18:07
Copy link
Copy Markdown
Member

@FrankChen021 FrankChen021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity Findings
P0 0
P1 0
P2 1
P3 0
Total 1

Reviewed 8 of 8 changed files.


This is an automated review by Codex GPT-5.5

sourceMetadata = s3Client.getObjectMetadata(s3Bucket, s3Path);
}
catch (S3Exception e) {
if (e.statusCode() == 404 && "NoSuchKey".equals(S3Utils.getS3ErrorCode(e))) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Treat any HEAD 404 as a missing source

The idempotent already-moved path now only runs when headObject fails with status 404 and error code NoSuchKey. S3 HEAD failures for absent objects can surface as a generic 404/NotFound response, and other Druid S3 code handles missing HEAD results by status alone. In that case this code rethrows before checking whether the target object already exists, so a retry or competing move where the source was deleted after a successful copy can incorrectly fail instead of returning the moved segment. Please key this fallback off statusCode() == 404 rather than requiring NoSuchKey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants