Skip to content

S3FileIO: Always normalize listPrefix key to end with trailing slash#16656

Closed
bolma-lila wants to merge 2 commits into
apache:mainfrom
bolma-lila:fix/s3fileio-listprefix-trailing-slash
Closed

S3FileIO: Always normalize listPrefix key to end with trailing slash#16656
bolma-lila wants to merge 2 commits into
apache:mainfrom
bolma-lila:fix/s3fileio-listprefix-trailing-slash

Conversation

@bolma-lila

Copy link
Copy Markdown

Description

S3FileIO.listPrefix() currently only normalizes the prefix with a trailing slash for S3 Directory Buckets (via toDirectoryPath()), but not for standard S3 buckets. This causes the prefix passed to ListObjectsV2 to potentially match sibling keys.

Problem

Given a table location s3://bucket/warehouse/ns/table, calling listPrefix("s3://bucket/warehouse/ns/table") sets prefix=warehouse/ns/table in the ListObjectsV2Request. S3 returns all keys starting with that string, including:

  • warehouse/ns/table/metadata/... (correct — belongs to this table)
  • warehouse/ns/table_archive/data/... (wrong — different table with similar name)

This also breaks STS credential vending from REST catalogs like Lakekeeper, which scope the session policy to:

{"Condition": {"StringLike": {"s3:prefix": "warehouse/ns/table/*"}}}

A prefix without trailing slash does not match StringLike pattern .../*, resulting in 403 AccessDenied.

Fix

Apply toDirectoryPath() unconditionally in listPrefix(), not just for Directory Buckets. This ensures the key always ends with / before being used as a ListObjects prefix, matching how Trino normalizes keys via directoryKey().

Testing

  • Existing integration test (TestS3FileIO.testListPrefix) creates objects under prefix/<scale>/ subdirectories, so the normalized prefix path/to/list/ still matches all expected objects
  • Verified manually with vended STS credentials: ListObjectsV2(prefix="key/") succeeds where ListObjectsV2(prefix="key") returns 403

Related

S3FileIO.listPrefix() only normalizes the prefix with a trailing slash
for S3 Directory Buckets, but not for standard S3 buckets. This causes
two problems:

1. Security: ListObjectsV2 with prefix "warehouse/ns/table" also
   matches sibling keys like "warehouse/ns/table_secret/..." since S3
   does pure string prefix matching.

2. STS compatibility: REST catalogs that vend credentials (e.g.,
   Lakekeeper) scope the session policy to "s3:prefix": "<key>/*".
   A prefix without trailing slash doesn't match, causing 403.

Apply toDirectoryPath() unconditionally, consistent with how Trino's
S3FileSystem.directoryKey() normalizes keys before listing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- testListPrefixNormalizesTrailingSlashForGeneralPurposeBucket: verifies
  via mock that ListObjectsV2Request always receives a trailing-slash
  prefix even when the caller omits it (general-purpose bucket)
- testListPrefixDoesNotMatchSiblingPaths: verifies via MinIO that
  listPrefix("s3://bucket/warehouse/ns/table") only returns objects
  under "table/" and not under "table_sibling/"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
// Always normalize the prefix to end with "/" to prevent matching sibling prefixes.
// Without this, prefix "warehouse/ns/table" would also match "warehouse/ns/table_other/...".
// This is also required for STS session policies that scope ListBucket to "<key>/*".
uri = uri.toDirectoryPath();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this change s3.directory-bucket.list-prefix-as-directory (default true) has no production reader: isS3DirectoryBucketListPrefixAsDirectory() is now referenced only by a unit test, and S3URI.useS3DirectoryBucket() becomes production-dead. It also removes the only opt-out that existed for directory buckets. Either repurpose the flag to gate this now-unconditional normalization, or remove the dead property and its getter from S3FileIOProperties.

&& client.s3FileIOProperties().isS3DirectoryBucketListPrefixAsDirectory()) {
uri = uri.toDirectoryPath();
}
// Always normalize the prefix to end with "/" to prevent matching sibling prefixes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SupportsPrefixOperations.listPrefix documents that object stores "may allow for arbitrary prefixes"; this forces directory-only semantics for general-purpose buckets with no opt-out, while GCSFileIO.listPrefix (GCSFileIO.java:298) and ADLSFileIO.listPrefix (ADLSFileIO.java:219) still pass the raw prefix - so the sibling-match behavior persists there and S3 now diverges. Iceberg's own orphan-file listing already appends "/" before calling listPrefix (FileSystemWalker.java:68), so the remaining benefit is external/STS callers. Consider normalizing consistently across FileIOs or documenting why only S3 changes.


// Call listPrefix WITHOUT trailing slash
List<FileInfo> fileInfoList =
StreamSupport.stream(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streams.stream(...) (already imported and used in the sibling test below) is simpler than the StreamSupport/Spliterators construction: Streams.stream(localS3FileIo.listPrefix("s3://bucket/warehouse/ns/table")).collect(Collectors.toList()).

@bolma-lila

Copy link
Copy Markdown
Author

Thanks @wombatu-kun for the careful review — all three points are fair, and digging into them led me to conclude this change isn't the right fix, so I'm going to close it.

Context on what I was actually chasing: the concrete failure was a 403 from an STS-scoped session policy (Lakekeeper vends credentials scoped to s3:prefix = "<table>/*"). The runtime stack trace was S3FileIO.deletePrefix ← Airbyte's IcebergTableCleaner.clearTable, which passes a table location without a trailing slash. Because deletePrefix(p) forwards p straight into listPrefix(p), the unslashed prefix reached ListObjectsV2 and got denied.

Why this PR isn't needed: that's an Airbyte-caller problem, and it's fixed at the caller in airbytehq/airbyte#78624 by normalizing the location before deletePrefix. Tracing the rest of the paths:

  • Iceberg-core's only internal listPrefix caller during writes is via ResolvingFileIO (pass-through) — and nothing in the commit/append path lists or deletes prefixes.
  • FileSystemWalker (orphan-file removal) already appends / before calling listPrefix, so it's unaffected either way.

So as you noted, the only remaining beneficiary would be external/STS callers — and for those, the right place is the caller (as Iceberg already does in FileSystemWalker), not an unconditional change in S3FileIO that:

Closing this in favor of the caller-side fix. If there's appetite for sibling-prefix hardening as defense-in-depth, I'd be happy to revisit it as a properly flag-gated, cross-FileIO change in a separate PR. Thanks again.

@bolma-lila bolma-lila closed this Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants