Skip to content

Core: Clean up encryption keys as part of RemoveSnapshots#16353

Open
Hugo-WB wants to merge 4 commits into
apache:mainfrom
Hugo-WB:cleanup-encryption-keys
Open

Core: Clean up encryption keys as part of RemoveSnapshots#16353
Hugo-WB wants to merge 4 commits into
apache:mainfrom
Hugo-WB:cleanup-encryption-keys

Conversation

@Hugo-WB

@Hugo-WB Hugo-WB commented May 15, 2026

Copy link
Copy Markdown
Contributor

Fixes the first part of: #16352 for ExpireSnapshots
By calling removeEncryptionKey when we remove snapshot from metadata.

This assumes that snapshot <> encryption key is 1:1. I think this is a valid assumption given a new encryption key is created per manifest list here? But keen to get thoughts from others on this invariant, I am relatively unfamiliar with all this. Do people know what the intention here was? Are there cases where key re-use could happen?

@github-actions github-actions Bot added the core label May 15, 2026
@Hugo-WB Hugo-WB force-pushed the cleanup-encryption-keys branch from 3166d6c to 3d534d6 Compare May 15, 2026 16:19
@Hugo-WB Hugo-WB marked this pull request as ready for review May 18, 2026 12:20
@Hugo-WB

Hugo-WB commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

@ggershinsky curious to hear your thoughts on above. Specifically the questions I left in the PR description. Thanks!

@ggershinsky

Copy link
Copy Markdown
Contributor

sgtm. manifest list file keys are generated per snapshot, and are not re-used. if a snapshot is removed, no reason to keep its key.

@Hugo-WB

Hugo-WB commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

sounds good. @huaxingao could I get a review on the above please!

@Hugo-WB

Hugo-WB commented May 21, 2026

Copy link
Copy Markdown
Contributor Author

friendly ping on above! Sorry still new 😅 @ggershinsky would appreciate a review and if you could tag in someone else who could give this a review and is familiar with the encryption space!

@ggershinsky

Copy link
Copy Markdown
Contributor

sure, maybe @huaxingao and @singhpk234 can help

@Hugo-WB

Hugo-WB commented May 26, 2026

Copy link
Copy Markdown
Contributor Author

Hey @huaxingao / @singhpk234 friendly ping on above 😄

@singhpk234 singhpk234 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change makes sense to me ! can we add an E2E where post removing the expired snapshot (MLK) one is still able to read the table ?

- While adding E2E test found a bug in the HiveTableOperations.
- HiveTableOperations, as well as the currently open PR for the
  RestTableOperations, adds all keys from EncryptionManager into
  TableMetadata
@huaxingao

Copy link
Copy Markdown
Contributor

@Hugo-WB Could you resolve the conflict?

@Hugo-WB Hugo-WB left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @huaxingao!

Merged upstream/main and updated a comment.

I still have some concerns about the correctness here long term. Left a comment, would be keen to get the answer to it before merging. Thanks!

Comment on lines +198 to +216
Set<String> referencedEncryptionKeys =
Sets.union(
metadata.snapshots().stream()
.map(Snapshot::keyId)
.filter(Objects::nonNull)
.collect(Collectors.toUnmodifiableSet()),
metadata.encryptionKeys().stream()
.map(EncryptedKey::encryptedById)
.filter(Objects::nonNull)
.collect(Collectors.toUnmodifiableSet()));
TableMetadata.Builder metadataBuilder = TableMetadata.buildFrom(metadata);
standardEncryptionManager.encryptionKeys().values().stream()
.filter(
encryptedKey ->
referencedEncryptionKeys.contains(encryptedKey.keyId())
// The KEK may not be referenced but must still be saved in the metadata.
|| standardEncryptionManager.keyEncryptionKeyID().equals(encryptedKey.keyId()))
.forEach(metadataBuilder::addEncryptionKey);
return metadataBuilder.build();

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new logic depends on an invariance that the manifestListKey -> KEK -> table encryption key. are all one "hop". E.g. the -> means encryptedBy.

If there was a world where a key in the encryptionKeys() list is encrypted by another key in the encryptionKey list, which is then in itself encrypted by the KEK, and they were all added in a single commit, this PR would fail to propagate the middle key.
E.g.:

EncryptionKeys:[{key: A, encryptedBy: B}, {key: B, encryptedBy: C}, {key: C, encryptedBy: KEK}]
snapshot: [snapshotId: x, keyId: A]

And all of the above happens within one commit.
In the above code, we would not add key B to the metadata file. As none of the encryptionKeys will be part of metadata.encryptionKeys() as that will be empty. We will keep A and C, since A is referenced by snapshot and C is a KEK.

TLDR: question @ggershinsky : we expect the depth of encryption key referencing from manifest list -> KEK to remain at a single hop right, we don't expect there to be another hop?

@Hugo-WB Hugo-WB Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought a behaviour change here would get caught by e2e tests. But worth noting.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, the approach is direct - to save KMS interactions, ml keys are encrypted by an intermediate local KEK, that in turn is encrypted by KMS. So it's always ml key -> KEK -> KMS. No reason to introduce another hop.

} else {
tableMetadata = metadata;
}
tableMetadata = EncryptionUtil.addEmKeysToMetadata(metadata, encrManager);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @smaheshwar-pltr for: #13225. If you could update your REST implementation to use this utility once it merges.

@Hugo-WB

Hugo-WB commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Friendly ping @huaxingao @singhpk234

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants