Core: Include row lineage and key-id in snapshot value methods#17015
Conversation
31752ae to
7891e9f
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates Iceberg Core’s BaseSnapshot identity semantics so snapshot equality, hashing, and debug output reflect row-lineage state (firstRowId, addedRows). This prevents snapshots that differ only in assigned row ID ranges from comparing equal and makes those values visible when logging/inspecting snapshots.
Changes:
- Extend
BaseSnapshot.equalsandhashCodeto includefirstRowIdandaddedRows. - Extend
BaseSnapshot.toStringto includefirst-row-idandadded-rows. - Add a focused unit test asserting equality/hash behavior and
toStringvisibility for row-lineage fields.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| core/src/main/java/org/apache/iceberg/BaseSnapshot.java | Includes row-lineage fields in equals, hashCode, and toString for correct snapshot identity/debugging. |
| core/src/test/java/org/apache/iceberg/TestRowLineageMetadata.java | Adds coverage verifying snapshot equality/hash/toString change with/without row-lineage values. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
singhpk234
left a comment
There was a problem hiding this comment.
Thanks for the change @manuzhang
I understand this is focusing on RL, though while we are at it, we can also include key-id as well
7891e9f to
08d0447
Compare
Include snapshot row-lineage fields and manifest list key IDs in BaseSnapshot equals/hashCode and toString. This keeps comparisons and hashes aligned with metadata that changes row ID assignment or manifest-list encryption key selection, and makes those values visible in debug output. Co-authored-by: Codex <codex@openai.com>
08d0447 to
34e733c
Compare
RussellSpitzer
left a comment
There was a problem hiding this comment.
Feels like this was just an oversight before ... I can't think of why we wouldn't include it
There was a problem hiding this comment.
LGTM too, thanks @manuzhang ! Thanks for review @RussellSpitzer
Summary
BaseSnapshot.equals,hashCode, andtoStringnow include the snapshot row-lineage fieldsfirstRowIdandaddedRows, plus the manifest-list encryptionkeyId.This keeps snapshot comparisons and hashes aligned with metadata that changes row ID assignment or encrypted manifest-list key selection. It also makes those values visible in snapshot debug output.
The regression coverage is in
TestSnapshot, alongside the snapshot behavior tests.Testing
env GRADLE_USER_HOME=/tmp/gradle ./gradlew :iceberg-core:test --tests org.apache.iceberg.TestSnapshot.snapshotValueMethodsIncludeMetadataFields --console=plainenv GRADLE_USER_HOME=/tmp/gradle ./gradlew :iceberg-core:spotlessJavaCheck --console=plaingit diff --checkAI Disclosure