HDDS-15125. MVCC-Style Snapshot Reclaimability Using seqNumMin / seqNumMax#14
HDDS-15125. MVCC-Style Snapshot Reclaimability Using seqNumMin / seqNumMax#14peterxcli wants to merge 1 commit into
Conversation
Signed-off-by: peterxcli <peterxcli@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request introduces MVCC-style visibility intervals for Ozone snapshots to optimize the key reclamation process. By adding seqNumMin and seqNumMax fields to key versions, the system can determine reclaimability via sequence number checks rather than expensive snapshot lookups. Feedback identifies a logic error in ReclaimableKeyFilter where exclusive size metrics may be over-counted if multiple snapshots fall within a key's visibility interval. Additionally, it is recommended to set seqNumMax for overwritten keys in OMKeyRequest to ensure they utilize the optimized reclamation path instead of falling back to legacy logic.
| int pos = lowerBound(snapshotSequences.getCreateSeqNums(), seqNumMin); | ||
| boolean referenced = pos < snapshotSequences.getCreateSeqNums().length && | ||
| snapshotSequences.getCreateSeqNums()[pos] < seqNumMax; | ||
| if (referenced) { | ||
| addExclusiveSize(snapshotSequences.getSnapshotIds()[pos], keyInfo); | ||
| } |
There was a problem hiding this comment.
The current logic for updating exclusiveSizeMap over-counts data size. A key is only exclusive to a snapshot if that snapshot is the only one within the visibility interval [seqNumMin, seqNumMax). The current implementation adds the key's size to the first snapshot that references it, even if subsequent snapshots also reference it. This will lead to incorrect exclusive size metrics.
int pos = lowerBound(snapshotSequences.getCreateSeqNums(), seqNumMin);
boolean referenced = pos < snapshotSequences.getCreateSeqNums().length &&
snapshotSequences.getCreateSeqNums()[pos] < seqNumMax;
if (referenced) {
// A key is exclusive to a snapshot only if no other snapshot references it.
boolean exclusive = (pos + 1 == snapshotSequences.getCreateSeqNums().length) ||
(snapshotSequences.getCreateSeqNums()[pos + 1] >= seqNumMax);
if (exclusive) {
addExclusiveSize(snapshotSequences.getSnapshotIds()[pos], keyInfo);
}
}| protected static OmKeyInfo prepareKeyInfoForDeleteMap(long trxnLogIndex, OmKeyInfo keyInfo) { | ||
| if (keyInfo.getObjectID() == OBJECT_ID_RECLAIM_BLOCKS) { | ||
| return keyInfo.toBuilder() | ||
| .setUpdateID(trxnLogIndex) | ||
| .setSeqNumMin(trxnLogIndex) | ||
| .setSeqNumMax(trxnLogIndex) | ||
| .build(); | ||
| } | ||
| return keyInfo; | ||
| } |
There was a problem hiding this comment.
The prepareKeyInfoForDeleteMap method should also set seqNumMax for normal keys if seqNumMin is present. This ensures that when a key is overwritten (which calls this method via addKeyInfoToDeleteMap), its visibility interval is correctly closed at the transaction index of the overwrite. Without this, overwritten keys will fall back to the legacy reclamation logic because they will lack a seqNumMax.
protected static OmKeyInfo prepareKeyInfoForDeleteMap(long trxnLogIndex, OmKeyInfo keyInfo) {
if (keyInfo.getObjectID() == OBJECT_ID_RECLAIM_BLOCKS) {
return keyInfo.toBuilder()
.setUpdateID(trxnLogIndex)
.setSeqNumMin(trxnLogIndex)
.setSeqNumMax(trxnLogIndex)
.build();
}
if (keyInfo.hasSeqNumMin()) {
return keyInfo.toBuilder()
.setSeqNumMax(trxnLogIndex)
.build();
}
return keyInfo;
}|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
https://issues.apache.org/jira/browse/HDDS-15125
This pull request introduces MVCC-style sequence number intervals to improve snapshot key reclamation in Ozone. The main changes add
seqNumMinandseqNumMaxfields toOmKeyInfoand the protocol buffer, update the logic for setting these fields during key creation and deletion, and provide configuration and metrics to enable and monitor this optimization. Comprehensive tests are also added to verify correct behavior.MVCC Interval Support for Snapshot Key Reclamation:
Added
seqNumMinandseqNumMaxfields toOmKeyInfo, including builder methods, accessors, equality checks, and serialization/deserialization logic in both Java and protobuf definitions. These fields represent the sequence number interval for which a key version is visible, supporting more efficient snapshot key reclamation. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]Updated key commit and deletion logic to set
seqNumMinandseqNumMaxappropriately during key lifecycle events, ensuring the interval is tracked for MVCC-style reclaimability checks. [1] [2] [3] [4] [5] [6] [7] [8]Configuration and Metrics:
Introduced a new configuration property
ozone.om.snapshot.key.reclaim.interval.enabled(default: true) to enable the new reclaim logic, and added corresponding constants. [1] [2]Added new metrics to
DeletingServiceMetricsto track the number of key reclaimability decisions optimized by seqNum intervals and those that fall back to legacy logic. [1] [2]Testing:
TestOmKeyInfoto verify correct preservation and behavior of the new interval fields during protobuf conversion and key deletion. [1] [2] [3]These changes collectively enable more efficient and accurate snapshot key reclamation by leveraging MVCC-style sequence number intervals, with configuration options and metrics to monitor and control the feature.