-
Notifications
You must be signed in to change notification settings - Fork 438
RATIS-2009. ReferenceCount should work for all LogEntry types. #1021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@szetszwo can you have a look? |
…ing submitted to RaftServer by SlidingWindow.
|
I also just attempted to solve the following test failure in this PR. @szetszwo
|
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing also testWatchRequestAsyncChangeLeader! Please see the comments inlined.
| default CompletableFuture<Long> appendEntry(LogEntryProto entry, TransactionContext context) { | ||
| return appendEntry(entry); | ||
| return appendEntry(ReferenceCountedObject.wrap(entry), context); | ||
| } | ||
|
|
||
| CompletableFuture<Long> appendEntry(ReferenceCountedObject<LogEntryProto> entryRef, TransactionContext context); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is in ratis-server-api, we have to provide a default implementation of the new method for compatibility.
/**
* Append asynchronously an entry.
* Used by the leader.
*/
default CompletableFuture<Long> appendEntry(ReferenceCountedObject<LogEntryProto> entryRef, TransactionContext context) {
return appendEntry(entryRef.get(), context);
}
/**
* @deprecated use {@link #append(ReferenceCountedObject)}.
*/
@Deprecated
default CompletableFuture<Long> appendEntry(LogEntryProto entry, TransactionContext context) {
throw new UnsupportedOperationException();
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update also RaftLogBase:
@@ -345,15 +346,17 @@ public abstract class RaftLogBase implements RaftLog {
@Override
public final CompletableFuture<Long> appendEntry(LogEntryProto entry) {
- return appendEntry(entry, null);
+ return appendEntry(ReferenceCountedObject.wrap(entry), null);
}
| final long seq = pending.getSeqNum(); | ||
| processClientRequest(pending.getRequestRef(), | ||
| reply -> slidingWindow.receiveReply(seq, reply, this::sendReply)); | ||
| pending.release(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use whenComplete(..).
final ReferenceCountedObject<RaftClientRequest> ref = pending.getRequestRef();
processClientRequest(ref, reply -> slidingWindow.receiveReply(seq, reply, this::sendReply))
.whenComplete((v, e) -> ref.release());There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The processClientRequest eventually calls submitClientRequestAsync(ReferenceCountedObject<RaftClientRequest>) that returns a Future.
I realized that when calling any asynchronous method accepting a reference counter, we hardly need to wait for whenComplete to release the counter. Look at an example below
final RaftClientRequest request = requestRef.retain();
try {
// do something with the request.
// call asynchronous
return submitClientRequestAsync()
// We don't need and should not do this
// .whenComplete((r, e) -> requestRef.release());
;
} finally {
// instead just release the counter here.
requestRef.release()
}
This is because:
- Inside
submitClientRequestAsync, there's alreadyretainwhen needed (in LogSegment cache). Other words, if the internal logic ofsubmitClientRequestAsyncneeds to extend the life of the referenced object, that should be done there. The outside client code responsibility is to ensure the reference isretainablewhen it reaches thesubmitClientRequestAsyncand this is done by the outsideretain. The outside code releases the counter as soon as it finishes using it. - There're cases that
whenCompleteis not called. For example, when client times out or disconects, the return future seems to be discarded (documented per RATIS-2007).
| void release() { | ||
| requestRef.release(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method could be removed.
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duongkame , thanks for the update! Just a minor comment inlined.
| /** | ||
| * Append asynchronously an entry. | ||
| * Used by the leader. | ||
| * Used by the leader for scenarios that there is no needs to clean up resources when the given entry is no | ||
| * longer used/referenced by this log. | ||
| */ | ||
| default CompletableFuture<Long> appendEntry(LogEntryProto entry, TransactionContext context) { | ||
| return appendEntry(entry); | ||
| return appendEntry(ReferenceCountedObject.wrap(entry), context); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method is not used anywhere except for the new append(ReferenceCountedObject). It should throw an exception and annotated as deprecated.
/**
* @deprecated use {@link #append(ReferenceCountedObject)}.
*/
@Deprecated
default CompletableFuture<Long> appendEntry(LogEntryProto entry, TransactionContext context) {
throw new UnsupportedOperationException();
}
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 the change looks good.
What changes were proposed in this pull request?
See RATIS-2009
Found this issue when running tests for #1014, TestServerRestartWithGrpc failed because RaftConfiguration entry zero-copy buffer are released too early (while the log entry are still in the EntryCache.)
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/RATIS-2009
How was this patch tested?
Existing tests.