Skip to content

HDDS-15328. Fix sleep time in RetryInvocationHandler#10317

Open
adoroszlai wants to merge 4 commits into
apache:masterfrom
adoroszlai:HDDS-15328
Open

HDDS-15328. Fix sleep time in RetryInvocationHandler#10317
adoroszlai wants to merge 4 commits into
apache:masterfrom
adoroszlai:HDDS-15328

Conversation

@adoroszlai
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Sleep period in RetryInvocationHandler should be waitTime, not delay (see #10310 (comment)).

  1. RetryInfo.retryTime is calculated when it is created:
    this.delay = delay;
    this.retryTime = Time.monotonicNow() + delay;
  2. waitTime is calculated before sleep as:
    synchronized Long getWaitTime(final long now) {
    return retryInfo == null? null: retryInfo.retryTime - now;
  3. but then original RetryInfo.delay is used for actual sleep time:
    final Long waitTime = getWaitTime(Time.monotonicNow());
    LOG.trace("#{} processRetryInfo: retryInfo={}, waitTime={}",
    callId, retryInfo, waitTime);
    if (waitTime != null && waitTime > 0) {
    try {
    Thread.sleep(retryInfo.delay);

Since this is such a small change, also make some improvements in this area:

  • Remove some leftover references to async call handling
  • Port TestRetryProxy from Hadoop
  • Change test-client profile in pom.xml to include tests ported from Hadoop (more tests to come in follow-up PRs)

https://issues.apache.org/jira/browse/HDDS-15328

How was this patch tested?

Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.274 s -- in org.apache.hadoop.io_.retry.TestRetryProxy

https://github.com/adoroszlai/ozone/actions/runs/26150721145

@adoroszlai adoroszlai self-assigned this May 20, 2026
Copy link
Copy Markdown
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adoroszlai Thanks for catching this issue, LGTM +1.

I think this will subtly change the retry wait behavior where instead of waiting for a fixed delay, it will wait based on the calculated time when RetryInfo is created. However, I think the difference should not be noticeable in most cases since RetryInfo creation in handleException and waitTime calculation in processWaitTimeAndRetryInfo should be similar most of the time (unless there is a huge pause between handleException and processWaitTimeAndRetryInfo or other cases like when retryTime is not set to null in processRetryInfo somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants