Skip to content

Conversation

@szetszwo
Copy link
Contributor

@szetszwo szetszwo commented May 6, 2024

@szetszwo
Copy link
Contributor Author

szetszwo commented May 6, 2024

@adoroszlai , could you test if this could fix the problem?

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @szetszwo for the patch.

Tried 100 runs with it:

Failed to close channel localhost:... in 10s appears in 2 runs:

  • split 7 iteration 5
  • split 9 iteration 1

It may have appeared in successful runs, too, but those logs are not kept.

testBasicAppendEntriesKillLeader is successful even if channel cannot be closed.

So the patch fixes the fork timeout.

@szetszwo
Copy link
Contributor Author

szetszwo commented May 7, 2024

Failed to close channel localhost:... in 10s appears in 2 runs:

Both cases seem okay

  • split 7 iteration 5: Interrupted
  • split 9 iteration 1: Connection refused: localhost/127.0.0.1:15052 (network problem?)

@szetszwo
Copy link
Contributor Author

szetszwo commented May 7, 2024

RATIS-2076 happened in 6 runs (about the same rate as previously)

There were 300 threads running. I suspect the failures was due to slowness. Let's try increasing the timeout.

@adoroszlai
Copy link
Contributor

Let's try increasing the timeout.

Only 1/100 runs failed with 63490ea:
https://github.com/adoroszlai/ratis/actions/runs/8989894896/job/24693966187#step:6:7592

@szetszwo
Copy link
Contributor Author

szetszwo commented May 7, 2024

@adoroszlai , thanks for testing and reviewing this! Let's merge this PR first. We can continue fixing the remaining problem on RATIS-2076.

@szetszwo szetszwo merged commit ac05d64 into apache:master May 7, 2024
SzyWilliam pushed a commit that referenced this pull request Jun 12, 2024
szetszwo added a commit to szetszwo/ratis that referenced this pull request Jun 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants