Skip to content

Adjust the auto rebalancer state assignment logic to reduce top state transition.#986

Merged
jiajunwang merged 1 commit into
apache:masterfrom
jiajunwang:statetest
May 6, 2020
Merged

Adjust the auto rebalancer state assignment logic to reduce top state transition.#986
jiajunwang merged 1 commit into
apache:masterfrom
jiajunwang:statetest

Conversation

@jiajunwang
Copy link
Copy Markdown
Contributor

Issues

  • My PR addresses the following Helix issues and references them in the PR description:

#985

Description

  • Here are some details about my PR, including screenshots of any UI changes:

The old state assignment logic assign the states to selected nodes according to the priority of the current replica state that is on the instance. Moreover, the sorting algorithm is designed to prioritize both current topstate and current secondary states equally. The result is that we will have premature mastership handoff to a current seconardy state host before the real desired master host is ready. For example,

  1. The current states are: [N1:M, N2:S, N3,S]
  2. The desired states are: [N4:M, N2:S, N1:S]
  3. Due to the sorting logic based on current states, we will have a transient preference list ordered like: [N2, N1, N4]. In which case, the controller will assign master to N2 before N4 has a slave state replica.
  4. When N4 finishes the Offline to Slave transition, the same sorting logic will sort the preference list to be: [N4, N2, N1]. Then we have another mastership handoff.
    To be clear, we don't want step 3. But only the state transition in step 4.

In this PR, we refactor the sorting logic so that it will only move the master whenever the candidate has a "ready" state replica, in which case, only one mastership handoff happens.

Tests

  • The following tests are written for this issue:

Add the test scenario to TestAbstractRebalancer.

  • The following is the result of the "mvn test" command on the appropriate module:

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestJobQueueCleanUp.testJobQueueAutoCleanUp » ThreadTimeout Method org.testng....
[ERROR] TestWorkflowTimeout.testWorkflowTimeoutWhenWorkflowCompleted:116 expected: but was:
[INFO]
[ERROR] Tests run: 1145, Failures: 2, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:21 h
[INFO] Finished at: 2020-04-30T23:30:07-07:00
[INFO] ------------------------------------------------------------------------

Rerun:
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.931 s - in TestSuite
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 37.212 s
[INFO] Finished at: 2020-04-30T23:32:13-07:00
[INFO] ------------------------------------------------------------------------

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation (Optional)

  • In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

stateCount--;
String proposedInstance = instance;
// Additional check and alternate the assignment for reducing top state handoff.
if (state.equals(stateModelDef.getTopState()) && !stateModelDef.getSecondTopStates()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you already sorted the preference list based on the state. Why not just replace the looping of preference list with the sorted list but add this logic?

This logic makes two order of preference list intersected each other that logic is not very clear.

Copy link
Copy Markdown
Contributor Author

@jiajunwang jiajunwang May 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it. But even I move this logic to the sorting logic, the sorting itself will be comparing one list with 2 logics. This interesting trick is just moved to another place. I don't have a way to eliminate it for now.
Then why I want to put it here instead of the sorting logic? Because the sorting logic is away from this business logic block. That would be even harder for me to reason even with comments.
Put it here, I understand it is intersecting the same list, but I can at least give it a good reason.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider this as a skiplist type design/implementation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you point. Purely relies on sorting is fine because the previous author use it in wrong way. We suppose to sort it before looping.

Anyway, I am OK to keep the current behavior. But can you add a TODO here. To remind that we can refactor the logic based on sorting with comprehensive test in the future?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly. Actually, I don't agree this is a TODO work to put everything in a sorting method. That is just an alternative design, and I don't think it is cleaner than the current one.
Considering the proposed universal sorting method, it is actually merging 2 orders (preference list order and the current state order). However, there is no clear way to define the priority score.
Note the logic here is that if the node is not on secondary state or above but it is attempted to be a master, we should find alternatives according to the current state order.
This logic cannot be done with only the 2 orders. We have to know the proposed states assignment. For example, how many top states and where to assign based on the preference state.
Give this, to merging the sort in one method (instead of the current design), we will need to:

  1. propose a state assignment based on the preference list
  2. input this proposed state mapping as one additional input to the sorting method, and then sort accordingly.
  3. calculate another state assignment based on the re-sorted list.

It seems more confusing and slower than the current design.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. The sorting logic could be complicated. Alright, let's revisit the logic if it requires refactoring later.

Copy link
Copy Markdown
Contributor

@junkaixue junkaixue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, please add comments to that.

… transition.

The old state assignment logic assign the states to selected nodes according to the priority of the current replica state that is on the instance. Moreover, the sorting algorithm is designed to prioritize both current topstate and current secondary states equally. The result is that we will have premature mastership handoff to a current seconardy state host before the real desired master host is ready. For example,
1. The current states are: [N1:M, N2:S, N3,S]
2. The desired states are: [N4:M, N2:S, N1:S]
3. Due to the sorting logic based on current states, we will have a transient preference list ordered like: [N2, N1, N4]. In which case, the controller will assign master to N2 before N4 has a slave state replica.
4. When N4 finishes the Offline to Slave transition, the same sorting logic will sort the preference list to be: [N4, N2, N1]. Then we have another mastership handoff.
To be clear, we don't want step 3. But only the state transition in step 4.

In this PR, we refactor the sorting logic so that it will only move the master whenever the candidate has a "ready" state replica, in which case, only one mastership handoff happens.
@jiajunwang
Copy link
Copy Markdown
Contributor Author

This PR is ready to be merged, approved by @dasahcc

@jiajunwang jiajunwang merged commit a1aba60 into apache:master May 6, 2020
@jiajunwang jiajunwang deleted the statetest branch May 6, 2020 22:48
jiajunwang pushed a commit to jiajunwang/helix that referenced this pull request Jun 16, 2020
…cted.

This is to fix a regression which has been introduced by PR apache#986.
The PR tried to prioritize the preference list to avoid unnecessary top state transition. However, there was a bug in the prioritizing logic and if one participant is skipped due to low priority, it won't be picked up again during the cauculating. As a result, this participant won't be assigned with any replica even it is originally in the preference list.
This fix will ensure the skipped participant being checked again until it gets the assignment.
jiajunwang pushed a commit to jiajunwang/helix that referenced this pull request Jun 16, 2020
…cted.

This is to fix a regression which was introduced by PR apache#986.
The PR tried to prioritize the preference list to avoid unnecessary top state transition. However, there was a bug in the prioritizing logic and if one participant is skipped due to low priority, it won't be picked up again during the cauculating. As a result, this participant won't be assigned with any replica even it is originally in the preference list.
This fix will ensure the skipped participant being checked again until it gets the assignment.
jiajunwang added a commit that referenced this pull request Jun 18, 2020
…cted. (#1098)

This is to fix a regression that was introduced by PR #986
The PR tried to prioritize the preference list to avoid unnecessary top state transitions. However, there was a bug in the prioritizing logic and if one participant is skipped due to low priority, it won't be picked up again during the calculating. As a result, this participant won't be assigned with any replica even it is originally in the preference list.
This only happens if the state model has been customized so it is multiple top states and there is an intermediate state with expected count -1 between the top state and the other states.

This fix will ensure the skipped participant being checked again until it gets the assignment.
huizhilu pushed a commit to huizhilu/helix that referenced this pull request Aug 16, 2020
… transition. (apache#986)

The old state assignment logic assign the states to selected nodes according to the priority of the current replica state that is on the instance. Moreover, the sorting algorithm is designed to prioritize both current topstate and current secondary states equally. The result is that we will have premature mastership handoff to a current seconardy state host before the real desired master host is ready.
For example,
1. The current states are: [N1:M, N2:S, N3,S]
2. The desired states are: [N4:M, N2:S, N1:S]
3. Due to the sorting logic based on current states, we will have a transient preference list ordered like: [N2, N1, N4]. In which case, the controller will assign master to N2 before N4 has a slave state replica.
4. When N4 finishes the Offline to Slave transition, the same sorting logic will sort the preference list to be: [N4, N2, N1]. Then we have another mastership handoff.
To be clear, we don't want step 3. But only the state transition in step 4.

In this PR, we refactor the sorting logic so that it will only move the master whenever the candidate has a "ready" state replica, in which case, only one mastership handoff happens.
huizhilu pushed a commit to huizhilu/helix that referenced this pull request Aug 16, 2020
…cted. (apache#1098)

This is to fix a regression that was introduced by PR apache#986
The PR tried to prioritize the preference list to avoid unnecessary top state transitions. However, there was a bug in the prioritizing logic and if one participant is skipped due to low priority, it won't be picked up again during the calculating. As a result, this participant won't be assigned with any replica even it is originally in the preference list.
This only happens if the state model has been customized so it is multiple top states and there is an intermediate state with expected count -1 between the top state and the other states.

This fix will ensure the skipped participant being checked again until it gets the assignment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants