Skip to content

Fix wrong tikv store leader count#762

Merged
lucklove merged 16 commits intopingcap:masterfrom
lucklove:fix-evict-tikv
Sep 11, 2020
Merged

Fix wrong tikv store leader count#762
lucklove merged 16 commits intopingcap:masterfrom
lucklove:fix-evict-tikv

Conversation

@lucklove
Copy link
Copy Markdown
Member

@lucklove lucklove commented Sep 7, 2020

Fix #761

Signed-off-by: lucklove gnu.crazier@gmail.com

What problem does this PR solve?

After pd restart, the region leader information will be lost and tiup can't evict tikv leaders correctly.

What is changed and how it works?

Use kv metrics api to count leader instead of read it from PDs

Check List

Tests

  • Manual test (add detailed scripts or steps below)

Both tls and non-tls works:
屏幕快照 2020-09-07 下午7 31 18

Related changes

  • Need to cherry-pick to the release branch

Release notes:

NONE

Fix pingcap#761

Signed-off-by: lucklove <gnu.crazier@gmail.com>
@lucklove lucklove added status/PTAL type/bug-fix Categorizes PR as a bug-fix labels Sep 7, 2020
Comment thread pkg/cluster/api/pdapi.go
Comment thread pkg/cluster/api/pdapi.go Outdated
Comment thread pkg/cluster/api/pdapi.go Outdated
Comment thread pkg/cluster/spec/tikv.go Outdated
Signed-off-by: lucklove <gnu.crazier@gmail.com>
Copy link
Copy Markdown
Contributor

@AstroProfundis AstroProfundis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 9, 2020
AstroProfundis and others added 8 commits September 9, 2020 12:30
Signed-off-by: lucklove <gnu.crazier@gmail.com>
Signed-off-by: lucklove <gnu.crazier@gmail.com>
Signed-off-by: lucklove <gnu.crazier@gmail.com>
Signed-off-by: lucklove <gnu.crazier@gmail.com>
Signed-off-by: lucklove <gnu.crazier@gmail.com>
Comment thread pkg/cluster/spec/tikv.go Outdated
Comment thread pkg/cluster/spec/tikv.go
Comment on lines +379 to +380
for mf := range mfChan {
fm := prom2json.NewFamily(mf)
fms = append(fms, fm)
}
for _, fm := range fms {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not merge the two iterations to avoid use a slice to save the *prom2json.Family?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that there may be goroutine leak. eg: the loop found target metric and returns but the sender has not finish all metrics

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can resolve it via:

defer close(mfChan)
for {
}

Comment thread pkg/cluster/spec/tikv.go Outdated
Comment thread pkg/cluster/spec/tikv.go Outdated
Signed-off-by: lucklove <gnu.crazier@gmail.com>
Signed-off-by: lucklove <gnu.crazier@gmail.com>
Copy link
Copy Markdown
Contributor

@lonng lonng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot removed the status/LGT1 Indicates that a PR has LGTM 1. label Sep 10, 2020
@ti-srebot ti-srebot added the status/LGT2 Indicates that a PR has LGTM 2. label Sep 10, 2020
Signed-off-by: lucklove <gnu.crazier@gmail.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Sep 11, 2020

Codecov Report

Merging #762 into master will increase coverage by 0.08%.
The diff coverage is 56.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #762      +/-   ##
==========================================
+ Coverage   54.62%   54.71%   +0.08%     
==========================================
  Files         260      260              
  Lines       21015    21070      +55     
==========================================
+ Hits        11479    11528      +49     
+ Misses       7993     7989       -4     
- Partials     1543     1553      +10     
Flag Coverage Δ
#cluster 46.85% <56.45%> (+0.09%) ⬆️
#dm 26.93% <0.00%> (-0.09%) ⬇️
#integrate 49.99% <56.45%> (+0.11%) ⬆️
#playground 23.07% <0.00%> (-0.02%) ⬇️
#tiup 19.80% <ø> (ø)
#unittest 19.50% <0.00%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/cluster/api/pdapi.go 52.69% <41.66%> (-4.63%) ⬇️
pkg/cluster/spec/tikv.go 59.90% <60.00%> (+1.67%) ⬆️
pkg/cluster/spec/alertmanager.go 66.12% <0.00%> (-3.23%) ⬇️
pkg/cluster/spec/prometheus.go 75.89% <0.00%> (-1.79%) ⬇️
pkg/cluster/manager.go 68.21% <0.00%> (+0.14%) ⬆️
pkg/cluster/operation/action.go 55.06% <0.00%> (+0.84%) ⬆️
pkg/cluster/spec/tispark.go 69.52% <0.00%> (+2.85%) ⬆️
pkg/cluster/task/update_meta.go 74.19% <0.00%> (+3.22%) ⬆️
pkg/cliutil/cliutil.go 72.85% <0.00%> (+8.57%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6296bca...08294df. Read the comment docs.

Signed-off-by: lucklove <gnu.crazier@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/LGT2 Indicates that a PR has LGTM 2. type/bug-fix Categorizes PR as a bug-fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PD returns 0 leader count for tikv store after restart

5 participants