Scale ha combined by jayunit100 · Pull Request #2 · roofmonkey/kubernetes

jayunit100 · 2015-08-10T14:19:30Z

@timothysc @rrati merged everything to one branch am testing now.
just FYI any thoughts feel free to put in here !

lease.go LConfig->Conf WIP SCALE HA, moved lock up 1 level ready for first review

locks imports make it compile

… the same object

timothysc · 2015-08-14T00:45:36Z

Guys we need to finish

Finish cleanup
shift to experimental
rebase
one we are green lets put up a WIP pr so we can get feedback while we give it cycles.

Ideally getting a WIP up tomorrow just to start getting feedback would be good.

jayunit100 · 2015-08-14T03:51:35Z

I think Cleanup is done addressed earlier comments, but have to copy final struct over to scheduler will do that in the am then we can open the PR up to upstream and start rebasing

jayunit100 · 2015-08-14T13:03:42Z

does this change effect more than just the locking stuff?

Yes. Before this change kube ignored the expire events from etcd. I don't know there was anything that used etcd's ttl so it probably wasn't needed before. An expire event is only when a key expires in etcd, so from kube's perspective it really should be handled like a delete. The only difference between a delete and an expire event in etcd is who initiated the key removal. For a delete, it is expressly asked by an external entity vs expire is done by etcd w/o external prompting

…duler and controller manager using the new lock API

timothysc · 2015-08-14T15:54:51Z

This should be the full process name.

…f course also an option

timothysc · 2015-08-14T16:22:23Z

Should check with decarr where is that recursive delete @...
This is the same iterative problem.

timothysc · 2015-08-14T16:32:24Z

The lock name needs to be an input option, with the default, so you need to plumb through to the cmdline stuffs

Should it? This is the controller-manager. Is there value in having the lock name configurable for the kcm? I was thinking we would want to hard-code the lock name in the processes so there's no chance of a configuration snaffu leading to issues

jayunit100 · 2015-08-16T18:52:50Z

(last thing i did friday, just remembered to add these notes here !) i got some testing done on some on a cluster (you can do this by modifying vagrant scripts to launch 2 masters). I can add the code tomorrow for that. but ...failover didn't seem to happen. monday

we need to see what the story is w/ ttls to make sure they are getting propogated properly.
I may have to add more logic to the utility class... i think the "update" logic maybe is messed up.

… times are updated properly and so on.

jayunit100 · 2015-08-18T22:20:14Z

interesting. i tested the renewtime and it seems to properly add a TTL to the etcd entries. but i cant see where?

Created-Index: 12
Modified-Index: 76
TTL: 26
Etcd-Index: 77
Raft-Index: 317
Raft-Term: 2
{"kind":"Lock","apiVersion":"v1","metadata":{"name":"ha.cm.lock","namespace":"default","uid":"f565e30b-45f6-11e5-b222-50e549b89d2f","resourceVersion":"73","creationTimestamp":"2015-08-18T22:17:48Z","deletionTimestamp":"2015-08-18T22:20:04Z"},"spec":{"heldby":"andross","duration":30,"atime":"2015-08-18 18:17:48.603475865 -0400 EDT","rtime":"2015-08-18 18:19:39.734959751 -0400 EDT"}}

…here is an error around startup after lock acquisition

…, a mod to local-up-cluster which starts 2 kcms.

jayunit100 · 2015-08-20T02:04:42Z

FYI, I was on the fence about keeping these changes... but they allow you to test HA in local mode by inducing failover. Iim happy to including them in the final PR, unless folks think its a bad idea. For now while this is WIP its good to have these for regression testing as we change the code base and especially before we rebase.

The way it works: Just tail the logs of controller-manager-1.log and controller-manager-2.log, and you'll see the handoff happen after the time bomb goes off.

jayunit100 and others added 7 commits August 6, 2015 18:12

Controller Lease: Run loop

cd3449d

lease.go LConfig->Conf WIP SCALE HA, moved lock up 1 level ready for first review

Fix k8s io on lease.go

db06889

k8 io

8e3cbb3

First pass to implement the Lock type

958e5be

Merge remote-tracking branch 'ROOFMONKEY/ha-api' into scale-ha-combined

1ce58a5

Fixed issues with tests

8839c6d

more merging

7a2a512

locks imports make it compile

jayunit100 force-pushed the scale-ha-combined branch from 772a6ac to 7a2a512 Compare August 10, 2015 21:03

jayunit100 and others added 13 commits August 10, 2015 17:16

Merge remote-tracking branch 'ROOFMONKEY/ha-api' into scale-ha-combined

b4b63cd

Cleanup locks imports to k8s.io

c148304

Merge branch 'scale-ha-jay' into scale-ha-combined

afcb821

Converted fack_locks from FakeAction->New{action}Action

9277019

Fixed some copyright header dates

823c0d3

First pass at integrating lease.go w robs stuff.

c6ef4f8

First working version of lease.go which uses robs api

c95f45f

Added unit tests for lock api and fixed the getAttrs method for locks

9182320

Added support to treat etcd expiration events like a delete

7a6fac4

Corrected copyright date

e26f738

Fixes to use structs for lease election and log LeasesGained/LeasesLost

0bca856

User pointer for LeaseUserInfo so that state updates are happening to…

dbb801c

… the same object

Docs TODO commit, will need love and squashing

f997c7c

jayunit100 reviewed Aug 14, 2015
View reviewed changes

jayunit100 added 2 commits August 14, 2015 11:01

Refactored all reusable stuffs into lease.go

ba160ba

Cleaned up and consolidated lock code, first iteration with both sche…

1184536

…duler and controller manager using the new lock API

timothysc reviewed Aug 14, 2015
View reviewed changes

Comment thread cmd/kube-controller-manager/controller-manager.go Outdated

Copy link
Copy Markdown

timothysc Aug 14, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the full process name.

Set default sleep to 5 seconds. Parameterizing it in the daemons is o…

2378a8c

…f course also an option

timothysc reviewed Aug 14, 2015
View reviewed changes

remove comments

faa6819

timothysc reviewed Aug 14, 2015
View reviewed changes

jayunit100 added 2 commits August 17, 2015 18:35

Lease updates (warning: lots of debug code in here) to make sure that…

9db3ff8

… times are updated properly and so on.

intermediate: lock cleaning/lifecycle to use RenewTime

60a3e8a

jayunit100 force-pushed the scale-ha-combined branch from 444e116 to 60a3e8a Compare August 18, 2015 14:23

Updated lease.go to parse dates from RenewTime properly

e953d1d

jayunit100 added 3 commits August 18, 2015 22:38

Added back in flags and the important check for whoami. working but t…

47ea5d6

…here is an error around startup after lock acquisition

Fix name of CM

1995479

Now ready for cluster testing.... Added debugging, timebomb, and also…

36484dd

…, a mod to local-up-cluster which starts 2 kcms.

jayunit100 reviewed Aug 20, 2015
View reviewed changes

jayunit100 added 3 commits August 19, 2015 22:16

remove comment

2e8a19a

Intermediate update, comments/consolidation

a09a4b4

Move verflags to the top

2254d78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale ha combined#2

Scale ha combined#2
jayunit100 wants to merge 33 commits into
masterfrom
scale-ha-combined

jayunit100 commented Aug 10, 2015

Uh oh!

timothysc commented Aug 14, 2015

Uh oh!

jayunit100 commented Aug 14, 2015

Uh oh!

jayunit100 Aug 14, 2015

Uh oh!

rrati Aug 14, 2015

Uh oh!

timothysc Aug 14, 2015

Uh oh!

timothysc Aug 14, 2015

Uh oh!

timothysc Aug 14, 2015

Uh oh!

rrati Aug 14, 2015

Uh oh!

jayunit100 commented Aug 16, 2015

Uh oh!

jayunit100 commented Aug 18, 2015

Uh oh!

jayunit100 Aug 20, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jayunit100 commented Aug 10, 2015

Uh oh!

timothysc commented Aug 14, 2015

Uh oh!

jayunit100 commented Aug 14, 2015

Uh oh!

jayunit100 Aug 14, 2015

Choose a reason for hiding this comment

Uh oh!

rrati Aug 14, 2015

Choose a reason for hiding this comment

Uh oh!

timothysc Aug 14, 2015

Choose a reason for hiding this comment

Uh oh!

timothysc Aug 14, 2015

Choose a reason for hiding this comment

Uh oh!

timothysc Aug 14, 2015

Choose a reason for hiding this comment

Uh oh!

rrati Aug 14, 2015

Choose a reason for hiding this comment

Uh oh!

jayunit100 commented Aug 16, 2015

Uh oh!

jayunit100 commented Aug 18, 2015

Uh oh!

jayunit100 Aug 20, 2015

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants