Expose flags for new NodeEviction logic in NodeController #30138

gmarek · 2016-08-05T12:53:06Z

Fix #28832
Last PR from the NodeController NodeEviction logic series.

This change is

davidopp · 2016-08-07T03:25:06Z

cmd/kube-controller-manager/app/options/options.go

@@ -173,5 +174,10 @@ func (s *CMServer) AddFlags(fs *pflag.FlagSet) {
 	fs.DurationVar(&s.ControllerStartInterval.Duration, "controller-start-interval", s.ControllerStartInterval.Duration, "Interval between starting controller managers.")
 	fs.BoolVar(&s.EnableGarbageCollector, "enable-garbage-collector", s.EnableGarbageCollector, "Enables the generic garbage collector. MUST be synced with the corresponding flag of the kube-apiserver. WARNING: the generic garbage collector is an alpha feature.")
 	fs.Int32Var(&s.ConcurrentGCSyncs, "concurrent-gc-syncs", s.ConcurrentGCSyncs, "The number of garbage collector workers that are allowed to sync concurrently.")
+	fs.Float32Var(&s.NodeEvictionRate, "node-eviction-rate", 0.1, "Number of nodes per second on which pods are deleted in case of node failure when a zone is healthy.")


s/healthy./healthy (see --unhealthy-zone-threshold for definition of healthy/unhealthy). Zone refers to entire cluster in non-multizone clusters./

gmarek · 2016-08-08T12:12:26Z

@davidopp PTAL

gmarek · 2016-08-08T13:20:19Z

@k8s-bot unit test this issue: #28713

gmarek · 2016-08-08T20:53:04Z

ping @davidopp

davidopp · 2016-08-09T19:41:01Z

Super-busy right now, will review within the next couple of days.

davidopp · 2016-08-15T05:08:56Z

cmd/kube-controller-manager/app/options/options.go

@@ -173,5 +174,10 @@ func (s *CMServer) AddFlags(fs *pflag.FlagSet) {
 	fs.DurationVar(&s.ControllerStartInterval.Duration, "controller-start-interval", s.ControllerStartInterval.Duration, "Interval between starting controller managers.")
 	fs.BoolVar(&s.EnableGarbageCollector, "enable-garbage-collector", s.EnableGarbageCollector, "Enables the generic garbage collector. MUST be synced with the corresponding flag of the kube-apiserver. WARNING: the generic garbage collector is an alpha feature.")
 	fs.Int32Var(&s.ConcurrentGCSyncs, "concurrent-gc-syncs", s.ConcurrentGCSyncs, "The number of garbage collector workers that are allowed to sync concurrently.")
+	fs.Float32Var(&s.NodeEvictionRate, "node-eviction-rate", 0.1, "Number of nodes per second on which pods are deleted in case of node failure when a zone is healthy (see --unhealthy-zone-threshold for definition of healthy/unhealthy). Zone refers to entire cluster in non-multizone clusters.")
+	fs.Float32Var(&s.SecondaryNodeEvictionRate, "secondary-node-eviction-rate", 0.01, "Number of nodes per second on which pods are deleted in case of node failure when a zone is unhealthy (see --unhealthy-zone-threshold for definition of healthy/unhealthy). Zone refers to entire cluster in non-multizone clusters. This value is implicitly overridden to 0 if the cluster size is smaller than --large-cluster-size-threshold.")


I agree with Daniel that 0.01 is probably too large, especially if --large-cluster-size-threshold is 20.

I think maybe we should change the default for --unhealthy-zone-threshold to 0.55 (this is a rough approximation for "can't reschedule anywhere" if the pods are evenly thread), and change --secondary-node-eviction-rate to 0.002.

We should also mention this new default behavior very prominently in the release notes so we don't get a ton of questions from people about why their pods aren't being evicted.

Actually, it would be great to add some kind of NodeCondition (generated by node controller) that indicates when evictions from that node are being rate-limited due to this new feature. Borg has something like this and it's really helpful for understandability. I would say it's not critical if you can't finish it by Friday (certainly I would not suggest to put it in this PR).

0.002 means that we'll evict ~7 Nodes and hour - I think it's too slow.

For 20 Nodes with 1 Node per 100s it will take over 30 minutes to evict whole cluster. If you think it's too little I'd rather increase the large-cluster-threshold to 40 (it'll give an hour for full eviction) than to decrease rate drastically. This is actually a reason why I wanted to have 'time-to-drain' as a second flag here... @davidpp PTAL.

I don't know... I mean, you probably never really want to evict the whole cluster, so 30 minutes is probably too short... I'm trying to think in terms of an ops team with a pager, what is a reasonable SLO for them to respond and stop the bleeding... But yeah, if you want to change large-cluster-threshold to 40 (or probably 50 would be better) instead of changing the rate, I think it's OK.

Made it 50.

davidopp · 2016-08-15T05:40:06Z

I made a few small comments. Please add LGTM after you've made the changes.

Do you think the tests are adequate or should there be more?

Please file an issue and assign to yourself for the node condition I suggested in the comments. It would be great to get that in 1.4 but not the end of the world if you can't.

davidopp · 2016-08-17T07:54:29Z

BTW what is the PodStatus/ContainerStatus for pods/containers on nodes that are NotReady but not being evicted? (I didn't review your other PRs for this and am too lazy or busy, depending on one's perspective, at the moment to check.)

gmarek · 2016-08-17T08:30:45Z

I didn't change the logic there, so Pods are marked as NotReady, but containers are left as is (it was this way since I remember)7. I can send another PR that changes this behavior, but I don't want to do this in this PR.

davidopp · 2016-08-17T08:41:24Z

That behavior sounds fine, actually.

gmarek · 2016-08-17T08:43:14Z

I'm trying to check why integration test failed...

gmarek · 2016-08-17T09:20:28Z

@k8s-bot unit test this issue: #30754

wojtek-t · 2016-08-17T10:18:46Z

LGTM

gmarek · 2016-08-17T11:18:59Z

Self applying LGTM label per #30138 (comment)

davidopp · 2016-08-17T19:36:12Z

@k8s-bot unit test this issue: #30788

k8s-bot · 2016-08-18T07:41:03Z

GCE e2e build/test passed for commit 4cf698e.

k8s-github-robot · 2016-08-18T07:41:28Z

Automatic merge from submit-queue

gmarek added the release-note-none Denotes a PR that doesn't merit a release note. label Aug 5, 2016

gmarek added this to the v1.4 milestone Aug 5, 2016

gmarek assigned wojtek-t Aug 5, 2016

googlebot added the cla: yes label Aug 5, 2016

gmarek force-pushed the flags branch from 93361fe to 22e9364 Compare August 5, 2016 13:02

k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 5, 2016

gmarek assigned davidopp and unassigned wojtek-t Aug 5, 2016

gmarek force-pushed the flags branch from 22e9364 to 4452081 Compare August 5, 2016 13:29

davidopp reviewed Aug 7, 2016
View reviewed changes

gmarek force-pushed the flags branch from 4452081 to 9da0755 Compare August 8, 2016 12:11

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 9, 2016

gmarek force-pushed the flags branch from 9da0755 to b019fa3 Compare August 10, 2016 09:35

k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 10, 2016

gmarek force-pushed the flags branch 2 times, most recently from d8892bb to f94b2fa Compare August 10, 2016 09:58

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 12, 2016

davidopp reviewed Aug 15, 2016
View reviewed changes

gmarek force-pushed the flags branch from f94b2fa to f6a6538 Compare August 16, 2016 13:17

k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 16, 2016

gmarek mentioned this pull request Aug 16, 2016

node controller does not evict pods properly #30536

Closed

k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 16, 2016

Expose flags for new NodeEviction logic in NodeController

4cf698e

gmarek force-pushed the flags branch from f6a6538 to 4cf698e Compare August 17, 2016 08:43

gmarek mentioned this pull request Aug 17, 2016

Various failed builds of k8s.io/kubernetes/test/integration in PR builder #30754

Closed

gmarek added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 17, 2016

gmarek added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Aug 18, 2016

k8s-github-robot merged commit f9190ed into kubernetes:master Aug 18, 2016

gmarek deleted the flags branch August 30, 2016 09:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose flags for new NodeEviction logic in NodeController #30138

Expose flags for new NodeEviction logic in NodeController #30138

gmarek commented Aug 5, 2016 •

edited by k8s-oncall

Loading

davidopp Aug 7, 2016

gmarek Aug 8, 2016

gmarek commented Aug 8, 2016

gmarek commented Aug 8, 2016

gmarek commented Aug 8, 2016

davidopp commented Aug 9, 2016

davidopp Aug 15, 2016

gmarek Aug 15, 2016

gmarek Aug 16, 2016

davidopp Aug 17, 2016

gmarek Aug 17, 2016

davidopp commented Aug 15, 2016

davidopp commented Aug 17, 2016

gmarek commented Aug 17, 2016

davidopp commented Aug 17, 2016

gmarek commented Aug 17, 2016

gmarek commented Aug 17, 2016

wojtek-t commented Aug 17, 2016

gmarek commented Aug 17, 2016

davidopp commented Aug 17, 2016

k8s-bot commented Aug 18, 2016

k8s-github-robot commented Aug 18, 2016

Expose flags for new NodeEviction logic in NodeController #30138

Expose flags for new NodeEviction logic in NodeController #30138

Conversation

gmarek commented Aug 5, 2016 • edited by k8s-oncall Loading

davidopp Aug 7, 2016

Choose a reason for hiding this comment

gmarek Aug 8, 2016

Choose a reason for hiding this comment

gmarek commented Aug 8, 2016

gmarek commented Aug 8, 2016

gmarek commented Aug 8, 2016

davidopp commented Aug 9, 2016

davidopp Aug 15, 2016

Choose a reason for hiding this comment

gmarek Aug 15, 2016

Choose a reason for hiding this comment

gmarek Aug 16, 2016

Choose a reason for hiding this comment

davidopp Aug 17, 2016

Choose a reason for hiding this comment

gmarek Aug 17, 2016

Choose a reason for hiding this comment

davidopp commented Aug 15, 2016

davidopp commented Aug 17, 2016

gmarek commented Aug 17, 2016

davidopp commented Aug 17, 2016

gmarek commented Aug 17, 2016

gmarek commented Aug 17, 2016

wojtek-t commented Aug 17, 2016

gmarek commented Aug 17, 2016

davidopp commented Aug 17, 2016

k8s-bot commented Aug 18, 2016

k8s-github-robot commented Aug 18, 2016

gmarek commented Aug 5, 2016 •

edited by k8s-oncall

Loading