Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NodeController doesn't evict Pods if no Nodes are Ready #25571

Merged
merged 1 commit into from
May 20, 2016

Conversation

gmarek
Copy link
Contributor

@gmarek gmarek commented May 13, 2016

Fix #13412 #24597

When NodeControllers don't see any Ready Node it goes into "network segmentation mode". In this mode it cancels all evictions and don't evict any Pods.

It leaves network segmentation mode when it sees at least one Ready Node. When leaving it resets all timers, so each Node has full grace period to reconnect to the cluster.

cc @lavalamp @davidopp @mml @wojtek-t @fgrzadkowski

@gmarek gmarek self-assigned this May 13, 2016
@gmarek gmarek added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label May 13, 2016
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels May 13, 2016
@gmarek
Copy link
Contributor Author

gmarek commented May 13, 2016

I think this is it. I want to run tests on real cluster, but I believe it's ready for review.

@lavalamp @davidopp

@gmarek gmarek assigned lavalamp and unassigned gmarek May 13, 2016
@@ -63,6 +63,11 @@ type UniqueQueue struct {
set sets.String
}

// Length returns a number of items currently in the queue
func (q *UniqueQueue) Length() int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks unsafe re: locking. Please don't add this, esp don't make it public.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@lavalamp
Copy link
Member

looks OK to me.

@davidopp davidopp added this to the v1.3 milestone May 13, 2016
@lavalamp
Copy link
Member

Ping me when you get tests working. :)

@davidopp
Copy link
Member

@lavalamp tests are passing now
(I didn't review the PR, just saw your comment)

@gmarek
Copy link
Contributor Author

gmarek commented May 16, 2016

I guess that @lavalamp meant the tests I mentioned in previous comment. There's no e2e tests for this and I didn't have time to write those. I'll hand test it today and add tests sometime next week.

@gmarek gmarek force-pushed the nodecontroller branch 3 times, most recently from ac2be30 to 59a83a4 Compare May 16, 2016 09:22
@gmarek gmarek changed the title WIP: NodeController doesn't evict Pods if no Nodes are Ready NodeController doesn't evict Pods if no Nodes are Ready May 16, 2016
@gmarek gmarek removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label May 16, 2016
@gmarek
Copy link
Contributor Author

gmarek commented May 16, 2016

@lavalamp I run manual tests and it seems to work now (I had a bug - now I skip master node when looking for Ready Nodes).

@lavalamp
Copy link
Member

Can you exercise your bugfix in the unit test? then LGTM.

@gmarek
Copy link
Contributor Author

gmarek commented May 17, 2016

There is a unit test for this - the last case I added checks exactly this.

@gmarek
Copy link
Contributor Author

gmarek commented May 17, 2016

@k8s-bot test this issue: #IGNORE
yesterday's jenkins problems

@lavalamp
Copy link
Member

Sorry, maybe I'm blind, but I can't see a test case with a node having a name ending in "-master"?

@gmarek
Copy link
Contributor Author

gmarek commented May 17, 2016

Oh, sorry - the master thing. I miss-read your comment.

@gmarek
Copy link
Contributor Author

gmarek commented May 17, 2016

@lavalamp PTAL

@lavalamp lavalamp added lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels May 18, 2016
@lavalamp
Copy link
Member

LGTM!

@gmarek
Copy link
Contributor Author

gmarek commented May 19, 2016

It's pretty high in the queue, but we might need to merge it by hand if it won't make it until tomorrow.

@gmarek gmarek added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label May 20, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented May 20, 2016

GCE e2e build/test passed for commit 6d27009.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 3b0a6da into kubernetes:master May 20, 2016
@pesho
Copy link

pesho commented May 20, 2016

This looks like it might also resolve the scenario described in #19124. Will it be backported to 1.2 as well?

@davidopp
Copy link
Member

@pesho No. But 1.3 should be out relatively soon (less than a month).

@manojlds
Copy link

manojlds commented Oct 2, 2016

I am seeing this error in the kubelet logs:

E0930 18:21:12.386245    2565 kubelet.go:2913] Error updating node status, will retry: failed to get node address from cloud provider: RequestError: send request failed
caused by: Get http://169.254.169.254/latest/meta-data/local-ipv4: dial tcp 169.254.169.254:80: getsockopt: no route to host

All pods in the node were rescheduled to other nodes. Is this how this is supposed to behave? Also, did it reschedule without doing few retries at the least?

@gmarek
Copy link
Contributor Author

gmarek commented Oct 2, 2016

@manojlds - yes it is is as expected, and NC does a number of retries before it evicts Pods (by default 30 times, but it depends on how often NodeStatus is reported, and what are the settings for PodEvictionGracePeriod). From what you wrote there were other Nodes that were serving, so the behavior seems to be correct, or did you expect something else?

This log line suggests that the cloud provider became unreachable from some of your Nodes. @yujuhong

@manojlds
Copy link

manojlds commented Oct 3, 2016

@gmarek - thanks. This is the only instance of the error I could see in the log, so not sure if it did retry before actually evicting the pods. The behaviour seems fine, only doubt/question is on the retry.

@gmarek
Copy link
Contributor Author

gmarek commented Oct 3, 2016

Even if the kubelet does not retry it's fine, as NC requires a lot of failed NodeStatus updates to actually start doing something.

For kubelet-side of this @yujuhong should be able to answer/point you to someone who can.

@yujuhong
Copy link
Contributor

yujuhong commented Oct 3, 2016

Kubelet should retry up to 5 times by default. On top of that, kubelet tries to update the node status every 10s (or a customized period passed by flags/configs), and node controller only reacts in O(min). One failed attempt should not cause any serious reactions.

@manojlds
Copy link

manojlds commented Oct 4, 2016

@yujuhong thanks. Will take a look at the flags, and see how best I can use them in my cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants