Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nodecontroller never finishes updates on 2k nodes #26211

Closed
zmerlynn opened this issue May 24, 2016 · 34 comments
Closed

nodecontroller never finishes updates on 2k nodes #26211

zmerlynn opened this issue May 24, 2016 · 34 comments
Assignees
Labels
area/nodecontroller kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Milestone

Comments

@zmerlynn
Copy link
Member

My cluster has been up for 20m or so and the nodecontroller is going crazy (this is filtering out "observed a new Node" messages, because they're very spammy):

W0524 21:47:27.424786      19 reflector.go:334] pkg/controller/node/nodecontroller.go:271: watch of *api.Node ended with: too
 old resource version: 574213 (575678)
E0524 21:47:28.035688      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-yhg1: Operat
ion cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-yhg1": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:28.103729      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-ymsy: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-ymsy": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:28.128099      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-ynx0: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-ynx0": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:28.158973      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-yrcz: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-yrcz": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:28.721125      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-yw2l: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-yw2l": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:30.945799      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-yxk4: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-yxk4": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:31.097177      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-z63d: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-z63d": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:32.075794      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-zelk: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-zelk": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:33.662551      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-zfcu: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-zfcu": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:34.095979      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-znja: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-znja": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:35.965741      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-zt9s: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-zt9s": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:36.900036      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-ztv5: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-ztv5": the object has been modified; please apply your changes to the latest version and try again
E0524 21:47:36.935768      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-zv49: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-zv49": the object has been modified; please apply your changes to the latest version and try again
W0524 21:47:43.901803      19 reflector.go:334] pkg/controller/node/nodecontroller.go:271: watch of *api.Node ended with: too old resource version: 575746 (576929)
@zmerlynn zmerlynn added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node. team/control-plane labels May 24, 2016
@zmerlynn
Copy link
Member Author

@dchen1107: Can you find someone to maybe look at this or triage further?

@dchen1107 dchen1107 assigned Random-Liu and unassigned dchen1107 May 24, 2016
@dchen1107 dchen1107 added the kind/bug Categorizes issue or PR as related to a bug. label May 24, 2016
@dchen1107 dchen1107 added this to the v1.3 milestone May 24, 2016
@dchen1107
Copy link
Member

@Random-Liu Can you take a look at this to make sure the issue is not caused by NodeProblemDetector updating a new NodeCondition?

@Random-Liu
Copy link
Member

Sure!

@Random-Liu
Copy link
Member

Random-Liu commented May 24, 2016

@dchen1107 Normally node-problem-detector should not update node object so frequently.
@zmerlynn Could you paste the kubectl describe nodes on one of the nodes here?

@Random-Liu
Copy link
Member

Random-Liu commented May 24, 2016

@dchen1107 I checked the cluster. There is no NodeProblemDetector running in this cluster. :)
If so, this looks like a race between kubelet and node controller. I'll dig into it more.

@dchen1107
Copy link
Member

output of kubectl describe nodes?

@Random-Liu
Copy link
Member

Random-Liu commented May 24, 2016

Here is the result of kubectl describe nodes gke-jenkins-e2e-default-pool-4-25892338-ztv5:

lantaol@gke-51872839970-de36245b0d6cd411088b ~ $ sudo kubectl describe nodes gke-jenkins-e2e-default-pool-4-25892338-ztv5
Name:           gke-jenkins-e2e-default-pool-4-25892338-ztv5
Labels:         beta.kubernetes.io/arch=amd64
            beta.kubernetes.io/instance-type=n1-standard-1
            beta.kubernetes.io/os=linux
            cloud.google.com/gke-nodepool=default-pool-4
            failure-domain.beta.kubernetes.io/region=us-east1
            failure-domain.beta.kubernetes.io/zone=us-east1-a
            kubernetes.io/hostname=gke-jenkins-e2e-default-pool-4-25892338-ztv5
Taints:         <none>
CreationTimestamp:  Tue, 24 May 2016 20:12:08 +0000
Phase:          
Conditions:
  Type          Status  LastHeartbeatTime           LastTransitionTime          Reason              Message
  ----          ------  -----------------           ------------------          ------              -------
  OutOfDisk         False   Tue, 24 May 2016 23:15:06 +0000     Tue, 24 May 2016 20:57:58 +0000     KubeletHasSufficientDisk    kubelet has sufficient disk space available
  MemoryPressure    False   Tue, 24 May 2016 23:15:06 +0000     Tue, 24 May 2016 20:12:08 +0000     KubeletHasSufficientMemory  kubelet has sufficient memory available
  Ready         True    Tue, 24 May 2016 23:15:06 +0000     Tue, 24 May 2016 20:57:58 +0000     KubeletReady            kubelet is posting ready status. WARNING: CPU hardcapping unsupported
Addresses:      10.240.3.38,104.196.50.75
Capacity:
 alpha.kubernetes.io/nvidia-gpu:    0
 cpu:                   1
 memory:                3801020Ki
 pods:                  110
System Info:
 Machine ID:            
 System UUID:           A4B3A092-ECD9-2651-4200-2B7F8832FB12
 Boot ID:           306f3786-f6a7-4452-a187-a6d38650cbe1
 Kernel Version:        3.16.0-4-amd64
 OS Image:          Debian GNU/Linux 7 (wheezy)
 Operating System:      linux
 Architecture:          amd64
 Container Runtime Version: docker://1.9.1
 Kubelet Version:       v1.3.0-alpha.4.438+6e4f494ad0d749
 Kube-Proxy Version:        v1.3.0-alpha.4.438+6e4f494ad0d749
PodCIDR:            10.10.53.0/24
ExternalID:         2804474502546262545
Non-terminated Pods:        (13 in total)
  Namespace         Name                                        CPU Requests    CPU Limits  Memory Requests Memory Limits
  ---------         ----                                        ------------    ----------  --------------- -------------
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-2mw9g             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-5th8k             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-axdew             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-eemie             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-ezc1o             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-f1eex             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-i9a5b             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-oc497             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-w3ebg             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-ye5vy             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  e2e-tests-kubelet-1ofyw   cleanup20000-09ec9032-21ef-11e6-b5fa-0242ac11000e-z0pr8             0 (0%)      0 (0%)      0 (0%)      0 (0%)
  kube-system           fluentd-cloud-logging-gke-jenkins-e2e-default-pool-4-25892338-ztv5      80m (8%)    0 (0%)      200Mi (5%)  200Mi (5%)
  kube-system           kube-proxy-gke-jenkins-e2e-default-pool-4-25892338-ztv5             20m (2%)    0 (0%)      0 (0%)      0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md)
  CPU Requests  CPU Limits  Memory Requests Memory Limits
  ------------  ----------  --------------- -------------
  100m (10%)    0 (0%)      200Mi (5%)  200Mi (5%)
No events.

We can see that the transition time of Ready and OutOfDisk is around 20:57:58, and the transition time of MemoryPressure is around 20:12:08.

Then in controller log, the first race happens at 21:29:15.

E0524 21:29:15.400731      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-0-44232fbd-2hnk: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-0-44232fbd-2hnk": the object has been modified; please apply your changes to the latest version and try again

Last one happens at 21:51:11:

E0524 21:51:11.363353      19 nodecontroller.go:856] Error updating node gke-jenkins-e2e-default-pool-4-25892338-zztn: Operation cannot be fulfilled on nodes "gke-jenkins-e2e-default-pool-4-25892338-zztn": the object has been modified; please apply your changes to the latest version and try again

@dchen1107
Copy link
Member

@zmerlynn Can we have related Kubelet logs?

@zmerlynn
Copy link
Member Author

I gave @Random-Liu access.

@dchen1107
Copy link
Member

Looks like we rule out two potential causes based on above output: 1) NodeProblemDetector 2) Out-of-Resource condition updates. Looks like this is specific failure caused by those hollow nodes. cc/ @wojtek-t

@dchen1107
Copy link
Member

cc/ @lavalamp Just mentioned that Kubemark test was disabled this morning due to some failure. @zmerlynn you might run into the same issue?

@zmerlynn
Copy link
Member Author

What do you mean by hollow nodes?

@zmerlynn
Copy link
Member Author

(This is a real 2k node cluster.)

@Random-Liu
Copy link
Member

Random-Liu commented May 25, 2016

@zmerlynn Sorry, I misunderstood what you said. I thought you were saying "kubernetes e2e scalability test"

@lavalamp
Copy link
Member

@zmerlynn This doesn't look like a node team problem, this is a nodecontroller problem. nodecontroller is owned by control plane.

node controller clearly needs to do something in a retry loop.

@mikedanese @gmarek <--- people who know something about node controller.

@lavalamp
Copy link
Member

@Random-Liu the only way this is your bug is if your component does many many writes to the node status on startup.

@zmerlynn
Copy link
Member Author

@lavalamp - I took a fast stab and asked for triage, sorry. To be fair, there's about 2k nodes posting status, but yes, that's probably the controller falling over, not the node team's fault.

@wojtek-t
Copy link
Member

But IIUC, this should be fixed by #26207 . Or am I missing something?

@Random-Liu
Copy link
Member

Random-Liu commented May 25, 2016

the only way this is your bug is if your component does many many writes to the node status on startup.

@lavalamp There is no node-problem-detector running in this cluster, but as you said, node-problem-detector should also avoid updating node status at the same time when the cluster start up.

@zmerlynn I checked the cluster yesterday. It's wired that all nodes started at 20:12:00, but the apiserver started at 21:27:00. After apiserver started, all the nodes were trying to update node statuses at the same time, and the nodecontroller also started complaining the update error at 21:29:15. After 21:51:11, there is no such error in nodecontroller any more.
A wild guess is that: this is just a temporary error when kubelet and node controller trying to update the same node status at the same time. There are tons of logs in a short period of time, just because there are too many nodes updating statuses when control plane starts up.

But IIUC, this should be fixed by #26207 . Or am I missing something?

It looks like #26207 doesn't solve this one. This one is a real update failure, not incorrect log. :)

@zmerlynn
Copy link
Member Author

@Random-Liu: That time delta is really odd. The master is sometimes slow to create compared to the individual IGMs on GKE, but not the nodes themselves, which shouldn't even be that uniform.

@lavalamp
Copy link
Member

I am suspicious of the time delta-- is one of those from a source that
doesn't include daylight savings or something?

On Wed, May 25, 2016 at 10:57 AM, Zach Loafman notifications@github.com
wrote:

@Random-Liu https://github.com/Random-Liu: That time delta is really
odd. The master is sometimes slow to create compared to the individual IGMs
on GKE, but not the nodes themselves, which shouldn't even be that uniform.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#26211 (comment)

@Random-Liu
Copy link
Member

Random-Liu commented May 25, 2016

I am suspicious of the time delta-- is one of those from a source that doesn't include daylight savings or something?

The time on the node:

lantaol@gke-jenkins-e2e-default-pool-4-25892338-zztn:~$ date
Wed May 25 19:51:00 UTC 2016

The time on the master:

lantaol@gke-51872839970-de36245b0d6cd411088b /var/log/containers $ date
Wed May 25 19:51:22 UTC 2016

The creation timestamp of node lantaol@gke-jenkins-e2e-default-pool-4-25892338-zztn:

Tue, 24 May 2016 20:14:37 +0000

The first log in kube-apiserver.log on the master gke-51872839970-de36245b0d6cd411088b:

I0524 21:27:05.295795      20 genericapiserver.go:604] Will report 104.196.50.83 as public IP address.

@davidopp
Copy link
Member

Is this kubernetes/node-problem-detector#9 ?

@Random-Liu
Copy link
Member

@davidopp Nope, node-problem-detector is not running in the cluster.

@lavalamp lavalamp assigned davidopp and unassigned Random-Liu May 26, 2016
@lavalamp
Copy link
Member

I'm giving this to @davidopp to delegate-- I can't see any reason why it should be @Random-Liu's problem. :)

@davidopp
Copy link
Member

davidopp commented Jun 3, 2016

@gmarek do you think you could take a look at this?

@gmarek
Copy link
Contributor

gmarek commented Jun 3, 2016

I'll look into it on Monday.

@davidopp
Copy link
Member

davidopp commented Jun 3, 2016

Thanks!

@gmarek
Copy link
Contributor

gmarek commented Jun 6, 2016

I have logs from the last enormous-cluster run. I expect that we're throttling something at a client side, but I need to confirm.

@gmarek
Copy link
Contributor

gmarek commented Jun 6, 2016

Though I see the problem in route-controller not node-controller itself.

@wojtek-t
Copy link
Member

wojtek-t commented Jun 6, 2016

This was observed 2 weeks a go. We did a bunch of changes especially in route-controller since then. Maybe this is no longer an issue. Did we check it?

@gmarek
Copy link
Contributor

gmarek commented Jun 6, 2016

I'm looking at the last enormous cluster run from June 3rd.

@gmarek
Copy link
Contributor

gmarek commented Jun 6, 2016

I looked at api-server logs and it looks that it's working as intended (more or less) - conflicts that I was able to find was caused by StatusUpdates sent by Kubelets.

This shouldn't be too surprising in 2k Node clusters.

@zmerlynn - what was the result of this behavior? NodeStatuses weren't up to date?

@zmerlynn
Copy link
Member Author

zmerlynn commented Jun 6, 2016

At the time, it looked like the nodecontroller was flapping and just not getting around to updating node state.

I'm happy to close this for now, though. We've certainly changed enough and I also haven't seen it.

@zmerlynn zmerlynn closed this as completed Jun 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/nodecontroller kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

8 participants