Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes seem to be updating its status way too frequently #5864

Closed
fabioy opened this issue Mar 24, 2015 · 16 comments
Closed

Nodes seem to be updating its status way too frequently #5864

fabioy opened this issue Mar 24, 2015 · 16 comments
Assignees
Labels
area/kubelet kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Milestone

Comments

@fabioy
Copy link
Contributor

fabioy commented Mar 24, 2015

May be a result of PR #5714. On GCE, the master kube-apiserver.log is flooded with entries like:

I0324 18:53:46.443512 9483 handlers.go:109] GET /api/v1beta1/minions/e2e-test-fabioy-minion-ipwx.c.fabioy-cloud-test-1.internal: (2.978675593s) 200 [[kubelet/v0.10.0 (linux/amd64) kubernetes/9707a94] 10.240.151.207:46823]
I0324 18:53:46.444040 9483 handlers.go:109] GET /api/v1beta1/minions/e2e-test-fabioy-minion-ipwx.c.fabioy-cloud-test-1.internal: (2.979451401s) 200 [[kubelet/v0.10.0 (linux/amd64) kubernetes/9707a94] 10.240.151.207:46800]
I0324 18:53:46.450698 9483 handlers.go:109] PUT /api/v1beta1/minions/e2e-test-fabioy-minion-ipwx.c.fabioy-cloud-test-1.internal: (3.778122997s) 409 [[kubelet/v0.10.0 (linux/amd64) kubernetes/9707a94] 10.240.151.207:41179]
I0324 18:53:46.451696 9483 handlers.go:109] PUT /api/v1beta1/minions/e2e-test-fabioy-minion-ipwx.c.fabioy-cloud-test-1.internal: (3.772748134s) 409 [[kubelet/v0.10.0 (linux/amd64) kubernetes/9707a94] 10.240.151.207:46706]

The log is filling at a rate of hundreds of these types of requests a second (this is during e2e test on GCE).

At the very least, there should be a rate limiter on how often pods update/fetch their status.

@fabioy
Copy link
Contributor Author

fabioy commented Mar 24, 2015

@fgrzadkowski, @vmarmol

@fabioy fabioy added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/kubelet kind/bug Categorizes issue or PR as related to a bug. labels Mar 24, 2015
@vmarmol vmarmol self-assigned this Mar 24, 2015
@vmarmol
Copy link
Contributor

vmarmol commented Mar 24, 2015

Taking a look, thanks for reporting @fabioy!

@vmarmol vmarmol added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Mar 24, 2015
@vmarmol
Copy link
Contributor

vmarmol commented Mar 24, 2015

@fabioy these look like node statuses not pod statuses?

@fabioy
Copy link
Contributor Author

fabioy commented Mar 24, 2015

You're probably right. I couldn't readily tell from the logs.

On Tue, Mar 24, 2015 at 1:14 PM, Victor Marmol notifications@github.com
wrote:

@fabioy https://github.com/fabioy these look like node statuses not pod
statuses?


Reply to this email directly or view it on GitHub
#5864 (comment)
.

-- Fabio Yeon

@vmarmol
Copy link
Contributor

vmarmol commented Mar 24, 2015

Load-balancing to @dchen1107 :) assigning to you

@vmarmol vmarmol assigned dchen1107 and unassigned vmarmol Mar 24, 2015
@brendandburns brendandburns added this to the v1.0 milestone Mar 24, 2015
@dchen1107 dchen1107 changed the title Pods seem to be updating its status way too frequently Nodes seem to be updating its status way too frequently Mar 24, 2015
@dchen1107
Copy link
Member

Found the bug in the code, will sent out fix shortly.

@dchen1107
Copy link
Member

It turns out this is no bug, and it works as intended. Under current design, we treat nodeStatus as heatbeat message:

When kubelet first starts up, it post nodeStatus every 500 millisec for first 2sec for a faster cluster startup. After this, kubelet post nodeStatus every 2sec. NodeController processes those heatbeat messages, and if it missed sequentially continuous 4 heatbeats, it will mark unreachable.

Above intervals is reasonable to me, but we could tune them though. I am going to close the issue.

cc/ @ddysher @bgrant0607

@ghost
Copy link

ghost commented Mar 25, 2015

As an aside, TCP keepalives are a very cheap way to do these sorts of remote heartbeats (all in-kernel). Then you can potentially do low latency edge-triggered rather than level triggered event reporting.

@dchen1107
Copy link
Member

@quinton-hoole I raised the same point when first introducing nodeStatus as heatbeat. If I remembered it correctly, the answer is that we can evolve it later. @ddysher Correct me if I were wrong here.

@ddysher
Copy link
Contributor

ddysher commented Mar 25, 2015

Based on the log timing, it's just for faster startup, like @dchen1107 said.

As to performance, yes, we wanted to evolve it later (which is 'now').

@ghodss
Copy link
Contributor

ghodss commented Apr 7, 2015

Is there another issue tracking the reduction of these PUTs?

@ghodss
Copy link
Contributor

ghodss commented Apr 7, 2015

Just as a heads up, I started up a completely idle 0.14.2 500 node cluster with an n1-standard-8 instance as the master and the master is completely 100% loaded to the point that 50% of requests return 429 due to node status GET's and PUT's. I know the target 1.0 size is 100 nodes but it would be great to optimize this at least a bit.

@gmarek
Copy link
Contributor

gmarek commented Apr 7, 2015

@wojtek-t @fgrzadkowski

@wojtek-t
Copy link
Member

wojtek-t commented Apr 7, 2015

In my opinion, the first thing that we should do is to increase the "heartbeat duration".
IIUC, the main reason for it is to move the pods that are running on broken machine to another machine in case the machine is unreachable. But I think this case is pretty similar to restarting the pod that failed (e.g. due to crash). However, we are currently examining pods only every 10 seconds. So for me it doesn't give much value to send heartbeat from a node every 2 seconds.
But I also agree that we should probably come up with a better mechanism than sending the whole NodeStatus from Kubelet as a heartbeat.

@dchen1107
Copy link
Member

The issue was filed because we believe there is a bug, but turns out it is config parameters of interval at startup time. To tune the NodeStatus interval and other NodeStatus related performance issues are covered by #5953 and several other issues.

@dchen1107
Copy link
Member

@ghodss Let's move the discussion to #5953.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

8 participants