Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet to POST pod status to apiserver #4561

Closed
erictune opened this issue Feb 18, 2015 · 13 comments · Fixed by #5205 or #5555
Closed

Kubelet to POST pod status to apiserver #4561

erictune opened this issue Feb 18, 2015 · 13 comments · Fixed by #5205 or #5555
Assignees
Labels
priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node.
Milestone

Comments

@erictune
Copy link
Member

Split off #156 as a smaller more specific work item.

In the kubelet, once every N sync loops, it should POST /api/$VERSION/namespaces/$NS/pods/$NAME/status for each pod.

The kubelet would do this if enabled by a flag, and emit a warning if it failed to POST the update.

The kubelet would ideally handle a 429 by retrying after the Retry-after header.

@dchen1107
Copy link
Member

In a long run, we want to do a bulk POST to apiserver, but that is not v1 blocker.

@dchen1107 dchen1107 added priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Feb 18, 2015
@dchen1107 dchen1107 added this to the v1.0 milestone Feb 18, 2015
@derekwaynecarr
Copy link
Member

api/$VERSION/namespaces/$NS/pods/$NAME/status for each pod.

Sent from my iPhone

On Feb 18, 2015, at 5:55 PM, Eric Tune notifications@github.com wrote:

api/$VERSION/namespace/$NS/pods/$NAME/status for each pod.

@fgrzadkowski
Copy link
Contributor

As suggested by @dchen1107 I'll work on this.

<For some reason I can't assign myself to this issue>

@timothysc
Copy link
Member

Out of curiosity why can we roll up Node and Pod status into a single status update?

@dchen1107
Copy link
Member

That is the plan to have a bulk status update, but not strictly required at this moment.

@erictune
Copy link
Member Author

erictune commented Mar 2, 2015

@timothysc what URL do you propose for doing a node + pod update?

@yujuhong
Copy link
Contributor

yujuhong commented Mar 4, 2015

@fgrzadkowski, FYI, my PR #5019 modifies kubelet to reject (set pod status to fail) pods that have port conflict. The status is stored in a map and gets reported back later via status polling.

@smarterclayton
Copy link
Contributor

#5085 is quasi blocked on this - in that if we do graceful deletion with TTL (the optimal way) then we won't be able to clear the binding at the point the pod is actually deleted. We could still delete the binding at the point the TTL starts, which is somewhat reasonable (since you can't stop or delay a deletion as I've implemented it so far) because it will trigger the kubelet to remove the pod gracefully. However, since true graceful would be SIGTERM to Docker with the remaining TTL window as soon as the pod sees it, then SIGKILL when delete happens, that's harder to do if the pod disappears from the binding.

@fgrzadkowski
Copy link
Contributor

I have almost ready PR for this (some tests are failing). Will send it on Monday.

@fgrzadkowski
Copy link
Contributor

I had to revert PR #5305 due to bugs. Reopening issue. Will send fixed version soon.

@timothysc
Copy link
Member

So in the case of deletion, we're now seeing gobs of traffic on deletion trying to send updated status and the api server presenting NOT-FOUND.

Easy repro:

  1. run density tests
  2. start traffic monitoring on api-server ** tcpdump -nnvvXSs 1514 'tcp port 8080 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)' **
  3. Wait for cleanup..

results in what appears to be a death spiral we can not exit from without a hard cluster reboot.

More details:
On a 23 node cluster running 1001 pods in a steady state env we can see api-server consuming >50% cpu on a 40-core box. Where steady state = no load just internal k8's traffic.

numerous 'kubectl get pods' return no result without an error, but I'm guessing yields ~429.

@vmarmol
Copy link
Contributor

vmarmol commented Mar 18, 2015

Sent out #5619 to lower and spread that load. It lowers the qps from 100 to ~9.

@vmarmol
Copy link
Contributor

vmarmol commented Mar 18, 2015

After discussions with @bgrant0607 and @dchen1107, they suggested to only update the status when it changes and on startup. The heartbeat will be handled by the node controller rather than per-pod. I'll file a separate issue for that and #5619 will go in for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
8 participants