Kubelet to fail pods that have hostPort conflicts, etc #4623

erictune · 2015-02-19T22:14:12Z

Although the scheduler tries not to coschedule pods with hostPort conflicts, there could still be a conflict:

once Remove HostPort conflict checking. #4619 goes in, the kubelet could still maybe get pods with conflict.
file based pods could conflict with apiserver-based pods.

Right now, kubelet can detect a conflict, via the code in syncLoop, in pkg/kubelet/kubelet.go, where it calls filterHostPortConflicts(). However, two things need to change in that loop.

First, instead of just ignoring one of the two conflicting pods, the kubelet should immediately set the pod status to Failed with an event with a useful Reason. It should do this even for pods which requested restarting.

Second, it should do this in a deterministic way. Right now, it is based on the order of the items as they come out of the config code. Maybe it should iterate through the pods by creation timestamp.

erictune · 2015-02-19T22:14:33Z

This would be a good project for someone who wants to get started in kubelet.

erictune · 2015-02-19T22:17:12Z

You could e2e test this as follows. Create a pod in the API that requests the same hostPort as is already used by the file-based cadvisor container. The pod should fail and there should be an appropriate reason.

erictune · 2015-02-19T22:17:32Z

@dchen1107 we just talked about this.

erictune · 2015-02-19T22:38:53Z

in future we might use this to fail due to out of resources, or other setup errors.

thockin · 2015-02-19T22:42:43Z

Might need some way to poison the node from future scheduling decisions - tricky.

pmorie · 2015-02-19T23:46:28Z

+1 to this, do want
On Thu, Feb 19, 2015 at 5:42 PM Tim Hockin notifications@github.com wrote:

Might need some way to poison the node from future scheduling decisions -
tricky.

—
Reply to this email directly or view it on GitHub
#4623 (comment)
.

erictune · 2015-02-20T00:30:33Z

@thockin
there are two cases:

two schedulers racing to start a pod on the same node
- one pod will get the the node first,
- the second will be kicked off by the kubelet
- the rc for the second pod makes a new pod
- there should not be a race the second time because all the schedulers know about the first pod now.
file pod versus apiserver pod on the same node.
- once Node should sync back to master with allocated Pods through file source #4090 is fixed, this won't happen because the schedulers will see the file pod hostPorts.

thockin · 2015-02-20T00:33:38Z

Oh, I thought this was getting rid of HostPort conflict checking entirely?
Is that ALSO happening somewhere else?

On Thu, Feb 19, 2015 at 4:30 PM, Eric Tune notifications@github.com wrote:

@thockin https://github.com/thockin
there are two cases:

two schedulers racing to start a pod on the same node

one pod will get the the node first,

the second will be kicked off by the kubelet

the rc for the second pod makes a new pod

there should not be a race the second time because all the
schedulers know about the first pod now.

file pod versus apiserver pod on the same node.

once Node should sync back to master with allocated Pods through file source #4090
Node should sync back to master with allocated Pods through file source #4090 is
fixed, this won't happen because the schedulers will see the file pod
hostPorts.

Reply to this email directly or view it on GitHub
#4623 (comment)
.

bgrant0607 · 2015-02-27T17:48:17Z

This is needed in order to eliminate BoundPods.

yujuhong · 2015-02-28T00:14:39Z

kubelet now sorts the pods before filtering, and also records an event on host port conflict. To actually reject such pods, we will need to kubelet to post pod status back to apiserver (#4561)

erictune · 2015-02-28T00:54:51Z

The scheduler still checks. This handles parallel scheduler races and file
based pods.
On Feb 27, 2015 4:15 PM, "Yu-Ju Hong" notifications@github.com wrote:

kubelet now sorts the pods before filtering, and also records an event on
host port conflict. To actually reject such pods, we will need to
kubelet to post pod status back to apiserver (#4561
#4561)

—
Reply to this email directly or view it on GitHub
#4623 (comment)
.

brendandburns · 2015-02-28T00:57:14Z

There should be no parallel scheduler races, we use resource versions and
optimistic concurrency to ensure correctness.

Brendan
On Feb 27, 2015 4:55 PM, "Eric Tune" notifications@github.com wrote:

The scheduler still checks. This handles parallel scheduler races and file
based pods.
On Feb 27, 2015 4:15 PM, "Yu-Ju Hong" notifications@github.com wrote:

kubelet now sorts the pods before filtering, and also records an event on
host port conflict. To actually reject such pods, we will need to
kubelet to post pod status back to apiserver (#4561
#4561)

—
Reply to this email directly or view it on GitHub
<
#4623 (comment)

.

—
Reply to this email directly or view it on GitHub
#4623 (comment)
.

bgrant0607 · 2015-02-28T01:40:08Z

Re. atomicity -- I don't buy that we need it. I commented here: #2483 (comment)

yujuhong · 2015-03-02T18:53:17Z

@erictune, do you mean that we should still set the pod status to "failed" so that when the scheduler check, it can get the accurate pod status back? I had thought about it, but the pod statuses are computed on the fly (by checking the containers) as of now. Without the ability to directly set the status, we need to cache this information somewhere. Since there are people touching the same code/file, I'd rather wait until #4561 is fixed.

erictune · 2015-03-02T20:43:39Z

@brendandburns
perhaps parallel was the wrong word, but a scheduler race appears possible. the cache of pods it uses is not updated atomically with binding pods. It appears the scheduler could:

successfully do a /binding of pod A with hostPort N to host H
read the cache and see A not bound
consider binding B with hostPort N to host H, and not see a conflict.
successfully do a /binding of pod B with hostPort N to host H
later, the cache notices that A is bound to H

yujuhong · 2015-03-03T18:15:59Z

@erictune and I discussed offline, and we decided to create a map to store the rejected pod->status, so that the failed status will be correctly reports when polling happens. By doing this, this issue is no longer blocked by any other issues.

I will write up a PR to implement this.

erictune added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Feb 19, 2015

erictune mentioned this issue Feb 19, 2015

Remove HostPort conflict checking. #4619

Merged

dchen1107 added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Feb 20, 2015

dchen1107 added this to the v1.0 milestone Feb 20, 2015

bgrant0607 assigned yujuhong Feb 27, 2015

yujuhong mentioned this issue Feb 27, 2015

kubelet: record an event with a clear reason on host port conflict #4915

Merged

dchen1107 mentioned this issue Mar 3, 2015

Simplify restartPolicy: change to string #3607

Closed

yujuhong mentioned this issue Mar 4, 2015

kubelet: reject pods on host port conflict #5019

Merged

dchen1107 closed this as completed in #5019 Mar 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubelet to fail pods that have hostPort conflicts, etc #4623

Kubelet to fail pods that have hostPort conflicts, etc #4623

erictune commented Feb 19, 2015

erictune commented Feb 19, 2015

erictune commented Feb 19, 2015

erictune commented Feb 19, 2015

erictune commented Feb 19, 2015

thockin commented Feb 19, 2015

pmorie commented Feb 19, 2015

erictune commented Feb 20, 2015

thockin commented Feb 20, 2015

bgrant0607 commented Feb 27, 2015

yujuhong commented Feb 28, 2015

erictune commented Feb 28, 2015

brendandburns commented Feb 28, 2015

bgrant0607 commented Feb 28, 2015

yujuhong commented Mar 2, 2015

erictune commented Mar 2, 2015

yujuhong commented Mar 3, 2015

Kubelet to fail pods that have hostPort conflicts, etc #4623

Kubelet to fail pods that have hostPort conflicts, etc #4623

Comments

erictune commented Feb 19, 2015

erictune commented Feb 19, 2015

erictune commented Feb 19, 2015

erictune commented Feb 19, 2015

erictune commented Feb 19, 2015

thockin commented Feb 19, 2015

pmorie commented Feb 19, 2015

erictune commented Feb 20, 2015

thockin commented Feb 20, 2015

bgrant0607 commented Feb 27, 2015

yujuhong commented Feb 28, 2015

erictune commented Feb 28, 2015

brendandburns commented Feb 28, 2015

bgrant0607 commented Feb 28, 2015

yujuhong commented Mar 2, 2015

erictune commented Mar 2, 2015

yujuhong commented Mar 3, 2015