Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Catch kubelet-master hostname mismatch during health check #4497

Merged
merged 1 commit into from Feb 26, 2015
Merged

WIP: Catch kubelet-master hostname mismatch during health check #4497

merged 1 commit into from Feb 26, 2015

Conversation

ghost
Copy link

@ghost ghost commented Feb 17, 2015

Hello all,

I have seen this bug a few times and run into it myself. For a variety of reasons a minion ends up with an incorrect hostname override. So it thinks its hostname is one thing while the master thinks it is something else. 'get minions' then shows the node as ready but pods never start on it.

This patch adds a hostname comparison as part of the node health check. @eparis had suggested using pkg/healthz.go although the health check code as changed since then.

Is this approach ok ? (expading the /healthz repsonse, checking against 'host' in 'HealthCheck')

@googlebot
Copy link

Thanks for your pull request.

It looks like this may be your first contribution to a Google open source project, in which case you'll need to sign a Contributor License Agreement (CLA) at https://cla.developers.google.com/.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check the information on your CLA or see this help article on setting the email on your git commits.

Once you've done that, please reply here to let us know. If you signed the CLA as a corporation, please let us know the company's name.

Hostname string
Health string
}
var healthResponse HealthResponse
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this approach is okay this struct can be generalized and moved somewhere where it can be shared by kubelet and master code. I think one of the types.go files from what I have read sofar

@mikedanese
Copy link
Member

Is this an illegal condition? See https://github.com/GoogleCloudPlatform/kubernetes/blob/8b25b43039f03e97ca930b2f0c8099253948ec5f/pkg/kubelet/server/server.go#L131

Edit: Ahh never mind, I see you've addressed this in your comment. A better question is when is this an illegal condition and when is it legal?

@eparis
Copy link
Contributor

eparis commented Feb 18, 2015

@swagiaal Is a Red Hatter. That should cover his CLA requirements.

@ghost
Copy link
Author

ghost commented Feb 18, 2015

@mikedanese so for example in one situation if you do not provide a hostnames override the pod thinks its hostname is hostname -fwhile the master reaches it by a different hostname/ip address. Another example is the machine is known by a static external ip which you provide to the master and the minion ends up with the machine's dynamic IP. I will dig more into why the minion cannot accept work in this state but the health check above at least exposes that there is a problem

@eparis
Copy link
Contributor

eparis commented Feb 18, 2015

In the case where the master thinks the machines hostname is host.example.com but the minion thinks it's hostname is localhost.localdomain the apiserver will schedule work for host.example.com but the minion will look for work for localhost.localdomain. Thus the minion will never pick up work to do!

@mikedanese
Copy link
Member

Okay. That makes sense.

@ghost
Copy link
Author

ghost commented Feb 18, 2015

@eparis gotcha!

@ghost
Copy link
Author

ghost commented Feb 19, 2015

@mikedanese I have updated the patch to use the hostname in the header of the request

@mikedanese
Copy link
Member

I like this a lot better. Looks like you will have to send a Host header with curl here as well https://github.com/GoogleCloudPlatform/kubernetes/blob/master/hack/lib/util.sh#L21. It would also be worthwhile to investigate downstream effects (if any) of this change.

@ghost
Copy link
Author

ghost commented Feb 19, 2015

@mikedanese curl sets the Host header but I had to add a --hostname_override="127.0.0.1" to match it. What do you mean by downstream effects BTW ?

I am looking into the test failure...

@mikedanese
Copy link
Member

@swagiaal particularly I was thinking about monit monitoring on that healthcheck with a localhost url, which would cause that check to fail. From a skim, I don't think that was an issue as I couldn't find any monit monitoring in the salt configs for kubelet.

@ghost
Copy link
Author

ghost commented Feb 20, 2015

@mikedanese Hmmm.. isnt that a valid use case though ? I am starting to think relying on the Host header might be a bit presumptuous.

@ghost
Copy link
Author

ghost commented Feb 20, 2015

@mikedanese I like what you are proposing in #4646. If we have that then we can use the Host header more comfortably since consumer of healthz which only want to do a simple ping would have the option of using healthz/ping

During the kubelet's /healthz responce check to see if the
hostname used by the master matches the hostname the kubelet
knows itself by. If not fail the health check.

Signed-off-by: Sami Wagiaalla <swagiaal@redhat.com>
@ghost
Copy link
Author

ghost commented Feb 26, 2015

Ping... Can someone review this for me

@brendandburns
Copy link
Contributor

LGTM. Merging.

brendandburns added a commit that referenced this pull request Feb 26, 2015
WIP: Catch kubelet-master hostname mismatch during health check
@brendandburns brendandburns merged commit 0d23875 into kubernetes:master Feb 26, 2015
@ghost
Copy link
Author

ghost commented Feb 27, 2015

@brendandburns thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants