-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet: lookup node address for external provider if none is set #75229
kubelet: lookup node address for external provider if none is set #75229
Conversation
ca88493
to
7473ed1
Compare
/sig cloudprovider |
/priority important-longterm |
pkg/kubelet/nodestatus/setters.go
Outdated
// then we return early because provider set addresses should take precedence. | ||
// Otherwise, try to look up the node IP and let the cloud provider override it later | ||
// This should alleviate a lot of the bootstrapping issues with out-of-tree providers | ||
if externalCloudProvider && len(node.Status.Addresses) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were just in an externalCloudProvider if statement. Why not fold this into the end of that if statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed!
7473ed1
to
59eedac
Compare
the only issue I can think of would be if the serving certificate approver process refuses to approve the self-detected addresses. in that case, the kubelet could be waiting for approval for the old SANs for a long time ( |
@mtaufen can I get a review from you please? |
is there a pointer i can reference for how external cloud provider with self hosted control planes are recommended for bootstrapping? |
I've only tried it with |
If you know the API server advertise address ahead of time I think you can work around it but it kind of defeats the purpose of self-hosting if you need to manually resync that value rather than relying on what Kubernetes says is the node's address. |
/hold we need to ensure that we have a clear bootstrapping flow identified before proceeding on this imo. is there a kep that we can iterate or extend upon for this as part of overall external cloud provider transition? |
fwiw I think the bootstraping flow is well identified for the internal cloud provider case already, it just breaks for the external case because of it's assumptions around node addresses. Will let kubeadm maintainers comment on this one though.
We have a KEP but it does not address self-hosting since it's still considered an alpha feature and wasn't part of the initial work on external providers. |
@andrewsykim given that there are conformant k8s distributions that run self-hosted, i think we need to capture a recommendation on how those distributions should proceed. i am happy to help rally resources to work through this topic if its not yet defined, but i think we need to account for it as part of the transition. |
@derekwaynecarr that's good to know, thank you! I'm curious if conformance actually covers the bootstrapping of self-hosted clusters also. I assume it only covers post-bootstrap which this PR shouldn't change - either way happy to dig into this if needed |
hi @andrewsykim ,
if by self-hosting we mean running the kubeadm created control-plane as DaemonSets (and not the popular internet meaning which is running static-pods), then sadly this feature is pretty much unsupported at this point in kubeadm. it remained in alpha for a long time due to a set of major caveats: reading back, the conclusion was that self-hosting and external CPs will not work in kubeadm:
given my comment above, perhaps kubeadm is not the right tool to base evaluations of this change upon.
i might be lacking context on the external CP case. so once the control plane is up the external CP can provide a node address at which point the control plane has to be restarted? does that mean that a new api server serving certificate has to be recreated to include a new advertise address?
kubeadm does not have e2e tests for self-hosting at this point. i remember seeing self-hosting tests in the suite, but i don't think these are part of conformance. |
i am particularly concerned about using static pods with external cloud controller manager. is there a concern with that approach? |
Apologies if this wasn't clear - yes I was only referring to kubeadm self-hosting where the control plane is bootstrapped into a DaemonSet. Static Pods with external cloud providers works and is adopted. |
@andrewsykim thanks for clarifying. is there a document/kep that describes how a static pod approach is enabled? |
This PR was specifically open to address the kubeadm self-hosting case, but it also addresses general trouble shooting issues for the external CP case. More specifically, if a node fails to register with an external cloud provider, it's generally hard to trouble shoot pods running on cluster cause you can't fetch any logs without node addresses being set. With this change, node's have more functionality prior to registration with the cloud provider. |
It's not any different from how you would run static pods previously aside from changing some flags (mainly --cloud-provider=external) which we document here https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller. And the external add-on for the cloud provider should be documented by the cloud provider. |
/hold cancel apologies for initial confusion. this makes sense. /approved |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andrewsykim, derekwaynecarr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Review the full test history for this PR. Silence the bot with an |
Is this in a release yet? |
yes, this is included in v1.16 |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This changes the node address lookup in the kubelet for external providers. If
--cloud-provider=external
and the kubelet has no node address set already, we attempt to fill its node address with what we can find at runtime (hostname + internal IP). In most cloud providers, whatever hostname/internal IP the kubelet sets at startup will get overwritten later.This should alleviate some of the common bootstrapping problems we see with external providers because the kubelet hosting the control plane does not have node addresses set.
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?: