Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate instance name from node name #61042

Conversation

akshaymankar
Copy link

What this PR does / why we need it:

This change allows the node's instance name to be provided through the cloud config file when using the GCE cloud provider. That allows the node's hostname to be different from GCE instance name, which helps an operator to bring their own TLS certs for nodes.

Special notes for your reviewer:

Release note:

This change is fully backwards compatible, if an operator wants to keep using the GCP instance-name as hostname and node-name, no extra configuration is required. 

However, if the operator wants to give different hostnames and bring their own TLS, they'd need  to:

1. Add `instance-name` to the GCE cloud config
2. Start kubelet with --hostname-override set to whatever hostname they would like the kubelet to be addressed with

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 12, 2018
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 12, 2018
@akshaymankar
Copy link
Author

Fixes #58084

@akshaymankar
Copy link
Author

Hi @thockin @mtaufen,

As discussed in #58084, We've separated hostname and instance-name for GCE. Please take a look and let us know if you have any feedback.

@akshaymankar
Copy link
Author

/assign @bowei

@mtaufen
Copy link
Contributor

mtaufen commented Mar 19, 2018

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 19, 2018
@mtaufen
Copy link
Contributor

mtaufen commented Mar 19, 2018

I think I like the idea of including the instance name in the cloud config file, since it's specific to the cloud provider anyway. @thockin wdyt?

@akshaymankar akshaymankar force-pushed the separate-instance-name-from-node-name branch from 9cc3439 to 618afdd Compare March 21, 2018 10:52
@akshaymankar
Copy link
Author

/retest

@akshaymankar akshaymankar force-pushed the separate-instance-name-from-node-name branch from 618afdd to 85eceeb Compare March 21, 2018 11:41
@akshaymankar
Copy link
Author

/retest

@akshaymankar
Copy link
Author

/test pull-kubernetes-integration

@akshaymankar
Copy link
Author

/test pull-kubernetes-e2e-gce

Akshay Mankar and others added 5 commits March 22, 2018 16:25
This change allows the node's instance name to be provided through the
cloud config file. That allows the node's hostname to be
different from GCE instance name, which helps an operator to bring their
own TLS certs for nodes.

Due to the fact that the controller-manager cannot just assume node name
is the instance name, it must keep a mapping of node names to instance
names.

Signed-off-by: James Myers <jfmyers9@gmail.com>
Signed-off-by: James Myers <jmyers@pivotal.io>
Signed-off-by: Akshay Mankar <amankar@pivotal.io>
Signed-off-by: James Myers <jfmyers9@gmail.com>
Signed-off-by: James Myers <jmyers@pivotal.io>
Signed-off-by: Iain Sproat <isproat@pivotal.io>
Signed-off-by: Akshay Mankar <amankar@pivotal.io>
Signed-off-by: Iain Sproat <isproat@pivotal.io>
@akshaymankar akshaymankar force-pushed the separate-instance-name-from-node-name branch from 85eceeb to 6d24f28 Compare March 22, 2018 16:40
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 22, 2018
@mtaufen
Copy link
Contributor

mtaufen commented Mar 25, 2018

This is tangentially related: #61410

@liggitt
Copy link
Member

liggitt commented Mar 25, 2018

cc @mikedanese

Is this making the kubelet the authority on which cloud provider instance it is running on? I expected any mapping like that to come from the cloud node controller side, not be based on a field reported by the kubelet that cannot be verified to be accurate (the current way the kubelet sets this field could be moved to a cloud node controller, since it is based on information available via the cloud provider's API. if that source changes to a config file on the node… not so much)

@mtaufen
Copy link
Contributor

mtaufen commented Mar 25, 2018

How does this PR play with the provider-id flag? #44258

@akshaymankar
Copy link
Author

Hi @liggitt,

I am not quite sure what you mean by "making the kubelet the authority on which cloud provider instance it is running on?".

As far as I understand kubelet takes these two flags--cloud-provider and --cloud-config to decide which cloud provider to use. This makes sure that kubelet reports with the right spec.providerID and then the controllers have their own cloud config provided to the kube-controller-manager. With this understanding, all we've done is to remove the assumption that the node name of the kubelet is the instance name of the VM it is running on.

I am not so sure about cloud node controller, can you please point me to a resource/code where I can learn more about it?

@akshaymankar
Copy link
Author

@mtaufen We tested the build from this PR with --provider-id and the cloud-controller-manager. We found that it doesn't work if you set --hostname-override to anything other than the instance name of the node VM. This is because of the same assumption that node names are instance names at these places:

We will work to change at these places and try to figure out if we have missed any other places.

Thanks for the heads up!

For this to work, we needed to set informers on cloud provider in cloud
controller manager. This makes sure that the instanceNameMap in
GCECloud is up to date when the controllers want to know instance name
of a node.

Signed-off-by: Urvashi Reddy <ureddy@pivotal.io>
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: akshaymankar
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: wlan0

Assign the PR to them by writing /assign @wlan0 in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@akshaymankar
Copy link
Author

We've fixed the three places mentioned above. The loadbalancer got created by the cloud-controller-manager. We will do some more exploring around cloud-controller-manager to make sure we haven't missed any places where nodename=instancename assumption exists.

To make it work in cloud-controller-manger, we had to call SetInformers on the cloud object, not sure if that was the right place to do it. @mtaufen can you please provide any suggestions on this?

@akshaymankar
Copy link
Author

/assign @wlan0

defer gce.instanceNameMapLock.Unlock()

if newNode != nil {
_, _, instanceName, err := splitProviderID(newNode.Spec.ProviderID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is changing the cloud controllers to pull information from the ProviderID field of a Node API object, correct? doesn't that make the kubelet the authority on which cloud provider instance a given Node API object corresponds to?
@mikedanese ^

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing consumed from ProviderID field here is the instance name, which was assumed to be the node name before this by the cloud controllers. So, the kubelet was and still is the the authority in defining the node name, which was assumed to be the instance name by the cloud controllers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kubelet does not get to choose any node metadata.name it wants. It is only allowed to modify the one that corresponds to its kube API credentials.

If this is changing cloudprovider code to be driven by the provider id field instead, we first have to ensure a compromised kubelet cannot claim to be another instance (by setting the providerID field) and get the cloud controller (or other controllers) to take actions against that instance

Copy link
Author

@akshaymankar akshaymankar Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I get it now. I was assuming that kubelets were trusted as far as they were authenticated. I am not sure how to solve this problem, I will try to dig into other cloud providers to see how they solve it. Meanwhile, if you have any suggestions, please let me know.

Copy link
Author

@akshaymankar akshaymankar Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked in other cloud providers (AWS, Openstack, Azure and vSphere), it seems like only vSphere relies on ProviderID and hence is insecure against a node being compromised.

After chatting with a few people an idea that seems to work is labelling nodes in IAAS to make sure kubelet is not telling cloud controllers which nodes it corresponds to. For example, node-1 runs on instance vm-foo. In this case vm-foo will be expected to have a label say kubernetes-node-name: node-1. When node-1 joins the cluster, the cloud provider will look for vm with the said label and find the instance name, from this point on the cloud provider can keep a mapping of node name to instance name.

This seems to solve the trust issue between master components and kubelet, assuming the labels can be trusted. But it will require operator to make extra configurations. I am not sure if that is acceptable. WDYT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With labels, the code for addNodeToInstanceMap will look like this:

	if newNode != nil {
		_, zone, instanceName, err := splitProviderID(newNode.Spec.ProviderID)
		if err != nil {
			return err
		}
		instance, err := gce.Compute().Instances().Get(context.TODO(), &meta.Key{Name: instanceName, Zone: zone})
		if err != nil {
			return err
		}
		allowedNodeName, ok := instance.Labels["kubernetes-node-name"]
		if !ok {
			return errors.Errorf("Could not find label: 'kubernetes-node-name' on instance '%s'", instanceName)
		}

		if allowedNodeName != newNode.Name {
			return errors.Errorf("Instance '%s' is only allowed to have node name '%s'", instanceName, allowedNodeName)
		}

		gce.instanceNameMap[allowedNodeName] = instanceName
	}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an idea that seems to work is labelling nodes in IAAS to make sure kubelet is not telling cloud controllers which nodes it corresponds to

having data living in the cloud provider being the source of truth is much closer to what I expected.

with that information living there, is it actually still necessary to have the ProviderID in the Node object? it's tough to ensure that every component that makes use of that field does a verification against a central source of truth that the mapping is actually correct.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of two reasons for ProviderID to be on the Node Object:

  1. It helps the cloud provider look at just one node and verify if the information is correct.
  2. If somebody wants to recreate the worker while previous one is not gone, they can smoothly switch over by creating the new worker and then killing the previous one.

it's tough to ensure that every component that makes use of that field does a verification against a central source of truth that the mapping is actually correct

No component (except for the addNodeToInstanceMap method) should be using ProviderID directly, they should always use the mapNodeNameToInstanceName method to get the instance name and the method above will make sure that the mapping is always verified.

Copy link
Member

@liggitt liggitt Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No component (except for the addNodeToInstanceMap method) should be using ProviderID directly, they should always use the mapNodeNameToInstanceName method to get the instance name and the method above will make sure that the mapping is always verified.

sure, but that's in one cloud provider. having a field like that with implicit "must verify externally before use" semantics will be a source of confusion at best, and vulnerabilities at worst.

is it possible to enumerate cloud provider instances or search by tag value to avoid needing the field on the Node object? I guess that would depend on the cloud provider API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it is weird to have that implicit assumption. Maybe the two points I listed above could be handled in other ways.

It seems to be a little tricky to list VMs, because the GCE API to list vms only lists VMs in a given zone. So, either we'll need to make the Zones field mandatory in the cloud config or look for the instance through all the zones of GCE, which sounds like an overkill.

@mikedanese
Copy link
Member

This change allows the node's instance name to be provided through the cloud config file when using the GCE cloud provider. That allows the node's hostname to be different from GCE instance name, which helps an operator to bring their own TLS certs for nodes.

Can you give a full explanation of the problem you are facing?

@mikedanese mikedanese assigned mikedanese and unassigned bowei Mar 28, 2018
@akshaymankar
Copy link
Author

@mikedanese

The problem we are facing is to generate certificates for kubelets, so we can secure communication between apiserver and kubelet. We were planning to use --hostname-override on kubelets to set the hostnames for apiserver -> kubelet communications to node1, node2, node3, etc. This also has a side effect that node name changes to host name of the node. But we also discovered that the GCE cloud provider assumes that node name is the instance name in GCE. So, the --hostname-override needs to be set to instance name. This makes it impossible to generate certificates for kubelet before spinning up the VMs.

So, after some discussions on slack, we decided that it made sense to break this assumption and so we created the issue and this PR.

@akshaymankar
Copy link
Author

/test pull-kubernetes-bazel-test
/test pull-kubernetes-typecheck
/test pull-kubernetes-integration

@k8s-ci-robot
Copy link
Contributor

@akshaymankar: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-typecheck af31c39 link /test pull-kubernetes-typecheck
pull-kubernetes-bazel-test af31c39 link /test pull-kubernetes-bazel-test
pull-kubernetes-integration af31c39 link /test pull-kubernetes-integration

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot
Copy link
Contributor

@akshaymankar: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 14, 2018
@liggitt
Copy link
Member

liggitt commented Jun 2, 2018

fyi, I'm documenting the current state of cloud provider node name computation and requirements in kubernetes/website#8873

once this issue is resolved, please update the GCP section of that document to describe the new method, any configuration requirements, etc.

@@ -179,6 +184,8 @@ type ConfigGlobal struct {
// located in (i.e. where the controller will be running). If this is
// blank, then the local zone will be discovered via the metadata server.
LocalZone string `gcfg:"local-zone"`
// InstanceName specifies the GCE instance name of the node
InstanceName string `gcfg:"instance-name"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the source of truth for the instance name? expected to be consumed by the kubelet and registered in the Node object for use by the cloud controller? is there any way for the controller to verify from information in the instance that Node X really does correspond to it?

Copy link
Author

@akshaymankar akshaymankar Jun 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the source of truth for instance name and is expected to be consumed by kubelet. I cannot think of any straightforward way to verify the information that node X really does correspond to it. Which is the reason why I was thinking that I should close this PR.

We could do something similar to AWS with labels as that would be verifiable by cloud controller. Please let me know what you think about this.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 8, 2018
@akshaymankar
Copy link
Author

/close

@k8s-ci-robot
Copy link
Contributor

@akshaymankar: Closing this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants