kubelet with --kubeconfig=... still requires --api-servers=... #30515

errordeveloper · 2016-08-12T17:16:48Z

For the work we are doing at @kubernetes/sig-cluster-lifecycle (#30360), it'd make things much easier if we could make kubelet wait for the config file when it's not there and expect to find the API server URL in that file, rather then expect --api-servers=... like it does right now.

The text was updated successfully, but these errors were encountered:

errordeveloper · 2016-08-12T17:17:09Z

cc @mikedanese

mikedanese · 2016-08-12T17:47:03Z

Ya I noticed this one time and it annoyed me. I think you mean --kubeconfig and not --config?

mikedanese · 2016-08-12T18:03:55Z

The problem with fixing this is that len(--api-servers) == 0 is overloaded to mean run kubelet in standalone mode. That's not the case if apiservers were specified in the kubeconfig. cc @kubernetes/goog-node

smarterclayton · 2016-08-12T21:15:26Z

I just noticed this in #29216 and I agree - clients of the API should use
kubeconfig if they are part of our codebase.

On Fri, Aug 12, 2016 at 1:47 PM, Mike Danese notifications@github.com
wrote:

Ya I noticed this one time and it annoyed me. I think you mean
--kubeconfig and not --config?

—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
#30515 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p_v807Sy_3oBtlAQJz4PYuNteCaQks5qfLGlgaJpZM4JjSJH
.

mtaufen · 2016-08-13T02:20:19Z

I guess one thing we could do is add a --standalone flag, and if the flag is not provided we use the existing behavior of 0-len --api-servers, but if it is provided, --api-servers is ignored. So if you want to just use the URLs from the config file, you can explicitly pass --standalone=false. And eventually we move to deprecating --api-servers, setting --standalone=false by default, and always using URLs from the config file unless --standalone=true.

errordeveloper · 2016-08-13T09:49:42Z

Certainly we need a new flag. It seem like adding --standalone would sacrifice backwards compatibility, if this is a popular feature... I've briefly thought of something like --wait-for-kubeconfig, but may be we don't need a flag for it, and it could be just the new behaviour. What do people think about waiting for the kubeconfig file? Of course, there is one downside that Config type only allows you to set one Server, but it'd be reasonable to leave that outside of the scope, so we can just get the basic non-HA bootstrap UX implemented.

dgoodwin · 2016-08-17T12:15:29Z

Going to try to take a look at fixing this.

Does it sound ok to do the following:

if --kubeconfig specified:
- if it doesn't yet exist wait for it
- error out if --api-servers was also defined
- change assumption for standalone behavior to: if len api-servers == 0 and --kubeconfig not specified

lukemarsden · 2016-08-17T12:28:18Z

@dgoodwin please note that there is already code which aims to fix this at #30616 – it might be better if you could review that code rather than writing new code?

lukemarsden · 2016-08-17T12:29:07Z

Or maybe you can explain how the proposal you just made is better than #30616? Thanks!

dgoodwin · 2016-08-17T12:30:48Z

My mistake, thought we needed a fix here. Looking now.

errordeveloper · 2016-08-17T15:19:32Z

@dgoodwin I think what you are suggesting is cleaner then adding a new flag, but it's mostly backwards compatibility that I was trying to address in #30616. I'd be happy to break this rather odd behaviour, if folks agree to do so.

smarterclayton · 2016-08-17T17:03:48Z

Ok, to summarize:

We cannot break the backwards compatibility of len(apiservers) == 0 not meaning standalone. That means we cannot just start reading kubeconfig by itself for apiserver, or failing on error
- However, in the future, we should introduce --standalone or similar to make that behavior explicit.
We never reload kubeclient, so a client that starts with kubeconfig specified but is broken will never reload (and users likely don't expect or want that behavior)
We default KubeConfig in kubelet to a value, but users may have simply copied that default behavior onto a flag and expect the behavior.

To fix this, we need to introduce a new flag. I would generally prefer we crash-loop with a very succinct and clean message, because we use that across the project consistently, and it is predictable and consistent with other tools in the space.

Here are some options:

--require-kubeconfig - if specified, we error out on kubeconfig. Would be set by cluster bootstrap. Would not require changes to kubeconfig. The vast majority of users want this flag.
--apiserver-kubeconfig - overrides --kubeconfig if set, and if set, requires the kubeconfig to exist. Deprecate --kubeconfig and remove in 1.5 or later.

Either way, we'd also want to have the error checking try to print a nice succinct error. I lean slightly towards the latter flag because if the vast majority of users want a specific behavior (error out if my config is wrong) then having to specify two flags means users will forget to specify the second flag, which means they will silently not get the behavior the expect. If we instead teach them to use the new flag, they always get the behavior they expect.

derekwaynecarr · 2016-08-17T17:05:56Z

I think I prefer --apiserver-kubeconfig

lukemarsden · 2016-08-17T17:07:18Z

Thanks @smarterclayton! So to be clear --apiserver-kubeconfig means "read the API server information from the kubeconfig file, exiting cleanly if the kubeconfig file doesn't exist"? That SGTM.

smarterclayton · 2016-08-17T17:07:30Z

Yes I think so.

smarterclayton · 2016-08-17T17:09:05Z

Bumping to p1 because this blocks lifecycle having a path forward.

smarterclayton · 2016-08-17T17:11:01Z

Since I reviewed @erictune's original code here and I consider this my fault I'm opening a PR.

smarterclayton · 2016-08-17T17:15:32Z

@mtaufen this probably intersects with your changes - was that slated for merge today?

mikedanese · 2016-08-17T17:19:30Z

Are you planning on doing this for all components?

smarterclayton · 2016-08-17T17:20:19Z

No, just for Kubelet. Controller manager and scheduler already exit if the kube config is invalid (can double check the others)

smarterclayton · 2016-08-17T17:20:37Z

Kubelet is the only one that must remain backwards compatible.

mikedanese · 2016-08-17T17:26:02Z

Can we just action-required release note and kubernetes-announce, and actually fix this? Kubelet should handle kubeconfig like all others components, including a continuous command line flag API. The fact that this is not the case is a bug. I imagine that this behavior is little traveled and this is gross.

Controller manager and scheduler already exit if the kube config is invalid

There is no invalid. Using the default loader you get a http://localhost:8080 kubeconfig

smarterclayton · 2016-08-17T17:27:37Z

I thought we fixed it so that the NonInteractive did not have that value? That's terrible if so.

mikedanese · 2016-08-17T17:33:00Z

$ kube-scheduler 
W0817 10:31:26.182035   42084 client_config.go:391] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
W0817 10:31:26.182091   42084 client_config.go:396] error creating inClusterConfig, falling back to default config: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined
E0817 10:31:26.183182   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:410: Failed to list *api.ReplicationController: Get http://localhost:8080/api/v1/replicationcontrollers?resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:26.183218   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:399: Failed to list *api.PersistentVolume: Get http://localhost:8080/api/v1/persistentvolumes?resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:26.183220   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:395: Failed to list *api.Node: Get http://localhost:8080/api/v1/nodes?resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:26.183310   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:415: Failed to list *extensions.ReplicaSet: Get http://localhost:8080/apis/extensions/v1beta1/replicasets?resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:26.183517   42084 leaderelection.go:253] error retrieving endpoint: Get http://localhost:8080/api/v1/namespaces/kube-system/endpoints/kube-scheduler: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:26.183626   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:400: Failed to list *api.PersistentVolumeClaim: Get http://localhost:8080/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:26.183691   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:389: Failed to list *api.Pod: Get http://localhost:8080/api/v1/pods?fieldSelector=spec.nodeName%3D%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:26.183773   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:392: Failed to list *api.Pod: Get http://localhost:8080/api/v1/pods?fieldSelector=spec.nodeName%21%3D%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:26.184695   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:405: Failed to list *api.Service: Get http://localhost:8080/api/v1/services?resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:27.184139   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:410: Failed to list *api.ReplicationController: Get http://localhost:8080/api/v1/replicationcontrollers?resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:27.184141   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:395: Failed to list *api.Node: Get http://localhost:8080/api/v1/nodes?resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:27.184307   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:399: Failed to list *api.PersistentVolume: Get http://localhost:8080/api/v1/persistentvolumes?resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused
E0817 10:31:27.184635   42084 reflector.go:216] k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:389: Failed to list *api.Pod: Get http://localhost:8080/api/v1/pods?fieldSelector=spec.nodeName%3D%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&resourceVersion=0: dial tcp [::1]:8080: getsockopt: connection refused

smarterclayton · 2016-08-17T17:35:47Z

Very useful...

errordeveloper · 2016-08-17T17:43:53Z

Yes, I recall there was a bit of mixed behaviour, but couldn't remember what it was exactly. Adding a --poll-kubeconfig sort of flag to all components would be nice, IMO... Or may be we could treat a empty file in special way or something. Alternatively, perhaps on systemd we could use ConditionPathExists (as @dgoodwin suggested) in kubelet.service and add kubelet.path to watch for it and kick kubelet.service when it gets created. Let's make sure to discuss later, once we get a fix to this LGTM'ed.

smarterclayton · 2016-08-17T18:46:56Z

Here's the first draft:

$ kubelet --require-kubeconfig
failed to run Kubelet: invalid kubeconfig: stat /var/lib/kubelet/kubeconfig: no such file or directory

smarterclayton · 2016-08-17T18:47:12Z

If people would like a better message for the "hot-loop" case please let me know

lukemarsden · 2016-08-17T18:58:07Z

LGTM @smarterclayton

mtaufen · 2016-08-17T20:55:31Z

@smarterclayton I have higher priorities today, so if #30798 merges first no worries. I'll rebase in the morning if necessary. Don't block on me.

@lukemarsden

Automatic merge from submit-queue Allow a flag that forces kubelet to have a valid kubeconfig `--require-kubeconfig` forces the kubelet to use the kubeconfig for all APIserver communication, and exit cleanly. Allows cluster lifecycle to loop waiting for config to be available. Fixes #30515 A follow up PR will handle the issue discovered where the DefaultCluster rules applied to kubeconfig allow a malicious party who can bind to localhost:8080 to take advantage of an admin misconfiguration. @lukemarsden @mikedanese ```release-note The Kubelet now supports the `--force-kubeconfig` option which reads all client config from the provided `--kubeconfig` file and will cause the Kubelet to exit with error code 1 on error. It also forces the Kubelet to use the server URL from the kubeconfig file rather than the `--api-servers` flag. Without this flag set, a failure to read the kubeconfig file would only result in a warning message. In a future release, the value of this flag will be defaulted to `true`. ```

k8s-github-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Aug 12, 2016

mikedanese changed the title ~~kubelet with --config=... still requires --api-servers=...~~ kubelet with --kubeconfig=... still requires --api-servers=... Aug 12, 2016

mtaufen mentioned this issue Aug 13, 2016

Refactor to simplify the hard-traveled path of the KubeletConfiguration object #29216

Merged

errordeveloper added a commit to errordeveloper/kubernetes that referenced this issue Aug 15, 2016

Add --wait-for-kubeconfig (fix kubernetes#30515)

3cc09b7

This was referenced Aug 15, 2016

Add --wait-for-kubeconfig (fix #30515) #30616

Closed

Proposal: Dramatically Simplify Kubernetes Cluster Creation #30360

Closed

smarterclayton added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Aug 17, 2016

smarterclayton mentioned this issue Aug 17, 2016

Allow a flag that forces kubelet to have a valid kubeconfig #30798

Merged

k8s-github-robot closed this as completed in #30798 Aug 18, 2016

dims mentioned this issue Nov 14, 2016

Kubelet won't read apiserver from kubeconfig #36745

Closed

mikedanese mentioned this issue Jan 18, 2017

Remove APIServerList from the Kubelet #40049

Closed

mikedanese mentioned this issue Jul 24, 2017

Remove --require-kubeconfig and the default kubeconfig path in v1.10 #41161

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubelet with --kubeconfig=... still requires --api-servers=... #30515

kubelet with --kubeconfig=... still requires --api-servers=... #30515

errordeveloper commented Aug 12, 2016 •

edited

Loading

errordeveloper commented Aug 12, 2016

mikedanese commented Aug 12, 2016

mikedanese commented Aug 12, 2016 •

edited

Loading

smarterclayton commented Aug 12, 2016

mtaufen commented Aug 13, 2016 •

edited

Loading

errordeveloper commented Aug 13, 2016

dgoodwin commented Aug 17, 2016

lukemarsden commented Aug 17, 2016

lukemarsden commented Aug 17, 2016

dgoodwin commented Aug 17, 2016

errordeveloper commented Aug 17, 2016

smarterclayton commented Aug 17, 2016 •

edited

Loading

derekwaynecarr commented Aug 17, 2016

lukemarsden commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

mikedanese commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

mikedanese commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

mikedanese commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

errordeveloper commented Aug 17, 2016 •

edited

Loading

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

lukemarsden commented Aug 17, 2016

mtaufen commented Aug 17, 2016

kubelet with --kubeconfig=... still requires --api-servers=... #30515

kubelet with --kubeconfig=... still requires --api-servers=... #30515

Comments

errordeveloper commented Aug 12, 2016 • edited Loading

errordeveloper commented Aug 12, 2016

mikedanese commented Aug 12, 2016

mikedanese commented Aug 12, 2016 • edited Loading

smarterclayton commented Aug 12, 2016

mtaufen commented Aug 13, 2016 • edited Loading

errordeveloper commented Aug 13, 2016

dgoodwin commented Aug 17, 2016

lukemarsden commented Aug 17, 2016

lukemarsden commented Aug 17, 2016

dgoodwin commented Aug 17, 2016

errordeveloper commented Aug 17, 2016

smarterclayton commented Aug 17, 2016 • edited Loading

derekwaynecarr commented Aug 17, 2016

lukemarsden commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

mikedanese commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

mikedanese commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

mikedanese commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

errordeveloper commented Aug 17, 2016 • edited Loading

smarterclayton commented Aug 17, 2016

smarterclayton commented Aug 17, 2016

lukemarsden commented Aug 17, 2016

mtaufen commented Aug 17, 2016

errordeveloper commented Aug 12, 2016 •

edited

Loading

mikedanese commented Aug 12, 2016 •

edited

Loading

mtaufen commented Aug 13, 2016 •

edited

Loading

smarterclayton commented Aug 17, 2016 •

edited

Loading

errordeveloper commented Aug 17, 2016 •

edited

Loading