Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kubelet] Optionally consume configuration from <node-name> named config maps #30090

Merged
merged 4 commits into from
Aug 23, 2016

Conversation

mtaufen
Copy link
Contributor

@mtaufen mtaufen commented Aug 4, 2016

This extends the Kubelet to check the API server for new node-specific config, and exit when it finds said new config.

/cc @kubernetes/sig-node @mikedanese @timstclair @vishh

Release note:

Extends Kubelet with Alpha Dynamic Kubelet Configuration. Please note that this alpha feature does not currently work with cloud provider auto-detection.

This change is Reviewable

@mtaufen mtaufen added area/kubelet do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. labels Aug 4, 2016
@k8s-github-robot k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Aug 4, 2016
@mtaufen mtaufen force-pushed the dynamic-kubelet-restart branch from 3f73635 to c6a56a6 Compare August 4, 2016 19:29
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 4, 2016
@kfox1111
Copy link

kfox1111 commented Aug 4, 2016

Couple of things.

  1. It looks like its looking in the default namespace. It probably should be looking in kube-system. This probably applies only if the next thing isn't done.
  2. Doing per node configmaps is quite flexible, it will make for a large number of configmaps to manage.

Can it be changed to do the following:
Look for a node attribute on the given node's attributes named "kubelet.kubernetes.io/config".
If it doesn't exist, the configmap name to use is "kube-system/kubelet-default"
If it does exist, the value of the attribute is used as the configmap name.

This allows all or some kubelets to share configmaps in groups, or each individual kubelet to have its own.

@mtaufen
Copy link
Contributor Author

mtaufen commented Aug 5, 2016

@vishh and I recently talked about our options in this area. We would rather stick to per-node config internally, and build external automation to manage updating the per-node configs based on the attributes, etc. on a given node. Global config (and shared config is just a smaller case of global config) is dangerous because the Kubelets will almost immediately take up new config when they see it change. If the Kubelets are directly looking for a shared config, they will all change to that new config as soon as the configmap changes. In this scenario, if someone makes a bad change to that shared config, they could easily knock out a huge chunk of a cluster (or the whole thing, if some shared config ends up being global).

There's nothing wrong with having a configmap that acts as the source of config for several nodes, but per-node config forces people to think about building rollout automation if they want shared config objects, and this is a good thing.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2016
@mtaufen
Copy link
Contributor Author

mtaufen commented Aug 8, 2016

@kfox1111 you're spot on about kube-system being the right namespace for these. Just updated this PR to reflect that.

@kfox1111
Copy link

kfox1111 commented Aug 8, 2016

@mtaufen I see what you mean about the per node config. But, I think you can get the same affect with less storage by following a pattern of uploading the new config with a unique name, then updating the attributes on the nodes one at a time to point to it.

@vishh
Copy link
Contributor

vishh commented Aug 8, 2016

If global configs are highly desirable, then we need to add some more logic
into config maps.
This idea was suggested by @thockin.
Each config map can contain an old and a new configuration and a rollout
percentage. Nodes will self-sort themselves based on a consistent hash. As
the rollout percentage increases from 0 - 100%, nodes will self-update
themselves based on the percentage number that the hashing scheme puts them
in.
This has the advantage of working for most users who typically do not have
diverse deployments that require custom kubelet configurations. We also
avoid storing a lot of duplicate config objects.
The disadvantage is that we are building a new concept on top of config
maps and we will need additional tooling and automation to make the
experience seamless for end users. I suspect that we might be better off
adding a top-level kubelet config object instead.
Also if we are going down the route of tooling and automation, the burden
of managing per-node configs can also be handled the same way. As for
duplicate storage, may be we could add something similar to
LocalObjectReference to ConfigMaps to re-use config objects instead of
duplicating them.

@kfox1111
Copy link

kfox1111 commented Aug 8, 2016

@mtaufen thanks for fixing the namespace issue. it looks good now.

@thockin
Copy link
Member

thockin commented Aug 9, 2016

@mtaufen talked me out of a single global map.

@thockin thockin removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Aug 9, 2016
@mtaufen mtaufen force-pushed the dynamic-kubelet-restart branch from 036bc46 to d82f1a8 Compare August 10, 2016 17:20
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 10, 2016
@mtaufen
Copy link
Contributor Author

mtaufen commented Aug 10, 2016

@kfox1111 that is a very good point.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 11, 2016
@mtaufen mtaufen force-pushed the dynamic-kubelet-restart branch from 80379a5 to e7f5ecb Compare August 11, 2016 18:29
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 11, 2016
@mtaufen mtaufen force-pushed the dynamic-kubelet-restart branch from c759f46 to bd90e1a Compare August 11, 2016 20:52
@mtaufen mtaufen force-pushed the dynamic-kubelet-restart branch from 709945c to 82cca35 Compare August 16, 2016 18:01
@vishh
Copy link
Contributor

vishh commented Aug 23, 2016

:lgtm:


Review status: 0 of 4 files reviewed at latest revision, 10 unresolved discussions, some commit checks failed.


cmd/kubelet/app/server.go, line 333 [r4] (raw file):

Previously, mtaufen (Michael Taufen) wrote…

I'm digging into it now, and it looks like there's a chicken-egg problem between building the Kubelet object and getting auto-detected cloud provider information. cAdvisor does the auto-detection (kl.cadvisor.MachineInfo() in GetCachedMachineInfo() in pkg/kubelet/kubelet_cadvisor.go), and cAdvisor appears to be started after the Kubelet object is constructed (initializeRuntimeDependentModules(), in pkg/kubelet/kubelet.go, is a Kubelet method). This is a problem because, in order to use new remote config, the Kubelet needs to download it before construcing the Kubelet object.

Even if we could feed the cloud provider information back into the config sync loop after cAdvisor comes up, that would only allow us to detect updates to config. We still wouldn't be able to bring down that new config after restarting. So such a feedback mechanism would just result in a crash loop.

If we can't find a solution by Monday, we might have to just document that dynamic Kubelet configuration doesn't presently work with auto-detect cloud provider, until we figure out a solution. I will keep noodling on this.

Let's document this for now.

Comments from Reviewable

@mtaufen mtaufen force-pushed the dynamic-kubelet-restart branch from 05ac5cd to 76810b4 Compare August 23, 2016 05:46
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 23, 2016
@mtaufen mtaufen added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Aug 23, 2016
@mtaufen
Copy link
Contributor Author

mtaufen commented Aug 23, 2016

cmd/kubelet/app/server.go, line 333 [r4] (raw file):

Previously, vishh (Vish Kannan) wrote…

Let's document this for now.

Added a release note.

Comments from Reviewable

@mtaufen mtaufen added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 23, 2016
@mtaufen mtaufen force-pushed the dynamic-kubelet-restart branch from 76810b4 to e780bb5 Compare August 23, 2016 14:42
@mtaufen mtaufen added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 23, 2016
@k8s-bot
Copy link

k8s-bot commented Aug 23, 2016

GCE e2e build/test passed for commit e780bb5.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Aug 23, 2016

GCE e2e build/test passed for commit e780bb5.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit fe808ec into kubernetes:master Aug 23, 2016
@j3ffml j3ffml mentioned this pull request Aug 23, 2016
@vishh
Copy link
Contributor

vishh commented Aug 23, 2016

Yayyy!!!

On Tue, Aug 23, 2016 at 9:26 AM, Kubernetes Submit Queue <
notifications@github.com> wrote:

Merged #30090 #30090.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#30090 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AGvIKOFpZPBawZPIgeYE0AOiiGp1vZD_ks5qix8sgaJpZM4JdAvp
.

@lavalamp
Copy link
Member

kubelet test went red after this merged.

mikedanese added a commit to mikedanese/kubernetes that referenced this pull request Aug 23, 2016
…let-restart"

This reverts commit fe808ec, reversing
changes made to f297ea9.
mtaufen added a commit to mtaufen/kubernetes that referenced this pull request Aug 24, 2016
k8s-github-robot pushed a commit that referenced this pull request Aug 24, 2016
Automatic merge from submit-queue

Revert revert 30090 with fix

This reverts #31297 (which originally reverted #30090) and applies a fix to stop the fd leak that was exposed by #30090.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.