-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: add proposal for ConfigMap management #31701
docs: add proposal for ConfigMap management #31701
Conversation
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
1 similar comment
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
new ConfigMap into the Deployment, or 2) edit the original ConfigMap and | ||
trigger the deployment controller to start a new ReplicaSet (either by | ||
adding a dummy environment value in the pod template of the Deployment or | ||
by including the update as part of another update) losing the ability to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they change the config map name, they have to update the deployment pod template spec, which will trigger the deployment.
revisions. | ||
|
||
Currently, in order to apply updated configuration into a running app, users | ||
need to 1) either do it manually: copy the ConfigMap that holds their old |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kargakis What's the problem with (1)? When using kubectl apply, it's trivial to do, and it supports rolling update and rollback fairly nicely. Once we finish GC, the old ConfigMaps would be automatically deleted when the ReplicaSets referencing them were deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- sounds reasonable. User needs to manually create another ConfigMap,
kubectl apply
the Deployment to mount the new ConfigMap. We just need to let the existing controllers to set ConfigMap's ownerReference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should lay out clearly the conditions that would make a controller set an ownerRef on a config map, since that has the potential to delete user-created data unintentionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kargakis What's the problem with (1)? When using kubectl apply, it's trivial to do, and it supports rolling update and rollback fairly nicely. Once we finish GC, the old ConfigMaps would be automatically deleted when the ReplicaSets referencing them were deleted.
I thought we wanted a mechanism for managing ConfigMaps. By doing (1) (which is already happening, we just need to add owner refs) we just solve garbage collection of CMs. Users still need to manually update their Deployments. Is this what we really want for #22368? If we added the CM template into the Deployment then we would do something similar to what I am proposing here, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be really careful about the relationship here. A configmap is unlikely to be owned by a deployment in the classic sense (a primary configmap - the one that an admin defines that contains config that is used by many components). It's very likely instead that the configmap is intended to flow out to deployments, and so the deployments are subordinate to the configmap. That is why I don't think ownerRef is appropriate on the deployment.
I think it's more useful to think about this the other direction - "how do I flow configmap/secret changes out to 1..N entities" and preserve safe updates. That's the underlying use case, and so the steps need to reflect it. I want to change my config declaratively. The question about whether it should be automated (whether the complexity is worth it) is
a) how many times would an average user do this (change the config)
b) is there a better, simpler flow (apply from a checked in config)
c) is that better simpler flow "safer"?
I think my worry with the proposed 1 is that it is not safe - a naive user is likely to change the configmap and get burned when their app magically updates without a deployment. Can we better direct users to handle this?
Also, with apply+purge there's another possibility for 1 being messed up - the user renames the config map in their config directory from config-1 to config-2, which means apply+purge deletes config-1, which means that pods in the old deployment are now broken. The user has to leave config-1 in their source config until it is no longer used (which is a hard determination). A particular version of a configmap is something that needs to be available and immutable until it is no longer referenced by a pod, and I can't see a way to do that easily today. We would need ownerRef on that config to be owned by an instance of the deployment, not by the deployment.
EDIT: so one aspect of this proposal would be, how does a deployment guarantee that a correct version of a configmap hangs around until the old deployment is gone. Should the deployment itself be the one doing the copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would need ownerRef on that config to be owned by an instance of the deployment, not by the deployment.
That's already in the proposal. The original CM is owned by the Deployment, copies are owned by the underlying replica sets. Is there a better way to establish the relationship between the original CM and the Deployment without API changes? Should we start off by using an annotation instead of an owner reference?
I included immutable CM copies as part of the Future section. Is it something we should really address in this proposal? I agree on principle, just don't see it as a hard req.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think much of the argument is because people are looking at two different use cases:
- The config map is part of an app deployment (like flags). It is static during the lifetime of the pod. For example, an environment configuration ("this is a dev environment, use table foo-dev instead of foo-prod"). When updating the app, the deployment needs to create a new config map and new RC, and delete the old one. Users would never interact with the config map directly.
- The config map is a control surface. Running pods need to grab new config from the map. Users would update and roll back interactively.
The question is: Which of these problems are you trying to solve? Because the good solutions are much different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to address 1) - the config map is part of the deployment. We don't want running pods to pick up configuration different from the one they started with.
This seems like a lot of complexity for little gain. |
Complexity for end-users or us? From a user POV, how is the updateFile-createNewMap-editOrApplyDeployment flow less complex than a simple updateConfigMap flow? What's happening now: $ kc get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
registry 1 1 1 1 2m
$ kc get cm
NAME DATA AGE
registry 1 6m
# Need to update config file
$ vim config.yml
# Need to create new config map
$ kc create configmap registry-updated --from-file=config.yml
configmap "registry-updated" created
$ kc get cm
NAME DATA AGE
registry 1 7m
registry-updated 1 1m
# Need to edit deployment manually
$ kc edit deploy/registry What's proposed: # Assuming owner reference is set in cm/registry pointing to deployment/registry
$ kc get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
registry 1 1 1 1 20m
$ kc get rs
NAME DESIRED CURRENT READY AGE
registry-2863914628 0 0 0 16m
registry-3042501016 1 1 1 9m
$ kc get cm
NAME DATA AGE
registry 1 18m
registry-f74hh04 1 10m
# All I need is to edit the config map I created
$ kc edit cm/registry
configmap "registry" edited
# and a new replica set with the updated configuration is rolled out automatically
$ kc get rs
NAME DESIRED CURRENT READY AGE
registry-2863914628 0 0 0 16m
registry-3042501016 0 0 0 9m
registry-3279569508 1 1 0 5s
$ kc get cm
NAME DATA AGE
registry 1 18m
registry-f74hh04 1 10m
registry-n6g3s90 1 20s
We can argue if we want users to set owner references or not. The new controller should be able to facilitate this kind of flow for any kind of resource (petset, ds, ...) that needs to be triggered in response to an update in another resource (cm, secret, pvc, ...). |
Updated the proposal to use annotations in the triggered resource (Deployment) rather than owner references in ConfigMaps for establishing the relationship between the two. |
A couple of downstream issues asking about this: |
PoC that runs oc observe inside a pod at https://github.com/kargakis/configmap-rollout |
|
||
A new controller needs to watch for Deployments and ConfigMaps. Whenever it | ||
spots a Deployment with a `triggered-by` annotation and a mounted ConfigMap, | ||
it will create a copy of the ConfigMap and mount the copy in the Deployment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you programmatically mount a configmap into a deployment when the pod's containers specifically need to indicate where a configmap is mounted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the controller has to do is find out which configmap volume it needs to update. The easiest way I can think of is to have kubectl set trigger
take a container name and store it as a second annotation. Then the controller inspects all configmap volumes and updates the one that points to the container with the given name. The container should already have the volumeMount in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest though I don't like the idea of a second annotation and I am trying to think for alternatives that don't involve api changes. Maybe the container name can be a part of the first annotation. Maybe the controller can identify by itself what's the right volume that needs to be updated by inspecting all the mounted config maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move the resource type that serves as a trigger into the annotation key. Then kubectl set trigger
would set an annotation in the form of:
{resourceA}.kubernetes.io/triggered-by-{resourceB}: {containerAName}={resourceBName}
In our specific case a Deployment that has a configMap "foo" mounted in a container named "bar-container" would need the following annotation:
deployment.kubernetes.io/triggered-by-configmap: bar-container=foo
which would be set by
kubectl set trigger deploy/bar configmap/foo -c bar-container
Note that the annotation value format is influenced by kubectl set image
where we do something similar with containerName=imageName.
I'm unclear on the intent of the change and it seems to be getting pulled in a few different directions. For our user's we have 4-5 use-cases which require a similar model in terms of both re-use and deployment model to how env vars are implemented today. As in they have low re-use and they want to be deployed by spinning up a new pod with the changed file contents. We can't use env-vars because a lot of software wants its configuration to come through files. The existing kube implementation of configmap doesn't make this very easy to do. We can juggle configMap names, but it would be a whole lot easier just to slave these configMaps to the deployment both in terms of our manifests and change control mental model. |
GCE e2e build/test passed for commit 73af019. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick intro: I'm Steve, Google Cloud Seeding SRE in Munich. We've had the same discussion internally, and I'd love to help to make the "config map is part of app deployment" proposal happen.
deployment "bar" updated | ||
``` | ||
|
||
## Issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race conditions on update. When the data is part of an app deployment, the user would typically roll out the config map and the pods at the same time as part of a release. In this scheme, the user would update the config map and the deployment (probably in short succession by a script), and it's a race between the config map controller and the deployment controller for recreating pods. In the best-case scenario, we get two deployments (data + code) that run after each other in an indeterminate order. Compatibility issues between the two would be much worse than needed. In the worst case, Kubernetes just crashes because of the competing deployments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In such a case you need to pause your Deployment, that's why we have that feature.
Pause, update config, update image, perform any other pod template tweaks you may want, resume. In the future we want to start one-off deployments while they are paused so you will run truly manual deployments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re compatibility issues, you shouldn't use Rolling deployments if incompatibilities exist between your versions. In that case, you need Recreate deployments, which supposedly never runs two versions of a deployment at the same time. In a Recreate scenario with incompatibilities between versions, you don't need atomic updates since the controller should make sure that it runs one version at any point in time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that the config map controller is not creating pods. It just updates the deployment. The deployment controller is the single creator of a new replica set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pause, update config, update image, resume" is not better than "create config map, update deployment, delete old config map". If I'm going to write that level of automation, I don't need the system proposed here.
Regarding incompatibilities: I think we're talking about two kinds of incompatibilities. You're probably thinking of pods that have a different external interface. I am considering the interface between pod and config map. In my view, the config map is an integral part of a version of the software, and rolled out together with it. This doesn't need to be a backwards- or forwards-compatible interface.
Let me explain a bit of background: We had a long and painful history with command-line flags at Google. During pushes, a new flag is introduced but not set correctly, or an old flag is removed but still set. We fixed this with the concept of "config MPM", where the flags would be bundled together with the software and rolled out as a unit. As config maps are Kubernetes' answer to flags (i.e., information that's not built into the image but injected by the deployment system), I think we should follow similar best practices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly, bundling the CM into the Deployment would suit your need well since you can atomically update and run a new deployment but I don't consider this solution optimal long-term because we would need to do the same thing for secrets, and I am worried about the usability of Deployments once we overload them with all configuration objects there are. Are you more worried about the garbage collection of old config maps or the atomicity that is missed if we don't bundle? In the latter case, I would expect manual deployments to be the best option eg. you always have a paused Deployment, run all the updates you want (new image, updated cm, ...), and start one-off rollouts (pausing and resuming is not a solution to manual deployments really).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an end user I need the ability to treat changes to configMaps, secrets as deployed changes so I get all of the goodness out of liveness probes, readiness probes and stopped deployments when I make changes to them. If you don't provide the facility in a usable way at the kube level then I'm forced to figure out how to manage it on top of kube.
Kube has historically been very good at treating pods as cattle not pets which has made for a very good experience for our stateless services (the bulk of our activity). I'd love to see this trend continue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mostly worried about the atomicity. The pause-update-unpause workflow doesn't really cut the cake because if I build that amount of workflow on top, I might as well skip using config maps altogether and use legacy systems like flags or building the config files into the image. I'd really love this design to succeed, and a simple workflow is paramount for success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't provide the facility in a usable way at the kube level then I'm forced to figure out how to manage it on top of kube.
That's the problem here. If we can't find a usable way of offering automatic deployments on configuration changes, we may need to decide if we really want to do it on the Kube level or let users build on top (that's why we also want feedback on tools like the observe
command: #5164).
Did you try the PoC I have in place for this proposal? I would like some feedback on it. There seem to be two different cases for configuration changes between deployments. 1) You turn on a flag which means only ConfigMap changes. My PoC covers that case. 2) You rollout a new version of your app and turn on/off flags appropriately ie. both CM and Deployment changes. Bundling the CM into the Deployment would offer an atomic way of doing 2 but I am not sure that moving app configuration as part of the Deployment will be usable long-term. Eventually, trying to edit a mass JSON/YAML file is not something we want users to be doing. Maybe small scoped changes either on the Deployment or in app configuration (ConfigMap, Secret) and the ability to run one-off deployments is the best solution for Kubernetes to offer at this point.
|
||
### Alternative solutions | ||
|
||
* Embed ConfigMap as a template in the Deployment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not so bad. We have two types of resources in Kubernetes: "first-level" resources like replica sets, config maps, secrets, etc., and "controlling" resources like PetSet, Deployment, and DaemonSet. We'd need to have a single message that templates all first-level resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want a solution that can be re-used across "controlling" resources w/o duplicating logic across API objects. Imagine that you are editing a Deployment with a PodTemplate, a ConfigMap that holds your configuration, and a Secret that holds your certificates, all inlined in the Deployment. From a user POV, this will be cumbersome to parse and maintain. You also can't easily share common configuration once it's inlined in a particular "controlling" resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a user's POV (and I am a Kubernetes user), the most confusing thing about K8s is the multitude of controllers that are trying to do magic in the background.
I understand your concerns about inlining and templating. Can we move this information into its own resource? e.g. a "ConfigMapTemplate"? Anything that doesn't have new magic connected with its update process would be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With respect to inlining, how are we going to solve multiple configs for multiple containers in a single Deployment? Multiple ConfigMapTemplates or the ConfigMapTemplate contains the config for every container? There are various patterns for mutlicontainer Deployments and furthermore some users already run embarassingly a lot of containers in single deployments (and we definitely cannot stop them from doing so). Config management for them would be a nightmare if we inlined everything.
I have been thiking about a separate object since the conception of this proposal but the bar for new API objects is pretty high at this point. My main concern is that we may end up with a solution for Deployments and down the road PetSets and DaemonSets will come up with different behavior for handling configuration updates. That's why I picked a new controller and an annotation instead of introducing new API, even though the annotation is essentially new API. It is much easier to deprecate though in case we decide we want something else before Deployments are ready for GA.
That being said, if we can't agree on a new controller, or inlining, or a new API object, I would prefer if we just handled the ConfigMap garbage collection in this proposal.
cc: @bprashanth @mikedanese @smarterclayton @bgrant0607 for opinions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @steve-wolter stated it pretty eloquently before that the two different uses around ConfigMaps, flags as files vs control surface, are distinct enough that they need different treatments as the capabilities for one won't solve the other.
At Box (in a separate system from kube) we lumped the two together and it has caused a tremendous amount of pain in the form of a non-unified canary mechanism for configuration and a pretty ineffective control surface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. I'd think that multiple ConfigMapTemplates would be fine from an API standpoint, since the semantics are much the same as for a single ConfigMapTemplate.
If adding top-level objects is the problem, maybe we can get away with using ConfigMap as a ConfigMapTemplate. This would work very much like your suggestion, but instead of creating an annotation on the updated object (ConfigMap), we add a list of objects to be copied to the deployment. In this case, at least the semantics would be clear and we wouldn't need to build the semantics into annotations. On the other hand, I'm not sure whether this would work with K8s' consistency model.
If you need any organizational support for helping to make the case, I think we could mobilize Google-on-GCP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steve-wolter the annotation goes into the Deployment and not the ConfigMap. You can think of these annotations as lists of objects. FWIW there are many examples of API that started off as annotations.
On the other hand, I'm not sure whether this would work with K8s' consistency model.
I am not sure why not. We already run image change triggers in OpenShift (same model as the one proposed in this PR) and while it took us a while to get them right, they work fine and give us the ability to provide automatic deployments in case of image changes.
Regarding pushing this forward, a solution for the issue at question (#22368) is not a high priority for most interested parties atm - you can join the sig-apps call (every Monday: https://github.com/kubernetes/community/tree/master/sig-apps) where we discuss all related work if you want to make a case for pushing it forward.
For the record, I am opposed to both new API types and to embedding ConfigMap in other resources, such as Deployment. Essentially the only thing this proposal would automate is the update of the reference from the pod spec to the new copy of the ConfigMap. Taking a step back, are users more likely to create ConfigMap yaml by hand, or using |
This is exactly what my proposal is about, albeit with an additional annotation in the Deployment for enabling the update.
Given that |
Most of box's configMaps are created by hand (copy paste) and then applied via an automated applier. @bgrant0607 Any chance I can understand the thought process behind not wanting to embed a configMap like object in deployments (more like directly in their volume) for use-cases that just want some env-var/flags equivalent that gets written out to files? All I really want is a clear path to configMap derived files being treated like an env/var/deployment change as opposed to be updated within an existing pod. |
It seems that you haven't understood the current proposal. We are not planning on making existing pods change configuration. I thought I pointed it out somewhere in my proposal but it seems I need to make it more explicit. |
I re-read it after all of the discussion and it seems clearer to me now. Is there any example of the json I would write to link the configMap to a deployment via a trigger? I assume it's just this annotation? deployment.kubernetes.io/triggered-by: configmap/foo |
Yes. You also need to have the configmap already mounted in the Deployment. |
@huggsboson you can use my PoC controller with any kind of Deployment-ConfigMap. Just add the annotation in the Deployment and try to update the ConfigMap. |
It seems like there are two ways to update volumes created from configMaps, the update in place and creating a new pod and tearing down the old one. It seems like the default behavior for most pods managed by Deployments should be create new, tear down old. Whereas most petsets would want to default to the update-in-place mechanism. With this proposal as it is we're asking users to do, admittedly small, work to get the correct default behavior for most use-cases I've seen. Can you help me understand if I'm missing something? |
There was some discussion in #13610 and #30716. The TL;DR is that an inline representation would add complexity to the API and to any tooling dealing with configmap, as well as to the implementation of controllers. We made ConfigMap a separate type deliberately. To achieve the apply flow you described, only 2 additional lines need to be changed in order to create a new ConfigMap and roll it out: the name of the ConfigMap being updated and the reference to it in the Deployment that consumes it. (Note that there must be some change to the Deployment spec in order to trigger a rollout and produce a new unique name for the new ReplicaSet.) The only thing missing from that approach is garbage collection of old ConfigMap objects when referencing ReplicaSets are all deleted. I'd like to use the server-side GC implementation to solve that problem. For someone generating a configmap from a file of a different syntax (e.g., K=V), I'd expect them to regenerate the configmap from a modified original file, much as they would for secrets. I agree it would reduce friction a little to provide a way to automatically generate the new configmap name and to update references to it, but I believe we have bigger fish to fry at the moment, such as finishing GC and controllerRef. |
For the record, having garbage collection for config maps would solve our problem, I think. We'd instruct our rollout automation to create a static config map per version, and rely on the garbage collection to clean the old versions up. |
cc @zefciu |
this proposal LGTM. Any sense of target milestone for it? |
This PR hasn't been active in 30 days. It will be closed in 59 days (Jan 26, 2017). You can add 'keep-open' label to prevent this from happening, or add a comment to keep it open another 90 days |
@Kargakis Closing this proposal. We should facilitate GC of config maps by adding ownerReferences for the appropriate ReplicaSets. We may need to make that optional somehow. |
A proposal for #22368
PoC: https://github.com/kargakis/configmap-rollout
@kubernetes/deployment @kubernetes/sig-apps
This change is