-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuration reconciliation (aka kubectl apply) #1702
Comments
/cc @jasonkuhrt |
@bgrant0607 Thanks. I may contribute some thoughts primarily regarding use-cases at littleBits as we continue to experiment with Kubernetes/GKE. For now, it looks like my first pass at creating an issue in this space is well covered already! |
I've been looking at implementing this and have a question on step 4, "Diff the specified object with an existing object in apiserver." I am at the point where I have two The naive approach is to just convert both objects to pkg/api objects (from the versioned v1beta* objects) then either marshal to JSON and do a diff between the two JSON objects or just compare the two Go objects directly. That brings us to our first issue: What about fields that are provided by the server but not expected to be locally available, such as Status (as opposed to Spec)? How do we make sure we compare the fields that are relevant, instead of those that are server-only or temporary? We could focus on the distinction made in https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/api-conventions.md and design diff to only compare fields named Spec and to ignore any objects that don't have a Spec field. But will third-party API object designers know to use a field named Spec if they want any diffing or reconciliation to happen? If we go this route, it seems that we would want to call that out a little more explicity in api-conventions.md. The second issue is what to recommend once a diff is produced. The reconciliation step could recommend a POST to modify the resource, but many resource types may be immutable and/or have specific fields that are immutable. Ideally in that case the reconciliation could recommend the right action (POST vs DELETE/PUT), but it would somehow need access to metadata which indicates the mutability properties of each type and field. In theory both problems could be solved with some kind of metadata struct tag that indicates fields to reconcile and their mutability, but it would need to be generic, applied to all kube objects and be able to be applied to all custom API objects as well. If there already has been work on this, please point me in the right direction and I'll be happy to pick it up. |
Fantastic! We shouldn't be converting objects to the internal representation in kubectl #3955. We need to record the set of fields we want to merge in an annotation #1201. Lots of things will be setting spec fields, not to mention metadata and status fields, so we just need to keep track of what the user cares about specifying declaratively. Ideally we'd record the fields after applying general metadata propagation, such as namespace, apiVersion, labels, annotations, and name prefixes. Unfortunately, we currently do that after converting to the internal representation. However, since all objects support uniform metadata, we should be able to transform the metadata without fully parsing the rest of the object into specialized structs, but changes to pkg/runtime/types.go are likely required, since it currently only factors out TypeMeta. If we parsed the json into a generic structure, as with enscope, we could traverse it and record a list of json paths. The union of currently specified json paths and those recorded in the annotation on the object would indicate the set of fields to merge. The annotation is required in order to facilitate declarative field nullification. If we wanted to punt on that initially, it would just require that users take an extra set to nullify fields before deleting those fields from their configs entirely. We'd then construct a patch set. The annotation containing fields to merge would be among the fields patched. Ideally, we'd literally just use patch. We'd probably need to resolve the issues it currently has #4008. For diff (for reconcile dry run), we could use a similar approach as patch. I would also support server-side support for patch (as in HTTP PATCH) and diff. As for which fields are mutable: We should appropriately annotate all the fields #2970. kubectl then should be able to extract the information from the swagger description. Swagger 1.2 doesn't yet support readonly, but we can either extend the 1.2 spec in a non-standard way or embed the info in a standard field. Note that you can let specific errors drive the process, too. For example, we could first POST, assuming the object didn't exist. If that failed because the object already existed, we could then try PUT. If that failed because it tried to change an immutable field, we could then DELETE, then POST. |
BTW, I put a list of things that may mutate object fields here: #1502 (comment) . |
BTW, @ghodss, in case you missed it, I created a CLI roadmap: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/cli-roadmap.md Let me know if you have any questions or feedback. |
On rolllingupdate: It should be possible to indicate declaratively that the user wants a rollingupdate instead of an in-place update for a replication controller. The user would change the name of the replication controller, update the uniquifying deployment-specific label, and change whatever else they wanted to be updated (e.g., a container's image tag). A rolling-update annotation on the new replication controller would specify the replication controllers to replace (e.g., via label selector) and the parameters accepted by rollingupdate, updatePeriod, pollInterval, and timeout: We could also consider a separate pseudo-object rather than an annotation, but then we'd need to pair it with the corresponding replication controller before deciding how to update it. |
The original reason I wanted to indicate which fields were diff'able were due to the mutability problem, but that seems well-handled by #2970. There seems to be another idea here (stemming primarily from #1178) which involves using annotations to indicate which fields should be considered in a merge to accommodate fields that may be changed outside of the tooling that's doing the diff. But instead of storing the set of fields to merge in an annotation, I believe we should just give the user a way to indicate which fields they care about comparing when they go to do the diff. It's a property of the diff action itself, not a property of the object. I suggest accomplishing this indication with explicit null values. Here's a quick example. Let's say we have an object that represents a very simplified replication controller type that looks like this:
Now let's say the To summarize the steps I imagine being taken to do a diff:
Please let me know what you think of this approach. I know this conversation has been going on for some time (#1502, #1178), but I think this approach gives us the separation of concerns where we really put it on the user to make sure all their tools are working properly together, and we just provide the functionality of comparing whatever their tooling wants to verify at any given time. |
@bgrant0607 Any thoughts on this approach? Am I missing anything in terms of use cases? |
This sounds amazing. This layer of interaction would provide a huge boost in confidence when trying to "update" resources in Kubernetes. @ghodss @bgrant0607 How do we know that during the time the user is viewing this diff that the resource has not changed behind his/her back? Is a locking mechanism required wherein Kubernetes locks the resource while waiting for the diff to be confirmed by the user (or for that matter, other tools/machines built on top etc.) --edit But on a quick second thought, naive locking doesn't make sense, you can't "lock" a Maybe I'm over-complicating things but it might be relevant if the diff is based on HTTP SSE technology wherein should any significant resource changes occur while the diff is being reviewed a new diff can be presented to the user. Furthermore, upon "confirming" the diff Kubernetes would still have to ensure that the diff accepted is still the diff in effect by the time the user's response has reached the Kubernetes API etc. If Kubernetes detects that a resource changed in that sliver of time it should re-present the new diff (or just error if the resource change is fatal, like a crashed |
@jasonkuhrt Kube already has |
@ghodss Cool, so what is your vision for refreshing or re-presenting diffs when there is some sort of underlying conflict/change? |
A new method to compute a strategic merge patch from an original configuration, a modified configuration and the current (live) configuration using delete decorator semantics has been submitted as #13820. Remaining steps are to store user supplied configurations as annotations on the affected objects, then add the apply verb to kubectl, using the new method to compute the patch to be applied using the configuration stored in annotations as the original. cc @bgrant0607, @ghodss |
To finish implementing kubectl apply, I need a way to reference the previous configuration of an object, as supplied by the user (i.e., specifically not including changes made to the configuration by automated processes). One avenue that has been suggested is to store the configuration in an annotation. I've been somewhat naively assuming that we'd go this route, as noted in the previous comment. Thinking about the relative merits of the suggestion, however, and how to implement it, a number of questions come to mind. For example:
Some alternatives to using annotations come to mind. For example:
Ultimately, this problem exists because we don't have Configuration as a first class resource. If we did, then it would be easy to watch for changes, and to implement creates, deletes and updates using the controller pattern. The user would push a new Configuration, and the Configuration controller would notice the change, compute the delta from the previous configuration using the new method from #13820, and then modify the resources accordingly. For now, at least, the best approach is probably to focus first on using the annotation. If we find an appropriate annotation, we use it. If not, we fail. Once that's working correctly, we can think some more about where to put hooks to create the annotation automatically. Worst case, we hook the command specific call sites to the helper methods cited above. |
The user can shoot themselves in the foot many ways. Deleting an On Sep 25, 2015, at 7:33 PM, Jack Greenfield notifications@github.com To finish implementing kubectl apply, I need a way to reference the One avenue that has been suggested is to store the configuration in an
Some alternatives to using annotations come to mind. For example:
Ultimately, this problem exists because we don't have Configuration as a For now, at least, the last alternative in the list above feels like the — |
That's certainly true, but doesn't make it sit much easier in my mind. I'm looking forward to having a first class Configuration resource, and to rewiring apply to take advantage of it. |
I've submitted #14621 and #14626, which implement the kubectl apply command, and add calls to automatically set the annotation it uses to most call paths used to update configuration. See #14626 for details regarding the exceptions. With these pull requests, this issue should now be addressed, except for conflict detection in kubectl/pkg/util/strategicmergepatch/patch.go, which has a TODO and clean insertion point. cc @bgrant0607, @ghodss |
If we don't find an appropriate annotation, The only purpose of the annotation is to find fields that were formerly specified by config but have now been removed. If you don't have the annotation, you can simply assume that the only fields that you want to control with config are the ones that are currently in the patch (not an unreasonable assumption, and likely the most common case), so you can just store the patch as that annotation. If there does happen to be a field that is left "over-specified" in a sense, i.e. was specified in the previous config but no longer in the current config but is not getting reset back to the default state, a user can go clean that up manually. Honestly, the whole annotation thing is just a UX enhancement and a nice-to-have. Puppet, for example, which manages a system's state through declarative config, just assumes that if you remove a reference to something (e.g. some RPM that you want installed on a machine), that that RPM's state is just no longer controlled by puppet, and it leaves the RPM installed on the machine. If you actually want to clean up the RPM, you have to manually uninstall it, or refresh a machine's state with puppet from scratch. The purpose of this is to allow you to have stuff on the machine that isn't controlled by puppet (which is very similar to our rationale as well). We're implementing a nicety here by trying to track old fields you've removed, but really it should just be a "best effort" approach and not something considered vital in the config reconciliation process. |
@bgrant0607 Given that we now can and do update the annotation in other code paths via #14626, it would seem a shame to back out that behavior. Conflicts can be detected when the patch is computed, per the TODO in the CreateThreeWayMergePatch method, by diffing the patch and the current with no additions. It's a small change and calls existing methods. Just didn't have time to get to it. Will add it in shortly. |
@ghodss That's how it behaves. It goes ahead and applies the patch, with no deletions. |
I'm fine with updates to the annotation if it's already done and doesn't cause problems for people not using apply. |
kubectl apply has been implemented. Yeah! Closing in favor of more specific issues. |
A concrete example was discussed here: |
Forked from #1007 and #1325 .
We want to support management of services via declarative configuration.
The configuration files written by the user may need to undergo multiple transformations before submitting them to the API, as discussed in #1694, but the result of these steps should be concrete objects. These objects may need to be created, updated, and/or deleted.
On an object-by-object basis, it should be possible via kubectl (via a client library) to do the following:
The configured fields should be recorded in the object via annotations, as discussed in #1178 and #1201. Only the previously and newly configured fields, labels, and annotations should be diff'ed and/or updated. Other fields from the existing object should be ignored/merged, and the set of configured fields should be updated with each update.
We ultimately do want to automate most of the orchestration/workflow involved in deployment. However, this is fairly complex, and we can't necessarily automate all of it in a domain-independent, use-case-independent manner. In general, a full deployment would likely be comprised of multiple files, each containing multiple objects. These objects may have inter-dependencies (e.g., right now services must be created before their clients are created), rolling updates require special treatment (#1353), and application-specific rollout logic may be required. Therefore, I recommend we keep the more general deployment workflow separate from the basic primitive mechanism that reconciles a single object, and ensure that the two are composable. I'll file another issue for the workflow part.
Among other things, composition means that the invoker of the tool and/or library needs to be able to distinguish noop (e.g., object already exists and matches config), success, retryable failure, and permanent failure, so the overall workflow can decide what to do after each step.
/cc @ghodss
The text was updated successfully, but these errors were encountered: