-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for re-architecting apply #1028
Conversation
/assign |
@kubernetes/sig-cli-proposals Also cc: @kubernetes/sig-apps-proposals and @kubernetes/helm-maintainers .. I know there is some disparity between the way a |
/sub |
and optionally be able to show this to a user, whereas this conflict may not be detect with a PATCH | ||
operation. | ||
|
||
#### New approach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not move apply to the server via a formal API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bgrant0607 asked a similar question.
My hope as that the steps in this proposal will make moving the logic to an API easier by reducing the complexity of the existing solution and making it a modular package that does not depend on cobra or other kubectl libraries to diff & update objects.
I'd be open to exposing the re-architected libraries as an API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because that's how we got strategic merge :P In the general case, the server can't know whether particular operations should be "replace" or "merge", and trying to capture all that flexibility in the API has created exactly the sort of complexity that motivated this doc. The alternative is just to do it client-side, where naturally there is more context available to be able to implement the exact semantics intended by the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any examples in the API schema where it shouldn't be obvious? In the few ordered lists we have (e.g., command, args), those need to be replaced. All other cases should be merged. In the case of no conflict between the configured state and the live state, a merge and a replacement should be the same.
Yeah library components are good. I just wanted to see it mentioned in the
doc so when the "apply on server" proposal gets thrown together I can point
back to that line.
…On Mon, Sep 11, 2017 at 3:36 PM, Phillip Wittrock ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In contributors/design-proposals/sig-cli/apply_v2.md
<#1028 (comment)>:
> +- Patch syntax is not human readable, making it hard for humans to evaluate the end state
+ that will result from the patch. `--dry-run` is nearly useless for understanding the
+ end state that will result from apply, and implementing a diff operation has been exceedingly
+ copmlex.
+- Generating a patch to make a change is more complicated than simply making the change to
+ the object read from the server. The apply command reads the live object from the server
+ and diffs it against the local configuration. Building a patch containing special merge
+ directives is a more complicated approach than modifying the object read from the server
+ and updating it with a PUT.
+- Patches provide weaker optimistic locking guarantees than PUT. When encountering an optimistic
+ lock failure in apply, it is easy to re-read the object and try to update it again. If the
+ apply command tries to modify a field that was modified since apply read the object, it will fail
+ and optionally be able to show this to a user, whereas this conflict may not be detect with a PATCH
+ operation.
+
+#### New approach
@bgrant0607 <https://github.com/bgrant0607> asked a similar question.
My hope as that the steps in this proposal will make moving the logic to
an API easier by reducing the complexity of the existing solution and
making it a modular package that does not depend on cobra or other kubectl
libraries to diff & update objects.
I'd be open to exposing the re-architected libraries as an API.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#1028 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_pw1_1djB_9GgUcmUOF9q9MQFrHpZks5shYu6gaJpZM4PQjIn>
.
|
Server-side apply issue is kubernetes/kubernetes#17333 |
Also relevant are uses of both apply and patch in the declarative app mgmt doc: https://goo.gl/T66ZcD Umbrella apply issue: kubernetes/kubernetes#35234 Original apply proposal: kubernetes/kubernetes#1702 |
Original original issue: kubernetes/kubernetes#1178 |
Previous umbrella issue with design description: kubernetes/kubernetes#15894 |
+100. I've been slowly building up evidence to justify why I need to implement a new client-side "patch" as part of kubecfg. I'm happy to see that other people are sharing my pain :) |
#### New approach | ||
|
||
- Read the live object | ||
- Compare the live object to last-recorded and local files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you still plan to record last-applied as an annotation?
discussion of the versioning issues (apply v1alpha1, time passes, want to apply v1beta2) would be good as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this proposal does not change what is stored for any of the objects, and works exactly the same with the exception that it uses PUT instead of PATCH to perform the update. It might best to solve the issues around versioning in a server-side API. I imagine the server side API would type convert all 3 objects (recorded from annotation, local from request and remote from live-lookup) to the same version before performing the diff-merge-update.
|
||
- Read the live object | ||
- Compare the live object to last-recorded and local files | ||
- Update the fields on the live object that was read |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using what merge strategies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The merge strategy / behavior would be exactly the same as the current version of apply, though more easily extended due to code structuring.
@smarterclayton Good point. I will add a section about that in the doc and the opportunity to integrate from the server-side on the future. |
Create on PUT: kubernetes/kubernetes#2114 |
Also relevant: discriminated unions: kubernetes/kubernetes#35345 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion about PATCH vs. PUT, but it seems to me that this doesn't solve what I see as the 2 biggest problems:
- Using compiled-in API types in kubectl
- Client-side apply
version skew, as the client will depend on the server supporting a patch semantic. There | ||
is currently no versioning of the patch semantics, so the client cannot reliably know what | ||
patch semantics the server supports. | ||
- Patch syntax is not human readable, making it hard for humans to evaluate the end state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you expand on the readability? Is this discussing strategic merge patch? Is there special syntax in the typical case other than for deletions? Where has the deletion syntax been made unreadable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is strategic merge patch. I think there are 2 main problems, there are several different directives now (delete, delete-primitive, ordering), and how the patch will be applied is communicated out of band. There are still a number of cases that are not supported by strategic patch that may require additional directives to support if done client side (e.g. how do we change the merge key for a field)
Directives:
- deletion of complex objects from lists
- deletion of primitive objects from lists (looks different)
- absolute ordering of items in a list
Additional issues with readability:
- merge keys for lists are not contained in the patch itself - unclear from looking at a patch whether it is adding an item to a list that looks similar to another item, or if it is updating the existing item in a list
- merge strategy not clear - are items in a list being replace, or are they being merged, (or using retain keys)?
- server silently drops directives that it doesn't recognize, so they may or may-not be used by the server - e.g. ordering may be randomized even with a directive specifying the ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retainKeys
is also a directive that you didn't mentioned above.
merge directives to the patch format on the server, and invoking them from the client. | ||
- Having to update both client and server to fix bugs introduces challenges to supporting | ||
version skew, as the client will depend on the server supporting a patch semantic. There | ||
is currently no versioning of the patch semantics, so the client cannot reliably know what |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would version it by adding new content types, which I believe we also do with proto encodings.
But discoverability of supported versions is a good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re. discoverability: It's possible to specify supported content types for each verb X resource via OpenAPI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add something about versioning the patch API in the future.
is currently no versioning of the patch semantics, so the client cannot reliably know what | ||
patch semantics the server supports. | ||
- Patch syntax is not human readable, making it hard for humans to evaluate the end state | ||
that will result from the patch. `--dry-run` is nearly useless for understanding the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is dry run useless? Because it is also buggy and incomplete? Because it is implemented in the client rather than in the server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't show the user what they really want to know - what will be the new object in the cluster after apply is run. It somewhat shows the changes being made, but without the context of what they are being made on.
- I would expect apply users care less about the patch format itself, and care more about the object in the cluster that will result from the patch.
- The patch output obfuscates validation errors caused by the values set by patch conflicting with other values set on the live object missing from the config (e.g. rolloutStrategy for Deployment)
- The patch itself is buggy and broken cases are hard to see just by looking at a patch (e.g. container ports merge key being incorrect)
- The server may apply the patch differently than the client, so running the patch locally against the file may result in something different than what the server does (e.g. the server doesn't support optional directives because it is skewed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I don't understand what apply's current patch shows me when I do --dry-run -o json
. It looks like it is just showing me the local object without applying the patch.
to update an object and frequently has to be expanded with new side-directives describing | ||
how the server should apply the patch. Updates are done in the form of adding new | ||
merge directives to the patch format on the server, and invoking them from the client. | ||
- Having to update both client and server to fix bugs introduces challenges to supporting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Links to exemplary bugs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me: links to bugs are below.
- Generating a patch to make a change is more complicated than simply making the change to | ||
the object read from the server. The apply command reads the live object from the server | ||
and diffs it against the local configuration. Building a patch containing special merge | ||
directives is a more complicated approach than modifying the object read from the server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True.
However:
- My impression is that most of the complexity is not in the directives but in the API schema and defaulting behaviors, as I mentioned in the server-side apply proposal.
- The schema needs to be accurately communicated to the client from the server (e.g., via extended OpenAPI) for client-side patch operations in order to support API aggregation and CRD, which re-introduces versioning and skew issues until the syntax stabilizes, but actually would improve skew behavior once it does.
- We need to support these patch behaviors independently of apply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression is that most of the complexity is not in the directives but in the API schema and defaulting behaviors, as I mentioned in the server-side apply proposal.
Constructing a set of directives to order and delete items from a list results in more complicated code than just deleting the objects from the list.
The schema needs to be accurately communicated to the client from the server (e.g., via extended OpenAPI) for client-side patch operations in order to support API aggregation and CRD, which re-introduces versioning and skew issues until the syntax stabilizes, but actually would improve skew behavior once it does.
Agreed. This proposal should also work independently of whether it is done on the client or server. The focus of this proposal is on rewriting the 3-way diff code to be maintainable. The inputs will be the 3 versions of the object and the output will be the new object.
We need to support these patch behaviors independently of apply.
Yes, but why do we need to use them in apply?
and diffs it against the local configuration. Building a patch containing special merge | ||
directives is a more complicated approach than modifying the object read from the server | ||
and updating it with a PUT. | ||
- Patches provide weaker optimistic locking guarantees than PUT. When encountering an optimistic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's the case. I thought resourceVersion could be provided in patches.
- Compare the live object to last-recorded and local files | ||
- Update the fields on the live object that was read | ||
- Send a PUT to update the modified object | ||
- If encountering optimistic lock failure, retry after re-reading the object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, GET, merge, and then try to PUT again.
testing challenges. We should instead modularize discrete responsibilities - collating the | ||
object values and updating the target object. | ||
|
||
#### Phase 1: Parse last-recorded, local, live objects and collate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last-applied-configuration?
|
||
Bad: | ||
|
||
- Larger request sizes to the server than when sending a patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had also discussed sending the full configured resource decorated with deletion directives, which would have been slightly larger.
- are spread across 1000+ lines of existing code | ||
- Changes that add new directives require updates in multiple locations - create patch + apply patch | ||
|
||
### PRs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want overlays to work, so I think all of these problems still need to be solved for strategic merge patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. We should talk offline about this, since the requirements for overlays may be a subset of what is need for apply to work. The semantics of apply are a bit different (though there is a lot of overlap) since apply is merging 3 modified versions of the same object to produce an updated object. Whereas overlays do work in this context.
I'm not opposed to implementing a custom merge operation for apply if that's simpler than diff and patch, but I disagree that it would address most of the difficulties discussed in the motivation. |
- Introduce logic for new a merge strategy by defining a new *Visitor* implementation | ||
- Introduce logic on structuring of a field by updating the parsing function for that field type | ||
|
||
If the apply diff logic was redesigned, most of the preceding PRs could be implemented by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this would still change user-visible behavior. While technically bug fixes, we should consider how to communicate such changes to users, whether we need to retain prior behavior, and how we should transition to implementations with different behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this makes sense.
In some cases the fixes purely fix something that was broken or unspecified (e.g. maintaining ordering for lists when merging is strictly more correct than randomizing the ordering). In other cases I think it is pretty clearly an improvement, but the change should be communicated (e.g. deleting an item from a primitive lists now deletes it).
The cases more related to your point would be fixing an unintended / broken merge-key or merge-strategy.
|
||
Bad: | ||
|
||
- Larger request sizes to the server than when sending a patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth to point out, the size will be doubled, but I think it's fine.
Before: object in annotaion (1) + patch (delta) ~ 1
After: object in annotation (1) + new object (1) = 2
The motivation then didn't do a good job of conveying that the primary problem this is trying to solve is the maintainability and health of the codebase, and how the proposed changes are intended to improve this area. There seems to be consensus with the @apelisse and @mengqiy that the new architecture will be simpler to maintain. It might be simpler to explain in person with the code structure pulled up in an IDE. |
@bgrant0607 @smarterclayton I've tried to address your comments. LMK if there is anything else. |
/assign @apelisse |
@apelisse LMK if there is anything I should change since you are reviewing the code. |
@pwittrock Did you intend to push new updates? I don't see any commits newer than about a week old |
@bgrant0607 I not sure what changes to this proposal you are requesting before it is merged. |
`kubectl apply` reads a file or set of files, and updates the cluster state based off the file contents. | ||
It does a couple things: | ||
|
||
1. Create / Update / (Delete) the resource based on the resources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the live resources based on the file contents
It does a couple things: | ||
|
||
1. Create / Update / (Delete) the resource based on the resources | ||
2. Only set / delete fields defined in the file either currently or in during last update from the file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update currently and previously configured fields, without clobbering fields set by other means, such as imperative kubectl commands, other deployment and management tools, admission controllers, initializers, horizontal and vertical autoscalers, operators, and other controllers.
1. Create / Update / (Delete) the resource based on the resources | ||
2. Only set / delete fields defined in the file either currently or in during last update from the file | ||
- Retain updates to fields made by controllers that were never managed in the file | ||
3. Merge fields based on the field name, merge lists based on a field contained within the list elements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this below. It's a detail of field updates.
3. Merge fields based on the field name, merge lists based on a field contained within the list elements | ||
|
||
The apply logic has been difficult to maintain from a contributor perspective, and difficult | ||
to understand from a user perspective. Much of this pain has comes from the `apply` command out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much of this pain has come from trying to paper over API schema complexity in an ad hoc incremental fashion using imperative patch directives.
The apply logic has been difficult to maintain from a contributor perspective, and difficult | ||
to understand from a user perspective. Much of this pain has comes from the `apply` command out | ||
growing its implementation. In order to continue to support apply and provide better user visibility | ||
into what it is doing, the command must be re-architected to break client dependencies on the server-side |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we agreed that client-server was not the main issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to make apply sufficiently maintainable and extensible to new API types, as well as to make its behavior more intuitive for users, the merge behavior, including how it is specified in the API schema, must be systematically redesigned and more thoroughly tested.
to update an object and frequently has to be expanded with new side-directives describing | ||
how the server should apply the patch. Updates are done in the form of adding new | ||
merge directives to the patch format on the server, and invoking them from the client. | ||
- Having to update both client and server to fix bugs introduces challenges to supporting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me: links to bugs are below.
|
||
Issues with updating the object by calculating and sending a patch to the server | ||
|
||
- Patch API is not fully featured enough to represent all of the operations required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has been frequently expanded
It doesn't have to be this way.
- Patch API is not fully featured enough to represent all of the operations required | ||
to update an object and frequently has to be expanded with new side-directives describing | ||
how the server should apply the patch. Updates are done in the form of adding new | ||
merge directives to the patch format on the server, and invoking them from the client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, please rephrase. The schema MUST come from the server, in order to properly work with version skew, API aggregation, CRDs, etc.
merge directives to the patch format on the server, and invoking them from the client. | ||
- Having to update both client and server to fix bugs introduces challenges to supporting | ||
version skew, as the client will depend on the server supporting a patch semantic. There | ||
is currently no versioning of the patch semantics, so the client cannot reliably know what |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add something about versioning the patch API in the future.
@bgrant0607 I've updated the doc to be focussed only on the code refactoring. I added some bugs and called out the need to fix issues with the schema semantics as a separate item. |
- Compare the live object to last-applied and local files | ||
- Update the fields on the live object that was read | ||
- Send a PUT to update the modified object | ||
- If encountering optimistic lock failure, retry after re-reading the object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retry back to step 1?
|
||
In order to make apply sufficiently maintainable and extensible to new API types, as well as to make its | ||
behavior more intuitive for users, the merge behavior, including how it is specified in the API schema, | ||
must be systematically redesigned and more thoroughly tested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was Brian :)
No further comments from me |
- define interfaces for encapuslating logic single-purpose in structs - use PUT instead of PATCH so we don't need to generate PATCH directives
Thanks. |
Automatic merge from submit-queue. |
No description provided.