Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for re-architecting apply #1028

Merged
merged 1 commit into from
Oct 4, 2017

Conversation

pwittrock
Copy link
Member

No description provided.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 8, 2017
@mengqiy
Copy link
Member

mengqiy commented Sep 8, 2017

/assign

@cblecker
Copy link
Member

cblecker commented Sep 8, 2017

@kubernetes/sig-cli-proposals

Also cc: @kubernetes/sig-apps-proposals and @kubernetes/helm-maintainers .. I know there is some disparity between the way a helm upgrade works and the way a kubectl apply does when modifying existing manifests. I wonder if this is an opportunity to more closely align them within the kubernetes ecosystem :). If not, I apologize for the spam.

@k8s-ci-robot k8s-ci-robot added sig/cli Categorizes an issue or PR as relevant to SIG CLI. kind/design Categorizes issue or PR as related to design. sig/apps Categorizes an issue or PR as relevant to SIG Apps. labels Sep 8, 2017
@shiywang
Copy link

shiywang commented Sep 9, 2017

/sub

and optionally be able to show this to a user, whereas this conflict may not be detect with a PATCH
operation.

#### New approach
Copy link
Contributor

@smarterclayton smarterclayton Sep 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not move apply to the server via a formal API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bgrant0607 asked a similar question.

My hope as that the steps in this proposal will make moving the logic to an API easier by reducing the complexity of the existing solution and making it a modular package that does not depend on cobra or other kubectl libraries to diff & update objects.

I'd be open to exposing the re-architected libraries as an API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that's how we got strategic merge :P In the general case, the server can't know whether particular operations should be "replace" or "merge", and trying to capture all that flexibility in the API has created exactly the sort of complexity that motivated this doc. The alternative is just to do it client-side, where naturally there is more context available to be able to implement the exact semantics intended by the client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any examples in the API schema where it shouldn't be obvious? In the few ordered lists we have (e.g., command, args), those need to be replaced. All other cases should be merged. In the case of no conflict between the configured state and the live state, a merge and a replacement should be the same.

@smarterclayton
Copy link
Contributor

smarterclayton commented Sep 12, 2017 via email

@bgrant0607
Copy link
Member

bgrant0607 commented Sep 12, 2017

Server-side apply issue is kubernetes/kubernetes#17333

@bgrant0607
Copy link
Member

Also relevant are uses of both apply and patch in the declarative app mgmt doc: https://goo.gl/T66ZcD

Umbrella apply issue: kubernetes/kubernetes#35234

Original apply proposal: kubernetes/kubernetes#1702

@bgrant0607
Copy link
Member

Original original issue: kubernetes/kubernetes#1178

@bgrant0607
Copy link
Member

Previous umbrella issue with design description: kubernetes/kubernetes#15894

@anguslees
Copy link
Member

+100. I've been slowly building up evidence to justify why I need to implement a new client-side "patch" as part of kubecfg. I'm happy to see that other people are sharing my pain :)

#### New approach

- Read the live object
- Compare the live object to last-recorded and local files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you still plan to record last-applied as an annotation?
discussion of the versioning issues (apply v1alpha1, time passes, want to apply v1beta2) would be good as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this proposal does not change what is stored for any of the objects, and works exactly the same with the exception that it uses PUT instead of PATCH to perform the update. It might best to solve the issues around versioning in a server-side API. I imagine the server side API would type convert all 3 objects (recorded from annotation, local from request and remote from live-lookup) to the same version before performing the diff-merge-update.


- Read the live object
- Compare the live object to last-recorded and local files
- Update the fields on the live object that was read
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using what merge strategies?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merge strategy / behavior would be exactly the same as the current version of apply, though more easily extended due to code structuring.

@pwittrock
Copy link
Member Author

@smarterclayton Good point. I will add a section about that in the doc and the opportunity to integrate from the server-side on the future.

@bgrant0607
Copy link
Member

Create on PUT: kubernetes/kubernetes#2114

@bgrant0607
Copy link
Member

Also relevant: discriminated unions: kubernetes/kubernetes#35345

Copy link
Member

@bgrant0607 bgrant0607 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion about PATCH vs. PUT, but it seems to me that this doesn't solve what I see as the 2 biggest problems:

  1. Using compiled-in API types in kubectl
  2. Client-side apply

version skew, as the client will depend on the server supporting a patch semantic. There
is currently no versioning of the patch semantics, so the client cannot reliably know what
patch semantics the server supports.
- Patch syntax is not human readable, making it hard for humans to evaluate the end state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on the readability? Is this discussing strategic merge patch? Is there special syntax in the typical case other than for deletions? Where has the deletion syntax been made unreadable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is strategic merge patch. I think there are 2 main problems, there are several different directives now (delete, delete-primitive, ordering), and how the patch will be applied is communicated out of band. There are still a number of cases that are not supported by strategic patch that may require additional directives to support if done client side (e.g. how do we change the merge key for a field)

Directives:

  • deletion of complex objects from lists
  • deletion of primitive objects from lists (looks different)
  • absolute ordering of items in a list

Additional issues with readability:

  • merge keys for lists are not contained in the patch itself - unclear from looking at a patch whether it is adding an item to a list that looks similar to another item, or if it is updating the existing item in a list
  • merge strategy not clear - are items in a list being replace, or are they being merged, (or using retain keys)?
  • server silently drops directives that it doesn't recognize, so they may or may-not be used by the server - e.g. ordering may be randomized even with a directive specifying the ordering

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retainKeys is also a directive that you didn't mentioned above.

merge directives to the patch format on the server, and invoking them from the client.
- Having to update both client and server to fix bugs introduces challenges to supporting
version skew, as the client will depend on the server supporting a patch semantic. There
is currently no versioning of the patch semantics, so the client cannot reliably know what
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would version it by adding new content types, which I believe we also do with proto encodings.

But discoverability of supported versions is a good point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. discoverability: It's possible to specify supported content types for each verb X resource via OpenAPI

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add something about versioning the patch API in the future.

is currently no versioning of the patch semantics, so the client cannot reliably know what
patch semantics the server supports.
- Patch syntax is not human readable, making it hard for humans to evaluate the end state
that will result from the patch. `--dry-run` is nearly useless for understanding the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is dry run useless? Because it is also buggy and incomplete? Because it is implemented in the client rather than in the server?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't show the user what they really want to know - what will be the new object in the cluster after apply is run. It somewhat shows the changes being made, but without the context of what they are being made on.

  • I would expect apply users care less about the patch format itself, and care more about the object in the cluster that will result from the patch.
  • The patch output obfuscates validation errors caused by the values set by patch conflicting with other values set on the live object missing from the config (e.g. rolloutStrategy for Deployment)
  • The patch itself is buggy and broken cases are hard to see just by looking at a patch (e.g. container ports merge key being incorrect)
  • The server may apply the patch differently than the client, so running the patch locally against the file may result in something different than what the server does (e.g. the server doesn't support optional directives because it is skewed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I don't understand what apply's current patch shows me when I do --dry-run -o json. It looks like it is just showing me the local object without applying the patch.

to update an object and frequently has to be expanded with new side-directives describing
how the server should apply the patch. Updates are done in the form of adding new
merge directives to the patch format on the server, and invoking them from the client.
- Having to update both client and server to fix bugs introduces challenges to supporting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links to exemplary bugs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me: links to bugs are below.

- Generating a patch to make a change is more complicated than simply making the change to
the object read from the server. The apply command reads the live object from the server
and diffs it against the local configuration. Building a patch containing special merge
directives is a more complicated approach than modifying the object read from the server
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True.

However:

  • My impression is that most of the complexity is not in the directives but in the API schema and defaulting behaviors, as I mentioned in the server-side apply proposal.
  • The schema needs to be accurately communicated to the client from the server (e.g., via extended OpenAPI) for client-side patch operations in order to support API aggregation and CRD, which re-introduces versioning and skew issues until the syntax stabilizes, but actually would improve skew behavior once it does.
  • We need to support these patch behaviors independently of apply.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression is that most of the complexity is not in the directives but in the API schema and defaulting behaviors, as I mentioned in the server-side apply proposal.

Constructing a set of directives to order and delete items from a list results in more complicated code than just deleting the objects from the list.

The schema needs to be accurately communicated to the client from the server (e.g., via extended OpenAPI) for client-side patch operations in order to support API aggregation and CRD, which re-introduces versioning and skew issues until the syntax stabilizes, but actually would improve skew behavior once it does.

Agreed. This proposal should also work independently of whether it is done on the client or server. The focus of this proposal is on rewriting the 3-way diff code to be maintainable. The inputs will be the 3 versions of the object and the output will be the new object.

We need to support these patch behaviors independently of apply.

Yes, but why do we need to use them in apply?

and diffs it against the local configuration. Building a patch containing special merge
directives is a more complicated approach than modifying the object read from the server
and updating it with a PUT.
- Patches provide weaker optimistic locking guarantees than PUT. When encountering an optimistic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's the case. I thought resourceVersion could be provided in patches.

- Compare the live object to last-recorded and local files
- Update the fields on the live object that was read
- Send a PUT to update the modified object
- If encountering optimistic lock failure, retry after re-reading the object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, GET, merge, and then try to PUT again.

testing challenges. We should instead modularize discrete responsibilities - collating the
object values and updating the target object.

#### Phase 1: Parse last-recorded, local, live objects and collate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last-applied-configuration?


Bad:

- Larger request sizes to the server than when sending a patch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had also discussed sending the full configured resource decorated with deletion directives, which would have been slightly larger.

- are spread across 1000+ lines of existing code
- Changes that add new directives require updates in multiple locations - create patch + apply patch

### PRs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want overlays to work, so I think all of these problems still need to be solved for strategic merge patch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We should talk offline about this, since the requirements for overlays may be a subset of what is need for apply to work. The semantics of apply are a bit different (though there is a lot of overlap) since apply is merging 3 modified versions of the same object to produce an updated object. Whereas overlays do work in this context.

@bgrant0607
Copy link
Member

I'm not opposed to implementing a custom merge operation for apply if that's simpler than diff and patch, but I disagree that it would address most of the difficulties discussed in the motivation.

- Introduce logic for new a merge strategy by defining a new *Visitor* implementation
- Introduce logic on structuring of a field by updating the parsing function for that field type

If the apply diff logic was redesigned, most of the preceding PRs could be implemented by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this would still change user-visible behavior. While technically bug fixes, we should consider how to communicate such changes to users, whether we need to retain prior behavior, and how we should transition to implementations with different behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this makes sense.

In some cases the fixes purely fix something that was broken or unspecified (e.g. maintaining ordering for lists when merging is strictly more correct than randomizing the ordering). In other cases I think it is pretty clearly an improvement, but the change should be communicated (e.g. deleting an item from a primitive lists now deletes it).

The cases more related to your point would be fixing an unintended / broken merge-key or merge-strategy.


Bad:

- Larger request sizes to the server than when sending a patch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth to point out, the size will be doubled, but I think it's fine.
Before: object in annotaion (1) + patch (delta) ~ 1
After: object in annotation (1) + new object (1) = 2

@pwittrock
Copy link
Member Author

pwittrock commented Sep 26, 2017

@bgrant0607

I'm not opposed to implementing a custom merge operation for apply if that's simpler than diff and patch, but I disagree that it would address most of the difficulties discussed in the motivation.

The motivation then didn't do a good job of conveying that the primary problem this is trying to solve is the maintainability and health of the codebase, and how the proposed changes are intended to improve this area.

There seems to be consensus with the @apelisse and @mengqiy that the new architecture will be simpler to maintain. It might be simpler to explain in person with the code structure pulled up in an IDE.

@pwittrock
Copy link
Member Author

@bgrant0607 @smarterclayton I've tried to address your comments. LMK if there is anything else.

@pwittrock
Copy link
Member Author

/assign @apelisse

@pwittrock
Copy link
Member Author

@apelisse LMK if there is anything I should change since you are reviewing the code.

@bgrant0607
Copy link
Member

@pwittrock Did you intend to push new updates? I don't see any commits newer than about a week old

@pwittrock
Copy link
Member Author

@bgrant0607 I not sure what changes to this proposal you are requesting before it is merged.

`kubectl apply` reads a file or set of files, and updates the cluster state based off the file contents.
It does a couple things:

1. Create / Update / (Delete) the resource based on the resources
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the live resources based on the file contents

It does a couple things:

1. Create / Update / (Delete) the resource based on the resources
2. Only set / delete fields defined in the file either currently or in during last update from the file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update currently and previously configured fields, without clobbering fields set by other means, such as imperative kubectl commands, other deployment and management tools, admission controllers, initializers, horizontal and vertical autoscalers, operators, and other controllers.

1. Create / Update / (Delete) the resource based on the resources
2. Only set / delete fields defined in the file either currently or in during last update from the file
- Retain updates to fields made by controllers that were never managed in the file
3. Merge fields based on the field name, merge lists based on a field contained within the list elements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this below. It's a detail of field updates.

3. Merge fields based on the field name, merge lists based on a field contained within the list elements

The apply logic has been difficult to maintain from a contributor perspective, and difficult
to understand from a user perspective. Much of this pain has comes from the `apply` command out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much of this pain has come from trying to paper over API schema complexity in an ad hoc incremental fashion using imperative patch directives.

The apply logic has been difficult to maintain from a contributor perspective, and difficult
to understand from a user perspective. Much of this pain has comes from the `apply` command out
growing its implementation. In order to continue to support apply and provide better user visibility
into what it is doing, the command must be re-architected to break client dependencies on the server-side
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we agreed that client-server was not the main issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to make apply sufficiently maintainable and extensible to new API types, as well as to make its behavior more intuitive for users, the merge behavior, including how it is specified in the API schema, must be systematically redesigned and more thoroughly tested.

to update an object and frequently has to be expanded with new side-directives describing
how the server should apply the patch. Updates are done in the form of adding new
merge directives to the patch format on the server, and invoking them from the client.
- Having to update both client and server to fix bugs introduces challenges to supporting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me: links to bugs are below.


Issues with updating the object by calculating and sending a patch to the server

- Patch API is not fully featured enough to represent all of the operations required
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has been frequently expanded

It doesn't have to be this way.

- Patch API is not fully featured enough to represent all of the operations required
to update an object and frequently has to be expanded with new side-directives describing
how the server should apply the patch. Updates are done in the form of adding new
merge directives to the patch format on the server, and invoking them from the client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, please rephrase. The schema MUST come from the server, in order to properly work with version skew, API aggregation, CRDs, etc.

merge directives to the patch format on the server, and invoking them from the client.
- Having to update both client and server to fix bugs introduces challenges to supporting
version skew, as the client will depend on the server supporting a patch semantic. There
is currently no versioning of the patch semantics, so the client cannot reliably know what
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add something about versioning the patch API in the future.

@pwittrock
Copy link
Member Author

@bgrant0607 I've updated the doc to be focussed only on the code refactoring. I added some bugs and called out the need to fix issues with the schema semantics as a separate item.

- Compare the live object to last-applied and local files
- Update the fields on the live object that was read
- Send a PUT to update the modified object
- If encountering optimistic lock failure, retry after re-reading the object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retry back to step 1?


In order to make apply sufficiently maintainable and extensible to new API types, as well as to make its
behavior more intuitive for users, the merge behavior, including how it is specified in the API schema,
must be systematically redesigned and more thoroughly tested.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was Brian :)

@smarterclayton
Copy link
Contributor

No further comments from me

- define interfaces for encapuslating logic single-purpose in structs
- use PUT instead of PATCH so we don't need to generate PATCH directives
@bgrant0607
Copy link
Member

Thanks.
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 4, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue.

@k8s-github-robot k8s-github-robot merged commit 361b627 into kubernetes:master Oct 4, 2017
danehans pushed a commit to danehans/community that referenced this pull request Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/cli Categorizes an issue or PR as relevant to SIG CLI. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.