API should differentiate between lists to merge and lists to replace in reconciliation #4889

ghodss · 2015-02-27T07:13:00Z

As part of #1702, we would like to use declarative configuration to maintain the state of a Kubernetes cluster. We do this by issuing the local objects to the server as JSON HTTP PATCH requests. As per the JSON merge patch spec, all maps are merged together, keys that are omitted from the patch are left untouched in the document, and if you want to delete a key you simply use key: null. This is exactly what we want in config reconciliation. However, lists present a special challenge in that the spec says they should always be replaced. Often that isn't what we want.

Let's say we start with the following Pod:

spec:
  containers:
    - name: nginx
      image: nginx-1.0

...and we POST that to the server. Then a controller comes along and adds another container to this pod before it starts up. (Not possible today but likely a scenario we will want to support.)

spec:
  containers:
    - name: nginx
      image: nginx-1.0
    - name: log-tailer
      image: log-tailer-1.0

If you try to apply the local config as a JSON PATCH to the remote config, you should get a no-op: the container that we've specified (nginx) still exists in the server, and extra containers inside the containers field are not of concern to us (and we should not clobber them). However, there is no way to programatically see if this is a list that should be merged versus a list that should not be merged. Here is an example of a list that should not be merged (the ExecAction API object), and instead should be fully replaced when it is modified:

command:
  - nginx
  - -t
  - -c
  - /etc/nginx.conf

If the remote server were to look like this...

command:
  - httpd
  - -X
  - -f
  - /etc/httpd.conf

...then if I were to patch the remote server with the local config, it should NOT merge these lists together - it should actually replace the command field wholesale with the one on disk.

The semantic knowledge of which lists should be merged like maps versus which lists should be replaced completely belongs in the API server. There are two ways to accomplish this.

Change mergeable lists to maps

If a list is mergeable, we should make it a map in the API server of a key to the objects. See a thorough discussion of this alternative to lists in #2004 (comment). The above example with pod spec would transform to this:

spec:
  containers:
    nginx:
      image: nginx-1.0

And the remote side would look like this:

spec:
  containers:
    nginx:
      image: nginx-1.0
    log-tailer:
      image: log-tailer-1.0

This enables us to merge the maps properly. If you wanted to actually delete the nginx container, you would put nginx: null in the local patch. (We have a design which will automatically generate the correct nulls of things you've deleted based on an annotation containing your previously submitted config - this design will be written into #1702.)

The big advantage of this method though is that it's really obvious what gets merged and what doesn't (lists are always replaced, maps are always merged) and we can use a standard JSON PATCH library to accept patches on the server and we can directly conform to the JSON merge patch spec. A side benefit is that the list is readily accessible in a meaningful way (indexed by primary key) which avoids manually indexing lists all the time.

The downside (as outlined in @jbeda's comment) is that the field to use as the key for the map is not always obvious. We can implement it as a swagger annotation to mitigate that. The bigger issue is that if a struct has no other required fields besides the map field, it can get awkward. For example, the PublicIP's field of the service spec is likely something we would want to merge. It's a []string which looks like this:

spec:
 publicIPs:
    - 67.1.1.10
    - 67.1.1.11

If we made it a mergeable map (since you may want someone to inject a public IP into a service), then we'd have to change it to look like something like this:

spec:
 publicIPs:
    67.1.1.10:
      ip: 67.1.1.10
    67.1.1.11:
      ip: 67.1.1.11

There is an optimization we could do where IP would be automatically filled by API server as a default, which would make it look like this:

spec:
 publicIPs:
    67.1.1.10: {}
    67.1.1.11: {}

But in any case it's still a bit ugly. There may be other ways to beautify it and/or solve this problem - very open to suggestions here.

Swagger annotation to tell API server which lists should be merged

The alternative is to do the merge determination more dynamically. When API server gets a PATCH, it introspects into the lists and checks against a Swagger annotation to see where the list should be replaced or merged. If it's one that should be merged, it will need a mergeKey which tells it what key to use to do the merging. In the above case of containers, it would be name, but in the publicIP's case it would just uniqify and merge the values themselves. While this lets us not change the API, this requires us to implement and maintain a custom JSON PATCH implementation which will not be fun. It also goes against POLA when your PATCH request will not behave as the spec defines.

What should we do?

Neither of these options are fantastic, but I think we need to implement this distinction as part of the API. Please discuss and detail other options if you can think of any.

@bgrant0607 @smarterclayton @derekwaynecarr @mikedanese

The text was updated successfully, but these errors were encountered:

bgrant0607 · 2015-02-27T07:25:21Z

Another example: Defaulted / finalized / auto-set fields in structs in lists would get blown away. It's not just an issue with the spec. We need some generic mechanism to identify which elements in the config correspond to which elements in the live object.

@thockin, in case he's interested, since map vs. lists has come up in other issues

bgrant0607 · 2015-02-27T07:51:46Z

Exhaustive list of lists from v1beta3:

ExecAction.Command and Container.Command: Should not be merged, so should remain as is.
Capabilities.Add and Capabilities.Drop: Likely we will want to merge this.
ServiceSpec.PublicIPs: Definitely defaulting if empty should be done, but it's possible we may want to merge this.
Container.Ports: Needs to be merged. Existing key is name, but we're planning to change that to a description. The port itself could be a key.
Container.Env: Needs to be merged. The name field would be the key.
Container.VolumeMounts: Needs to be merged. The name field would be the key.
PodSpec.Volumes: Needs to be merged. The name field would be the key.
PodSpec.Containers: Needs to be merged. The name field would be the key.
PodStatus.Conditions and NodeStatus.Conditions: The Kubelet and Node Controller can do a schema-aware merge, so it could remain as is, though if we decide to open the floodgates to maps we might want to reconsider.
PodStatus.Info: Is a map, but PodInfo should be a list, not a map #3622 proposes to make it a list. Similar to conditions, above.
LimitRangeSpec.Limits: Probably needs to be merged. The type field would be the key.

Maps in the v1beta3 API:

Labels
Annotations
*Selector
PodInfo
ResourceList (which is misnamed)
Secret.Data

thockin · 2015-02-27T18:30:32Z

More and more I've come to believe that maps of string -> object are OK in
a lot of places and probably cleaner that lists of objects which hold their
own primary key.

I think we need to consider what happens when you pass around a copy or
pointer to one of these now-unnamed structs - is it OK that an object
doesn't know its own name? I think the answer in most cases is yes, but I
have not done deep-dive on this.

On Thu, Feb 26, 2015 at 11:52 PM, Brian Grant notifications@github.com
wrote:

Exhaustive list of lists from v1beta3:

ExecAction.Command and Container.Command: Should not be merged, so
should remain as is.

Capabilities.Add and Capabilities.Drop: Likely we will want to merge
this.

ServiceSpec.PublicIPs: Definitely defaulting if empty should be
done, but it's possible we may want to merge this.

Container.Ports: Needs to be merged. Existing key is name, but we're
planning to change that to a description. The port itself could be a key.

Container.Env: Needs to be merged. The name field would be the key.

Container.VolumeMounts: Needs to be merged. The name field would be
the key.

PodSpec.Volumes: Needs to be merged. The name field would be the key.

PodSpec.Containers: Needs to be merged. The name field would be the
key.

PodStatus.Conditions and NodeStatus.Conditions: The Kubelet and Node
Controller can do a schema-aware merge, so it could remain as is, though if
we decide to open the floodgates to maps we might want to reconsider.

PodStatus.Info: Is a map, but PodInfo should be a list, not a map #3622
PodInfo should be a list, not a map #3622
proposes to make it a list. Similar to conditions, above.

LimitRangeSpec.Limits: Probably needs to be merged. The type field
would be the key.

Maps in the v1beta3 API:

Labels

Annotations

*Selector

PodInfo

ResourceList (which is misnamed)

Secret.Data

Reply to this email directly or view it on GitHub
#4889 (comment)
.

bgrant0607 · 2015-02-27T19:14:31Z

In this case, I think we can have our cake and eat it, too, by defaulting the subobject names from their map keys.

thockin · 2015-02-27T22:00:23Z

I'm not sure that is required, but yes - absolutely possible.

On Fri, Feb 27, 2015 at 11:14 AM, Brian Grant notifications@github.com
wrote:

In this case, I think we can have our cake and eat it, too, by defaulting
the subobject names from their map keys.

Reply to this email directly or view it on GitHub
#4889 (comment)
.

ghodss · 2015-03-01T22:01:41Z

There seems to be a counter-case as well, where something makes sense as a a map but actually should be replaced and therefore should be a list. Selector is an example that should probably be replaced, not merged - if a post-processing step changes the selector of a replication controller, that fundamentally changes the meaning of the replication controller, and a PATCH with a Selector should fully overwrite that field. Currently Selector is map[string]string so it would get merged, which I would argue is not the right behavior. And we can't use the null technique because there is an infinite set of possible selectors that could be added.

In this case what you really want is something like:

selector:
  *: null
  my-label: my-value

But despite this requiring an extension to the PATCH implementation which we are so earnestly trying to avoid, it's still not the approach you want because this distinction should be part of the API, not part of the PATCH operation.

So, if you go with the "Change mergeable lists to maps" technique as written above, the right way to do it is to have Selector be represented as a list of Selectors so they get replaced:

selector:
  - key: my-key
    value: my-value
  - key: my-key-2
    value: my-value-2

Not too intuitive at first when most other things are changing to maps, but it's easy to follow given API documentation and as a client you get the right merging behavior for free.

mikedanese · 2015-03-01T23:43:48Z

There is a use case for updating a single label in a selector with a merge patch as well. Say I have a service selector that has labels for environment, application, and version. I may want to update just version during a deployment. I would consider this fundamentally the same service and I would like my patch to look like {"selector":{"v":"2"}}.

Any modification is possible with merge patch assuming the patcher knows the state of the resource to be patched, but this requires complicated patch generation logic in the client. We should aim to design the api so that a patcher can work with intent and as little prior knowledge as possible (key is part of intent where array index is not). I also don't want to restrict the talk on PATCH to the Merge Patch algorithm. It will still be possible to replace entire subtrees of JSON using RFC6902 replace operation. The important distinction to make is map elements are always going to be much easier to partially modify.

ghodss · 2015-03-02T01:30:34Z

We can easily support RFC6902 as well. The question here is how do we make Merge Patch work in such a way that most graciously enables the declarative configuration workflow described in #1702 (comment).

I do want to ensure @jbeda's concern is represented here: converting from lists to maps makes the config a little less intuitive, especially if you're copy/pasting things around. The full argument is at #2004 (comment).

ghodss · 2015-03-02T03:00:10Z

There is an issue that in some ways disqualifies the swagger annotation approach: it is impossible to use a null value to delete a list item. In #1702 we were intending to use the client to look at the previous config object that was submitted (which would have been stored in an annotation on the object) to insert null values for fields that were deleted from the config before submitting the PATCH. This works for maps, but for lists, there's no way to structure the JSON such that a list item would get deleted on PATCH. For illustration purposes:

# v1 config
spec:
  controllers:
    - name: nginx
      image: nginx-1.0
    - name: stats-daemon
      image: stats-daemon-1.0

# v2 config
spec:
  controllers:
    - name: nginx
      image: nginx-1.0

There's no way to create a JSON Merge Patch which will actually remove stats-daemon, even with fancy list merging logic on the server. You'd have to regenerate the entire controllers list with data from the server and send that, which negates the benefits of having a server-side PATCH solution.

ghodss · 2015-03-02T03:34:04Z

The more I think about it, the more I am against changing lists to maps in the API. I think maps cause a lot of issues in terms of forcing the choice (or worse, creation) of a key, forcing using an object where there shouldn't be (e.g. the publicIP's example above), and the complexities of always auto-filling (and -unfilling?) the field which is used for the map key. It makes our API more awkward to express very simple concepts just to express a default merging behavior. I say "default" because, even in cases where merging seems appropriate, there will be scenarios where someone WILL want to be strict with their comparison, e.g. this pod should ONLY have these containers. If we go the map route, it's unclear how we would enable users to designate certain fields of certain objects to replace instead of merge.

The swagger annotation approach has nearly as many issues, which include astonishing users by breaking the Merge Patch spec and not handling list item deletions well.

So, I'm reaching the conclusion that this merging and reconciling functionality should be entirely out of the API server and that the API server should primarily receive complete PUT's. Instead, the client (in this case kubectl reconcile, but it could also be a reconciler service that you submit your local object to) should do a GET from the server, intelligently merge in the local object based on an annotation and merging rules defined somewhere in the client, and then submit the new object (catching any intermediate changes with resourceVersion).

Finalizers will work the same way (i.e. using the GET-modify-PUT flow), because they will likely have subtleties of what they will want to merge versus replace. They should do the work themselves and then submit a PUT to the API server with the exact changes they want.

I will start a PR for #1702 where kubectl reconcile does a GET from the server, uses client-centric rules for a merge with the local object, then PUT's that complete object back to the server.

mikedanese · 2015-03-02T04:10:49Z

Why does it have to be all or none? I think containers keyed by names is reasonable. I think maps for pulicIPs is less reasonable. As a client, I would take understandable and well documented merge semantics over having arrays in the api and lot's of custom merge rules.

ghodss · 2015-03-02T04:19:30Z

Because if there's even a single case where you want to merge but it's a list (or want to replace but it's a map), then you can no longer rely on the server, and you need to implement the whole operation yourself anyways. IMO, our current convention is actually quite consistent and usable, and introducing special case maps will introduce confusion.

ghodss · 2015-03-02T04:31:55Z

Keep in mind Merge Patch and RFC6902 can still both be supported on the server, they just wouldn't be used for config reconciliation, and clients would need to be cautious when dealing with lists or maps.

smarterclayton · 2015-03-02T04:39:11Z

This discussion reminds me of the various convolutions mongodb update syntax went through in order to support mutating a nested child item in the list. They resorted to custom operators and the $ operator to represent mapping between locator (query) and position. It's challenging to make this work predictably.

On Mar 1, 2015, at 11:32 PM, Sam Ghods notifications@github.com wrote:

Keep in mind Merge Patch and RFC6902 can still both be supported on the server, they just wouldn't be used for config reconciliation, and clients would need to be cautious when dealing with lists or maps.

—
Reply to this email directly or view it on GitHub.

ghodss · 2015-03-03T00:41:57Z

Note that the approach outlined above will require PUTs from kubectl reconcile to include the Status field (otherwise it would get wiped out on a reconciling PUT). This should be fine - we'll just leave it re-PUT the status exactly as we found it in the GET - but if users lose permission to re-PUT the Status field this could be an issue. If that happens, we could put a workaround where if the Status field is in the PUT but doesn't actually result in a change, let the rest of the PUT through.

smarterclayton · 2015-03-03T01:20:38Z

Status isn't allowed to be set anymore, so you can pass it or not. Reconciliation is spec focused, right?

On Mar 2, 2015, at 7:42 PM, Sam Ghods notifications@github.com wrote:

Note that the approach outlined above will require PUTs from kubectl reconcile to include the Status field (otherwise it would get wiped out on a reconciling PUT). This should be fine - we'll just leave it re-PUT the status exactly as we found it in the GET - but if users lose permission to re-PUT the Status field this could be an issue. If that happens, we could put a workaround where if the Status field is in the PUT but doesn't actually result in a change, let the rest of the PUT through.

—
Reply to this email directly or view it on GitHub.

ghodss · 2015-03-03T01:57:25Z

If I can pass it and it'll get ignored, that works great (although it might not be the greatest UX for other use cases?).

Yes, reconciliation is spec focused, but it's hard to generically distinguish between a field that I simply didn't specify locally and a field that's server-side only. It's easiest to just include both sets of fields from the server in my PUT's.

bgrant0607 · 2015-03-06T00:11:21Z

Sorry for the delay, @ghodss.

Ignoring the IP list at the moment, which we're not even sure we'll need to merge...

These lists are effectively maps, maps are commonly used by other systems (e.g., Docker Compose), and a number of users have asked for the env. vars. and containers lists in particular to be converted to maps. Maps also have the benefit of making the merge behavior obvious, since it would be derived from the structure of the objects.

We could document which attribute to use as the key in the map's field description, which is represented in the swagger json. Duplicating the key as a field in the subobject is analogous to representing Kind and ApiVersion in metadata (since they are also used as "keys" in the apiserver and etcd). As for the confusion about the duplication, yes, we've also seen that with Kind and ApiVersion, as well as with names vs. labels and subobject names in general (#3394, #3605). My opinion is that this can be addressed with documentation, proper examples that aren't unnecessarily confusing, tooling, and rigorous consistency in the API. The alternative would be custom map and set representations, which would also require documentation, examples, tooling, and consistency.

Implementing some amount of support in the server has the advantages that it works for multiple languages (e.g., Java, Ruby, Python, Javascript), works with API extensions, and works where the server is newer than the client.

Regardless, it's not acceptable for the client to need to have ad hoc per-object, per-field merge rules, and it seems like it would be very error-prone to require the user to specify merge behavior in every yaml file. Additionally, in json, there's no way to represent such information, AFAIK.

bgrant0607 · 2015-03-06T02:11:47Z

Others have also pointed out that maps work better with off-the-shelf json tools, such as jq, that we need to be able to canonicalize order of the lists/maps generically, and that most tools/systems need to build maps to actually do anything with the information, anyway.

Coming back to the IP lists: I'm not convinced that we want to merge them, but we probably need a canonical set representation anyway. Explaining that sets are encoded using degenerate maps to empty structs seems straightforward.

bgrant0607 · 2015-03-06T02:13:47Z

Furthermore, lists should have the requirement that order actually matters, as in commands, and/or that duplicates are allowed (i.e., no keys).

bgrant0607 · 2015-03-06T02:23:21Z

The inability to null a list value, as mentioned in #4889 (comment), also adds to the list of problems with lists.

bgrant0607 · 2015-03-06T02:27:07Z

srvexpand example using map:
#1980 (comment)

bgrant0607 · 2015-03-06T02:30:12Z

Maps work better for specifying paths: #1980 (comment)

bgrant0607 · 2015-03-06T02:42:26Z

Fig vs. k8s: https://gist.github.com/proppy/8e714bf41b6b0978ab0e#fig2kube

smarterclayton · 2015-03-06T02:57:20Z

Not having an order on containers makes it harder for naive clients to introspect them and determine relative importance (the first one is overwhelmingly more likely to be important). That's the biggest disadvantage to Go's random map order. Same for ports. Not the case for volumes or env vars.

On Mar 5, 2015, at 9:14 PM, Brian Grant notifications@github.com wrote:

Furthermore, lists should have the requirement that order actually matters, as in commands, and/or that duplicates are allowed (i.e., no keys).

—
Reply to this email directly or view it on GitHub.

bgrant0607 · 2015-03-06T03:29:41Z

As for specifying merge vs. replace, we'll probably need that, regardless of the representation.

I think the 3-way diff should properly address most cases of removal, such as a previously configured container that was removed.

bgrant0607 · 2015-03-06T03:31:56Z

@smarterclayton We're eliminating service port defaulting. In what other situations do you need to determine the "most important" container? It seems like explicit information about that would be better.

There will be multiple ways to infer importance, however, such as which containers have readiness probes, which ones are higher quality of service, etc.

smarterclayton · 2015-03-06T04:11:29Z

Most of the cases are around offering intelligent upgrade or introspection suggestions from a UI based on a pod (you probably want to create a service by exposing the first containers first port). Regardless of whether it's a map or list, the users ordering is significant. Even though we in our code don't make decisions based on that, it's still a reasonable inference to draw in many simple scenarios that allows tools to better guide users.

I would hazard that if you canvassed a set of multitier fig yml files, the web tier would be the first listed container in a large number of them. Anecdotally, the ordering of components specified via the Openshift create app cli usually was "rhc create-app "

I don't think that reason alone outweighs the other arguments, I just don't think the order is unimportant to an author or observer.

It's unfortunate that Go's default json serialization would leave us without that information. The "natural order" of containers and ports isn't alphabetical, it's going to be the order specified by an author.

On Mar 5, 2015, at 10:32 PM, Brian Grant notifications@github.com wrote:

@smarterclayton We're eliminating service port defaulting. In what other situations do you need to determine the "most important" container? It seems like explicit information about that would be better.

There will be multiple ways to infer importance, however, such as which containers have readiness probes, which ones are higher quality of service, etc.

—
Reply to this email directly or view it on GitHub.

bgrant0607 · 2015-03-06T04:39:59Z

Thanks for the examples @smarterclayton. So, we're faced with re-deriving order/importance, or re-deriving correspondence (i.e., the map key). Order for a UI totally seems like something that could be automatically annotated without understanding the full object schema. For material importance, I'd imagine quality-of-service specifications, not unlike those provided by lmctfy:
https://github.com/google/lmctfy/blob/master/include/lmctfy.proto#L181

smarterclayton · 2015-03-06T19:26:35Z

Yeah, we've identified another area where the golang json parser limits us.

On Mar 5, 2015, at 11:40 PM, Brian Grant notifications@github.com wrote:

Thanks for the examples @smarterclayton. So, we're faced with re-deriving order/importance, or re-deriving correspondence (i.e., the map key). Order for a UI totally seems like something that could be automatically annotated without understanding the full object schema. For material importance, I'd imagine quality-of-service specifications, not unlike those provided by lmctfy:
https://github.com/google/lmctfy/blob/master/include/lmctfy.proto#L181

—
Reply to this email directly or view it on GitHub.

ghodss · 2015-03-10T00:55:27Z

As for specifying merge vs. replace, we'll probably need that, regardless of the representation.

This is the crux of the whole matter. Merge vs. replace should not be determined by the data type of the collection, it should be determined by the nature of the field or the PATCH action itself. The Selector example above is a good example of when a merging behavior conflicts with the natural data type.

I was looking for precedent, and I think the best one is actually a SQL database. A list of objects where the primary key is held inside each object is the right way to canonically represent data. It allows you to represent objects consistently across all platforms: etcd, the versioned APIs, YAML files, etc. Moving the key outside of the object causes inconsistency in the object's construction by both client and server.

For example, if an object that is typically in a subobject list, but then can exist outside of that list (e.g. imagine being able to fetch a container's spec directly), you will now have two ways of constructing the object, one with the primary key (name in this case) inside and one outside. This has to be managed both when objects are coming into the system as well as going out, e.g. in GET requests you'll need to make sure the primary key is written out in certain cases and not in others.

Additionally, converting subobject lists to maps can result in a clumsy API. As of right now, Port is the most egregious case of this. Currently we have:

ports:
- name: www
  containerPort: 80

@jbeda in the past has used this as an example:

ports:
  www:
    containerPort: 80

But I don't think this is correct. name is currently optional on ports, and I think it should continue to be. It's also not the "primary key" of the Port object - containerPort is. So then you need to have containerPort be the map key:

ports:
  80:
    name: www

And if you don't specify name:

ports:
  80: {}  # could be mistaken for deleting or nullifying the port

IMO it gets pretty ugly. And this is a case that happens to have a primary key. In the future we'll probably have subobject lists where the subobject does not have a natural key (e.g. a list of events) and we'll be in a tough spot. This may be why Docker Compose uses a map approach only for the top level and uses lists for nearly everything else.

I've come around to thinking that the best approach is the Swagger annotation approach listed originally, with the PATCH verb on the server dynamically merging the specified lists. This results in a PATCH verb on the server that "does the right thing" for the vast majority of use cases. I've prototyped the custom PATCH implementation and it works quite well. We can annotate a subobject list as "mergeable" with a designated merge key. This "mergeable list" has specific properties such as irrelevant ordering, a unique primary key, and merging behavior by default.

In the future we can allow specific PATCH requests to override certain fields, perhaps through a URL argument. The only people that will need to know about this concept are those building tools that use PATCH to modify server state. Everyone else can use the PATCH verb transparently and get the right behavior.

Once we have custom ways to handle PATCHes, nullifying and strictness comparisons become possible. For example, nullifying could be done with a special value:

- name: 1
  containerPort: 80
- patchOperation: true
  op: delete
  containerPort: 8080  # delete the value that has containerPort 8080

And strictness could be handled with a different value:

- name: 1
  containerPort: 80
- patchOperation: true
  op: deleteOthers  # delete all other values besides the ones provided above

Not the most elegant solutions in the world, but they can be consistent across the entire API and can be clearly documented.

Re: standard tooling accessing these values, you can access an element of a list based on name in jq with
.desiredState.manifest.containers[] | select(.name=="nginx") and in JSONPath with $.desiredState.manifest.containers[?(@.name==nginx)]. Again, it's not quite as elegant, but finding an element of a list based on a key is built-in to both major ways of accessing JSON documents. In our code when dealing with such objects, we could make a generic function which would index a collection into a map based on some key.

In conclusion, both solutions are lame and have huge drawbacks. :) @bgrant0607 if after this you still are hard on the side of moving to maps, I'm okay with that and will implement it. I do however believe it's worth the pain to stick with lists. The vast majority of users will let the standard tooling handle merging transparently for them, and having a cleaner and more consistent way to represent subobjects is preferable to forcing maps.

bgrant0607 · 2015-03-10T17:59:36Z

Thanks for the analysis, @ghodss, and for the jq/jsonpath examples.

FWIW, we're getting rid of port names. :-) I agree the port is really the key in that case.

How big is your custom patch implementation?

Thoughts on how to encode mergeability and the merge key in the swagger?

I do agree we should implement patch in the server, and we should support a "dry run" mode that just returns the patched object to the client. This approach wouldn't need to block on the swagger representation, since the field tags would be available directly in the server implementation. At the moment, they are available in kubectl, also, since it is (incorrectly, IMO) decoding and encoding API objects using API source it was built with.

I'm fine with trying your approach -- you're implementing it, and you'd be using it. As you say, neither solution is ideal. Anyway, done is better than perfect, and we can always revisit in v2beta2 if we're not happy with it. :-)

mikedanese · 2015-03-10T18:48:49Z

If we do want to support custom patching server side, I'd still like to support standard patches. I'd want to inject the merge function depending on the MIME type. "application/merge-patch+json" should apply the patch per RFC7386, "application/json-patch+json" should apply the patch per RFC6902, and maybe "application/k8s-merge-patch+json" can apply the custom merge patch required by kubectl. We can pick one to be default.

bgrant0607 · 2015-03-10T19:06:29Z

Using content type to differentiate is a good idea.

ghodss · 2015-03-11T04:59:27Z

I like the content type idea.

How big is your custom patch implementation?

A few hundred lines - I'll have a PR soon.

Thoughts on how to encode mergeability and the merge key in the swagger?

Actually I am not yet too familiar with swagger. I was imagining them as struct tags on the versioned objects - can swagger pick up and expose arbitrary tags, or only ones that it knows about? How do I get a tag exposed through swagger? How do I query swagger for a tag?

On second though, this information may not even need to be available to the client. The client can just generate nulls for everything that's changed in all maps and lists based on the most previously submitted config, and the server-side patch can ignore those nulls for fields that are of type replace.

ghodss · 2015-03-11T05:07:20Z

This design gave me an idea for how to structure the special case list items, and if we extend it to maps, we can have a pretty good way to express all kinds of intent in patches.

For example, as mentioned above, if you want to your local reconcile command to ask that a list item be deleted due to a change from a previous config, it can submit a PATCH that looks like this:

- name: 1
  containerPort: 80
- patch.kubernetes.io: true  # reserved keyword
  op: delete
  containerPort: 8080  # delete the value that has containerPort 8080

The good thing about this is that you can generate the same nullify request for lists that should be merged OR replaced, and the server can just drop the special element from those that should be replaced, and replace the list with the remaining elements.

If you wanted strictness for a given list (e.g. ONLY run these containers), you could add this manually yourself:

- name: 1
  containerPort: 80
- patch.kubernetes.io: true
  op: deleteOthers  # delete all other values besides the ones provided above

The best part is that we can apply this to maps too. Let's say you want to indicate in your config that a pod should ONLY have some certain labels:

- labels:
    foo: 1
    bar: 2
    patch.kubernetes.io:
      op: deleteOthers

I'm going to play with this to see if it works, but lmk if you can think of any issues with it.

bgrant0607 · 2015-03-11T06:03:55Z

We're using swagger 1.2:
https://github.com/swagger-api/swagger-spec/tree/master/versions

Swagger has a way to specify a list is a set (uniqueItems), but not to specify map keys. There's a little bit of a discussion about it in an issue, though:
OAI/OpenAPI-Specification#289

It's probably worth raising this issue with them.

Getting the support into go-restful is another step.

bgrant0607 · 2015-03-11T07:01:01Z

The proposed representation looks powerful enough to represent the patches we want to perform.

However, I think that generating the patch spec is harder than serializing and applying it. I'm fine with driving it from Go struct field tags until we figure out the swagger representation, but it does impose a barrier for non-Go clients, of which we have several (at least Java, Ruby, Python, Javascript).

For completeness more than anything else:

One alternative could be to send the 2 versions of the object (newly updated and previously configured) to the apiserver and let it perform the 3-way diff (with the live state) and merge. That way, the patch wouldn't have to be explicitly represented, and the server could exploit its full understanding of the object schemas.

Or, we could revisit my shadow-value/overlay proposal from #1007. The overlay represents the declaratively configured fields and the underlay represents all of the non-declaratively-configured fields to merge with: default values, late-initialized values, automatically set values, imperatively updated values, etc. With that approach, we wouldn't even need to explicitly diff, patch, or merge during updates. We could just overwrite the overlay, which would then be automatically merged with the underlay.

Anyway, custom patching would probably feel less exotic to users, so I'm fine with going ahead with that approach provided that it's consistent enough to be reimplemented in other languages without too much trouble.

ghodss · 2015-04-05T09:22:49Z

tl;dr: To resolve this issue, we settled on something most close to the "Swagger annotation" approach originally described, elaborated on further in #4889 (comment) and #4889 (comment). The final implementation was done in #6027.

bgrant0607 mentioned this issue Feb 27, 2015

Document why we don't use maps in the API in api-conventions.md #2004

Closed

bgrant0607 self-assigned this Mar 10, 2015

This was referenced Mar 10, 2015

PodInfo should be a list, not a map #3622

Closed

add PATCH verb to apiserver #4706

Merged

bgrant0607 mentioned this issue Mar 24, 2015

Make kubectl awesome #5275

Closed

ghodss mentioned this issue Mar 26, 2015

Implement Strategic Merge Patch in apiserver #6027

Merged

bgrant0607 closed this as completed in #6027 Apr 3, 2015

ghodss mentioned this issue Apr 5, 2015

Configuration reconciliation (aka kubectl apply) #1702

Closed

ghodss mentioned this issue May 22, 2015

Adds a diff command #8662

Closed

shyamjvs mentioned this issue Jun 20, 2017

Endpoints api object needs change to allow useful merge patching #47787

Closed

API should differentiate between lists to merge and lists to replace in reconciliation #4889

API should differentiate between lists to merge and lists to replace in reconciliation #4889

Comments

ghodss commented Feb 27, 2015

Change mergeable lists to maps

Swagger annotation to tell API server which lists should be merged

What should we do?

bgrant0607 commented Feb 27, 2015

bgrant0607 commented Feb 27, 2015

thockin commented Feb 27, 2015

bgrant0607 commented Feb 27, 2015

thockin commented Feb 27, 2015

ghodss commented Mar 1, 2015

mikedanese commented Mar 1, 2015

ghodss commented Mar 2, 2015

ghodss commented Mar 2, 2015

ghodss commented Mar 2, 2015

mikedanese commented Mar 2, 2015

ghodss commented Mar 2, 2015

ghodss commented Mar 2, 2015

smarterclayton commented Mar 2, 2015

ghodss commented Mar 3, 2015

smarterclayton commented Mar 3, 2015

ghodss commented Mar 3, 2015

bgrant0607 commented Mar 6, 2015

bgrant0607 commented Mar 6, 2015

bgrant0607 commented Mar 6, 2015

bgrant0607 commented Mar 6, 2015

bgrant0607 commented Mar 6, 2015

bgrant0607 commented Mar 6, 2015

bgrant0607 commented Mar 6, 2015

smarterclayton commented Mar 6, 2015

bgrant0607 commented Mar 6, 2015

bgrant0607 commented Mar 6, 2015

smarterclayton commented Mar 6, 2015

bgrant0607 commented Mar 6, 2015

smarterclayton commented Mar 6, 2015

ghodss commented Mar 10, 2015

bgrant0607 commented Mar 10, 2015

mikedanese commented Mar 10, 2015

bgrant0607 commented Mar 10, 2015

ghodss commented Mar 11, 2015

ghodss commented Mar 11, 2015

bgrant0607 commented Mar 11, 2015

bgrant0607 commented Mar 11, 2015

ghodss commented Apr 5, 2015