Proposal: API support for diff'ing desired state #1178

bgrant0607 · 2014-09-04T17:10:38Z

Broken out from #1007, which was unwieldy.

A configuration system needs to be able to diff a new desired state generated from the configuration with the current desired state registered with the system, in a generic, extensible manner, in order to determine which objects require updates (as well as potentially what kinds of updates). In order to do this, it needs uniform structure across all API objects, clean separation of input and output values, and clean differentiation of user-provided values and automatically determined values, including default values.

Currently, our API objects contain:

(all, even lists which is questionable) JSONBase, which includes Kind, ID, CreationTimestamp, SelfLink, ResourceVersion, and APIVersion
(Pod, ReplicationController, Service) Labels
(Pod, ReplicationController) DesiredState
(Pod only) CurrentState

In JSONBase, only ID is explicitly provided by the client and only ResourceVersion is mutable.

In Pod, DesiredState and CurrentState use the same schema, mixing input and output fields (Manifest, RestartPolicy, Host, HostIP, PodIP, Info).

Service lacks DesiredState and CurrentState.

ReplicationController lacks CurrentState (though that may change with #736).

Proposals:

Nest JSONBase in a named field, such as Metadata, to more coarsely bucket them within the objects.
Move Labels to JSONBase, so that they're in the same position in every object, together with other generic object metadata. Split Labels into FinalLabels and DynamicLabels. FinalLabels could only be specified at creation time, and would not be mutable. DynamicLabels would be applied at run time and would be ignored in the reconciliation diff. If people thought it would be useful, we could introduce a convenience field (e.g., Labels) that presented a merged view. An alternative would be to represent finality as a bit on each label.
Duplicate DesiredState into PrimaryDesiredState, FallbackDesiredState, and DesiredState, and add these to every object. PrimaryDesiredState would be the state set from configuration, and would be the only part of the objects diff'ed by the configuration reconciliation process. FallbackDesiredState would be set by automated processes, such as default values provided by configuration and deployment tools, default values initialized by apiserver itself, values filled in by other automation systems such as auto-scalers (e.g., the replicas field in replicationController), and so on. The PrimaryDesiredState values would always override the FallbackDesiredState values (called shadow values in Proposal: Configuration #1007). DesiredState would present a merged view for components that needed to act upon the full DesiredState. Again, an alternative would be to keep track of whether values should be diff'ed with a bit on each field. That has at least one disadvantage, in that automatically initialized values would be lost in the case that they were overwritten by configuration that was then later reverted. However, it may be simpler for components that don't care about the distinction and may be more space efficient, depending on our internal representation.
Add CurrentState to all objects that are lacking it.
Use different representations for desired and current state in Pod, appropriately divide the fields between them, and merge ContainerManifest into the new PodDesiredState type.

bgrant0607 · 2014-09-04T17:17:58Z

Bonus points:

Add a mechanism to GET just part of an object, such as just the DesiredState or just the CurrentState, such as using URL parameters to specify the filter (?field=DesiredState&field=Metadata). That would reduce overhead, as well as being convenient.

jbeda · 2014-09-04T17:38:05Z

I'm really not a fan of the PrimaryDesiredState/FallbackDesiredState/DesiredState split. Are we sure this will be enough? Why not a general stack of DesiredState? This will turn into CSS. If we need to deal with the "replica count" issue, this seems to me to be a feature of the config system. It should support a "don't modify" value for certain fields so it doesn't clobber other systems.

I am a fan of using different schemas for desired vs. current state. While there is a lot duplicated across these having fields that only apply to one but not the other is really confusing.

bgrant0607 · 2014-09-04T18:03:18Z

@jbeda We don't need a stack. How about just the bit? I think a systematic approach is better than ad hoc addition of secondary shadow fields, some of which are visible in the API and some of which aren't, sentinel values interpreted as UNSET, and other non-uniform approaches. It's not just replicas, but also resources, fields with default values, perhaps ports, ...

If this information were only maintained by config clients, then the addition of new automatically set fields would break existing configurations.

smarterclayton · 2014-09-04T21:42:26Z

Another option might be to use the same object for Primary / Desired, but instead make those alterations of the way you fetch an object. So instead of {desiredState: ..., primary: ...} make the act of fetching the resource alter what you get:

GET /pods/foo # or a param ?state=fallback
Accept: application/json;state=fallback
{
  "foo": "default_value"
}

I base this on the use case of getting the fallback being a fairly uncommon operation that UIs might use - you're trying to get an alternate representation of the object for a certain type of user scenario.

smarterclayton · 2014-09-04T21:45:16Z

Is "ID" metadata or primary data? I feel like it's primary, whereas everything else in metadata could be optional (even kind, because in theory you know the kind when you asked for it from the server).

bgrant0607 · 2014-09-04T22:55:41Z

@smarterclayton Overloading the object/field would make it impossible to specify both some primary and fallback fields in the same POST or PUT. I agree that GET would likely only want to fetch one view, which is why I proposed filters for that purpose.

IMO, name, namespace, ID, labels, creationTimestamp, kind, resourceVersion, etc. are all identifying metadata, used to find and manipulate objects, but aren't primary configuration data, nor current operational state. Additionally, once created, name and ID are immutable, so they might as well be left out of desired state.

smarterclayton · 2014-09-04T23:11:28Z

From a practical perspective I can't say I've ever seen an API that managed to do this well (or really every tried to model it as part of the standard rest resource). That give me a bit of pause when considering making the resource be a composite of these multiple layers. I agree with the general design point of shadow values, but at the same time they seem much less of a common use case than the simple "hey, I want to write a simple client to introspect the actual value". If most clients don't need shadow values (or can get away with being naive about it), is the extra complexity good or bad?

bgrant0607 · 2014-09-04T23:32:34Z

For all practical purposes, internally we have 2 types of workload: services and batch (e.g., mapreduce). The overwhelmingly vast majority of services, even simple ones, are managed using declarative configuration. It's a pretty important pattern to support well.

How about just the 1 bit per field to indicate primary or not? It's somewhat less expressive, but is simpler and less onerous to implement, and could be emulated in client-side glue to support APIs that aren't compliant.

An even more minimal approach would be to allow clients to attach non-identifying key-value metadata to our API objects. The config system could then use that to store the primary field info. We'll likely want that, anyway, though I feel using it for this would create an informal API between automation systems, such as between the config system and auto-scalers.

thockin · 2014-09-05T00:02:29Z

Why a named field for JSONBase?

Lists have JSONBase because the encode/decode logic requires it.

Re: automatically assigned "default" values. Why is it important to know
that they were default values? By not specifying them, the caller asked
for the default. Diffing them seems legit?

@smarterclayton 'Is "ID" metadata or primary data?' I ma also wrestlng a
bit with JSONData. My understandign as of today is that JSONData describes
the REST resource. Given "kind" and "id" (which should perhaps be "name")
you can sort of piece together the REST resource you are looking at. The
real question is whether ID (as in UUID, not name) belongs in JSONBase.

Re: Labels in JSONBase. If JSONBase describes the REST object, rather than
the actual object, do Labels really apply there?

My mind keeps driving towards a metaphor of inodes and dirents. JSONData
describes the dirent.

Brian, can you get a bit more detailed about "1 bite per field" ? How
would that look/be implemented?

On Thu, Sep 4, 2014 at 4:11 PM, Clayton Coleman notifications@github.com
wrote:

From a practical perspective I can't say I've ever seen an API that
managed to do this well (or really every tried to model it as part of the
standard rest resource). That give me a bit of pause when considering
making the resource be a composite of these multiple layers. I agree with
the general design point of shadow values, but at the same time they seem
much less of a common use case than the simple "hey, I want to write a
simple client to introspect the actual value". If most clients don't need
shadow values (or can get away with being naive about it), is the extra
complexity good or bad?

Reply to this email directly or view it on GitHub
#1178 (comment)
.

bgrant0607 · 2014-09-05T02:10:04Z

@thockin Why a named field for JSONBase? To more easily segregate it from desired and current state, to filter or select its fields, and to treat the collection of fields uniformly across API objects collectively, even as we add new metadata fields. Why NOT a named field? Anyway, I'm not hard set on this one.

Re. operations other than GET of a single object: Maybe we need RESTObjectBase, which includes JSONBase.

Re. 1 bit per field: The minimum information that needs to be tracked is which fields should be diff'ed (as mentioned above, keeping track of which ones should not be diff'ed instead is problematic). The representation depends on the complexity of the structures of the types we care the most about, such as resource requests. I suppose it could be as simple as an array of the field names. A map of bools would also work, though there's not much point in specifying the false case, unless we simply wanted to mirror the full DesiredState schema. The configuration mechanism would produce a representation (whichever we choose) of all the fields it has user-provided values for, union that with the representation in the server, and diff the resulting set of fields.

smarterclayton · 2014-09-05T15:18:39Z

In the case where we have the three facets of the object's properties (effective desired, user desired, fallback desired), the user desired properties must be nullable. That means that all clients have to deal with that complexity (naive clients will set all of them, maybe, and defeat the purpose of the fallbacks) which adds a burden to client authors. I don't believe it's sufficient (I think you mentioned this in the config proposal) to just do a diff because when values are equal intent is still not preserved.

Approaching this from the perspective of encouraging client authors to do the right thing - it should be easy (trivial?) to write a client that conveys user desired intent, and to preserve shadow values. It should be possible to atomically update shadow values and user intent values in the same request. It should be clear to a client author how to map user input into those values (both for the naive "i'm scripting this in curl" and the advanced "i'm building the uber ui for this").

Desired, Fallback, and Effective seem like the same struct viewed differently. They are different representations of the same "object" (but maybe not the same "resource"). By default, GET really wants to show Effective. If I GET then PUT, a RESTful client should not clobber my fallback values. Fallback should (?) be sparse - the attributes listed in Fallback should represent values set in preference to the system defaults.

As per the discussion about platform supplied defaults, validation of incoming Desired needs to take Fallback into account. Platform supplied values comprise a hidden fourth struct - the defaults applied by the platform underneath fallback, which are not persisted into objects in the store.

bgrant0607 · 2014-09-05T16:20:54Z

@smarterclayton To paraphrase what I think you're suggesting:

POST and PUT to the vanilla object URLs sets/modifies user desired intent (aka primary desired state)
GET of the vanilla object URLs returns effective desired state (aka merged view)
Config clients would use special POST/PUT/GET URLs (or maybe just parameters) to access the full view (primary and fallback)

Is that correct? That works for me.

Yes, you're right that diff'ing just the fields currently and/or previously set isn't very robust, and not setting a field in the configuration should typically unset that field in the corresponding system object.

However, this implies that modifying primary desired state of config-managed objects through a non-config-aware client will produce changes that will show up in the next diff with configuration, and likely would be clobbered by the next config push. That's Working as Intended, IMO.

An automation client that wants changes to stick should be setting fallback values.

jbeda · 2014-09-05T17:02:00Z

Honestly, I think we should leave all of this out of the API itself. We shouldn't assume there is only one config system. Instead, we should set the patterns that config systems should only tweak/compare the parameters that they know about. Anything else should be left unmolested and blank.

If we must put this in the API I would support:

Having a general stack of "config layers" that get composed together for the desired state.
Having those layers be named with the config system that is tweaking them
Having this all be optional and disappear for the user that doesn't want to deal with this stuff.

I'm 90% sure that we'll regret having a singleton overlay layer -- we should either do this whole hog (with the complexity and confusion in the API that it implies) or we should leave it for higher layers.

bgrant0607 · 2014-09-05T19:09:46Z

@jbeda I agree that we should allow for multiple implementations of config systems, but any given deployment should only be managed by one, since there needs to be a single authoritative source of the desired state. I also agree that users/clients that don't want to deal with it shouldn't need to worry about it. I think @smarterclayton's proposal addressed that.

What counts as "parameters they know about"? If only explicitly set fields, then how would one unset a field? A separate database of fields that should be diff'ed or not diff'ed could create circular dependencies (if the database were run on k8s), as well as being difficult to keep up to date and consistent.

We've considered N>2 config layers before. That would allow sophisticated support for overrides represented as overlays. I see the appeal, but flattening, re-expanding, and otherwise managing the layers can be quite complex, and many use cases wanted to expand a single representation into many flattened objects, which is something that should not be part of the core API.

The proposal(s) here is intended to facilitate interoperation of a configuration system and other automated systems that would want to set desired state fields without updating the authoritative revision-controlled configuration, default values and auto-scaling being the canonical examples. It's intended to be independent of the method(s) used by the configuration system of choice to produce the literal API objects, and independent of the method(s) used to determine the default values and automatically updated values. Obviously, more coordination is required in order to make these systems play nicely together -- for example, an auto-scaler would need to be configured and granted authority -- but I think that problem is also separable from this one. I really don't think we need an arbitrary number of layers for this.

The ad hoc alternatives I've seen in the past were fragile and not general-purpose, requiring special-case code to figure out which values were set by whom at each translation layer, such as code that hardcodes which fields are set to which default values by the system (e.g., that hostPort is sometimes set from containerPort), or how to interpret particular sentinel values in different fields (e.g., -1 implies the system will choose a value for that field), or which fields are set by a particular generator, or which fields are set by the auto-scaler when another configured object is present, or which fields take precedence over which other fields, or which different fields have to be read vs. the ones set, etc. We really want an approach more robust and more uniform than that. The 2-layer approach would have been sufficient to handle what our users have been doing for the past 10 years.

brendandburns · 2014-09-05T19:28:52Z

I have a couple of thoughts:

the default values should be a property of the type, not the instance. I don't really like the "fallback" being present in the object itself. Instead, the API should set up an endpoint (/api/v1beta1/pods/default ?) and we should be able to query the API for the const default instance
I really think that low level objects like pods should be 100% concrete, if we really think that defaulting is a problem, rather than establishing a default part in the object, we should eliminate any default values. Defaults are user conveniences, not fundamental parts of the API, given that, they should be placed in systems like config, or tooling which are oriented around user convenience, not the lower level cluster management system.

Now, that said, I see the value in having consistent defaults across a variety of tools/configs/etc. so I think a reasonable compromise is establishing a const default object that is accessible from the API, but isn't a part of each instance of a type.

bgrant0607 · 2014-09-05T20:27:45Z

@brendandburns Default values are frequently not fixed values, as in the example where hostPort is set from the containerPort value. I also see a universal default template as being problematic in multi-tenant scenarios.

Lack of any default values in the system would exacerbate the problem we're trying to solve, since it would push setting of more values to clients that would have no explicit representation in the configuration source.

Without direct API support, what I'd then recommend is storing the primary desired state, before any defaults or automatically set values were applied, in a "database". The new desired state after a configuration update would be diff'ed with the state in the database, rather than with the API objects. FWIW, if we're doing that, the rigorous separation of desired and current state isn't buying us much.

With that approach, arbitrary key-value metadata and/or a pass-through storage API (which is also needed for API plug-ins) would be useful so that users wouldn't need to run their own databases, deal with bootstrapping/dependency issues, etc.

bgrant0607 · 2014-09-08T23:33:02Z

Assuming the original proposal is dead. Already filed more specific issues for changes/features that appear to have more support.

smarterclayton · 2016-06-23T14:44:03Z

I regret not having something like this. It's complex, but I still regret it.

…erry-pick-1108-to-release-4.7 [release-4.7] Bug 2054669: UPSTREAM: 89885: SQUASH: Retry fetching clouds.conf

bgrant0607 added design labels Sep 4, 2014

bgrant0607 mentioned this issue Sep 4, 2014

Proposal: Isolate kubelet from etcd #860

Merged

bgrant0607 mentioned this issue Sep 4, 2014

Inconsistent usage of ID vs Name #1135

Closed

bgrant0607 mentioned this issue Sep 5, 2014

API plugin design thread #991

Closed

This was referenced Sep 5, 2014

Proposal: Configuration #1007

Closed

Proposal: API desired state / current state cleanup #1200

Closed

Document annotations and guidelines regarding when to use annotations vs labels #1201

Closed

bgrant0607 closed this as completed Sep 8, 2014

This was referenced Sep 26, 2014

GET/LIST subset of object fields #1459

Closed

Can we get consistent defaulting for API? #1502

Closed

This was referenced Oct 7, 2014

Proposal: scaling interface #1629

Closed

Configuration reconciliation (aka kubectl apply) #1702

Closed

bgrant0607 mentioned this issue Apr 3, 2015

Add PodSpec.NodeFailurePolicy = {Reschedule, Delete, Ignore} #6393

Closed

bgrant0607 mentioned this issue Sep 3, 2015

Add a method to generate a strategic merge patch #13007

Merged

This was referenced Oct 19, 2015

UpdateApplyAnnotation stores large annotation on every kubectl operation #15878

Closed

kubectl apply umbrella issue #15894

Closed

Mirror pods in a delete create loop in version skewed cluster (1.1 master, 1.0 node) #15960

Closed

bgrant0607 mentioned this issue Nov 16, 2015

v2 API proposal "desired vs actual" #17333

Open

bgrant0607 mentioned this issue Jun 23, 2016

kubectl apply deployment -f doesn't accept label/selector changes #26202

Closed

bgrant0607 mentioned this issue Nov 1, 2016

Server side defaulting of values prevents diff patches from working #34292

Open

bgrant0607 mentioned this issue Sep 12, 2017

Proposal for re-architecting apply kubernetes/community#1028

Merged

rphillips pushed a commit to rphillips/kubernetes that referenced this issue Mar 3, 2022

Merge pull request kubernetes#1178 from openshift-cherrypick-robot/ch…

971f857

…erry-pick-1108-to-release-4.7 [release-4.7] Bug 2054669: UPSTREAM: 89885: SQUASH: Retry fetching clouds.conf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: API support for diff'ing desired state #1178

Proposal: API support for diff'ing desired state #1178

bgrant0607 commented Sep 4, 2014

bgrant0607 commented Sep 4, 2014

jbeda commented Sep 4, 2014

bgrant0607 commented Sep 4, 2014

smarterclayton commented Sep 4, 2014

smarterclayton commented Sep 4, 2014

bgrant0607 commented Sep 4, 2014

smarterclayton commented Sep 4, 2014

bgrant0607 commented Sep 4, 2014

thockin commented Sep 5, 2014

bgrant0607 commented Sep 5, 2014

smarterclayton commented Sep 5, 2014

bgrant0607 commented Sep 5, 2014

jbeda commented Sep 5, 2014

bgrant0607 commented Sep 5, 2014

brendandburns commented Sep 5, 2014

bgrant0607 commented Sep 5, 2014

bgrant0607 commented Sep 8, 2014

smarterclayton commented Jun 23, 2016

Proposal: API support for diff'ing desired state #1178

Proposal: API support for diff'ing desired state #1178

Comments

bgrant0607 commented Sep 4, 2014

bgrant0607 commented Sep 4, 2014

jbeda commented Sep 4, 2014

bgrant0607 commented Sep 4, 2014

smarterclayton commented Sep 4, 2014

smarterclayton commented Sep 4, 2014

bgrant0607 commented Sep 4, 2014

smarterclayton commented Sep 4, 2014

bgrant0607 commented Sep 4, 2014

thockin commented Sep 5, 2014

bgrant0607 commented Sep 5, 2014

smarterclayton commented Sep 5, 2014

bgrant0607 commented Sep 5, 2014

jbeda commented Sep 5, 2014

bgrant0607 commented Sep 5, 2014

brendandburns commented Sep 5, 2014

bgrant0607 commented Sep 5, 2014

bgrant0607 commented Sep 8, 2014

smarterclayton commented Jun 23, 2016