Can we get consistent defaulting for API? #1502

thockin · 2014-09-30T05:06:09Z

In #1458, Brendan introduced a pattern for accessing and defaulting values: Rather than actually setting the value in the struct during validation (which we do for some other fields), he wrapped the comparisons in functions that captured the defaulting.

This pattern has a distinct advantage of working in tests, where input validation has not run. It has the distinct disadvantage that not all fields are so self-contained - some set their default based on another field's value.

We should choose a pattern and apply it liberally. I'd like to be consistent as much as possible.

bgrant0607 · 2014-09-30T07:31:18Z

Other disadvantages of the comparison approach:

Clients can't introspect default behavior
Creates a subtle API version dependency: Changing default behavior is a breaking API change. If we were to ever change such defaults, we'd need to factor out the code into a version-specific location. Furthermore, if we were to allow objects to survive API version changes (which we probably will eventually), we'd want to record the original default in the object.

brendandburns · 2014-09-30T18:23:47Z

I'd argue rather strongly for my approach.

The main argument in favor is that it respects the user's request. If we automatically add stuff when it comes into validation, then we will be modifying the thing that the user stores, and they will then be surprised when they see a field they didn't set come back in the response.

wrt to the version dependency, this is easy to add a unit test to validate, just test that the predicate that you expect to match, matches on an API object with the default value (e.g. IsPullAlways(v1beta1.Container{}) == true). And then any change to the behavior will cause a break in the unit test, and any such change will require the user to update the test, which should set off red flags.

Regarding client introspection, this is exactly what documentation is for.

--brendan

bgrant0607 · 2014-09-30T19:05:22Z

As discussed in #1178, values will be initialized through a variety of means without the user directly specifying them: configuration generators, client tools/libraries (e.g., kubecfg), name uniquification, active controllers (e.g., replication controller), auto-scalers, ... I don't think it buys us much to forbid this one narrow case, and it would result in less transparency. Additionally, the code would need to be duplicated or linked in to every component and client looking at the desired state of the object. That approach has been highly problematic in the past, in my experience.

The track record of people changing the API documenting the changes they made makes me doubt that as an effective mechanism, and the proposals to auto-generate the documentation from the Go types wouldn't solve this problem, either.

Unit tests are a good idea, regardless of the approach we settle on.

brendandburns · 2014-09-30T19:15:15Z

Hrm, I think we violently agree in the first case.

Rather than doing code substitution anywhere, I'm arguing for having a very
specific interpretation of "" (or whatever default value is for the type)
in the code that handles the field, and treating that as a value as value
with meaning, just like "PullAlways" or whatever.

I think this is preferable to writing code anywhere (client or otherwise)
that looks for defaults and writes in a data value to replace the default
with the "correct" value, for all of the reasons you mention.

On Tue, Sep 30, 2014 at 12:05 PM, bgrant0607 notifications@github.com
wrote:

As discussed in #1178
#1178, values
will be initialized through a variety of means without the user directly
specifying them: configuration generators, client tools/libraries (e.g.,
kubecfg), name uniquification, active controllers (e.g., replication
controller), auto-scalers, ... I don't think it buys us much to forbid this
one narrow case, and it would result in less transparency. Additionally,
the code would need to be duplicated or linked in to every component and
client looking at the desired state of the object. That approach has been
highly problematic in the past, in my experience.

The track record of people changing the API documenting the changes they
made makes me doubt that as an effective mechanism, and the proposals to
auto-generate the documentation from the Go types wouldn't solve this
problem, either.

Unit tests are a good idea, regardless of the approach we settle on.

—
Reply to this email directly or view it on GitHub
#1502 (comment)
.

bgrant0607 · 2014-09-30T20:32:44Z

I don't think we are in agreement.

The code that would need to be duplicated is the comparison code.

If the field were actually initialized, then all consumers would see the initialized value and not need to reason about the default. Protocol buffer default values are intended to provide similar behavior, though, AFAICT, Go's json library doesn't support defaults.

thockin · 2014-10-01T00:55:39Z

I don't think anyone would be confused if they write an object and omit
optional fields, and find those fields populated with default values upon
read-back.

It requires a certain amount of rigor to ALWAYS call validation logic
before using objects in tests. But it also takes some rigor to ALWAYS call
IsFoo(value) functions, rather than compare value == Foo.sum.

Unfortunately Go does not have "auto-settable by JSON but
read-encapsulated" fields.

On Tue, Sep 30, 2014 at 1:32 PM, bgrant0607 notifications@github.com
wrote:

I don't think we are in agreement.

The code that would need to be duplicated is the comparison code.

If the field were actually initialized, then all consumers would see the
initialized value and not need to reason about the default. Protocol buffer
default values are intended to provide similar behavior, though, AFAICT,
Go's json library doesn't support defaults.

Reply to this email directly or view it on GitHub
#1502 (comment)
.

bgrant0607 · 2014-10-01T01:22:30Z

If our code is structured such that we can only create legal objects when they are submitted through the API, then we should fix that problem.

thockin · 2014-10-01T02:36:11Z

We either chokepoint early and do fixups (validation) so callers see valid
structures or we abstract access (the PR in question) so callers see valid
fields or we scatter knowledge of default values for myriad fields all over
the place.

I know how much pain the last option causes.
On Sep 30, 2014 6:22 PM, "bgrant0607" notifications@github.com wrote:

If our code is structured such that we can only create legal objects when
they are submitted through the API, then we should fix that problem.

Reply to this email directly or view it on GitHub
#1502 (comment)
.

bgrant0607 · 2014-10-01T03:17:43Z

Abstracting access across multiple components, multiple usage scenarios (e.g., printing the effective behavior in a UI), multiple API versions, multiple repos (e.g., Openshift), (eventually) multiple languages, etc. would require implementing something automatic, like the proto compiler.

brendandburns · 2014-10-01T19:34:11Z

I'd like to argue more stridently for my approach. Humans may be able to discern when something was defaulted in, but programs often won't.

Imagine I write a loop that does the following:

do {
  updateConfig(newConfig)
  conf := getConfig()
} while (conf != newConfig)

This loop goes forever, and its not even super clear how a user could successfully write a program that understood it should stop.

bgrant0607 · 2014-10-01T22:19:40Z

@brendandburns It's not that simple, and your suggested approach doesn't solve a sufficiently large part of the problem to be useful. In the case of simultaneous updates from an automated component, this approach is not going to work. An intelligent merge needs to be done with the updated desired state, by tracking which fields should be set/unset by the configuration. See discussion in #1178, #1007, and #1201. If you don't care about clobbering other changes, you can drop the preconditions from the update and keep performing it until operation succeeds -- no need to get desired state back from the apiserver.

bgrant0607 · 2014-10-01T22:54:10Z

Clarification of #1502 (comment): We should create a factory method for each object that accepts json, parses into an object, performs validation checks, clears fields that shouldn't be set, initializes fields that are set automatically, such as UID, sets default values, and produces a valid object. The API methods should do auth checks, logging, response generation, etc., but should rely on the same factory methods for object creation.

Updates should similarly be factored out.

No significant business logic should be directly implemented in the API methods themselves.

thockin · 2014-10-02T04:05:50Z

Devil's advocate on this proposal:

We receive a blob of versioned JSON. We look up which factory to call by the version & path. We call factory.Create(blob). Either we have version-specific factories or each factory.Create() has to switch on version. It has to decode JSON into a struct, then filter out not-valid-on-input fields, then apply defaults and assign managed fields. At last we have a usable object (without any real encapsulation because, hey, it's Go).

That's a lot of work to do for 49 unique structs * 3 API versions (yes, we have 49 structs in pkg/api, 58 in v1beta3). Can it be slimmed down? Thought experiment.

We receive a blob of versioned JSON. We auto-decode that into a version-specific struct, then almost-auto-convert that to the internal struct (as today). Now we have an internal struct that has not-valid-on-input fields, missing to-be-defaulted fields, and missing managed fields.

We could make a ClearNonInputFields() method on runtime.Object, which each type would have to implement and Decode() would call. Maybe we could even use a struct tag 'nodecode" and have Decode() do that transparently? We could make a SetDefaults() method which Decode would call, but this is maybe a stretch for struct tags. We could maybe make Validate() be a method, but there is an argument to be made that validation is contextual, and we don't know that at Decode time. Likewise managed fields, they have to be set by context-specific code.

We could have a THIRD type (not api internal, not api versioned) that had a real constructor like you suggest here, and convert from api internal to this "stronger" type. Construction implies clearing and defaulting fields. But this is just like C++ - it's still easy to add a field and forget to add it to the contructor init. Is it really much better?

We could have different structs for input and output, where there is no such thing as a field that has to be ignored on input. For example, v1beta3 still has POST receive a Pod, which has a Status field. We would instead have a PodInput struct which just has metadata and spec, and a Pod struct which has the additional Status. This leads to a lot more structs, often overlapping, but at least it is self-documenting. We still need to call SetDefaults() on it.

Better ideas?

bgrant0607 · 2014-10-02T06:07:07Z

Current code is:

    case "POST":
        if len(parts) != 1 {
            notFound(w, req)
            return
        }
        body, err := readBody(req)
        if err != nil {
            errorJSON(err, h.codec, w)
            return
        }
        obj := storage.New()
        err = h.codec.DecodeInto(body, obj)
        if err != nil {
            errorJSON(err, h.codec, w)
            return
        }
        out, err := storage.Create(ctx, obj)
        if err != nil {
            errorJSON(err, h.codec, w)
            return
        }
        op := h.createOperation(out, sync, timeout, curry(h.setSelfLinkAddID, req))
        h.finishReq(op, req, w)

Right now we do validation and initialize default values in Create(), but we do the conversion in DecodeInto. If we convert between versions, we probably need to do validation twice, once in the original version and again in the target version.

Translations between API versions implies that we will need to fill in at least some default fields, to record version-specific defaults.

thockin · 2014-10-02T06:43:47Z

On Wed, Oct 1, 2014 at 11:07 PM, bgrant0607 notifications@github.com
wrote:

Current code is:
case "POST":
    if len(parts) != 1 {
        notFound(w, req)
        return
    }
    body, err := readBody(req)
    if err != nil {
        errorJSON(err, h.codec, w)
        return
    }
    obj := storage.New()
    err = h.codec.DecodeInto(body, obj)
    if err != nil {
        errorJSON(err, h.codec, w)
        return
    }
    out, err := storage.Create(ctx, obj)
    if err != nil {
        errorJSON(err, h.codec, w)
        return
    }
    op := h.createOperation(out, sync, timeout, curry(h.setSelfLinkAddID, req))
    h.finishReq(op, req, w)
Right now we do validation and initialize default values in Create(), but
we do the conversion in DecodeInto. If we convert between versions, we
probably need to do validation twice, once in the original version and
again in the target version.

Objects only exist in the versioned form in transit to the internal form.
I agree it's POSSIBLE that we would need to validate before conversion, but
I don't see that actually happening.

Translations between API versions implies that we will need to fill in at
least some default fields, to record version-specific defaults.

That's a fair and fugly point. If the default value changes across
versions, we'll totally get that wrong right now.

bgrant0607 · 2014-10-02T19:20:54Z

The case for filling in default values:

Pros:

Could be done in just one place (per object, per version, of course)
- In particular, clients and desired-state consumers wouldn't need to hardcode knowledge of default values
- Would force some object-creation discipline, which is a good thing in my experience, since it helps to ensure we test what the system is doing
Straightforward cross-version object translation
- In particular, if we want objects to survive schema changes, this solves the problem of remembering what the defaults were in the API version used to create the object, including the case where we introduce a new default behavior that didn't even exist in prior API versions
Transparent, introspectable, explicit
Compatible with the config proposal (Proposal: API support for diff'ing desired state #1178, Document annotations and guidelines regarding when to use annotations vs labels #1201, Proposal: Configuration #1007)
Consistent with other means of automatically setting metadata and desired-state fields, such as UID generation, Name uniquification, label attachment (e.g., by replication controller), auto-sizing and auto-scaling, and fields populated by client convenience utilities

Cons:

Consumes a little more bandwidth and/or memory for fields that would otherwise be unset

smarterclayton · 2014-10-29T16:00:40Z

One other con of filling in default values:

Requires a migration in the event the default value is implemented in code incorrectly.

bgrant0607 · 2014-10-29T17:03:04Z

@smarterclayton I'm not sure I understand. Whether or not a default value is explicitly filled in, changing default behavior, whether the original behavior was intended or not, could break clients, in general.

smarterclayton · 2014-10-29T17:36:16Z

I was referring to an internal migration - today you can roll out a fix to the code that corrects a bad default value, but if the values were persisted at creation you have to roll out a migration. If the default value was non client impactful the latter is more expensive.

I don't think that's a significant con, but I wanted to highlight it.

----- Original Message -----

@smarterclayton I'm not sure I understand. Whether or not a default value is
explicitly filled in, changing default behavior, whether the original
behavior was intended or not, could break clients, in general.

Reply to this email directly or view it on GitHub:
#1502 (comment)

lavalamp · 2014-10-29T18:04:27Z

Let me suggest: Translate defaults at conversion time.

Like, if "" means PullAlways in a certain field in a certain version, then when serialized in that version, one should see "" there, but in memory one sees in the unversioned field "PullAlways".

This has the advantage that defaults are versioned, and that you can add support for legacy defaults by adding a new setting, e.g. "PullAlwaysLikeWeUsedToInv1beta1".

Downsides to this: Need to make it super easy to write a field conversion that only changes one field; right now you have to write a bunch of other code. I want to do this anyway, so not that much of a con.

I'm kind of morally opposed to having more than one way to represent any given desired behavior, that makes everything hard. We should independently consider the questions "What's the best way for the user to specify their intent?" and "What's the best way present intentions to the system?". Filling in defaults at random places scattered throughout the code spreads user-intent-deduction code throughout the system, an anti-pattern. No user intent deductions should be required after an object is submitted to the system.

(One could also argue that config stuff should fill in defaults as its last step; I can live with that, too-- but if that's our answer then we should not have defaults in our versioned structs--leaving something unset should just be an error. This would be a PITA if you're not using the/a config system.)

bgrant0607 · 2014-10-29T19:05:31Z

@smarterclayton Sounds like more of a feature than a bug. Changing default behavior is a breaking change that should require a version change.

@lavalamp Punting the problem to clients would mean that we couldn't add fields without breaking clients. We don't want to do that.

Besides, it's unrealistic to think that there will just be one place/tool/system/whatever that sets all fields not explicitly set by the user. We need a composable solution.

I agree with your moral opposition, though. Sprinkling interpretation of defaults all throughout the code would make the system much harder to understand and evolve.

I think the only question is whether clients should be able to inspect the default values. I argue that they should (provided that the fields are visible in the version of the API they are using).

lavalamp · 2014-10-29T21:36:41Z

Punting the problem to clients would mean that we couldn't add fields without breaking clients.

Good point; I will double-down on my assertion that the versioned <-> unversioned struct conversion code is the place for defaults to be expanded.

bgrant0607 · 2014-11-16T00:29:39Z

As shown in #2388, we need the code that validates and initializes defaults in pods to be callable by Kubelet as well as by apiserver, for pods that don't come from the apiserver.

Currently, the validation logic validates fields in an object and supply default values wherever applies. This change factors out defaulting to a set of defaulting callback functions for decoding (see kubernetes#1502 for more discussion). * This change is based on pull request 2587. * Most defaulting has been migrated to defaults.go where the defaulting functions are added. * validation_test.go and converter_test.go have been adapted to not testing the default values. * Fixed all tests with that create invalid objects with the absence of defaulting logic.

bgrant0607 · 2015-02-14T05:42:52Z

This is mostly done. I did find a stray default:
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/registry/service/rest.go#L133

bgrant0607 · 2015-02-14T05:43:23Z

We should also document the approach in docs/api-conventions.md.

ghodss · 2015-02-16T23:58:53Z

@bgrant0607 Can you summarize the approach taken for how defaults are handled?

thockin · 2015-02-17T04:16:21Z

@ghodss Defaulting is now done in a separate per-version pass during conversion from versioned APIs to internal structs.

bgrant0607 · 2015-02-17T06:14:36Z

@ghodss An example is here:
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/api/v1beta1/defaults.go#L27

bgrant0607 · 2015-02-24T02:03:05Z

I think this is done.

…erry-pick-1498-to-release-4.13 [release-4.13] OCPBUGS-8412: Fix mounted volume expansion tests

bgrant0607 added the area/api Indicates an issue on api area. label Sep 30, 2014

bgrant0607 added kind/design Categorizes issue or PR as related to design. area/app-lifecycle labels Oct 1, 2014

bgrant0607 added this to the v1.0 milestone Oct 4, 2014

bgrant0607 mentioned this issue Oct 27, 2014

Respect a requested PortalIP if possible #1982

Merged

bgrant0607 mentioned this issue Nov 7, 2014

Capture application termination messages/output #2225

Merged

thockin mentioned this issue Nov 22, 2014

Sketch: per-api-version defaults #2541

Closed

yujuhong mentioned this issue Jan 28, 2015

Migrate API defaulting to a centralized location #3854

Merged

goltermann removed this from the v1.0 milestone Feb 6, 2015

thockin removed this from the v1.0 milestone Feb 6, 2015

bgrant0607 mentioned this issue Feb 14, 2015

Configuration reconciliation (aka kubectl apply) #1702

Closed

bgrant0607 assigned yujuhong and unassigned thockin Feb 14, 2015

This was referenced Feb 17, 2015

Remove obsolete defaulting in service/rest.go #4493

Merged

Add a brief description on how we handle defaults #4495

Merged

bgrant0607 closed this as completed Feb 24, 2015

bgrant0607 mentioned this issue Feb 28, 2015

Clean up validation code #2238

Closed

bgrant0607 mentioned this issue Apr 3, 2015

Add PodSpec.NodeFailurePolicy = {Reschedule, Delete, Ignore} #6393

Closed

bgrant0607 mentioned this issue Sep 23, 2015

Pod-level security context proposal #12823

Merged

bgrant0607 mentioned this issue Oct 19, 2015

PodSecurityContext with inline fields #14705

Merged

dobsonj pushed a commit to dobsonj/kubernetes that referenced this issue Mar 28, 2023

Merge pull request kubernetes#1502 from openshift-cherrypick-robot/ch…

06e8c46

…erry-pick-1498-to-release-4.13 [release-4.13] OCPBUGS-8412: Fix mounted volume expansion tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we get consistent defaulting for API? #1502

Can we get consistent defaulting for API? #1502

thockin commented Sep 30, 2014

bgrant0607 commented Sep 30, 2014

brendandburns commented Sep 30, 2014

bgrant0607 commented Sep 30, 2014

brendandburns commented Sep 30, 2014

bgrant0607 commented Sep 30, 2014

thockin commented Oct 1, 2014

bgrant0607 commented Oct 1, 2014

thockin commented Oct 1, 2014

bgrant0607 commented Oct 1, 2014

brendandburns commented Oct 1, 2014

bgrant0607 commented Oct 1, 2014

bgrant0607 commented Oct 1, 2014

thockin commented Oct 2, 2014

bgrant0607 commented Oct 2, 2014

thockin commented Oct 2, 2014

bgrant0607 commented Oct 2, 2014

smarterclayton commented Oct 29, 2014

bgrant0607 commented Oct 29, 2014

smarterclayton commented Oct 29, 2014

lavalamp commented Oct 29, 2014

bgrant0607 commented Oct 29, 2014

lavalamp commented Oct 29, 2014

bgrant0607 commented Nov 16, 2014

bgrant0607 commented Feb 14, 2015

bgrant0607 commented Feb 14, 2015

ghodss commented Feb 16, 2015

thockin commented Feb 17, 2015

bgrant0607 commented Feb 17, 2015

bgrant0607 commented Feb 24, 2015

Can we get consistent defaulting for API? #1502

Can we get consistent defaulting for API? #1502

Comments

thockin commented Sep 30, 2014

bgrant0607 commented Sep 30, 2014

brendandburns commented Sep 30, 2014

bgrant0607 commented Sep 30, 2014

brendandburns commented Sep 30, 2014

bgrant0607 commented Sep 30, 2014

thockin commented Oct 1, 2014

bgrant0607 commented Oct 1, 2014

thockin commented Oct 1, 2014

bgrant0607 commented Oct 1, 2014

brendandburns commented Oct 1, 2014

bgrant0607 commented Oct 1, 2014

bgrant0607 commented Oct 1, 2014

thockin commented Oct 2, 2014

bgrant0607 commented Oct 2, 2014

thockin commented Oct 2, 2014

bgrant0607 commented Oct 2, 2014

smarterclayton commented Oct 29, 2014

bgrant0607 commented Oct 29, 2014

smarterclayton commented Oct 29, 2014

lavalamp commented Oct 29, 2014

bgrant0607 commented Oct 29, 2014

lavalamp commented Oct 29, 2014

bgrant0607 commented Nov 16, 2014

bgrant0607 commented Feb 14, 2015

bgrant0607 commented Feb 14, 2015

ghodss commented Feb 16, 2015

thockin commented Feb 17, 2015

bgrant0607 commented Feb 17, 2015

bgrant0607 commented Feb 24, 2015