-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design Server Side Apply #4758
Design Server Side Apply #4758
Conversation
controllers Signed-off-by: joshvanl <vleeuwenjoshua@gmail.com>
Signed-off-by: joshvanl <vleeuwenjoshua@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, thanks for putting it together.
It'd be good to see an expanded section on a migration plan too, e.g. if a user turns this feature on, is there any risk of old fields being deleted? What if they disable it again after?
design/20220118.server-side-apply.md
Outdated
|
||
To reach graduation, the cert-manager authors should consider the server side | ||
apply implementation for cert-manager to be safe for end users. Graduation | ||
should ideally be done over no more than one or two releases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we validate it is safe? Are there metrics we can use? (e.g. "we expect to see a decrease in the number of resource version conflicts"?
Will we directly solicit feedback from users?
Graduation should ideally be done over no more than one or two releases.
I think it's great to timebox this, however if we don't have confidence, we should not proceed with graduation. (and I think it's worth noting that here too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this is something we should define. I'll add a note about soliciting user feedback, but we should keep thinking of other ideas (if there is anything else we can do).
Indeed, this was meant as a rfc SHOULD
rather than MUST
. I'll make that more clear
Signed-off-by: joshvanl <vleeuwenjoshua@gmail.com>
Signed-off-by: joshvanl <vleeuwenjoshua@gmail.com>
design/20220118.server-side-apply.md
Outdated
API server to revoke management of that field from the previous owner, overwrite | ||
the field, and change owner ship to the new client. | ||
one controller (issuing and trigger Certificate controllers, and cmctl), and as | ||
such, will need to make use of the `force` option in their API calls. This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: will there be any cases where force
should not be used?
I guess the assumption in this design is that all conflicts that we get are caused by controllers modifying different fields of the same resource (I assume that we cannot end up in a state where a controller has an older version of some resource (i.e due to some slow operations) and ends up force overwriting field X that was also modified by another controller in a newer version of the same resource).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have a think, but I don't think there is any case where we do not want to use force
. Getting a conflict error is an opportunity for the client to make a decision on whether they want to overwrite, co-own, or drop management of that value- all of our controllers own their managed fields and will want to overwrite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused about using 'force' - this makes me think it is potentially overwriting other controller's progress.
Can we expand in the doc what force does, when it's appropriate to use, when we do use it, and why it's okay that we use it where we do/any special considerations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add more comments to the doc to clear up what force
does, and how it relates to the management. From the Kube docs;
Conflicts can be forced, in which case the value will be overridden, and the ownership will be transferred.
https://kubernetes.io/docs/reference/using-api/server-side-apply/#conflicts
Overwrite value, become sole manager: If overwriting the value was intentional (or if the applier is an automated process like a controller) the applier should set the force query parameter to true and make the request again. This forces the operation to succeed, changes the value of the field, and removes the field from all other managers' entries in managedFields.
Don't overwrite value, give up management claim: If the applier doesn't care about the value of the field anymore, they can remove it from their config and make the request again. This leaves the value unchanged, and causes the field to be removed from the applier's entry in managedFields.
Don't overwrite value, become shared manager: If the applier still cares about the value of the field, but doesn't want to overwrite it, they can change the value of the field in their config to match the value of the object on the server, and make the request again. This leaves the value unchanged, and causes the field's management to be shared by the applier and all other field managers that already claimed to manage it
fwiw, I think force
isn't a great name 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the only case where it could overwrite other controllers progress could be something like- controller X and controller Y both pick up resource foo
at version N where cluster state is T and resource state is Z. X would be writing to field A if cluster state was T and resource state was Z, Y would be writing to field A if cluster state was K and resource state was Z. X writes to the field, so the resource state is no longer Z. Meanwhile, cluster state changes to K for unrelated reasons and Y who still has foo
at state Z and has been doing some slow operation now overwrites the field A.
Signed-off-by: joshvanl <vleeuwenjoshua@gmail.com>
Looks good to me, thank you for replying to all the comments! Adding an lgmt with an assumption that this will be feature gated and that graduation steps will be easier worked out once the feature is in use and a hold in case more folks have comments /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: irbekrm, JoshVanL The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: joshvanl <vleeuwenjoshua@gmail.com>
Hi! This is a notice and does not require any action. As part of the effort of migrating from Bazel to Make, I wanted to let you know that we have added two new Prow jobs |
I think that this could now be merged, given that we have already merged most of the actual implementation and I see that all outstanding comments have been resolved /lgtm |
/retest |
/test pull-cert-manager-e2e-v1-23 |
/retest |
This PR contains a design document for implementing Server Side Apply for all cert-manager controllers. Discussion is very welcome 🙂
Some related issues/PRs:
#4638
#4598
#4552