Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document API architectural approach for soundness and consistency #41954

Open
bgrant0607 opened this issue Aug 16, 2016 · 28 comments
Open

Document API architectural approach for soundness and consistency #41954

bgrant0607 opened this issue Aug 16, 2016 · 28 comments
Labels
language/en Issues or PRs related to English language lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/service-catalog Categorizes an issue or PR as relevant to SIG Service Catalog. wg/api-expression Categorizes an issue or PR as relevant to WG API Expression.

Comments

@bgrant0607
Copy link
Member

bgrant0607 commented Aug 16, 2016

There are a number of distributed-systems challenges with our API, which is:

  • eventually consistent (no guarantee about when a mutation will be observed),
  • weakly consistent (no guarantee that mutations will be observed in order), and
  • transactional only for individual resources.

Cases:

  • I modify resource A and watch A. How can I tell when I've observed my update, assuming I'm not the only actor?
  • I modify resource A and resource B. How can I tell when I've observed both updates, assuming I'm not the only actor?
  • I modify resource A and resource B. How can I tell when the controller managing resource B has observed the update to resource A?
  • I create resources A, B, ..., Z. How can I tell when I've observed the creation of all of those resources, assuming I'm not the only actor (e.g., some resources might quickly be deleted by another agent)?
  • More concrete: I'm the ReplicaSet controller. How can I ensure that I update ReplicaSet status with the most up-to-date pod status in a HA, master-elected configuration, and am not?

There are good reasons for the weak consistency semantics of the API, such as composability with add-on controllers, federated APIs, sharded storage, multiple layers of caches, etc.

The typical means of providing strong consistency is to provide all clients direct access to the database. That's not a viable approach for Kubernetes.

However, we probably have zero implementations of sound clients at the moment.

Examples of mechanisms we've discussed and/or partially implemented that would help:

But we should think about the problem holistically.

This is a prereq to kubernetes/kubernetes#1957

cc @lavalamp @pwittrock @erictune

@bgrant0607 bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Aug 16, 2016
@lavalamp
Copy link
Member

lavalamp commented Aug 16, 2016

Actually it's better than I expected.

Cases:

  • I modify resource A and watch A. How can I tell when I've observed
    my update, assuming I'm not the only actor?

The modification gives you back a ResourceVersion, which you can compare
(via equality) with RVs coming down your watch. Unfortunately you'd have to
expand this to compare via < to handle the general case where you have to
restart your watch--we currently claim that you cannot do this comparison,
although in practice it is currently safe if you confine yourself to a
single resource type.

  • I modify resource A and resource B. How can I tell when I've
    observed both updates, assuming I'm not the only actor?

Same answer, just tracking RV per-resource.

  • I modify resource A and resource B. How can I tell when the
    controller managing resource B has observed the update to resource A?

No general way at the moment.

  • I create resources A, B, ..., Z. How can I tell when I've observed
    the creation of all of those resources, assuming I'm not the only actor
    (e.g., some resources might quickly be deleted by another agent)?

Same as first two answers.

  • More concrete: I'm the ReplicaSet controller. How can I ensure that
    I update ReplicaSet status with the most up-to-date pod status in a HA,
    master-elected configuration, and am not?

I'm not 100% sure I follow the question, I think the ending is garbled?
But if you meant something like what I expect, I think we can extend the
precondition concept to support this.

@adohe-zz
Copy link

adohe-zz commented Aug 16, 2016

/subscribe

@bgrant0607 bgrant0607 added the sig/service-catalog Categorizes an issue or PR as relevant to SIG Service Catalog. label Sep 29, 2016
@caesarxuchao
Copy link
Member

/sub

@bgrant0607
Copy link
Member Author

Somewhat related: kubernetes/kubernetes#34363

@lavalamp
Copy link
Member

lavalamp commented Oct 7, 2016

Fixed my email-garbled comment above.

@smarterclayton
Copy link
Contributor

One more - preconditions on deletion and other actions. A resourceVersion precondition on delete, for example.

@ash2k
Copy link
Member

ash2k commented Aug 3, 2017

resourceVersion precondition on delete

Would be very useful. Right now there is a race between delete and any other operation that updates the object.
E.g. a controller that owns an object (has a controller owner reference pointing to it) cannot safely delete it because something else may change ownership concurrently. I don't know if it is an issue that happens in practice in kubernetes codebase but it is an issue for Custom Resource controllers.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 2, 2018
@bgrant0607
Copy link
Member Author

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jan 9, 2018
@nikhita
Copy link
Member

nikhita commented Mar 4, 2018

/remove-lifecycle stale

@warmchang
Copy link
Contributor

Hi, is there any plan for this? Thx!

@lavalamp
Copy link
Member

lavalamp commented Aug 6, 2019

One more - preconditions on deletion and other actions. A resourceVersion precondition on delete, for example.

For those following along at home, we have this now.

@lavalamp
Copy link
Member

lavalamp commented Aug 6, 2019

I'd like to update my answers above considering the large amount of change the system has undergone.

  • I modify resource A and watch A. How can I tell when I've observed
    my update, assuming I'm not the only actor?

Record the metadata.generation returned; when you observe and update with a generation >= that one, you've observed your change.

  • I modify resource A and resource B. How can I tell when I've
    observed both updates, assuming I'm not the only actor?

Same answer, just tracking RV per-resource.

  • I modify resource A and resource B. How can I tell when the
    controller managing resource B has observed the update to resource A?

We rely on the controller author to do something useful like record the observed generation.

  • I create resources A, B, ..., Z. How can I tell when I've observed
    the creation of all of those resources, assuming I'm not the only actor
    (e.g., some resources might quickly be deleted by another agent)?

We don't offer a way to detect an "after" relationship with a deletion. But we do now offer both UID and RV deletion preconditions, so folks doing the deletion no longer risk losing a change accidentally.

  • More concrete: I'm the ReplicaSet controller. How can I ensure that
    I update ReplicaSet status with the most up-to-date pod status in a HA,
    master-elected configuration, and am not?

I have unpublished drafts on individual object locking for controllers-- it involves an annotation with an "I hold lock X" assertion + a webhook which does a consistent read of the lock object to confirm.

@bgrant0607
Copy link
Member Author

Thanks for the updates.

Is metadata.generation updated for all resource types?

@lavalamp
Copy link
Member

lavalamp commented Aug 6, 2019

It is not 100% automated, so it's possible for an individual resource to do it wrong, but we would treat that as an important bug.

@thockin
Copy link
Member

thockin commented Aug 19, 2022

Closing old issues that are unlikely to be useful any further.

@thockin thockin closed this as completed Aug 19, 2022
@logicalhan
Copy link
Member

Closing old issues that are unlikely to be useful any further.

I actually believe this issue is still useful, we need documentation on the types of guarantees we provide for which API calls and what combinations of API calls one would need to make in order to preserve data integrity at a certain level.

@logicalhan
Copy link
Member

To some extent, this issue served as documentation.

@thockin
Copy link
Member

thockin commented Aug 19, 2022

Can we turn it into documentation? Or re-open it as a request to write such documentation?

@logicalhan
Copy link
Member

Can we turn it into documentation? Or re-open it as a request to write such documentation?

Yeah that makes sense.

@sftim
Copy link
Contributor

sftim commented Jul 9, 2023

For some more context, read: Life Beyond Distributed Transactions: An apostate's opinion

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Jul 9, 2023
@k8s-ci-robot
Copy link
Contributor

@sftim: Reopened this issue.

In response to this:

For some more context: https://queue.acm.org/detail.cfm?id=3025012

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 9, 2023
@sftim
Copy link
Contributor

sftim commented Jul 9, 2023

/transfer website
/sig docs
/kind documentation

@k8s-ci-robot k8s-ci-robot added the sig/docs Categorizes an issue or PR as relevant to SIG Docs. label Jul 9, 2023
@k8s-ci-robot k8s-ci-robot transferred this issue from kubernetes/kubernetes Jul 9, 2023
@sftim
Copy link
Contributor

sftim commented Jul 9, 2023

/retitle Document API architectural approach for soundness and consistency

Some parts of this work might end up in https://k8s.dev/docs/

@k8s-ci-robot k8s-ci-robot changed the title Make it possible to write a sound client from a distributed-systems perspective Document API architectural approach for soundness and consistency Jul 9, 2023
@sftim
Copy link
Contributor

sftim commented Jul 9, 2023

/language en
/wg api-expression
/triage accepted
/lifecycle frozen
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added language/en Issues or PRs related to English language wg/api-expression Categorizes an issue or PR as relevant to WG API Expression. triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 9, 2023
@sftim
Copy link
Contributor

sftim commented Jul 9, 2023

/remove-priority backlog

@k8s-ci-robot k8s-ci-robot removed the priority/backlog Higher priority than priority/awaiting-more-evidence. label Jul 9, 2023
@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language/en Issues or PRs related to English language lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/service-catalog Categorizes an issue or PR as relevant to SIG Service Catalog. wg/api-expression Categorizes an issue or PR as relevant to WG API Expression.
Projects
Status: Triage Accepted
Development

No branches or pull requests