Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document API/system design principles and patterns #6133

Closed
bgrant0607 opened this issue Mar 28, 2015 · 6 comments · Fixed by #6933
Closed

Document API/system design principles and patterns #6133

bgrant0607 opened this issue Mar 28, 2015 · 6 comments · Fixed by #6933
Assignees
Labels
kind/design Categorizes issue or PR as related to design. kind/documentation Categorizes issue or PR as related to documentation. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@bgrant0607
Copy link
Member

Related to #5476, #2003, #1007, #5620, and other issues.

The PR creation rate and number of contributors are both growing. In order to scale the effort, we need to better document the patterns we want additions to the system and API to follow, so that we can better parallelize the design and review effort. I'm filing this issue to solicit important principles and patterns to document. I volunteer to create the actual document and/or presentation slides.

What do you think isn't sufficiently covered by existing design docs and/or isn't sufficiently discoverable? What principles/patterns do you find yourself explaining over and over again? What properties would you be unhappy to have violated/degraded/changed by new PRs?

I'm thinking more about reasonably general issues rather than implementation-oriented invariants that apply to very specific/narrow parts of the code.

Examples (some areas, some principles):

  • Declarative APIs, desired state and current state
  • Level-based operation: Functionality must be level-based, meaning the system must operate correctly given the desired state and the current/observed state, regardless of how many intermediate state updates may have been missed. Edge-based behavior must be just an optimization.
  • Status must be 100% reconstructable by observation. Any history kept must be just an optimization and not required for correct operation.
  • Phase, Reason, Message, Conditions, etc. vs. a single state machine with lots of states
  • Composable, a la carte functionality
  • Transparent, no glass to break
  • More generally, Eric Raymond's 17 UNIX rules
  • Only the apiserver should communicate with etcd/store, and not other components (scheduler, kubelet, etc.).
  • Compromising a single node shouldn't compromise the cluster (not that we've achieved that yet)
  • Components should continue to do what they're told in the absence of new instructions.
  • Open world assumption: continually verify assumptions and gracefully adapt to external events and/or actors (example: we allow users to kill pods under control of a replication controller; it just replaces them)
  • Don't assume a component's decisions will not be overridden or rejected, nor for the component to always understand why. For example, etcd may reject writes. Kubelet may reject pods. The scheduler may not be able to schedule pods.
  • When to use labels vs. annotations vs. object references
  • Watch vs. polling
  • Bootstrapping-related principles

Let's brainstorm first, and argue and prioritize afterward.

@smarterclayton @derekwaynecarr @thockin @erictune @lavalamp

@bgrant0607 bgrant0607 added kind/design Categorizes issue or PR as related to design. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/documentation Categorizes issue or PR as related to documentation. team/any labels Mar 28, 2015
@bgrant0607 bgrant0607 self-assigned this Mar 28, 2015
@lavalamp
Copy link
Member

  • Master operations must be constant time.
  • Cluster-wide invariants are difficult to get right. Try not to add them. If you must have them, don't enforce them atomically in master components, that is contention-prone and doesn't provide a recovery path in the case of a bug allowing the invariant to be violated. Instead, provide a series of checks to reduce the probability of a violation, and make every component involved able to recover from an invariant violation. Example: scheduler assigning pods to kubelet.
  • Self-healing components. For example, if you must keep some state (e.g., cache) the content needs to be periodically refreshed, so that if an item does get erroneously stored or a deletion event is missed etc, it will be soon fixed. Ideally on timescales that are shorter than what will attract attention from humans.

@bgrant0607
Copy link
Member Author

Status and spec principles: https://groups.google.com/forum/#!topic/kubernetes-dev/HKg8OD_PNfw

@bgrant0607
Copy link
Member Author

Other things I noticed missing from api-conventions.md:

  • adequate documentation of watch
  • subresources
  • delete options, deletionTimestamp, and graceful termination
  • list options
  • object references
  • selfLink
  • events

@bgrant0607
Copy link
Member Author

And pod-states.md is woefully out of date (e.g., it references "PodStatus" rather than "PodPhase").

@bgrant0607
Copy link
Member Author

Working on API-visible conventions first.

bgrant0607 added a commit to bgrant0607/kubernetes that referenced this issue Apr 15, 2015
bgrant0607 added a commit to bgrant0607/kubernetes that referenced this issue Apr 15, 2015
bgrant0607 added a commit to bgrant0607/kubernetes that referenced this issue Apr 15, 2015
bgrant0607 added a commit that referenced this issue Apr 15, 2015
Updated API conventions and other details, per #6133.
@bgrant0607 bgrant0607 reopened this Apr 15, 2015
@bgrant0607
Copy link
Member Author

Another principle: all relevant API info should be represented in json. Relevant HTTP header fields should mirror the json fields. One example that came up just now: status codes (which are represented in Status). We've previously discussed using Etags and If-Match, also. We're going in this direction for URL query parameters, also.

bgrant0607 added a commit to bgrant0607/kubernetes that referenced this issue Apr 16, 2015
bgrant0607 added a commit to bgrant0607/kubernetes that referenced this issue Apr 16, 2015
…rnetes#4182.

# *** ERROR: *** docs are out of sync between cli and markdown
# run hack/run-gendocs.sh > docs/kubectl.md to regenerate

#
# Your commit will be aborted unless you regenerate docs.
    COMMIT_BLOCKED_ON_GENDOCS
xingzhou pushed a commit to xingzhou/kubernetes that referenced this issue Dec 15, 2016
…rnetes#4182.

# *** ERROR: *** docs are out of sync between cli and markdown
# run hack/run-gendocs.sh > docs/kubectl.md to regenerate

#
# Your commit will be aborted unless you regenerate docs.
    COMMIT_BLOCKED_ON_GENDOCS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. kind/documentation Categorizes issue or PR as related to documentation. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants