Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API plugin design thread #991

Closed
lavalamp opened this issue Aug 21, 2014 · 9 comments
Closed

API plugin design thread #991

lavalamp opened this issue Aug 21, 2014 · 9 comments
Labels
area/api Indicates an issue on api area. area/extensibility kind/design Categorizes issue or PR as related to design. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@lavalamp
Copy link
Member

I'm thinking a binary (for example, replication controller) does this to set itself up as a plugin:

  1. On startup, registers itself with apiserver.
    a. POST to /api/<version>/plugins
    b. Need following information: plugin name, object type, object version, api call destination, more?
    c. We use something similar to Brendan's master election code to store all this in etcd. key is the resource name (e.g., replicationController). We set ttl to something shortish, so the plugin has to dial in to an apiserver periodically to verify that it's still around.
  2. The plugin binary runs its own apiserver to server its plugin objects.
    a. Need to think about where it stores them. Do we expose a resource from the main apiserver to allow plugins to store their objects in etcd? IMO we should not give plugins direct access to etcd.
  3. The plugin binary also runs whatever go routines necessary to support its activities.
    a. For example, watches, polls, etc...

We'd try to provide enough high level support that an individual plugin can be very concisely written.

Thoughts?

@smarterclayton
Copy link
Contributor

For 1, I'm hesitant to define service discovery but I can see how it might be useful. As a client in a given environment you need to be able to look somewhere to find where to get the things - maybe we could punt and read an API backed by a config file or etcd value (with no ability to set via API) and let the administrator register their own plugins for use in an environment. Later on we could come back and do a further impl. This may not be simple enough, but we could easily define this:

GET /api
{
  "resources": [
    {"name": "pods", "versions": {"v1beta1": "/api/v1beta1/pods"}},
    {"name": "builds", "versions": {"v1": "https://foo.bar.com/builds/v1/builds"}},
  ]

For 2a, I'd say that many plugins will need their own data store, and it's silly to require multiple etcds esp. if the administrator is the one deciding which plugins can be run. It should be possible for a client to run their own data store and their own plugin and register themselves - maybe that's a step 2 (dynamic plugin registration) down the road. I think there are two different use cases here - plugins an admin / deployer chooses, and plugins that are layered on by less privileged users. The former are likely to be configured against the main etcd (if that's the preferred store) by an admin, and the latter are configured against i.e. a kube service defined in the infrastructure). But in both cases the plugins can be configured to point wherever they need. I think one critical thing is that a plugin should not be using direct etcd access to another component's data store unless the two are versioned together and intentionally coupled. Clients should be using the API to fulfill their pod/controller/service needs.

When admin/deployers choose plugins they're probably going to bundle them together and I suspect at small scales they would prefer deploying them together in the same binary/process. At larger scales, that uncoupling becomes more important (where I scale component X and Y differently).

@smarterclayton
Copy link
Contributor

As far as service discovery, it's probably worth noting that the dominant mechanism in different environments may be out of the client author's control.

OpenStack is an example of a system where a well behaved client is expected to follow a certain pattern. I do think it's possible that we'll do a Kube-OpenStack integration in the future - to do so we'd at least need to leverage some of their discovery, perhaps via a NewOpenStackDiscovery("https://openstack.myco.com/v3/endpoints", "kube", "openshift") etc. call (just an example).

I also assume that the larger a deployment gets the more likely that complex discovery becomes, while for many smaller deployments discovery should be trivial. I.e., a provider wishing to expose highly scoped subsets of Kube resources might not even have a static global registry, but instead want to delegate a client based on their credentials into a specific subset APIs. I hesitate to speculate on whether that could be accomplished by the simple example above or not.

@bgrant0607
Copy link
Member

We're definitely going to need something in this area. It's better to design for this sort of extensibility up front than to let it creep into the system organically in an inconsistent, ad hoc fashion. There are many useful API extensions: build/test/batch/workflow jobs (e.g., #503), cron, sharding controller, auto-scaler, deployment manager, etc.

What these things need is:

  • Seamless integration into auth mechanisms, SDKs, libraries, tools, config mechanisms, etc.
  • Creation trigger w/ request payload. This could be a call from the apiserver as described here, or a call from a config library as described in the config.md PR, or by watching an event or data stream.
  • Storage system to store the extension requests/objects.
  • Auth. delegation to observe and/or modify the entity the extension is creating/managing/monitoring.
  • Watch events relevant to the managed object.
  • Object lifetime management tied to associated object, or possibly general GC.
  • Logging, monitoring, analytics.

@smarterclayton
Copy link
Contributor

Some thoughts:

Seamless integration into auth mechanisms, SDKs, libraries, tools, config mechanisms, etc.

Config is really the driving feature here - to make config effective you need an easy way to map declaration to a remote server call. You either mandate homogenous auth from the client, or make it easy to wrap the auth checking around the client. You have to do some form of service discovery, which either means a global registry or local client registration, neither of which is perfect.

Creation trigger w/ request payload. This could be a call from the apiserver as described here, or a call from a config library as described in the config.md PR, or by watching an event or data stream.

Do you enforce strong REST consistency across these resources (including error conditions and subtle things like versioning) or require client glue code? The former is hard to manage as these things mature, the latter isn't centralized.

Storage system to store the extension requests/objects.

I'm assuming every system can choose to use its own store, but in practice there's benefits for small scales to reusing a standard / common store.

Auth. delegation to observe and/or modify the entity the extension is creating/managing/monitoring.

Requires resources to define somehow the list of extensions that are enabled to monitor them (as per pod templates), if I understand you correctly.

Watch events relevant to the managed object.

Across all objects? Or just local?

Logging, monitoring, analytics.

Analytics of the objects, the extensions providing the objects, or both? Monitoring seems like something that fits into core health concepts at some level

@bgrant0607
Copy link
Member

@smarterclayton Config is just one example of a higher-level orchestration and meta-programming system that would benefit from some uniformity among the set of target APIs. Building ad hoc glue everywhere is hard enough that in the absence of an extensible approach, there will be a lot of pressure to cram a lot of functionality into apiserver, which would then become a monolithic behemoth. As proposed in #1178, I do think we need a standard REST convention for compliant APIs.

Re. watch: I envision both private and multi-tenant APIs. So, both all (at least ones where permission were explicitly granted) and local should be possible.

Logging, monitoring, analytics: It would be useful to be able to log all mutations to BigQuery, for example. Making these data pipelines reliable is non-trivial, so allowing system extensions to leverage them would significantly reduce the work necessary to produce a production-quality API extension.

@smarterclayton
Copy link
Contributor

Copied from another thread


In order to make config work eventually the client needs a way to take an arbitrary chunk of JSON (which it is not expected to understand) and to find, only from "kind" and "apiVersion" a compatible client server it can post that value to.

Problems that have to be avoided:

  • a config object shouldn't have to compiled in support for every version of every api object ever created (and all possible api objects), which means it shouldn't necessarily decode the values that come in config
  • it should be possible to do service discovery outside of code and get a map of kind , apiVersion -> server, path_of_resource.
  • the config client needs to be able to know what operation to perform against the server (POST) and interpret the response, but not necessarily in a specific way

@bgrant0607 bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Dec 1, 2014
@bgrant0607 bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Dec 10, 2014
@bgrant0607 bgrant0607 added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Feb 28, 2015
@bgrant0607 bgrant0607 added this to the v1.2-candidate milestone Sep 12, 2015
@bgrant0607 bgrant0607 removed this from the next-candidate milestone Nov 4, 2016
@bgrant0607 bgrant0607 added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed team/cluster priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Nov 4, 2016
@bgrant0607
Copy link
Member

@smarterclayton @lavalamp I think this has been superseded by apiserver federation for the most part. Is there any reason to keep this issue open?

I never had time to look at Openstack's APIs in detail. Is there anything worth learning from them in this area?

@smarterclayton
Copy link
Contributor

We've mostly followed a similar path - discovery, here hosted natively, in
openstack hosted in keystone, with some level of service doc per service.
Keystone provides authentication integration by each service delegating
tokens, while we're considering multiple options including an auth proxy.
Having native swagger docs also helps in terms of describing generic
resources. We have the concept of server standard behavior that allows
generic clients to work across multiple backends. I think this can be
closed.

On Thu, Nov 3, 2016 at 11:04 PM, Brian Grant notifications@github.com
wrote:

@smarterclayton https://github.com/smarterclayton @lavalamp
https://github.com/lavalamp I think this has been superseded by
apiserver federation for the most part. Is there any reason to keep this
issue open?

I never had time to look at Openstack's APIs in detail. Is there anything
worth learning from them in this area?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#991 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p6cXcfs6Vtg1C6jTF1cb9XUGhIj5ks5q6qDIgaJpZM4CZ0M0
.

@lavalamp
Copy link
Member Author

I agree we're not going to solve this as listed in the OP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. area/extensibility kind/design Categorizes issue or PR as related to design. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests

6 participants