-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Our API code is hard to understand #2306
Comments
For example, why does api/conversion.go exist? |
Isn't api/conversion.go there to add/register the conversion functions? In general, I agree the API code is hard to understand. There are too many interfaces, and I am not sure they are all that necessary. A little background and/or higher-level overview on the API code would definitely help a newbie like me. |
@yujuhong Which conversion functions? vs. v1betaN/conversion.go files? |
Originally it was to facilitate containermanifest -> podspec, since kubelet continued to support legacy paths.
|
It seems like a temporary place to hold conversion functions. All of them can and should be moved to v1beta1/v1beta2/v1beta3, etc eventually. |
There will always be internal objects you want to convert across, although it should be a few (due to reuse of common structs.
|
It also hosts "copy constructors" that have been externalized from objects If we were not allergic to methods we could make the same-type conversion On Wed, Jan 28, 2015 at 3:47 PM, Clayton Coleman notifications@github.com
|
aachoo ----- Original Message -----
|
@smarterclayton I'm increasingly concerned about the API code complexity. Could you please sketch what API layers/modules we have in the client and server, what they do, and why they exist? I very much want to preserve extensibility and testability, but would like to find a way to reduce the number of levels of indirection, or at least rationalize and streamline the organization. |
Yes agree. ----- Original Message -----
|
Some new people joining the team have also mentioned the API code complexity. When I joined the team a couple of months ago I also found it baffling (and largely still do, even though I wrote a fairly large CL that modified it). At the very least we need to document what we have (both in the code and in a separate design doc). It would also be nice if we could simplify it. |
What do I include in "API code"?
Really, there are 3 overlapping sets of code:
|
FWIW I think the badness situation is basically 2 is the worst, then 1, then 3. |
Can you define badness in terms of actionable cleanup for 2? ----- Original Message -----
|
The actionable cleanup for (1) and (3) are most clear to me. As discussed in #1451 (1) and #2726 (3), the apiserver should be entirely generic. Status should be pushed into etcd via calls to apiserver by controllers and not computed on the fly by apiserver. Validation and defaulting code should live with the registry objects. The individual registry objects shouldn't have any custom etcd code. Mainly what I wanted to be the topic of this bug was (2), and I don't yet understand that code well enough to prescribe a solution. |
I think we have very clear paths for 1 that can be described today (based on iterative steps Dan and I had been taking). 3. Definitely can be simplified to fit more closely to helpers around gorestful, but that's been mostly small changes made at each point, and as you say it's in the best shape. For 2, I can describe a fair amount of what is there now in the context of trying to tie external versions to go structs. The Scheme is a bit of a catch all object that could be more cleanly factored into a smaller object. Encoding and decoding are above the scheme, because they can use the go structs directly, except for some coupling to the object structure themselves which we can work around. Some of the complexity in the code is an attempt to be moderately efficient (although encode/decode/conversion are anything but today), which has a lot of low hanging fruit. I'll try to draft up something on that specifically soon (and #1, because from where we are today to where we want to be should be moderately mechanical refactoring).
|
Next steps for simplifying RESTStorage, registry, generic registry, and etcd behavior (1 above):
The current path implementation:
Example:
It's assumed that writing stores to new backends is relatively uncommon, and that the test suite would be easily adapted to new storage types. So we preserve the behavior that PodEtcd implements RESTStorage, and that the apiserver can check whether RESTStorage implements a method via a cast to an interface. This makes PodEtcd a very thin layer around generic etcd, and someone can write GenericPostgreSQL or GenericConsul in the future. Remaining todos:
|
Thanks for the writeup @smarterclayton. Will try to digest and reconcile with previous discussion in #1451. I don't yet understand the division between pkg/conversion and pkg/runtime, which contain marshaling/unmarshaling, conversion, defaulting, some generic metadata accessors, some registry-like functionality (per-kind functionality, collections of kinds), support for extensions, and probably some other stuff. |
I think I like 99% of what's written here. The part I want clarity on is how widespread knowledge of etcd needs to be? I was expecting to see something along the lines of #2594 wherein the kernel of the API server knows that etcd is how we implement the generic Storage interface. All other REST object code simply uses the generic Storage. More succinctly - the word "etcd" should not appear in the codebase except in the implementation of the storage interface and the code that constructs the concrete instance. |
FWIW I think this would be a great topic for the contributors meeting--describe the current partitioning of this functionality and the proposal. |
Ok, I think I understand this now. However, I second @thockin's question about etcd. There is currently business logic in pkg/registry/etcd/etcd.go that doesn't belong there, such as status field initialization upon create or update. It's horrible that the business logic is spread between individual registry objects (e.g., pkg/registry/pod/rest.go), etcd.go, pod_cache.go, and the responsible components (controller-manager, kubelet, etc.). And that doesn't even include the validation, conversion, defaulting, and fuzzing logic. A primary goal has to be to consolidate object-centric mutation logic. Making the apiserver framework totally generic and the etcd layer totally generic are also important goals. Objects shouldn't care about details like what keys are used to store them in etcd -- that should be generically derived from Kind. We should solve field selection #1362 generically, also. Preconditions should all be reducible to resourceVersion-based preconditions. The BoundPods stuff will go away with BoundPods. |
|
See also #1451 |
Started a proposal / summary here #7111 |
It's still getting worse, not better, along the understandable metric. |
cc @mml |
|
The majority of the refactors have been executed in 1.6, although more work in documentation is certainly helpful. We actually have clean separation of core packages, layering, and stronger control on what goes where. |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
I've been digging through our API code in order to automatically document the API, and it's hard to understand.
The division between master and apiserver isn't clean, and the API machinery is spread out between (at least) api, runtime, conversion, version, and registry, with a seemingly arbitrary decomposition into files and a bewildering proliferation of interfaces (storage, codec, scheme, ...). Additionally, it's not obvious which code is used by the server, by clients, or both. Also, ideally, the generic machinery would be cleanly separated from Kubernetes's specific API object implementations.
After we get some API plugins working, I suggest we look hard at how to simplify and reorganize this.
/cc @lavalamp @smarterclayton
The text was updated successfully, but these errors were encountered: