-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube: wrapper around informer #43870
kube: wrapper around informer #43870
Conversation
This PR introduces a new abstraction around a Kubernetes Informer, Lister, and Client. Benefits: * Automatic handling of HasSynced for Handlers. Normally, HasSynced only covers the *store* - handlers are not guaranteed to be called. Kubernetes recently added a way to check for informers syncing, but using it is verbose. * Easier use - Lister Get and List can not actually fail but return errors; this removes those to make code simpler. Also use of the underlying Indexer or Store or Controller is not required, which are even harder to use correctly. * Better typing - less use of `any` requiring casting * less verbose - For example, `s.client.NetworkingV1().Ingresses(currIng.Namespace).UpdateStatus(context.TODO(), currIng, metav1.UpdateOptions{})` becomes `s.ingresses.UpdateStatus(currIng)` * Common type for everything - this allows us to implement higher level abstractions like CreateOrUpdate, a test wrapper (t.Fatal's any errors), etc * Easier to properly remove handlers from informers, which is very desirable for some cases. * Easier ability to create an informer on LabelSelector or FieldSelector, which before was like 15 lines just to create one * Easier to use the DiscoveryNamespaceFilter, and usage uses the same type as un-filtered making interop easier * Handle SetWatchErrorHandler from centralized location, removing need to handle it on every controller and removing HasStopped * Introduced an example that is a fully commented example of writing a proper controller in Istio
pkg/kube/client/erased.go
Outdated
"istio.io/pkg/log" | ||
) | ||
|
||
// Erased is a Cached with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It is not clear what does Erased do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's basically a non -generic type. Used in places where we don't know the types. The names I am very open to changing, I don't love them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also looks like I forgot to finish the comment here..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. you mean some thing similar to type erased? I agree naming is little hard here. Now I understand the diff between this and cached. I will think about naming a bit. I will review in detail tomorrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could also just make it a Cached[controllers.Object] if its more clear... will probably do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed to Untyped and just made it an alias for Cached[controllers.Object]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor nits on the new interfaces. Will take a look at rest of the code
pkg/kube/client/cached.go
Outdated
ShutdownHandlers() | ||
} | ||
|
||
type Cached[T controllers.Object] interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about naming this Wrapped/Wrapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like the name wrapped since it's not descriptive and exposes implementation details. Users don't need to know it's wrapping something, they just need to know the functionality. An informer itself is also a wrapper of a bunch of things wrapping others
pkg/kube/client/cached.go
Outdated
|
||
// Cached wraps a Kubernetes client providing cached read access and direct write access. | ||
// This is based on informers, so most of the same caveats to informers apply here. | ||
type CachedRead[T controllers.Object] interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly ReadWrapper?
pkg/kube/client/cached.go
Outdated
|
||
// NewCached returns a Cached for the given type. | ||
// Internally, this uses a shared informer, so calling this multiple times will share the same internals. | ||
func NewCached[T controllers.Object](c kube.Client) Cached[T] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewWrapped/NewWrapper?
pkg/kube/client/cached.go
Outdated
// Warning: currently, if filter.LabelSelector or filter.FieldSelector are set, the same informer will still be used | ||
// This means there must only be one filter configuration for a given type using the same kube.Client (generally, this means the whole program). | ||
// Use with caution. | ||
func NewCachedFiltered[T controllers.Object](c kube.Client, filter Filter) Cached[T] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewFilteredWrapper?
PR looks good - this comment is more about future changes and the limits of the current design. We keep in memory all Pod, Endpoint, Service and a dozen of other objects. I suspect in most clusters this is the bulk of the data K8S maintain ( yes, there are many CRDs we don't watch - but most people create pods/services/etc). I don't know if the APIserver keeps all CRs in memory ( has a full cache ) or relies on etcd/sqlite/etc and only loads and sends, acting as a wrapper not a full memory cache. But it doesn't feel right to keep a large % of the dataset in memory - not only for one cluster, but for ALL clusters. We actually only use a few fields from Pod/Endpoint/etc - at some point it may make sense to drop the cached CRs, and instead compute our indexes on the list(Pod) ( calling all Istio components that need pods ), and also call all 'onChange' callbacks, but without caching the full object. I would go one step further - while in small meshes watching ALL other API servers and keeping in RAM not only most data of one, but of ALL the clusters is ok, for very large meshes this may not scale well. We still have the XDS federation code in Istiod and expose the subset of info we need - maybe watching Istiod in other clusters for the processed subset of data we need may scale better ? Most of this is not relevant for the short term - but in terms of API and best practices I don't think this PR is ideal ( still better than what we have, so +1 on merging it). Hiding the 'pods' cache and providing a callback onChange(pod) and onList(pods) or even onListChunk(pods) would allow dropping the cache at some point and scale better. Also would work nicely with getting data from XDS. |
Generally agree but fwiw we don't actually keep the full object in memory. We strip fields we know we won't use before persisting them. |
You mean client.Cached has fields stripped ? Not sure I've seen the code
doing this - or understand how it would work if another user of informer
needs the fields.
…On Sat, Mar 11, 2023 at 10:52 AM John Howard ***@***.***> wrote:
Generally agree but fwiw we don't actually keep the *full* object in
memory. We strip fields we know we won't use before persisting them.
—
Reply to this email directly, view it on GitHub
<#43870 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2UE5VZQVSBSI4O3CJTW3TCWNANCNFSM6AAAAAAVVUYR3Q>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
other changes LGTM |
Agree, interface name should describe what it does - not how it is
implemented. But cached is in the same category with 'wrapper', factory and
builder...
K8SStore, ConfigStore, K8SConfigs, etc may be more descriptive - as I
mentioned earlier it may be good long term to have an interface
that abstracts all the configs ( well, we have a few... ) and applies
identically to APIServer ( List/Watch) and XDS ( subscribe ) - both
behave the same despite the implementation difference, get a collection of
configs and notification on each change. From the client
side it shouldn't matter that much how this is implemented - or even if the
configs are cached internally or not.
…On Sun, Mar 12, 2023 at 7:54 AM John Howard ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pkg/kube/client/cached.go
<#43870 (comment)>:
> + ListUnfiltered(namespace string, selector klabels.Selector) []T
+
+ // AddEventHandler inserts a handler. The handler will be called for all Create/Update/Removals.
+ // When ShutdownHandlers is called, the handler is removed.
+ AddEventHandler(h cache.ResourceEventHandler)
+ // HasSynced returns true when the informer is initially populated and that all handlers added
+ // via AddEventHandler have been called with the initial state.
+ // note: this differs from a standard informer HasSynced, which does not check handlers have been called.
+ HasSynced() bool
+ // ShutdownHandlers terminates all handlers added by AddEventHandler.
+ // Warning: this only applies to handlers called via AddEventHandler; any handlers directly added
+ // to the underlying informer are not touched
+ ShutdownHandlers()
+}
+
+type Cached[T controllers.Object] interface {
I don't really like the name wrapped since it's not descriptive and
exposes implementation details. Users don't need to know it's wrapping
something, they just need to know the functionality. An informer itself is
also a wrapper of a bunch of things wrapping others
—
Reply to this email directly, view it on GitHub
<#43870 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2Q4GKXR5M7R75EYQPLW3XWUPANCNFSM6AAAAAAVVUYR3Q>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
I agree. That was my intent of suggesting name change for it - especially cached does not seem to go well with create/update/delete calls. I also agree wrapped is also like that. May be K8SConfigs/K8sResources also seems like good alternative |
@@ -157,6 +157,9 @@ type Client interface { | |||
|
|||
// Shutdown closes all informers and waits for them to terminate | |||
Shutdown() | |||
|
|||
// ClusterID returns the cluster this client is connected to | |||
ClusterID() cluster.ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to know clusterID from k8s client, we can get it from controller already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used for SetWatchErrorHandler which is now automatically handled by the client, removing the need for each controller to implement SetWatchErrorHandler which has a lot of boilerplate + complexity. Now we don't have to deal with that at all (or HasStarted)
So we only need it at one centralized place instead of in every single controller
pkg/kube/client/cached.go
Outdated
@@ -0,0 +1,256 @@ | |||
// Copyright Istio Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the first glance, it seems like kube controller-runtime, which does implement a cached client for all kinds of resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sigs.k8s.io/controller-runtime/pkg/client/interfaces.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is similar. However, controller-runtime has a (very large) number of issues that make it unsuitable for usage in Istio (I know we use it in operator, but that has a much lower bar and different requirements). https://github.com/istio/istio/blob/bdc51cb8e13c94c545d73cd406d2914d6c301d06/pilot/pkg/config/kube/gateway/deploymentcontroller.go#L60-L66 describes a few.
So this is inspired a bit by that library, but taking Istio needs into account. for example, we embedd our StripUnusedFields and SetWatchErrorHandler logic here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that lib is too abstract, and does not consider perf at all. For every resource that is fetched it start up a informer, many many side-effects. WIll have a careful look with the new one here today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. The approach here will have no performance impact, it's just a small wrapper around what we already did for consistency + ease of use
istio/pkg/kube/client/untyped.go Line 38 in efcc80f
Its a mechanism in client-go informer. It does NOT work if another user needs the fields. Given we have shared informers, this relies on global mutual agreement. The same is the case with filters (label selector, etc). Not ideal, but works... I am going to add some logic to detect if 2 callers register informers with conflicting settings. Today, this is ~impossible; this PR is a prereq for doing that type of check. |
Ok I renamed to just |
On Mon, Mar 13, 2023 at 9:03 AM John Howard ***@***.***> wrote:
You mean client.Cached has fields stripped ? Not sure I've seen the code
doing this - or understand how it would work if another user of informer
needs the fields.
https://github.com/istio/istio/blob/efcc80fdbcf67c95011730bc0c54fab4b7e6f2b8/pkg/kube/client/untyped.go#L38
Its a mechanism in client-go informer. It does NOT work if another user
needs the fields. Given we have shared informers, this relies on global
mutual agreement. The same is the case with filters (label selector, etc).
Not ideal, but works...
I am going to add some logic to detect if 2 callers register informers
with conflicting settings. Today, this is ~impossible; this PR is a prereq
for doing that type of check.
That was my understanding as well - I would suggest not even allowing 2
callers to register the filters, but have a single place that registers for
all watched objects and
filtering. As I mentioned earlier, I think the best abstraction here would
be close to what we do for xDS watching, hiding far more of the k8s
internals. But we can do that later
as well, not a blocking comment.
… —
Reply to this email directly, view it on GitHub
<#43870 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2USHHBQYXI3PPOSO33W35AM7ANCNFSM6AAAAAAVVUYR3Q>
.
You are receiving this because your review was requested. Message ID:
***@***.***>
|
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
package kclient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is a little confusing with pkg/kube/client.go
I think it is more of a controller framework, which wraps a kube client and register event handler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am open to suggestions about alternative names but this is my 4th name for it so I am not going to be able to come up with better ones myself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, i donot have a good idea either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we can easily rename later if we come up with a better one. I'll think on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a clear improvement, I see no reason to not make it - but I still believe a more abstract interface that is not tied to K8S and also works with XDS would be better long term.
Also probably would be better as a standalone project that we just use - I am more and more uncomfortable with the amount of NIH in Istio libraries. If this is a better way to write k8s controllers - it would be better to have it in an istio/controllers repo that all other k8s users could use and without so many deps to Istio, or contributions made to upsream. If someone wanted to use this better interface - they would take a go.mod dep on the entire Istio
Agree on all fronts, this is a step in the right direction, not necessarily long term end point |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@howardjohn a meta point not directly related to this PR ... the kube utilities appear to be scattered around the codebase and it's getting hard to understand how it all ties together. It would be good to consolidate them under a single top-level package, with more clear organization of the nuts and bolts. I suspect having them all together might also be a forcing function for better naming. |
Ack, I agree. Hoping to improve this further, trying to split things up though - this PR is already split out of a bigger PR to keep the diff small (ish - its still huge) |
/retest |
This PR introduces a new abstraction around a Kubernetes Informer, Lister, and Client.
Benefits:
any
requiring castings.client.NetworkingV1().Ingresses(currIng.Namespace).UpdateStatus(context.TODO(), currIng, metav1.UpdateOptions{})
becomess.ingresses.UpdateStatus(currIng)
This is split out from #43725. This has the vast majority of this PR, but skips an upgrade to client-go v0.27, which is blocked by a controller-runtime PR. Until then, some of the syncing logic is missed (that is, we use the old syncing -- not regressing). Once the dependencies are ready (well before Istio 1.18), I will land that PR (which will then just be a ~20 line diff for syncing properly).
Reviewers guide:
This is a big PR. I suggest reviewing in order: