Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube: wrapper around informer #43870

Merged

Conversation

howardjohn
Copy link
Member

@howardjohn howardjohn commented Mar 9, 2023

This PR introduces a new abstraction around a Kubernetes Informer, Lister, and Client.

Benefits:

  • Automatic handling of HasSynced for Handlers. Normally, HasSynced only covers the store - handlers are not guaranteed to be called. Kubernetes recently added a way to check for informers syncing, but using it is verbose.
  • Easier use - Lister Get and List can not actually fail but return errors; this removes those to make code simpler. Also use of the underlying Indexer or Store or Controller is not required, which are even harder to use correctly.
  • Better typing - less use of any requiring casting
  • less verbose - For example, s.client.NetworkingV1().Ingresses(currIng.Namespace).UpdateStatus(context.TODO(), currIng, metav1.UpdateOptions{}) becomes s.ingresses.UpdateStatus(currIng)
  • Common type for everything - this allows us to implement higher level abstractions like CreateOrUpdate, a test wrapper (t.Fatal's any errors), etc
  • Easier to properly remove handlers from informers, which is very desirable for some cases.
  • Easier ability to create an informer on LabelSelector or FieldSelector, which before was like 15 lines just to create one
  • Easier to use the DiscoveryNamespaceFilter, and usage uses the same type as un-filtered making interop easier
  • Handle SetWatchErrorHandler from centralized location, removing need to handle it on every controller and removing HasStopped
  • Introduced an example that is a fully commented example of writing a proper controller in Istio

This is split out from #43725. This has the vast majority of this PR, but skips an upgrade to client-go v0.27, which is blocked by a controller-runtime PR. Until then, some of the syncing logic is missed (that is, we use the old syncing -- not regressing). Once the dependencies are ready (well before Istio 1.18), I will land that PR (which will then just be a ~20 line diff for syncing properly).


Reviewers guide:

This is a big PR. I suggest reviewing in order:

  • Example controller, showing a canonical controller written with this new usage in mind: pkg/kube/controllers/example_test.go
  • pkg/kube/client - this is the core of the implementation
  • Rest of the changes are refactoring the codebase to use it.

This PR introduces a new abstraction around a Kubernetes Informer, Lister, and Client.

Benefits:
* Automatic handling of HasSynced for Handlers. Normally, HasSynced only covers the *store* - handlers are not guaranteed to be called. Kubernetes recently added a way to check for informers syncing, but using it is verbose.
* Easier use - Lister Get and List can not actually fail but return errors; this removes those to make code simpler. Also use of the underlying Indexer or Store or Controller is not required, which are even harder to use correctly.
* Better typing - less use of `any` requiring casting
* less verbose - For example, `s.client.NetworkingV1().Ingresses(currIng.Namespace).UpdateStatus(context.TODO(), currIng, metav1.UpdateOptions{})` becomes `s.ingresses.UpdateStatus(currIng)`
* Common type for everything - this allows us to implement higher level abstractions like CreateOrUpdate, a test wrapper (t.Fatal's any errors), etc
* Easier to properly remove handlers from informers, which is very desirable for some cases.
* Easier ability to create an informer on LabelSelector or FieldSelector, which before was like 15 lines just to create one
* Easier to use the DiscoveryNamespaceFilter, and usage uses the same type as un-filtered making interop easier
* Handle SetWatchErrorHandler from centralized location, removing need to handle it on every controller and removing HasStopped
* Introduced an example that is a fully commented example of writing a proper controller in Istio
@howardjohn howardjohn added the release-notes-none Indicates a PR that does not require release notes. label Mar 9, 2023
@howardjohn howardjohn requested review from a team as code owners March 9, 2023 21:36
@istio-testing istio-testing added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 9, 2023
"istio.io/pkg/log"
)

// Erased is a Cached with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It is not clear what does Erased do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's basically a non -generic type. Used in places where we don't know the types. The names I am very open to changing, I don't love them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also looks like I forgot to finish the comment here..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. you mean some thing similar to type erased? I agree naming is little hard here. Now I understand the diff between this and cached. I will think about naming a bit. I will review in detail tomorrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could also just make it a Cached[controllers.Object] if its more clear... will probably do that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed to Untyped and just made it an alias for Cached[controllers.Object]

Copy link
Contributor

@ramaraochavali ramaraochavali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor nits on the new interfaces. Will take a look at rest of the code

ShutdownHandlers()
}

type Cached[T controllers.Object] interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about naming this Wrapped/Wrapper?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the name wrapped since it's not descriptive and exposes implementation details. Users don't need to know it's wrapping something, they just need to know the functionality. An informer itself is also a wrapper of a bunch of things wrapping others


// Cached wraps a Kubernetes client providing cached read access and direct write access.
// This is based on informers, so most of the same caveats to informers apply here.
type CachedRead[T controllers.Object] interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly ReadWrapper?


// NewCached returns a Cached for the given type.
// Internally, this uses a shared informer, so calling this multiple times will share the same internals.
func NewCached[T controllers.Object](c kube.Client) Cached[T] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewWrapped/NewWrapper?

// Warning: currently, if filter.LabelSelector or filter.FieldSelector are set, the same informer will still be used
// This means there must only be one filter configuration for a given type using the same kube.Client (generally, this means the whole program).
// Use with caution.
func NewCachedFiltered[T controllers.Object](c kube.Client, filter Filter) Cached[T] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewFilteredWrapper?

@costinm
Copy link
Contributor

costinm commented Mar 11, 2023

PR looks good - this comment is more about future changes and the limits of the current design.

We keep in memory all Pod, Endpoint, Service and a dozen of other objects. I suspect in most clusters this is the bulk of the data K8S maintain ( yes, there are many CRDs we don't watch - but most people create pods/services/etc). I don't know if the APIserver keeps all CRs in memory ( has a full cache ) or relies on etcd/sqlite/etc and only loads and sends, acting as a wrapper not a full memory cache. But it doesn't feel right to keep a large % of the dataset in memory - not only for one cluster, but for ALL clusters.

We actually only use a few fields from Pod/Endpoint/etc - at some point it may make sense to drop the cached CRs, and instead compute our indexes on the list(Pod) ( calling all Istio components that need pods ), and also call all 'onChange' callbacks, but without caching the full object.

I would go one step further - while in small meshes watching ALL other API servers and keeping in RAM not only most data of one, but of ALL the clusters is ok, for very large meshes this may not scale well. We still have the XDS federation code in Istiod and expose the subset of info we need - maybe watching Istiod in other clusters for the processed subset of data we need may scale better ?

Most of this is not relevant for the short term - but in terms of API and best practices I don't think this PR is ideal ( still better than what we have, so +1 on merging it). Hiding the 'pods' cache and providing a callback onChange(pod) and onList(pods) or even onListChunk(pods) would allow dropping the cache at some point and scale better. Also would work nicely with getting data from XDS.

@howardjohn
Copy link
Member Author

Generally agree but fwiw we don't actually keep the full object in memory. We strip fields we know we won't use before persisting them.

@costinm
Copy link
Contributor

costinm commented Mar 11, 2023 via email

@ramaraochavali
Copy link
Contributor

other changes LGTM

@costinm
Copy link
Contributor

costinm commented Mar 12, 2023 via email

@ramaraochavali
Copy link
Contributor

Agree, interface name should describe what it does - not how it is
implemented. But cached is in the same category with 'wrapper', factory and
builder...

I agree. That was my intent of suggesting name change for it - especially cached does not seem to go well with create/update/delete calls. I also agree wrapped is also like that. May be K8SConfigs/K8sResources also seems like good alternative

@@ -157,6 +157,9 @@ type Client interface {

// Shutdown closes all informers and waits for them to terminate
Shutdown()

// ClusterID returns the cluster this client is connected to
ClusterID() cluster.ID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to know clusterID from k8s client, we can get it from controller already

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used for SetWatchErrorHandler which is now automatically handled by the client, removing the need for each controller to implement SetWatchErrorHandler which has a lot of boilerplate + complexity. Now we don't have to deal with that at all (or HasStarted)

So we only need it at one centralized place instead of in every single controller

@@ -0,0 +1,256 @@
// Copyright Istio Authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first glance, it seems like kube controller-runtime, which does implement a cached client for all kinds of resources

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigs.k8s.io/controller-runtime/pkg/client/interfaces.go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is similar. However, controller-runtime has a (very large) number of issues that make it unsuitable for usage in Istio (I know we use it in operator, but that has a much lower bar and different requirements). https://github.com/istio/istio/blob/bdc51cb8e13c94c545d73cd406d2914d6c301d06/pilot/pkg/config/kube/gateway/deploymentcontroller.go#L60-L66 describes a few.

So this is inspired a bit by that library, but taking Istio needs into account. for example, we embedd our StripUnusedFields and SetWatchErrorHandler logic here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that lib is too abstract, and does not consider perf at all. For every resource that is fetched it start up a informer, many many side-effects. WIll have a careful look with the new one here today

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. The approach here will have no performance impact, it's just a small wrapper around what we already did for consistency + ease of use

@howardjohn
Copy link
Member Author

You mean client.Cached has fields stripped ? Not sure I've seen the code doing this - or understand how it would work if another user of informer needs the fields.

_ = inf.SetTransform(kube.StripUnusedFields)

Its a mechanism in client-go informer. It does NOT work if another user needs the fields. Given we have shared informers, this relies on global mutual agreement. The same is the case with filters (label selector, etc). Not ideal, but works...

I am going to add some logic to detect if 2 callers register informers with conflicting settings. Today, this is ~impossible; this PR is a prereq for doing that type of check.

@howardjohn
Copy link
Member Author

Ok I renamed to just Client which is hopefully less confusing. I also added a check for conflicting filters.

@costinm
Copy link
Contributor

costinm commented Mar 14, 2023 via email

// See the License for the specific language governing permissions and
// limitations under the License.

package kclient
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is a little confusing with pkg/kube/client.go

I think it is more of a controller framework, which wraps a kube client and register event handler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am open to suggestions about alternative names but this is my 4th name for it so I am not going to be able to come up with better ones myself

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, i donot have a good idea either

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, we can easily rename later if we come up with a better one. I'll think on it

Copy link
Contributor

@costinm costinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a clear improvement, I see no reason to not make it - but I still believe a more abstract interface that is not tied to K8S and also works with XDS would be better long term.

Also probably would be better as a standalone project that we just use - I am more and more uncomfortable with the amount of NIH in Istio libraries. If this is a better way to write k8s controllers - it would be better to have it in an istio/controllers repo that all other k8s users could use and without so many deps to Istio, or contributions made to upsream. If someone wanted to use this better interface - they would take a go.mod dep on the entire Istio

@howardjohn
Copy link
Member Author

It's a clear improvement, I see no reason to not make it - but I still believe a more abstract interface that is not tied to K8S and also works with XDS would be better long term.

Also probably would be better as a standalone project that we just use - I am more and more uncomfortable with the amount of NIH in Istio libraries. If this is a better way to write k8s controllers - it would be better to have it in an istio/controllers repo that all other k8s users could use and without so many deps to Istio, or contributions made to upsream. If someone wanted to use this better interface - they would take a go.mod dep on the entire Istio

Agree on all fronts, this is a step in the right direction, not necessarily long term end point

Copy link
Member

@hanxiaop hanxiaop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nmittler
Copy link
Contributor

@howardjohn a meta point not directly related to this PR ... the kube utilities appear to be scattered around the codebase and it's getting hard to understand how it all ties together. It would be good to consolidate them under a single top-level package, with more clear organization of the nuts and bolts. I suspect having them all together might also be a forcing function for better naming.

@howardjohn
Copy link
Member Author

@howardjohn a meta point not directly related to this PR ... the kube utilities appear to be scattered around the codebase and it's getting hard to understand how it all ties together. It would be good to consolidate them under a single top-level package, with more clear organization of the nuts and bolts. I suspect having them all together might also be a forcing function for better naming.

Ack, I agree. Hoping to improve this further, trying to split things up though - this PR is already split out of a bigger PR to keep the diff small (ish - its still huge)

@howardjohn
Copy link
Member Author

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes-none Indicates a PR that does not require release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants