-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sched: dynamic event handlers registration #101394
sched: dynamic event handlers registration #101394
Conversation
@Huang-Wei: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
43419c7
to
3f7df16
Compare
3f7df16
to
2df5dc6
Compare
case framework.CSINode: | ||
informerFactory.Storage().V1().CSINodes().Informer().AddEventHandler( | ||
buildEvtResHandler(at, framework.CSINode, "CSINode"), | ||
) | ||
case framework.PersistentVolume: | ||
// MaxPDVolumeCountPredicate: since it relies on the counts of PV. | ||
AddFunc: sched.onPvAdd, | ||
UpdateFunc: sched.onPvUpdate, | ||
}, | ||
) | ||
|
||
// This is for MaxPDVolumeCountPredicate: add/update PVC will affect counts of PV when it is bound. | ||
informerFactory.Core().V1().PersistentVolumeClaims().Informer().AddEventHandler( | ||
cache.ResourceEventHandlerFuncs{ | ||
AddFunc: sched.onPvcAdd, | ||
UpdateFunc: sched.onPvcUpdate, | ||
}, | ||
) | ||
|
||
// This is for ServiceAffinity: affected by the selector of the service is updated. | ||
// Also, if new service is added, equivalence cache will also become invalid since | ||
// existing pods may be "captured" by this service and change this predicate result. | ||
informerFactory.Core().V1().Services().Informer().AddEventHandler( | ||
cache.ResourceEventHandlerFuncs{ | ||
AddFunc: sched.onServiceAdd, | ||
UpdateFunc: sched.onServiceUpdate, | ||
DeleteFunc: sched.onServiceDelete, | ||
}, | ||
) | ||
|
||
informerFactory.Storage().V1().StorageClasses().Informer().AddEventHandler( | ||
cache.ResourceEventHandlerFuncs{ | ||
AddFunc: sched.onStorageClassAdd, | ||
}, | ||
) | ||
// | ||
// PvAdd: Pods created when there are no PVs available will be stuck in | ||
// unschedulable queue. But unbound PVs created for static provisioning and | ||
// delay binding storage class are skipped in PV controller dynamic | ||
// provisioning and binding process, will not trigger events to schedule pod | ||
// again. So we need to move pods to active queue on PV add for this | ||
// scenario. | ||
// | ||
// PvUpdate: Scheduler.bindVolumesWorker may fail to update assumed pod volume | ||
// bindings due to conflicts if PVs are updated by PV controller or other | ||
// parties, then scheduler will add pod back to unschedulable queue. We | ||
// need to move pods to active queue on PV update for this scenario. | ||
informerFactory.Core().V1().PersistentVolumes().Informer().AddEventHandler( | ||
buildEvtResHandler(at, framework.PersistentVolume, "Pv"), | ||
) | ||
case framework.PersistentVolumeClaim: | ||
// MaxPDVolumeCountPredicate: add/update PVC will affect counts of PV when it is bound. | ||
informerFactory.Core().V1().PersistentVolumeClaims().Informer().AddEventHandler( | ||
buildEvtResHandler(at, framework.PersistentVolumeClaim, "Pvc"), | ||
) | ||
case framework.StorageClass: | ||
if at&framework.Add != 0 { | ||
informerFactory.Storage().V1().StorageClasses().Informer().AddEventHandler( | ||
cache.ResourceEventHandlerFuncs{ | ||
AddFunc: sched.onStorageClassAdd, | ||
}, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really think that a user will configure a scheduler without registering the volume-based plugins? I feel that those should be treated the same as pod and node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really think that a user will configure a scheduler without registering the volume-based plugins?
Nope, I think volume-based plugins are always needed. The design intention is that they're not interacting with internal cache (but pod and node do), so they fit the switch
case naturally, to leverage buildEvtResHandler
generaizing their event handler instantiation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have buildEvtResHandler
function moved outside, so we can still reuse it. The way I am interpreting this switch statement is whether or not those events are necessary to register. If we are saying the storage based ones should always be registered, then I think it is better to treat them like pod/node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(missed this comment earlier)
Another benefit is that we can generalize the registration for add/update/delete events dynamically - for example, currently storage-related registrations are hard-coded, some register add/update, some register add, etc. But if we leverage the unioned cluster events, we can dynamically register the event handlers - which will be resilient to adapt to further code changes such as add/delete events. WDYT?
// options.kubeConfig can be nil in tests. | ||
if options.kubeConfig != nil { | ||
dynClient := dynamic.NewForConfigOrDie(options.kubeConfig) | ||
dynInformerFactory = dynamicinformer.NewFilteredDynamicSharedInformerFactory(dynClient, 0, v1.NamespaceAll, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any overhead (specifically memory and traffic between the api-server and scheduler) to instantiating this informer? if so, we should have an explicit enable flag in component config that is false by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of default scheduler, there is no plugin registering CR events, so IMO instantiating a dynInformerFactory
won't do any concrete work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the dynamic informer cache is populated only if an event is registered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, just like coreInformerFactory, by default the embedded informers are empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, so someone needs to initiate an informer then, so that makes it even more necessary to pass this factory via framework handler to the plugins that will use it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's your point at #101394 (comment). For now, plugin developers instantiate typed informer and register event handlers accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is related, but might be different. If you remember, an informer is initiated only if it is explicitly accessed before starting the the informer. Is this the case here (I don't have experience with dynamic informer, so asking just to make sure)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you remember, an informer is initiated only if it is explicitly accessed before starting the the informer.
Yes, similar here. For dynamic informer factory, an informer for CR xyz is only initiated when we fetch the gvr of xyz, and call its informer() function. In this PR, it's in the default
case - i.e., only triggered when a custom plugin implemented EventsToRegister() to show its interests in CR xyz.
However, the framework doesn't really care about the cache for CRs - the out-of-tree plugins are responsible to build their own cache, if needed. The hook setup here is fairly simple: just to guarantee that upon some CR events, partial unschedulable Pods can be moved properly.
(BTW: I rethink the necessity of offering the dynInformerFactory in framework handle. See my other comment #101394 (comment))
@@ -281,10 +286,33 @@ func New(client clientset.Interface, | |||
sched.StopEverything = stopEverything | |||
sched.client = client | |||
|
|||
addAllEventHandlers(sched, informerFactory) | |||
// Build dynamic client and dynamic informer factory | |||
var dynInformerFactory dynamicinformer.DynamicSharedInformerFactory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if plugins want to access CRD resources, it is better if they do through the same informer factory used for registering the events, so we may want to plumb this to the framework handler just like we did for shared informer factory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I need to think about it. I'm a bit concerned that would add extra burden for plugin developers, such as fetching the GVR before add event handlers, etc. Probably need to compare it with how controller-runtime obstacles similar problems, and come up with a better abstraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rethink it a bit. It's better not to enforce out-of-tree plugins to use dynamic informers as typed ones are more applicable and easy to manipulate, in both core logic and tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not forcing them to use it, but I think it is useful to make it available to them since the cache is already populated for the types they registered events for. If they create there own informer for the same type, aren't we wasting memory because we will have two informerFactor caches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I think it is useful to make it available to them since the cache is already populated for the types they registered events for
Not quite sure the plugin developers would really use it, as the typed informer and other generated stuff are way easier to manipulate and performant.
If they create their own informer for the same type, aren't we wasting memory because we will have two informerFactory caches?
Yes, this is true. But AFAIK, using dynamic client/informer all the way down can impose overhead as it stores objects as interfaces under the hood, and hence that cause a lot of reflection operations.
7541ef3
to
79baafe
Compare
@ahg-g any further comments on this PR? |
/lgtm for squash |
- move clusterEventMap to Configurator - dynamic event handlers registration for core API resources - dynamic event handlers registration for custom resources
79baafe
to
1b3a124
Compare
@ahg-g Commits squashed. |
@dims Could you take a look at the vendors change, or cascade to the correct reviewer? Thanks. |
/approve the |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dims, Huang-Wei The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
/lgtm |
/hold cancel |
/retest |
What type of PR is this?
/kind feature
/sig scheduling
What this PR does / why we need it:
Kubernetes scheduler now registers event handlers dynamically, rather than hard-coding the events registration logic. Please see more details in Design Doc.
This PR consists of 3 parts:
TestServiceAffinityEnqueue
can ensure the test coverage)TestCustomResourceEnqueue
added to ensure the test coverage)Which issue(s) this PR fixes:
Part of #100347.
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: