-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating selectors is expensive in EndpointsController #73527
Comments
IIUC, It sounds like a simple short-term fix can be adding an in-memory cache of selector per service. That would avoid recreating the selector every time. This would be a good PR for new contributor. If anyone wants to send an PR, please assign it to me. I am willing to shepherd it. |
Yeah probably. Though it obviously complicates the code, so my main goal was to ensure that the new API will not have this problem. |
@yashbhutwala - nice to meet you. But to describe in more details:
What @freehan is suggesting that, we may want to cache "service->selector" mapping inside endpoints controller and use that instead of recomputing that every single time. I'm a bit worried that it may complicate the code, but I think I may be too pesimistic here. So it's probably worth trying. So what you would need to do is:
|
Thanks @wojtek-t for laying out an outline for me! I'll shoot my best shot, and put up a PR, perhaps you can guide it along? P.S. what's your username in the slack group? [ignore, for my own reference; important links]: https://github.com/kubernetes/api/blob/master/apps/v1/types.go#L723 |
Hey @yashbhutwala , what's the status of this issue? Still work on that? |
@shinytang6 @wojtek-t @freehan I made a PR here: #75965. Review/feedback/suggestions appreciated! :) |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
@wojtek-t
How about passing a flag to AsSelectorPreValidated() which skips the sorting step ? |
AsSelectorPreValidated() is already a bit of a hack by iself (I know I'm the one who added it). And I'm bit reluctant of making it even more strange looking... |
Can SelectorFromValidatedSet skip calling sort ? |
It can't - it would e.g. break the assumption that you can compare them. |
/assign @robscott |
@bowei: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@wojtek-t it seems selector-cache solution has to rewrite GetPodServices function, is it feasible to change so many code? |
Not necessary - see #73527 |
I meant #75965 |
@wojtek-t Excuse me, I have one concern about the service selector: should cache updated while service selector changed? |
yes |
When profiling scalability tests at large scale, it appears that the function
SelectorFromValidatedSet
can consume even 10% of cpu.In the profiles it's coming purely from the GetPodServices in EndpointsController.
The reason for that is for every single call of GetPodServices:
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/endpoint/endpoints_controller.go#L170
for every single service in that namespace we build the selector from scratch:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/listers/core/v1/service_expansion.go#L49
Even though a single such operation is not very expensive, the amount of that is resulting in significant percentage of overall cpu usage.
Longer-term solution is to do exactly what was done e.g. for ReplicaSet:
https://github.com/kubernetes/api/blob/master/apps/v1/types.go#L723
That would mean that store-in-memory services would have the selector already created, so it would eliminate that completely.
I'm wondering if there is something we can do here more short-term.
@kubernetes/sig-network-misc @freehan
@kubernetes/sig-scalability-misc
The text was updated successfully, but these errors were encountered: