-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Controller-manager sees higher mem-usage when load test runs before density #61041
Comments
It hopefully shouldn't be endpoint controller. Note that we don't start the second test before all namespaces from the previous one are deleted. That means endpoint controller wouldn't be able to update endpoints object becuase of non-existing namespace. |
I looked into the apiserver logs, and can't find any calls that have 'load' mentioned in them after the first call which has 'density' mentioned in it. This could potentially mean that the memory usage is coming from watches (though not of those created by the e2e test, as they IIUC are closed after the load test finishes). So maybe watches from kube-proxies or kubelets? |
I have one idea to check if it is sth around services. Let's disable them in our load test for the job and see. |
That's true. But IIUC it's still possible that endpoints-controller is using memory, for e.g to process watch events for endpoints updates (coming from load test deletion phase)? |
I don't think it's watch related. I don't think we should spent too much time on it now. |
Makes sense.
Sure... But this is just taking very small time while I'm running bisection for the pod-startup regression in the background :) |
This needs more digging, but AFAIU it's not a regression and we've been seeing it for a while already I think. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
I accidentally turned off our load test in PR #60973. But thanks to it, I observed this pattern in our controller-manager memory usage during density test:
You can see the jump after run 11479 when I re-enabled load test. And all subsequent spikes are seen in runs when the density test was preceeded by load test. We were seeing similar issues in past, but IIRC it was for kube-proxies. My feeling is this is related to endpoints-controller processing backlog - but need to confirm.
@wojtek-t - Is it sth already observed in the past? Do we consider it WAI or should we try to fix it?
@kubernetes/sig-scalability-bugs
The text was updated successfully, but these errors were encountered: