-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle deleting external load balancers across kube-controller-manager restarts #15203
Comments
I would recommend getting the service controller out of kube-controller-manager and into a pod. We know how to handle shutdown for pods (terminationGracePeriod). Of course there're things we need to do on start up to prevent leaking (or getting wedged) that the pod won't help with. |
We need a bit of a pow-wow on this. There's value is busting controllers We should work up a plan for this across existing L3 balancers, upcoming L7 @markturansky @smarterclayton @saad-ali @davidopp Options:
Nothing stops users/admins from doing arbitrarily many pods with control @aronchick FYI On Wed, Oct 7, 2015 at 8:57 AM, Prashanth B notifications@github.com
|
@a-robinson add #14443 to your list of potential resource leaks. |
@a-robinson the not-yet-deleted load balancers are still implicitly listed in etcd, associated with the relevant services, right? So on startup, service controller could find them there (via api-server), and repopulate it's state? Or is it more complicated than that? |
@quinton-hoole - no, the issue is for services that have been deleted via the API (or have been updated to remove their load balancer via the API). The service controller is making the changes asynchronously from the API updates, so unless it can access the old state of the services, it won't know that the service ever existed or had a load balancer, respectively. Services that still exist with their load balancer when the service controller restarts are handled fine if they're later deleted. |
@a-robinson My apologies - I did not state my point clearly. Let me try again. Should a deleted service not remain in "deleting" state (in etcd) until it has been completely and successfully deleted (including all GCE resource deletion)? If so, it would remain in etcd, and the process I described above could work? |
Ah, sorry for the misunderstanding. If we were to add a "deleting" state for service objects, and only the service controller could move them out of that state, then this would indeed be fixed (and that pattern would also work for other objects that need external resources to be cleaned up asynchronously). |
The deleting state would be graceful deletion. You can implement that On Wed, Oct 7, 2015 at 4:58 PM, Alex Robinson notifications@github.com
|
That said, another way to handle this is via resynchronization - I thought On Wed, Oct 7, 2015 at 5:11 PM, Clayton Coleman ccoleman@redhat.com wrote:
|
Volumes need to be reconciled with the provider: #14443 |
Maybe we should, like Docker containers, claim a prefix as "assumed for On Wed, Oct 7, 2015 at 6:39 PM, Mark Turansky notifications@github.com
|
This is not a 1.1 blocker, right? |
@thockin I reversed the order in which I create things in our Provisioner. I now create the PV first (with a faux placeholder ID to pass validation). If that transaction succeeds, then we provision the volume in the provider. We can feed the PV's real name to the EBS/GCEPD/Cinder volume for desc/tags. Recon for volumes will be querying the provider for all volumes and using the PV's name to make sure it's in the cluster. Not sure how it should work for other resources, but this approach is working well for volumes, I think. |
@davidopp Yup, not a blocker, IMO. |
Yup, not a blocker. This issue is just following up on our conversation earlier this week. |
Resolving this for the purpose of deflaking tests alone makes it worthy of P1 IMO. @a-robinson to what extent are leaking network resources still causing e2e test failures? If they're not, I'd suggest dropping priority to P2? |
I honestly have no idea at this point. I haven't been watching them other than when I'm build copping since I've been doing so little OSS work lately. |
OK, I'll:
cc @ixdy FYI, as you've looked into this a fair bit in the past. @kubernetes/goog-testing |
This is marked as 1.2 milestone but it doesn't seem like there will be anything resembling closure on it for 1.2. Is there any piece of it that's critical for 1.2? If not, we can punt it to the next release, and/or work on it after the feature freeze. Suggestions? |
It's not critical for 1.2, but could certainly be considered a test flake issue given that it leads to static IP leaks in our e2e tests |
If we're restarting controller manager outside of daemon restart during a test run when it has live loadbalancer, that would be a worse bug IMO |
This issue is full of good ideas but nobody had time to implement any of them, so I'm going to move this to the next-candidate milestone... |
Can we at least on AWS tag the ELBs created by kubernetes with cluster name and service name, so that manual clean up is possible? |
On Fri, Feb 26, 2016 at 12:27 PM, Jeff Grafton notifications@github.com
|
Yeah, I realized what you meant right after making that comment. |
@justinsb what do you think about It seems like a good backstop in case #19879 doesn't get merged in time for 1.2 |
@davidopp @guoshimin tagging of ELBs is a good request. We recently added some tagging; it should be easy to add any missing tagging. Tracking as #22118 |
Oh ... and related: on AWS we already reserved some tags. I screwed up originally and called the primary tag "KubernetesCluster", but otherwise we're now claiming the whole "*.kubernetes.io/..." namespace for our own purposes. Nobody's complained yet, but then these are only tags (not names) and I haven't exactly been broadcasting the fact that you shouldn't allow your tags to collide. But on the other hand, I don't see people getting a collision accidentally. The biggest problem on AWS is the opposite - people that install without using kube-up invariably forget these tags and find things don't work as they expect. On GCE though you have project isolation (that's what I am simulating with KubernetesCluster on AWS) so it doesn't seem unreasonable to grab a name prefix like On the other hand, having a "Terminating" state would be more consistent with the pod lifecycle... |
Ref #4630 |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Right now, the service controller has no sense of responsibility for the load balancer resources that exist in a cloud user's project. It only deletes things that it knows it owns, meaning that if it gets restarted while deleting a load balancer, it won't finish deleting the load balancer when it restarts (or if it's simply not running when a load balancer gets deleted, it won't even start deleting the load balancer).
This should reduce resource leaks, especially in any e2e tests that involve restarting (including upgrading) the master.
Cross-referencing various leak-related issues: #13515, #14597, #15083, #15084
@davidopp @quinton-hoole
The text was updated successfully, but these errors were encountered: