Handle deleting external load balancers across kube-controller-manager restarts #15203

a-robinson · 2015-10-07T01:17:38Z

Right now, the service controller has no sense of responsibility for the load balancer resources that exist in a cloud user's project. It only deletes things that it knows it owns, meaning that if it gets restarted while deleting a load balancer, it won't finish deleting the load balancer when it restarts (or if it's simply not running when a load balancer gets deleted, it won't even start deleting the load balancer).

This should reduce resource leaks, especially in any e2e tests that involve restarting (including upgrading) the master.
Cross-referencing various leak-related issues: #13515, #14597, #15083, #15084

@davidopp @quinton-hoole

bprashanth · 2015-10-07T15:56:43Z

I would recommend getting the service controller out of kube-controller-manager and into a pod. We know how to handle shutdown for pods (terminationGracePeriod). Of course there're things we need to do on start up to prevent leaking (or getting wedged) that the pod won't help with.

thockin · 2015-10-07T16:20:45Z

We need a bit of a pow-wow on this. There's value is busting controllers
out into pods, but there's also costs. There's cost to putting them all in
one monolith, but there's also value.

We should work up a plan for this across existing L3 balancers, upcoming L7
(provisioning, binding), and storage (provisioning, recycling, binding) -
all of which have slightly different answers right now.

@markturansky @smarterclayton @saad-ali @davidopp

Options:

put all control loops for supported clouds into the monolithic
controller manager
break each individual control loop out into a pod
make a per-cloud-provider pod that monoliths all the control loops for
that provider

Nothing stops users/admins from doing arbitrarily many pods with control
loops, but we need to decide how many we want to have on a default
installation on (for example but not limited to) GCE. We already take flak
for "too many components".

@aronchick FYI

On Wed, Oct 7, 2015 at 8:57 AM, Prashanth B notifications@github.com
wrote:

I would recommend getting the service controller out of
kube-controller-manager and into a pod. We know how to handle shutdown for
pods (terminationGracePeriod). Of course there're things we need to do on
start up to prevent leaking (or getting wedged) that the pod won't help
with.

—
Reply to this email directly or view it on GitHub
#15203 (comment)
.

markturansky · 2015-10-07T17:19:31Z

@a-robinson add #14443 to your list of potential resource leaks.

ghost · 2015-10-07T18:45:47Z

@a-robinson the not-yet-deleted load balancers are still implicitly listed in etcd, associated with the relevant services, right? So on startup, service controller could find them there (via api-server), and repopulate it's state? Or is it more complicated than that?

a-robinson · 2015-10-07T18:54:02Z

@quinton-hoole - no, the issue is for services that have been deleted via the API (or have been updated to remove their load balancer via the API). The service controller is making the changes asynchronously from the API updates, so unless it can access the old state of the services, it won't know that the service ever existed or had a load balancer, respectively.

Services that still exist with their load balancer when the service controller restarts are handled fine if they're later deleted.

ghost · 2015-10-07T19:50:11Z

@a-robinson My apologies - I did not state my point clearly. Let me try again.

Should a deleted service not remain in "deleting" state (in etcd) until it has been completely and successfully deleted (including all GCE resource deletion)? If so, it would remain in etcd, and the process I described above could work?

a-robinson · 2015-10-07T20:58:21Z

Ah, sorry for the misunderstanding. If we were to add a "deleting" state for service objects, and only the service controller could move them out of that state, then this would indeed be fixed (and that pattern would also work for other objects that need external resources to be cleaned up asynchronously).

smarterclayton · 2015-10-07T21:11:07Z

The deleting state would be graceful deletion. You can implement that
today.

On Wed, Oct 7, 2015 at 4:58 PM, Alex Robinson notifications@github.com
wrote:

Ah, sorry for the misunderstanding. If we were to add a "deleting" state
for service objects, and only the service controller could move them out of
that state, then this would indeed be fixed (and it would also work for
other objects that need external resources to be cleaned up asynchronously).

—
Reply to this email directly or view it on GitHub
#15203 (comment)
.

smarterclayton · 2015-10-07T21:12:12Z

That said, another way to handle this is via resynchronization - I thought
we already have controller implementations that could provide the framework
to do this synchronization for us (retrieve all load balancers from GCE and
compare to the list of services?). This won't be the only case where we
have to sync external resources - the deleting state doesn't make the
experience better, as long as we can correctly identify which load
balancers don't exist.

On Wed, Oct 7, 2015 at 5:11 PM, Clayton Coleman ccoleman@redhat.com wrote:

The deleting state would be graceful deletion. You can implement that
today.

On Wed, Oct 7, 2015 at 4:58 PM, Alex Robinson notifications@github.com
wrote:

Ah, sorry for the misunderstanding. If we were to add a "deleting" state
for service objects, and only the service controller could move them out of
that state, then this would indeed be fixed (and it would also work for
other objects that need external resources to be cleaned up asynchronously).

—
Reply to this email directly or view it on GitHub
#15203 (comment)
.

markturansky · 2015-10-08T01:38:32Z

This won't be the only case where we have to sync external resources

Volumes need to be reconciled with the provider: #14443

thockin · 2015-10-08T03:35:49Z

Maybe we should, like Docker containers, claim a prefix as "assumed for
kubernetes". The problem is that we're sharing an unmanaged space with
non-kubernetes users.

On Wed, Oct 7, 2015 at 6:39 PM, Mark Turansky notifications@github.com
wrote:

This won't be the only case where we have to sync external resources

Volumes need to be reconciled with the provider: #14443
#14443

—
Reply to this email directly or view it on GitHub
#15203 (comment)
.

davidopp · 2015-10-08T07:56:45Z

This is not a 1.1 blocker, right?

markturansky · 2015-10-08T12:24:56Z

@thockin I reversed the order in which I create things in our Provisioner. I now create the PV first (with a faux placeholder ID to pass validation). If that transaction succeeds, then we provision the volume in the provider. We can feed the PV's real name to the EBS/GCEPD/Cinder volume for desc/tags.

Recon for volumes will be querying the provider for all volumes and using the PV's name to make sure it's in the cluster.

Not sure how it should work for other resources, but this approach is working well for volumes, I think.

ghost · 2015-10-08T15:02:57Z

@davidopp Yup, not a blocker, IMO.

a-robinson · 2015-10-08T17:05:08Z

Yup, not a blocker. This issue is just following up on our conversation earlier this week.

ghost · 2015-12-15T21:46:24Z

Resolving this for the purpose of deflaking tests alone makes it worthy of P1 IMO. @a-robinson to what extent are leaking network resources still causing e2e test failures? If they're not, I'd suggest dropping priority to P2?

a-robinson · 2015-12-15T21:54:07Z

I honestly have no idea at this point. I haven't been watching them other than when I'm build copping since I've been doing so little OSS work lately.

ghost · 2015-12-16T00:47:11Z

OK, I'll:

try to consolidate all of the related "leaking cloud resources" issues that have been reported in various places (typically in the context of e2e tests failing) into a smaller number of apparent root causes. I anticipate that these will fall into three buckets:
1. test infrastructure that does not clean up properly after tests (Framework.AfterEach())
2. cluster teardown ("kube-down.sh") that does not reliably turn down clusters and/or their backend cloud resources.
3. k8s controllers that do not reliably garbage-collect cloud resources.
Find owners for each of the underlying root causes

cc @ixdy FYI, as you've looked into this a fair bit in the past.

@kubernetes/goog-testing

a-robinson · 2016-01-25T18:41:47Z

cc @brendandburns

davidopp · 2016-02-04T08:49:09Z

This is marked as 1.2 milestone but it doesn't seem like there will be anything resembling closure on it for 1.2. Is there any piece of it that's critical for 1.2? If not, we can punt it to the next release, and/or work on it after the feature freeze. Suggestions?

a-robinson · 2016-02-04T19:07:20Z

It's not critical for 1.2, but could certainly be considered a test flake issue given that it leads to static IP leaks in our e2e tests

bprashanth · 2016-02-04T19:11:49Z

If we're restarting controller manager outside of daemon restart during a test run when it has live loadbalancer, that would be a worse bug IMO

davidopp · 2016-02-12T07:51:51Z

This issue is full of good ideas but nobody had time to implement any of them, so I'm going to move this to the next-candidate milestone...

guoshimin · 2016-02-26T15:06:47Z

Can we at least on AWS tag the ELBs created by kubernetes with cluster name and service name, so that manual clean up is possible?

guoshimin · 2016-02-26T23:00:22Z

On Fri, Feb 26, 2016 at 12:27 PM, Jeff Grafton notifications@github.com
wrote:

@guoshimin https://github.com/guoshimin I think we've looked into this
before. The problem is that the allowed name is very limited. From
http://docs.aws.amazon.com/ElasticLoadBalancing/latest/APIReference/API_CreateLoadBalancer.html
:

This name must be unique within your set of load balancers for the region,
must have a maximum of 32 characters, must contain only alphanumeric
characters or hyphens, and cannot begin or end with a hyphen.

Requiring uniqueness + maximum of 32 characters really constrains the
ability for a general solution that also includes the cluster/service name.

I don't mean the names of the ELBs, but tags. Kubernetes already tags the
security groups:
https://github.com/kubernetes/kubernetes/blob/v1.2.0-alpha.8/pkg/cloudprovider/providers/aws/aws.go#L1753.
We should tag the ELBs the same way.

—
Reply to this email directly or view it on GitHub
#15203 (comment)
.

ixdy · 2016-02-26T23:03:43Z

Yeah, I realized what you meant right after making that comment.

davidopp · 2016-02-26T23:51:41Z

@justinsb what do you think about
#15203 (comment)
?

It seems like a good backstop in case #19879 doesn't get merged in time for 1.2

a-robinson · 2016-02-27T00:27:40Z

#19879 won't be going in for 1.2, we went with #22069 instead. And neither one actually addresses this specific issue. They fix other bad behavior, but explicitly don't handle the sort of ownership tracking across controller-manager restarts or graceful deletion state needed for this.

justinsb · 2016-02-27T03:14:06Z

@davidopp @guoshimin tagging of ELBs is a good request. We recently added some tagging; it should be easy to add any missing tagging. Tracking as #22118

justinsb · 2016-02-27T03:36:27Z

Oh ... and related: on AWS we already reserved some tags. I screwed up originally and called the primary tag "KubernetesCluster", but otherwise we're now claiming the whole "*.kubernetes.io/..." namespace for our own purposes. Nobody's complained yet, but then these are only tags (not names) and I haven't exactly been broadcasting the fact that you shouldn't allow your tags to collide. But on the other hand, I don't see people getting a collision accidentally. The biggest problem on AWS is the opposite - people that install without using kube-up invariably forget these tags and find things don't work as they expect.

On GCE though you have project isolation (that's what I am simulating with KubernetesCluster on AWS) so it doesn't seem unreasonable to grab a name prefix like k8sio- or kubernetes-io---; if I install k8s into a project it's not a huge ask to not choose colliding names in that project. At least that's my understanding of GCE naming rules.

On the other hand, having a "Terminating" state would be more consistent with the pod lifecycle...

bgrant0607 · 2016-10-17T23:26:24Z

Ref #4630

fejta-bot · 2017-12-25T02:40:38Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

fejta-bot · 2018-01-24T03:28:25Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

fejta-bot · 2018-02-23T03:34:49Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

a-robinson added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. team/control-plane labels Oct 7, 2015

a-robinson added this to the v1.2 milestone Dec 8, 2015

davidopp added the davidopp label Dec 15, 2015

davidopp assigned ghost Dec 15, 2015

davidopp added the team/cluster label Feb 12, 2016

davidopp added this to the next-candidate milestone Feb 12, 2016

bprashanth mentioned this issue Feb 26, 2016

controller won't delete load balancer if service is deleted while controller is down #21960

Closed

a-robinson mentioned this issue Feb 26, 2016

Improve the concurrency management in the service controller. #19879

Closed

ixdy closed this as completed Feb 26, 2016

ixdy reopened this Feb 26, 2016

justinsb mentioned this issue Feb 27, 2016

AWS: Make sure created ELBs have cluster & service tags #22118

Closed

bgrant0607 mentioned this issue Oct 7, 2016

Facilitate API orchestration #34363

Open

bgrant0607 unassigned ghost Oct 17, 2016

bgrant0607 removed this from the next-candidate milestone Oct 17, 2016

bgrant0607 added the area/teardown label Oct 17, 2016

bgrant0607 removed the davidopp label Dec 8, 2016

This was referenced Mar 9, 2017

add loadbalancer and loadbalancerclaim proposal kubernetes/community#275

Closed

GKE network load balancer doesn't get deleted on service deletion #39420

Closed

thockin added the sig/network Categorizes an issue or PR as relevant to SIG Network. label May 23, 2017

bgrant0607 mentioned this issue Nov 7, 2017

Control plane bootstrapping order AKA we need a run-level concept #54522

Open

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 25, 2017

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 24, 2018

k8s-ci-robot closed this as completed Feb 23, 2018

schrodit mentioned this issue Oct 23, 2019

Shoot LoadBalancers aren't updated correctly gardener-attic/gardener-extensions#384

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle deleting external load balancers across kube-controller-manager restarts #15203

Handle deleting external load balancers across kube-controller-manager restarts #15203

a-robinson commented Oct 7, 2015

bprashanth commented Oct 7, 2015

thockin commented Oct 7, 2015

markturansky commented Oct 7, 2015

ghost commented Oct 7, 2015

a-robinson commented Oct 7, 2015

ghost commented Oct 7, 2015

a-robinson commented Oct 7, 2015

smarterclayton commented Oct 7, 2015

smarterclayton commented Oct 7, 2015

markturansky commented Oct 8, 2015

thockin commented Oct 8, 2015

davidopp commented Oct 8, 2015

markturansky commented Oct 8, 2015

ghost commented Oct 8, 2015

a-robinson commented Oct 8, 2015

ghost commented Dec 15, 2015

a-robinson commented Dec 15, 2015

ghost commented Dec 16, 2015

a-robinson commented Jan 25, 2016

davidopp commented Feb 4, 2016

a-robinson commented Feb 4, 2016

bprashanth commented Feb 4, 2016

davidopp commented Feb 12, 2016

guoshimin commented Feb 26, 2016

guoshimin commented Feb 26, 2016

ixdy commented Feb 26, 2016

davidopp commented Feb 26, 2016

a-robinson commented Feb 27, 2016

justinsb commented Feb 27, 2016

justinsb commented Feb 27, 2016

bgrant0607 commented Oct 17, 2016

fejta-bot commented Dec 25, 2017

fejta-bot commented Jan 24, 2018

fejta-bot commented Feb 23, 2018

Handle deleting external load balancers across kube-controller-manager restarts #15203

Handle deleting external load balancers across kube-controller-manager restarts #15203

Comments

a-robinson commented Oct 7, 2015

bprashanth commented Oct 7, 2015

thockin commented Oct 7, 2015

markturansky commented Oct 7, 2015

ghost commented Oct 7, 2015

a-robinson commented Oct 7, 2015

ghost commented Oct 7, 2015

a-robinson commented Oct 7, 2015

smarterclayton commented Oct 7, 2015

smarterclayton commented Oct 7, 2015

markturansky commented Oct 8, 2015

thockin commented Oct 8, 2015

davidopp commented Oct 8, 2015

markturansky commented Oct 8, 2015

ghost commented Oct 8, 2015

a-robinson commented Oct 8, 2015

ghost commented Dec 15, 2015

a-robinson commented Dec 15, 2015

ghost commented Dec 16, 2015

a-robinson commented Jan 25, 2016

davidopp commented Feb 4, 2016

a-robinson commented Feb 4, 2016

bprashanth commented Feb 4, 2016

davidopp commented Feb 12, 2016

guoshimin commented Feb 26, 2016

guoshimin commented Feb 26, 2016

ixdy commented Feb 26, 2016

davidopp commented Feb 26, 2016

a-robinson commented Feb 27, 2016

justinsb commented Feb 27, 2016

justinsb commented Feb 27, 2016

bgrant0607 commented Oct 17, 2016

fejta-bot commented Dec 25, 2017

fejta-bot commented Jan 24, 2018

fejta-bot commented Feb 23, 2018