Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service controller deletes and re-adds AWS ELB listeners when it starts up #47067

Closed
marshallbrekka opened this issue Jun 6, 2017 · 7 comments
Assignees
Milestone

Comments

@marshallbrekka
Copy link
Contributor

Is this a request for help?: NO

What keywords did you search in Kubernetes issues before filing this one?:

AWS, ELB


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Kubernetes version (use kubectl version): 1.4.8 (pretty sure problem exists in master)

Environment:

  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): CoreOS 1353.7.0
  • Kernel (e.g. uname -a): 4.9.24
  • Install tools: Custom AMI

What happened:
The service controller deletes and re-adds the ELB listeners when it starts up (or becomes the master).

What you expected to happen:
It should only modify the listeners if they have actually changed.

How to reproduce it (as minimally and precisely as possible):

  • Create a service with type LoadBalancer
  • Restart controller-manager after ELB has been created
  • Search CloudTrail for events for the ELB, and verify there are DeleteLoadBalancerListeners and CreateLoadBalancerListeners events.

Anything else we need to know:

I believe this is caused by this for loop in the ensureLoadBalancer function which compares the desired listener properties to the actual listener properties.

The problem is that K8 uses lowercase strings 1, 2 for the protocols, but AWS returns uppercase protocols when describing the listeners (see the two example outputs below).

I believe the simplest fix is to lowercase the protocols in the for loop when performing the comparison, which I can submit as a PR if others agree with that solution.

From AWS CLI:

$ aws elb describe-load-balancers --load-balancer-names a4fafa3e6f24911e6aac506402a2bc46
{
    "LoadBalancerDescriptions": [
         {
             "ListenerDescriptions": [
                {
                    "Listener": {
                        "InstancePort": 32706, 
                        "LoadBalancerPort": 80, 
                        "Protocol": "TCP", 
                        "InstanceProtocol": "TCP"
                    }, 
                    "PolicyNames": []
                }
            ] 
        }
    ]
}

From Go elb.DescribeLoadBalancers(request)

package main

import (
	"log"

	"github.com/aws/aws-sdk-go/aws/session"
	"github.com/aws/aws-sdk-go/service/elb"
)

func main() {
	name := "a4fafa3e6f24911e6aac506402a2bc46"

	sess := session.Must(session.NewSession())
	svc := elb.New(sess)

	request := &elb.DescribeLoadBalancersInput{}
	request.LoadBalancerNames = []*string{&name}

	response, err := svc.DescribeLoadBalancers(request)
	if err != nil {
		log.Fatal(err)
	}

	for _, loadBalancer := range response.LoadBalancerDescriptions {
		log.Printf("ELB: %v\n", loadBalancer)
		return
	}
}

outputs

2017/06/06 11:11:49 ELB: {
  AvailabilityZones: ["us-west-2a","us-west-2b","us-west-2c"],
  CanonicalHostedZoneNameID: "xxx",
  CreatedTime: 2017-03-24 03:35:49.48 +0000 UTC,
  DNSName: "internal-a4fafa3e6f24911e6aac506402a2bc46-2085661947.us-west-2.elb.amazonaws.com",
  HealthCheck: {
    HealthyThreshold: 2,
    Interval: 10,
    Target: "TCP:32706",
    Timeout: 5,
    UnhealthyThreshold: 6
  },
  Instances: [{
      InstanceId: "i-xxx"
    },{
      InstanceId: "i-xxx"
    },{
      InstanceId: "i-xxx"
    }],
  ListenerDescriptions: [{
      Listener: {
        InstancePort: 32706,
        InstanceProtocol: "TCP",
        LoadBalancerPort: 80,
        Protocol: "TCP"
      }
    }],
  LoadBalancerName: "a4fafa3e6f24911e6aac506402a2bc46",
  Policies: {

  },
  Scheme: "internal",
  SecurityGroups: ["sg-xxx"],
  SourceSecurityGroup: {
    GroupName: "k8s-elb-a4fafa3e6f24911e6aac506402a2bc46",
    OwnerAlias: "xxx"
  },
  Subnets: ["subnet-xxx","subnet-xxx","subnet-xxx"],
  VPCId: "vpc-xxx"
}
@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 6, 2017
@k8s-github-robot
Copy link

@marshallbrekka There are no sig labels on this issue. Please add a sig label by:
(1) mentioning a sig: @kubernetes/sig-<team-name>-misc
(2) specifying the label manually: /sig <label>

Note: method (1) will trigger a notification to the team. You can find the team list here and label list here

@marshallbrekka
Copy link
Contributor Author

@kubernetes/sig-aws-misc

@k8s-ci-robot
Copy link
Contributor

@marshallbrekka: Reiterating the mentions to trigger a notification:
@kubernetes/sig-aws-misc.

In response to this:

@kubernetes/sig-aws-misc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@marshallbrekka
Copy link
Contributor Author

/sig aws

@marshallbrekka
Copy link
Contributor Author

I should also add, this caused downtime for us, as we run a decent amount of ELB services, we were getting rate limited on the delete and create listener requests, which resulted in the listeners being deleted for a long time before they were re-added.

$ kubectl describe service our-elb-service
Name:			our-elb-service
Namespace:		default
Labels:			<none>
Selector:		app=our-elb-service
Type:			LoadBalancer
IP:			10.19.128.185
LoadBalancer Ingress:	internal-a4fafa3e6f24911e6aac506402a2bc46-2085661947.us-west-2.elb.amazonaws.com
Port:			<unset>	80/TCP
NodePort:		<unset>	32706/TCP
Endpoints:		10.19.131.152:6000,10.19.132.53:6000
Session Affinity:	None
Events:
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason				Message
  ---------	--------	-----	----			-------------	--------	------				-------
  11m		11m		1	{service-controller }			Warning		CreatingLoadBalancerFailed	Error creating load balancer (will retry): Failed to create load balancer for service default/internal-api: error deleting AWS loadbalancer listeners: Throttling: Rate exceeded
  11m		11m	2	{service-controller }		Normal	CreatingLoadBalancer	Creating load balancer
  11m		11m	1	{service-controller }		Normal	CreatedLoadBalancer	Created load balancer
  9m		9m	1	{service-controller }		Normal	UpdatedLoadBalancer	Updated load balancer with new hosts
  6m		6m	1	{service-controller }		Normal	CreatingLoadBalancer	Creating load balancer
  6m		6m	1	{service-controller }		Normal	CreatedLoadBalancer	Created load balancer
  5m		5m	1	{service-controller }		Normal	CreatingLoadBalancer	Creating load balancer
  5m		5m	1	{service-controller }		Normal	CreatedLoadBalancer	Created load balancer
  3m		3m	1	{service-controller }		Normal	UpdatedLoadBalancer	Updated load balancer with new hosts

@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 6, 2017
@justinsb justinsb added this to the v1.7 milestone Jun 13, 2017
@justinsb justinsb self-assigned this Jun 13, 2017
@justinsb
Copy link
Member

Sorry for the delay @marshallbrekka, and thank you so much for the excellent bug report.

I agree with the approach, and because of the timelines I put up a PR at #47391. If you also want to submit a PR, I'll happily LGTM it though!

Thanks again for finding this!

@marshallbrekka
Copy link
Contributor Author

Your PR looks good, thanks for getting a fix together so quickly!

justinsb added a commit to justinsb/kubernetes that referenced this issue Jun 13, 2017
k8s-github-robot pushed a commit that referenced this issue Jun 13, 2017
Automatic merge from submit-queue (batch tested with PRs 46929, 47391, 47399, 47428, 47274)

AWS: Perform ELB listener comparison in case-insensitive manner

Fix #47067

```release-note
AWS: Avoid spurious ELB listener recreation - ignore case when matching protocol
```
justinsb added a commit to justinsb/kubernetes that referenced this issue Jun 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants