Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azure: filter load balancer backend nodes to PrimaryAvailabilitySet (if set) #34526

Conversation

colemickens
Copy link
Contributor

@colemickens colemickens commented Oct 11, 2016

What this PR does / why we need it:

  • Adds a new field (PrimaryAvailabilitySetName) to the Azure CloudProvider config struct
  • If the field is set, only machines who are in that availabilitySet are added to the load balancer backend pool.

This is required to:

  • Support more than 100 nodes in Azure (only can have 100 nodes per availability set)
  • Support multiple availability sets per cluster (An Azure L4 LoadBalancer can only be pointed at nodes in a single availability set)

Without this PR, or if the field is not set in a cluster that contains two availabilitysets, then the following is observed:

  • Azure resources are created (LB, LB rules, NSG rules, public IP)
  • Azure throws errors when trying to add nodes from the "other" availability set
  • The service winds up exposed to the outside world (if you manually retrieve the public ip from Azure API)
  • Kubernetes controller-manager's service loop keeps retrying forever because it never finishes fully successfully
  • The "external ip" property field is never updated.

Which issue this PR fixes: Fixes #34293

Unknowns:

  • Naming convention: LoadBalancedAvailabilitySet might be more descriptive than PrimaryAvailabilitySet, but is also a misnomer since kube-proxy will still end up routing requests to all relevant nodes.
  • Is it worth trying to be "smart" about it in the case the user hasn't set this field in the config? Save the first availability set name and try not to add any nodes that aren't also in that one? It may simply be better to just let this fail so the user has to choose the right setting for their use-case.

Release note:

azure: add PrimaryAvailabilitySet to config, only use nodes in that set in the loadbalancer pool

CC: @brendandburns, @anhowe


This change is Reviewable

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Oct 11, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins GCE e2e failed for commit e57eecf1a9033d4c5131a0faee18611d0103b85f. Full PR test history.

The magic incantation to run this job again is @k8s-bot gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@brendandburns brendandburns added lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Oct 12, 2016
@colemickens colemickens force-pushed the colemickens-azure-specify-availabilityset branch from e57eecf to 433243c Compare October 12, 2016 19:47
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 12, 2016
@colemickens
Copy link
Contributor Author

(No material change, just rebased to try to avoid hitting test flakes)

@k8s-ci-robot
Copy link
Contributor

Jenkins verification failed for commit 433243c4b7f09b86e755205ea40b9d819427ca39. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@colemickens colemickens force-pushed the colemickens-azure-specify-availabilityset branch from 433243c to 113c5e3 Compare October 12, 2016 22:07
@colemickens
Copy link
Contributor Author

(Updated to fix the lint issue.)

@brendandburns brendandburns added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 12, 2016
@colemickens
Copy link
Contributor Author

@k8s-bot test this

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 3b787c9 into kubernetes:master Oct 13, 2016
k8s-github-robot pushed a commit that referenced this pull request Oct 14, 2016
…availabilityset

Automatic merge from submit-queue

azure: lower log priority for skipped nic update message

**What this PR does / why we need it**: Very minor, just wanted to remove some log noise I introduced in #34526.

I chose `V(3)` since it aligns with the other nicupdate message printed out here, and will be hidden for the usual default of `--v=2`.

**Release note**:
<!--  Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access) 
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`. 
-->
```release-note
NONE
```
@colemickens
Copy link
Contributor Author

@jessfraz Can we tag this for a cherry-pick into 1.4.x? Along with the follow-up PR: #34730

I'm looking at the cherry-pick doc and I can't seem to tell if other action from me is required other than getting these two PRs tagged.

@jessfraz jessfraz added this to the v1.4 milestone Oct 17, 2016
@jessfraz
Copy link
Contributor

got it and ill open the cherrypicks, thanks!

@jessfraz jessfraz added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Oct 18, 2016
k8s-github-robot pushed a commit that referenced this pull request Oct 18, 2016
#34730-origin-release-1.4

Automatic merge from submit-queue

Automated cherry pick of #34526 #34730

Cherry pick of #34526 #34730 on release-1.4.

#34526: azure: filter load balancer backend nodes to
#34730: azure: lower log priority for skipped nic update message
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.4" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this pull request Dec 1, 2016
…ck-of-#34526-kubernetes#34730-origin-release-1.4

Automatic merge from submit-queue

Automated cherry pick of kubernetes#34526 kubernetes#34730

Cherry pick of kubernetes#34526 kubernetes#34730 on release-1.4.

kubernetes#34526: azure: filter load balancer backend nodes to
kubernetes#34730: azure: lower log priority for skipped nic update message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

azure: Gracefully support multiple availability sets
7 participants