Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow use resource ID to specify public IP address in azure_loadbalancer #53557

Merged
merged 6 commits into from
Dec 21, 2017

Conversation

yolocs
Copy link
Contributor

@yolocs yolocs commented Oct 7, 2017

What this PR does / why we need it: Currently the Azure load balancer assumes that a Public IP address is in the same resource group as the cluster. This is not necessarily true in all environments, in addition to accepting a Public IP, we should allow an annotation to the Service object that indicates what resource group the IP is present in.

Which issue this PR fixes: fixes #53274 #52129

Special notes for your reviewer: first time golang user, please forgive the amateurness

Release note

Allow use resource ID to specify public IP address in azure_loadbalancer

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 7, 2017
@dims
Copy link
Member

dims commented Oct 9, 2017

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 9, 2017
Copy link
Contributor

@itowlson itowlson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome! This looks pretty good to me though I have a couple of suggestions and caveats. Main thing is I would suggest that the annotation be a resource group name rather than the resource ID of a Public IP, but I may well be missing something (and Brendan's original issue can be read either way... grin). Open to being talked round though!

// ServiceAnnotationLoadBalancerPublicIPAddressID is the annotation used on the service
// to specify the resource ID of the public IP address,
// which can be used to specify public IP address that's not in the same resource group as the cluster
const ServiceAnnotationLoadBalancerPublicIPAddressID = "service.beta.kubernetes.io/azure-load-public-ip-id"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this is used only to derive a resource group name: the actual Public IP resource is still located by looking up the loadBalancerIP value in the public IPs in that resource group. The code doesn't require that the Public IP resource in the annotation is the one that matches the loadBalancerIP address. Given that, perhaps we should just have the annotation be a resource group name here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotating with just resource group actually makes sense to me. I was carried away by the issue description 'resource id'.

}
}
}
// TODO: follow next link here? Will there really ever be that many public IPs?

return "", fmt.Errorf("user supplied IP Address %s was not found", loadBalancerIP)
return "", resourceGroup, fmt.Errorf("user supplied IP Address %s was not found", loadBalancerIP)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of the Go idiom here (I'm pretty new to this too) but I don't think consumers are going to use the resource group in an error return, so would it be more idiomatic to return "" for the resource group here?

func (az *Cloud) ensurePublicIPExists(serviceName, pipName string) (*network.PublicIPAddress, error) {
pip, existsPip, err := az.getPublicIPAddress(pipName)
func (az *Cloud) ensurePublicIPExists(serviceName, pipResourceGroup string, pipName string) (*network.PublicIPAddress, error) {
pip, existsPip, err := az.getPublicIPAddress(pipResourceGroup, pipName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might have a problem here if the Public IP in the annotation doesn't exist. If that happens, won't it attempt to create the Public IP in the cluster resource group? I'm guessing the desired behaviour would be either to fail or to create it in the resource group from the annotation. If the latter should it use the ID from the annotation too? Again the story may be simpler if the annotation specifies only the RG and uses address lookup to locate the PIP resource...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm going to change the annotation to resource group, here I can try to create the pip in the given resource group. If the creation fails, it will return error anyway.


// parseResourceGroupNameFromID parses the resource group name from a resource ID
func parseResourceGroupNameFromID(resourceID string) (resourceGroupName string, err error) {
reg, err := regexp.Compile(`(?i)(.*?)/resourceGroups/(?P<rgname>\S+)/providers/(.*?)`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a regexp.MustCompile at module level (cf. providerIDRE on line 49 of this file).

@itowlson
Copy link
Contributor

@yolo3301 I think this also fixes #52129 - if so could you add that to the "fixes" list in the PR description please? Thanks!

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 10, 2017
@yolocs
Copy link
Contributor Author

yolocs commented Oct 10, 2017

@itowlson thanks for the review!

Copy link
Contributor

@itowlson itowlson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is an asymmetry where we can create Public IPs but not delete them. I'm not quite sure what the preferred behaviour is so it would be good to get feedback from other reviewers on this. Otherwise looks good, just minor cosmetic suggestions.

@@ -102,14 +102,14 @@ func (az *Cloud) CreateOrUpdateLBWithRetry(lb network.LoadBalancer) error {
}

// CreateOrUpdatePIPWithRetry invokes az.PublicIPAddressesClient.CreateOrUpdate with exponential backoff retry
func (az *Cloud) CreateOrUpdatePIPWithRetry(pip network.PublicIPAddress) error {
func (az *Cloud) CreateOrUpdatePIPWithRetry(resourceGroup string, pip network.PublicIPAddress) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to handle deletion if we create the PIP? I believe we would delete it if we created it in the cluster RG, so probably we should delete if we create (but leave alone if the user created the PIP). Would be good to get input from other reviewers on the preferred behaviour (I'm new to this too!).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I added the code to handle deletion. Basically by matching both name and resource group.

// Override the cluster resource group if public IP resource ID presents
resourceGroup := az.ResourceGroup
if publicIPAddressResourceGroup(service) != "" {
resourceGroup = publicIPAddressResourceGroup(service)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's harmless to evaluate twice, but also unnecessary. Can you do something like if pipResourceGroup := publicIPAddressResourceGroup(service); pipResourceGroup != "" { resourceGroup = pipResourceGroup }?

Another possibility could be to make publicIPAddressResourceGroup handle the defaulting to az.ResourceGroup itself.

}
}
}
// TODO: follow next link here? Will there really ever be that many public IPs?

return "", fmt.Errorf("user supplied IP Address %s was not found", loadBalancerIP)
return "", "", fmt.Errorf("user supplied IP Address %s was not found", loadBalancerIP)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to include the searched RG in the error message e.g. "user supplied IP Address %s was not found in resource group %s"? This might help a user diagnose a "doh, I forgot/mis-spelled the annotation" situation.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 15, 2017
Copy link
Contributor

@itowlson itowlson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many apologies for taking so long to get back to this. A few nitpicks and a couple of things that worry me: first, not being sure when the 'resource group' variable is populated and when not, and second, whether create/delete is going to be safe in the face of edit scenarios. (If we can't easily support edit scenarios then it may be that we can drop create/delete - I think the main use case for this is when the user already has a particular IP they want to use and it's in a different RG, though I could be wrong!)

@@ -128,14 +128,14 @@ func (az *Cloud) CreateOrUpdateInterfaceWithRetry(nic network.Interface) error {
}

// DeletePublicIPWithRetry invokes az.PublicIPAddressesClient.Delete with exponential backoff retry
func (az *Cloud) DeletePublicIPWithRetry(pipName string) error {
func (az *Cloud) DeletePublicIPWithRetry(pipResourceGroup string, pipName string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a minor style inconsistency - in CreateOrUpdate this is called resourceGroup, here pipResourceGroup. (I think I slightly prefer the latter but don't feel strongly as long as they're consistent.)

loadBalancerIP := service.Spec.LoadBalancerIP
if len(loadBalancerIP) == 0 {
return getPublicIPName(clusterName, service), nil
return getPublicIPName(clusterName, service), "", nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be safer to return az.getPublicIPAddressResourceGroup() here rather than ""? The current design puts the onus on the calling function to fill in the resource group in certain situations - this could lead to more code paths in the calling functions and more opportunities to miss something during testing. (Longer term perhaps we need a better abstraction around this, such as a PublicIPAddressReference, but not sure about that yet!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified it so that I use az.getPublicIPAddressResourceGroup() as much as possible instead of passing the resource group around.

@@ -435,18 +442,20 @@ func (az *Cloud) cleanupLoadBalancer(clusterName string, service *v1.Service, is

// Public IP can be deleted after frontend ip configuration rule deleted.
if publicIPToCleanup != nil {
// Only delete an IP address if we created it, deducing by name.
// Only delete an IP address if we created it, deducing by name and resource group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the cleanup logic going to work correctly if a user edits a service and changes which resource group its public IP is in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing code cannot handle it too. Mainly because we have no way to know which pip is actually created by k8s. I'm basically restricting the condition to delete public IP by matching name and rg. So in the worse case (the one you described), it didn't cleanup the pip which is not that bad I think. VS. accidentally deleted a pip in the provided rg. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also add tags to k8s-managed public IPs?

func (az *Cloud) ensurePublicIPExists(serviceName, pipName string) (*network.PublicIPAddress, error) {
pip, existsPip, err := az.getPublicIPAddress(pipName)
func (az *Cloud) ensurePublicIPExists(serviceName string, pipResourceGroup string, pipName string) (*network.PublicIPAddress, error) {
pip, existsPip, err := az.getPublicIPAddress(pipResourceGroup, pipName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current design, does this need to handle the case where pipResourceGroup is the empty string, or is there something else that guarantees that will never happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I changed it accept a service object instead and get the right resource group inside this method.

// ServiceAnnotationLoadBalancerPublicIPAddressResourceGroup is the annotation used on the service
// to specify the resource group of the public IP address,
// which can be used to specify public IP address that's not in the same resource group as the cluster
const ServiceAnnotationLoadBalancerPublicIPAddressResourceGroup = "service.beta.kubernetes.io/azure-load-balancer-pip-resource-group"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the 'pip' part in case we want to re-use this for other load balancer objects...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. Are we assuming other load balancer objects will share the same resource group?

@@ -1005,6 +1018,14 @@ func (az *Cloud) ensureHostInPool(serviceName string, nodeName types.NodeName, b
return nil
}

func (az *Cloud) getPublicIPAddressResourceGroup(service *v1.Service) string {
if resourceGroup, ok := service.Annotations[ServiceAnnotationLoadBalancerPublicIPAddressResourceGroup]; ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, I'd prefer found vs ok

@brendandburns
Copy link
Contributor

This basically LGTM, a few nits... @yolo3301 let's get this merged...

--brendan

@yolocs
Copy link
Contributor Author

yolocs commented Nov 3, 2017

/retest

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 14, 2017
@itowlson
Copy link
Contributor

itowlson commented Dec 3, 2017

@yolo3301 do you have cycles to rebase this so we can get it merged? If not I can send a PR to your branch I think.

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 3, 2017
@yolocs
Copy link
Contributor Author

yolocs commented Dec 3, 2017

@itowlson Finally got time to update this PR :)

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 17, 2017
@kubernetes kubernetes deleted a comment from k8s-github-robot Dec 17, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 17, 2017
@yolocs
Copy link
Contributor Author

yolocs commented Dec 17, 2017

@brendandburns shall we get this merged?

@brendandburns
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 21, 2017
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brendandburns, yolo3301

Associated issue: #53274

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 21, 2017
@brendandburns
Copy link
Contributor

@yolo3301 sorry for the delay, please ping me on slack/IM/email in the future :)

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 02be3eb into kubernetes:master Dec 21, 2017
@tapanhalani
Copy link

Is this feature available in 1.8.7 release? If yes, what annotation should I use in my service metadata to make ip's in a different resource group usable by the kubernetes service? Thanks.

@feiskyer
Copy link
Member

@tapanhalani No, it will be included in v1.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Azure LB should support a resource id reference for public IP addresses...
10 participants