Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCE: Handle missing subnet for legacy networks and auto networks with unique subnet names #53410

Merged
merged 1 commit into from
Oct 5, 2017

Conversation

nicksardo
Copy link
Contributor

@nicksardo nicksardo commented Oct 3, 2017

Fixes #53409

/assign @bowei

Tested on three GKE clusters with automatic, manual, and legacy networks.

Release note:

GCE: Fixes ILB sync on legacy networks and auto networks with unique subnet names

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 3, 2017
@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 3, 2017
// https://www.googleapis.com/compute/v1/projects/myproject/regions/us-central1/subnetworks/a
// projects/myproject/regions/us-central1/subnetworks/a
// All return "us-central1"
func getRegionInURL(urlStr string) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit test?

return nil, fmt.Errorf("could not retrieve network %v, err: %v", networkURL, err)
}

if len(n.IPv4Range) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Legacy networks have a non-empty IPv4Range

isLegacyNetwork = true
} else if !n.AutoCreateSubnetworks {
// If manual network, fail because no subnet is specified
return nil, fmt.Errorf("no subnet specified in cloud provider config despite the network %q being manual.", networkURL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"manual networks require specifying a subnet'

@@ -687,6 +729,15 @@ func getZonesForRegion(svc *compute.Service, projectID, region string) ([]string
return zones, nil
}

func findSubnetForRegion(subnetURLs []string, region string) string {
for _, url := range subnetURLs {
if thisRegion, err := getRegionInURL(url); err == nil && thisRegion == region {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are eating err here...

isLegacyNetwork = true
} else {
// Try to find the subnet in the list of subnets
subnetURL = findSubnetForRegion(n.Subnetworks, config.Region)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bowei Originally, I threw an error if no subnet is specified and n.AutoCreateSubets is false. Given what we just saw for auto networks converted to manual networks, I removed the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you disagree or have a better idea.

@nicksardo nicksardo force-pushed the gce-hack-subnet branch 2 times, most recently from 68e2526 to 5ea3b2c Compare October 4, 2017 17:27
@nicksardo
Copy link
Contributor Author

/cc @MrHohn
Would appreciate more eyes if possible.

@k8s-ci-robot k8s-ci-robot requested a review from MrHohn October 4, 2017 17:28
networkURL = gceNetworkURL(config.ApiEndpoint, netProjID, config.NetworkName)
} else {
glog.Warningf("No network name or URL specified.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did two things in this area:

  • Reversed the order of conditions.
  • Removed the getNetworkNameViaAPICall invocation because I highly doubt it works.

if n, err := getNetwork(service, netProjID, networkName); err != nil {
// Gracefully fail because kubelet calls CreateGCECloud without any config, and API
// calls will fail coming from minions.
glog.Warningf("could not retrieve network %q in attempt to determine if legacy network or see list of subnets, err %v", networkURL, err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the logic to gracefully fail because we need to handle the kubelet case: network name is known but the minion has no GCE permissions.

@nicksardo
Copy link
Contributor Author

/test pull-kubernetes-unit

networkURL = gceNetworkURL(config.ApiEndpoint, netProjID, config.NetworkName)
} else {
glog.Warningf("No network name or URL specified.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in this case, should we return error? Or is the subnetURL below all we need?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on what the user of this package wants to do. Kubelet inits the cloud provider but only uses a static function that's not related to networking.

I'm not convinced that we can return an error here and not have this bite us in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubelet's use of cloud provider seems to be unnecessary. But does it make sense to have a cloud provider be initialized but not fully functional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's indeed unnecessary, but we have to live with it for now. Since gce.conf is not passed to kubelet, the cloudprovider is already mostly defunct. CreateGCECloud needs to not throw an error despite the lack of functionality. Meanwhile, ideally we do throw an error if we get into a defunct state for service/ingress controller. I don't think there's a great answer here until we limit the consumers of cloudprovider.

I'm hesitant to return an error here because I'm worried about the case of the metadata API call not returning the network name. This would cause kubelet to not start.

@nicksardo
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-etcd3

Copy link
Member

@MrHohn MrHohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just feel like the logic to get URLs is surprisingly dedicated...

if n, err := getNetwork(service, netProjID, networkName); err != nil {
// Gracefully fail because kubelet calls CreateGCECloud without any config, and API
// calls will fail coming from minions.
glog.Warningf("could not retrieve network %q in attempt to determine if legacy network or see list of subnets, err %v", networkURL, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could -> Could

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@bowei
Copy link
Member

bowei commented Oct 5, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 5, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bowei, nicksardo

Associated issue: 53409

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 51750, 53195, 53384, 53410). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 9af3398 into kubernetes:master Oct 5, 2017
k8s-github-robot pushed a commit that referenced this pull request Oct 9, 2017
…410-upstream-release-1.8

Automatic merge from submit-queue.

Automated cherry pick of #53410

Cherry pick of #53410 on release-1.8.

#53410: Handle missing subnet for auto networks and legacy networks


**Release note**:
```release-note
GCE: Fixes ILB sync on legacy networks and auto networks with unique subnet names
```
@nicksardo nicksardo deleted the gce-hack-subnet branch January 12, 2018 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants