-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCE: Handle missing subnet for legacy networks and auto networks with unique subnet names #53410
Conversation
// https://www.googleapis.com/compute/v1/projects/myproject/regions/us-central1/subnetworks/a | ||
// projects/myproject/regions/us-central1/subnetworks/a | ||
// All return "us-central1" | ||
func getRegionInURL(urlStr string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit test?
return nil, fmt.Errorf("could not retrieve network %v, err: %v", networkURL, err) | ||
} | ||
|
||
if len(n.IPv4Range) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Legacy networks have a non-empty IPv4Range
isLegacyNetwork = true | ||
} else if !n.AutoCreateSubnetworks { | ||
// If manual network, fail because no subnet is specified | ||
return nil, fmt.Errorf("no subnet specified in cloud provider config despite the network %q being manual.", networkURL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"manual networks require specifying a subnet'
@@ -687,6 +729,15 @@ func getZonesForRegion(svc *compute.Service, projectID, region string) ([]string | |||
return zones, nil | |||
} | |||
|
|||
func findSubnetForRegion(subnetURLs []string, region string) string { | |||
for _, url := range subnetURLs { | |||
if thisRegion, err := getRegionInURL(url); err == nil && thisRegion == region { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are eating err
here...
isLegacyNetwork = true | ||
} else { | ||
// Try to find the subnet in the list of subnets | ||
subnetURL = findSubnetForRegion(n.Subnetworks, config.Region) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bowei Originally, I threw an error if no subnet is specified and n.AutoCreateSubets is false. Given what we just saw for auto networks converted to manual networks, I removed the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you disagree or have a better idea.
68e2526
to
5ea3b2c
Compare
/cc @MrHohn |
networkURL = gceNetworkURL(config.ApiEndpoint, netProjID, config.NetworkName) | ||
} else { | ||
glog.Warningf("No network name or URL specified.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did two things in this area:
- Reversed the order of conditions.
- Removed the
getNetworkNameViaAPICall
invocation because I highly doubt it works.
if n, err := getNetwork(service, netProjID, networkName); err != nil { | ||
// Gracefully fail because kubelet calls CreateGCECloud without any config, and API | ||
// calls will fail coming from minions. | ||
glog.Warningf("could not retrieve network %q in attempt to determine if legacy network or see list of subnets, err %v", networkURL, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the logic to gracefully fail because we need to handle the kubelet case: network name is known but the minion has no GCE permissions.
/test pull-kubernetes-unit |
networkURL = gceNetworkURL(config.ApiEndpoint, netProjID, config.NetworkName) | ||
} else { | ||
glog.Warningf("No network name or URL specified.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in this case, should we return error? Or is the subnetURL
below all we need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on what the user of this package wants to do. Kubelet inits the cloud provider but only uses a static function that's not related to networking.
I'm not convinced that we can return an error here and not have this bite us in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubelet's use of cloud provider seems to be unnecessary. But does it make sense to have a cloud provider be initialized but not fully functional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's indeed unnecessary, but we have to live with it for now. Since gce.conf is not passed to kubelet, the cloudprovider is already mostly defunct. CreateGCECloud
needs to not throw an error despite the lack of functionality. Meanwhile, ideally we do throw an error if we get into a defunct state for service/ingress controller. I don't think there's a great answer here until we limit the consumers of cloudprovider.
I'm hesitant to return an error here because I'm worried about the case of the metadata API call not returning the network name. This would cause kubelet to not start.
/test pull-kubernetes-e2e-gce-etcd3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just feel like the logic to get URLs is surprisingly dedicated...
if n, err := getNetwork(service, netProjID, networkName); err != nil { | ||
// Gracefully fail because kubelet calls CreateGCECloud without any config, and API | ||
// calls will fail coming from minions. | ||
glog.Warningf("could not retrieve network %q in attempt to determine if legacy network or see list of subnets, err %v", networkURL, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could -> Could
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
5ea3b2c
to
995dd32
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bowei, nicksardo Associated issue: 53409 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/retest Review the full test history for this PR. |
Automatic merge from submit-queue (batch tested with PRs 51750, 53195, 53384, 53410). If you want to cherry-pick this change to another branch, please follow the instructions here. |
…410-upstream-release-1.8 Automatic merge from submit-queue. Automated cherry pick of #53410 Cherry pick of #53410 on release-1.8. #53410: Handle missing subnet for auto networks and legacy networks **Release note**: ```release-note GCE: Fixes ILB sync on legacy networks and auto networks with unique subnet names ```
Fixes #53409
/assign @bowei
Tested on three GKE clusters with automatic, manual, and legacy networks.
Release note: