-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCE provider: Limit Filter calls to regexps rather than large blobs #27741
Conversation
cc @roberthbailey: This needs a v1.2 cherrypick as well, once we get it in |
57085b6
to
6acb5b8
Compare
After testing this on a 1000 node cluster, I found another nasty surprise lurking in the same APIs, so I pushed a new iteration that killed the |
6acb5b8
to
626263e
Compare
@@ -2467,6 +2492,13 @@ func (gce *GCECloud) getInstancesByNames(names []string) ([]*gceInstance, error) | |||
pageToken = res.NextPageToken | |||
for _, i := range res.Items { | |||
name := i.Name | |||
if inst, ok := instances[name]; !ok || inst != nil { | |||
if inst != nil { | |||
glog.Warningf("GCE returned duplicate result for instance %q", name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this really happen? It seems to me a bit paranoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it when was debugging that internal bug. I was thinking there was instability across pages (I.e. with the wrong options it returned a,b,c
then c,d,f
) in some race condition, which would then show up as a missing instance a few lines below (so the idea is that the pair of lines would show the instability). That ended up not being the case, but it doesn't seem harmful to leave in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and took it out. If the other case ever comes up, hopefully it'll be obvious (or non-harmful, e.g. duplicate results but all the results).
I added a minor comment. Other than that LGTM. |
Filters can't exceed 4k, and GET requests against the GCE API are also limited, so these break down in different ways at different cluster counts. Fix it by introducing an advisory node-instance-prefix configuration in the GCE provider that can hint the EnsureLoadBalancer/UpdateLoadBalancer code (and the firewall creation/update code). If it's not there, or wrong (a hostname that's registered violates it), just ignore it and grab the whole project.
626263e
to
dd4dae4
Compare
GCE e2e build/test passed for commit dd4dae4. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
if cfg.Global.Multizone { | ||
managedZones = nil // Use all zones in region | ||
} | ||
} | ||
|
||
return CreateGCECloud(projectID, region, zone, managedZones, networkURL, nodeTags, tokenSource, true /* useMetadataServer */) | ||
return CreateGCECloud(projectID, region, zone, managedZones, networkURL, nodeTags, nodeInstancePrefix, tokenSource, true /* useMetadataServer */) | ||
} | ||
|
||
// Creates a GCECloud object using the specified parameters. | ||
// If no networkUrl is specified, loads networkName via rest call. | ||
// If no tokenSource is specified, uses oauth2.DefaultTokenSource. | ||
// If managedZones is nil / empty all zones in the region will be managed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment describing the exact use of nodeInstancePrefix and impact of passing in an empty string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't see this before it merged. To clarify it here (and I'll have a PR later): It's an advisory prefix to help GCE searches for nodes in the cluster. If nodes happen to join that violate the nodeInstancePrefix
, the nodeInstancePrefix
is ignored and the entire GCE project is enumerated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and if it's empty, we enumerate the whole project.
GCE e2e build/test passed for commit dd4dae4. |
Automatic merge from submit-queue |
…41-upstream-release-1.2 Automated cherry pick of #27741
Fix to kubernetes/kubernetes#27821. Borrows the necessary piece of kubernetes/kubernetes#27741 to parse the gce.conf. Also opened kubernetes-retired#1251 to fix Godeps correctly
…ck-of-#27741-upstream-release-1.2 Automated cherry pick of kubernetes#27741
…ck-of-#27741-upstream-release-1.2 Automated cherry pick of kubernetes#27741
Fix to kubernetes/kubernetes#27821. Borrows the necessary piece of kubernetes/kubernetes#27741 to parse the gce.conf. Also opened kubernetes-retired/contrib#1251 to fix Godeps correctly
Fix to kubernetes/kubernetes#27821. Borrows the necessary piece of kubernetes/kubernetes#27741 to parse the gce.conf. Also opened kubernetes-retired/contrib#1251 to fix Godeps correctly
Fix to kubernetes/kubernetes#27821. Borrows the necessary piece of kubernetes/kubernetes#27741 to parse the gce.conf. Also opened kubernetes-retired/contrib#1251 to fix Godeps correctly
Filters can't exceed 4k, and GET requests against the GCE API are also limited, so these break down in different ways at different cluster counts. Fix it by introducing an advisory
node-instance-prefix
configuration in the GCE provider that can hint theEnsureLoadBalancer
/UpdateLoadBalancer code
(and the firewall creation/update code). If it's not there, or wrong (a hostname that's registered violates it), just ignore it and grab the whole project.Fixes #27731