-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch AWS getInstancesByNodeNames calls with FilterNodeLimit #47516
Batch AWS getInstancesByNodeNames calls with FilterNodeLimit #47516
Conversation
This is just a simpler alternate way of fixing the filter limit for #45050 in case, we didn't had a chance to get caching fix merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from name of one constant the PR looks good, however I'd appreciate @justinsb's opinion.
|
||
// Number of node names that can be added to a filter. The AWS limit is 200 | ||
// but we are using a lower limit on purpose | ||
FilterNodeLimit = 150 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not seem to be used externally and thus should be lowercase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
We are going to limit the getInstancesByNodeNames call with a batch size of 150
d1d1798
to
ffa622f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -157,6 +157,10 @@ const ( | |||
createTagInitialDelay = 1 * time.Second | |||
createTagFactor = 2.0 | |||
createTagSteps = 9 | |||
|
|||
// Number of node names that can be added to a filter. The AWS limit is 200 | |||
// but we are using a lower limit on purpose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. If nothing else, it leaves us room for our own additional filters.
Name: aws.String("private-dns-name"), | ||
Values: names, | ||
} | ||
for i := 0; i < len(names); i += filterNodeLimit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth splitting this into a partition
function, just because it is tricky and it would be great to have a unit test on it. I am surprised there isn't one already, but maybe I'm looking in the wrong place.
tag.Value = aws.String("") | ||
tags := []*ec2.Tag{&tag} | ||
nodeNames := []string{} | ||
for i := 0; i < 200; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - I suppose this has (some) coverage of the partition logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah "some", :-) . Because we still don't test if partitioning correctly splits the results too. I can extract this into a separate function easily enough. Will open a github issue for myself.
So I think this looks good. I actually think the best option here is probably to get the cache logic in for ELBs, and to get this logic in for EBS. For ELBs we really don't need to be doing a lookup by name anyway, now that we have the node object. For EBS the logic is much more complicated, as I'm sure we can all attest to by now. But also getting both this and #47410 into 1.7 means that backporting won't be a nightmare, because hopefully the basic code structure won't then change too much. |
/approve |
I added an AWS prefix to the release note and tagged the markdown block with |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gnufied, justinsb Associated issue: 47271 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/remove-priority P0 |
Automatic merge from submit-queue (batch tested with PRs 47510, 47516, 47482, 47521, 47537) |
Automatic merge from submit-queue Fix AWS instance information fetch for large clusters This is a manually cherry-pick for two PRs - #46463 and #47516 ```release-note AWS cloudprovider plugin: Fix for large clusters (200+ nodes). Also fix bug with volumes not getting detached from a node after reboot. ``` cc @enisoc @justinsb @wongma7
We are going to limit the getInstancesByNodeNames call with a batch
size of 150.
Fixes - #47271