Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCE: Allow node count to exceed GCE TargetPool maximums #25178

Merged

Conversation

zmerlynn
Copy link
Member

@zmerlynn zmerlynn commented May 4, 2016

If the cluster node count exceeds the GCE TargetPool maximum (currently 1000),
randomly select which nodes are members of Kubernetes External Load Balancers.

Analytics

If we would exceeded the TargetPool API maximums, instead just
randomly select some subsection of the nodes to include in the TP
instead.

@zmerlynn zmerlynn added team/cluster release-note Denotes a PR that will be considered when it comes time to generate release notes. labels May 4, 2016
@zmerlynn
Copy link
Member Author

zmerlynn commented May 5, 2016

cc @thockin @cjcullen @wojtek-t

Is this evil, or fine? If we're shooting to test above 1k for 1.3, I don't see a lot of choice in the short term.

@k8s-github-robot k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 5, 2016
@davidopp
Copy link
Member

davidopp commented May 5, 2016

cc/ @bprashanth

@@ -68,6 +69,9 @@ const (
// are iterated through to prevent infinite loops if the API
// were to continuously return a nextPageToken.
maxPages = 25

// TargetPools can only support 1000 VMs.
maxInstancesPerTargetPool = 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kubernetes/goog-cluster fyi

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal b/28566577, FYI

@thockin
Copy link
Member

thockin commented May 5, 2016

I'm OK with this approach until we have real answer(s)

@thockin
Copy link
Member

thockin commented May 5, 2016

Would be nice to functionize this logic and have a unit test?

@zmerlynn
Copy link
Member Author

zmerlynn commented May 9, 2016

@thockin: Can do. Frustratingly, the two shuffles operate in slightly different type spaces right now (string vs compute.InstanceReference), otherwise there might be a way to conceptualize this totally together. As it is, I'll probably just pull this out to two functions.

@zmerlynn zmerlynn assigned thockin and unassigned mikedanese May 10, 2016
@zmerlynn zmerlynn force-pushed the random_max_target_pools branch from b99b8f2 to 9ee563b Compare May 10, 2016 00:17
@zmerlynn
Copy link
Member Author

PTAL. Pulled shuffler out to util, added tests for both paths.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 10, 2016
},
}
for _, tc := range tests {
rand.Seed(5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment on this seed value and the "want" values above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I used prose to comment the test blocks, hopefully that was good.

@thockin
Copy link
Member

thockin commented May 10, 2016

LGTM just a couple nits. Please self-LGTM when done

If we would exceeded the TargetPool API maximums, instead just
randomly select some subsection of the nodes to include in the TP
instead.
@zmerlynn zmerlynn force-pushed the random_max_target_pools branch from 9ee563b to faf0c44 Compare May 10, 2016 04:45
@zmerlynn zmerlynn added lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/backlog Higher priority than priority/awaiting-more-evidence. labels May 10, 2016
@wojtek-t wojtek-t added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels May 10, 2016
@wojtek-t
Copy link
Member

Thanks for taking care of that @zmerlynn ! (and thanks for quick review @thockin )

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@thockin
Copy link
Member

thockin commented May 10, 2016

LGTM

@k8s-bot
Copy link

k8s-bot commented May 10, 2016

GCE e2e build/test passed for commit faf0c44.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit a57876b into kubernetes:master May 10, 2016
zmerlynn added a commit to zmerlynn/kubernetes that referenced this pull request Jun 22, 2016
…rest

Tested with 2000 nodes, this actually meets the GCE API specifications
(which is nutty). Previous PR (kubernetes#25178) was based on a mistaken
understanding of a poorly documented set of limitations, and even
poorer testing, for which I am embarassed.
k8s-github-robot pushed a commit that referenced this pull request Jun 22, 2016
Automatic merge from submit-queue

GCE provider: Create TargetPool with 200 instances, then update with rest

GCE provider: Create TargetPool with 200 instances, then update with rest
 
Tested with 2000 nodes, this actually meets the GCE API specifications (which is nutty). Previous PR (#25178) was based on a mistaken understanding of a poorly documented set of limitations, and even poorer testing, for which I am embarassed.

Also includes the revert of #25178 (review commits separately).

[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
zmerlynn added a commit to zmerlynn/kubernetes that referenced this pull request Jun 22, 2016
…rest

Tested with 2000 nodes, this actually meets the GCE API specifications
(which is nutty). Previous PR (kubernetes#25178) was based on a mistaken
understanding of a poorly documented set of limitations, and even
poorer testing, for which I am embarassed.
zmerlynn added a commit to zmerlynn/kubernetes that referenced this pull request Jun 22, 2016
…rest

Tested with 2000 nodes, this actually meets the GCE API specifications
(which is nutty). Previous PR (kubernetes#25178) was based on a mistaken
understanding of a poorly documented set of limitations, and even
poorer testing, for which I am embarassed.
shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this pull request Dec 1, 2016
…rest

Tested with 2000 nodes, this actually meets the GCE API specifications
(which is nutty). Previous PR (kubernetes#25178) was based on a mistaken
understanding of a poorly documented set of limitations, and even
poorer testing, for which I am embarassed.
shouhong pushed a commit to shouhong/kubernetes that referenced this pull request Feb 14, 2017
…rest

Tested with 2000 nodes, this actually meets the GCE API specifications
(which is nutty). Previous PR (kubernetes#25178) was based on a mistaken
understanding of a poorly documented set of limitations, and even
poorer testing, for which I am embarassed.
openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this pull request Aug 6, 2020
…herry-pick-25153-to-release-4.5

[release-4.5] Bug 1849340: 4.6: UPSTREAM: 91748: FieldManager: Reset if we receive nil or a list with one

Origin-commit: 002a51f51065024cddd18d373bf7f80aa30a36de
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants