-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
additional backoff in azure cloudprovider #48967
additional backoff in azure cloudprovider #48967
Conversation
EnsureHostInPool() submits a GET to azure API for VM info. We’re seeing this on agent node kubelets and would like to enable configurable backoff engagement for 4xx responses to be able to slow down the rate of reconciliation, when appropriate.
Hi @jackfrancis. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/sig azure |
/assign @brendandburns @colemickens |
/cc @colemickens |
@jdumars: GitHub didn't allow me to request PR reviews from the following users: brendanburns. Note that only kubernetes members can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @brendandburns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit, makes sense otherwise, lgtm
glog.Errorf("backoff: failure, will retry,err=%v", retryErr) | ||
return false, nil | ||
} | ||
glog.V(2).Infof("backoff: success") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe a little lower, we don't log much at all at 2 elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! You mean I have to bust out the --v
tool?
/lgtm but I strongly suggest adding this backoff to the call in Instances as well. |
also rate limiting the call to az.getVirtualMachine inside az.getIPForMachine
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: brendandburns, colemickens, jackfrancis Associated issue: 48971 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue (batch tested with PRs 49218, 48253, 48967, 48460, 49230) |
Fixes #48971
What this PR does / why we need it:
We want to be able to opt in to backoff retry logic for kubelet-originating request behavior: node IP address resolution and node load balancer pool membership enforcement.
Special notes for your reviewer:
The use-case for this is azure cloudprovider clusters with large node counts, especially during cluster installation, or other scenarios when lots of nodes come online at once and attempt to register all resources with the backend API. To allow clusters at scale more control over the API request rate in-cluster, backoff config has the ability to meaningful slow down this rate, when appropriate.
Release note: