-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support specifying custom LB retry period from cloud provider #94021
Support specifying custom LB retry period from cloud provider #94021
Conversation
Hi @timoreimann. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
e802d2d
to
adceb65
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
staging/src/k8s.io/cloud-provider/controllers/service/controller.go
Outdated
Show resolved
Hide resolved
staging/src/k8s.io/cloud-provider/controllers/service/controller.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
@andrewsykim @MrHohn does this proposal warrant a KEP, or would it suffice to move forward with the PR? In the latter case, I'd like to invest additional time to update the tests and bring the code into a mergeable state. |
I don't think we need a KEP for this, let's get the tests added and try to merge this for v1.20 |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
This change allows cloud providers to specify a custom retry period by returning a RetryError. The purpose is to bypass the work queue-driven exponential backoff algorithm when there is no need to back off. Specifically, this can be the case when a cloud load balancer operation such as a create or delete is still pending and the cloud API should be polled for completion at a constant interval. A backoff algorithm would not always be reasonable to apply here since there is no API or performance degradation warranting an increasing wait time between API requests.
3aec5f5
to
0fcf42f
Compare
/retest |
// fixed duration (as opposed to backing off exponentially). | ||
type RetryError struct { | ||
msg string | ||
retryAfter time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what should be the interpretation of 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There'd be no special interpretation. Instead, the retry would be immediate (see also where the value is used).
The need to retry right away may be uncommon or even rare, but I personally wouldn't want to disallow it. Maybe a user's network is very slow, or there are already some natural / drive-by delays that don't warrant another extra wait at the client side?
I think a zero delay can be legitimate, but let me know if you think differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a zero delay can be legitimate, but let me know if you think differently.
I think is ok, I just wanted to double check we all have the same interpretation
@@ -2281,3 +2362,66 @@ func (l *fakeNodeLister) Get(name string) (*v1.Node, error) { | |||
} | |||
return nil, nil | |||
} | |||
|
|||
type fakeServiceLister struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to create these mocks, you can use a cache
serviceCache := cache.NewIndexer(cache.MetaNamespaceKeyFunc, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc})
serviceLister := v1listers.NewServiceLister(serviceCache)
for i := range test.services {
if err := serviceCache.Add(test.services[i]); err != nil {
t.Fatalf("%s unexpected service add error: %v", test.name, err)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's much better -- adjusted, thanks!
@aojea PTAL. |
b0ebdc0
to
2ad2c15
Compare
(Updated the copyright year to 2023 real quick) |
@timoreimann: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest |
/lgtm Thanks |
LGTM label has been added. Git tree hash: c97695cb43c46f39dd3542af7bbf6e9f964c05d9
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andrewsykim, aojea, timoreimann The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This change allows cloud providers to specify a custom retry period by returning a
RetryError
. The purpose is to bypass the work queue-driven exponential backoff algorithm when there is no need to back off.Specifically, this can be the case when a cloud load balancer operation such as a create or delete is still pending and the cloud API should be polled for completion at a constant interval. A backoff algorithm would not always be reasonable to apply here since there is no API or performance degradation warranting an increasing wait time between API requests.
Which issue(s) this PR fixes:
Fixes #88902
Special notes for your reviewer:
For now, the PR is meant to provide a starting point for discussing. Hence, I have not invested into adding tests. Once/if we have consent on the general direction of this PR, I am going to complete the missing pieces.(Tests have been added.)Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
Not sure if needed. None so far.
/sig cloud-provider
/cc @andrewsykim