Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always retry connection reset error in webhook #53947

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions staging/src/k8s.io/apiserver/pkg/util/webhook/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ go_library(
"//vendor/k8s.io/apimachinery/pkg/runtime:go_default_library",
"//vendor/k8s.io/apimachinery/pkg/runtime/schema:go_default_library",
"//vendor/k8s.io/apimachinery/pkg/runtime/serializer:go_default_library",
"//vendor/k8s.io/apimachinery/pkg/util/net:go_default_library",
"//vendor/k8s.io/apimachinery/pkg/util/wait:go_default_library",
"//vendor/k8s.io/client-go/rest:go_default_library",
"//vendor/k8s.io/client-go/tools/clientcmd:go_default_library",
Expand Down
8 changes: 3 additions & 5 deletions staging/src/k8s.io/apiserver/pkg/util/webhook/webhook.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import (
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/runtime/serializer"
runtimeserializer "k8s.io/apimachinery/pkg/runtime/serializer"
"k8s.io/apimachinery/pkg/util/net"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/rest"
"k8s.io/client-go/tools/clientcmd"
Expand Down Expand Up @@ -104,17 +105,14 @@ func WithExponentialBackoff(initialBackoff time.Duration, webhookFn func() error
var err error
wait.ExponentialBackoff(backoff, func() (bool, error) {
err = webhookFn()
// these errors indicate a need to retry an authentication check
if apierrors.IsServerTimeout(err) || apierrors.IsTimeout(err) || apierrors.IsTooManyRequests(err) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you remove IsServerTimeout?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's already checked in SuggestsClientDelay. I can return it, if you want the check to be more explicit

// these errors indicate a transient error that should be retried.
if net.IsConnectionReset(err) || apierrors.IsInternalError(err) || apierrors.IsTimeout(err) || apierrors.IsTooManyRequests(err) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't client-go already turn connection errors into internal server errors?

if err != nil {
// "Connection reset by peer" is usually a transient error.
// Thus in case of "GET" operations, we simply retry it.
// We are not automatically retrying "write" operations, as
// they are not idempotent.
if !net.IsConnectionReset(err) || r.verb != "GET" {
return err
}
// For the purpose of retry, we set the artificial "retry-after" response.
// TODO: Should we clean the original response if it exists?
resp = &http.Response{
StatusCode: http.StatusInternalServerError,
Header: http.Header{"Retry-After": []string{"1"}},
Body: ioutil.NopCloser(bytes.NewReader([]byte{})),
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client-go only retries on GET calls because it doesn't want to replay non-idempotent calls. this code should not assume all webhook calls are idempotent, either. seems like it would be better to be able to indicate a request is idempotent

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if it's a GET, but e.g. request with audit logs is POST

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right... this shouldn't assume all webhook calls are idempotent

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing audit logs is not an idempotent call, but it's still better to write audit logs twice than loose, esp. considering that audit logs have ids in it, that allow deduplication

seems like it would be better to be able to indicate a request is idempotent

Is there really a need to do so? We still cannot guarantee that a request hasn't been processed if e.g. a timeout is returned. Additionally, for all known to me webhooks (auth[nz], audit) I see no benefit in not retrying connection resets

Copy link
Author

@crassirostris crassirostris Oct 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if we really care about a request being idempotent in this case, we shouldn't unconditionally retry e.g. 5XX. However, nobody complained about this code before, so I think it's safe to assume that retrying is more important than idempotence, WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if we really care about a request being idempotent in this case, we shouldn't unconditionally retry e.g. 5XX. However, nobody complained about this code before, so I think it's safe to assume that retrying is more important than idempotence, WDYT?

Hmm, that's probably true

return false, nil
}
// if the error sends the Retry-After header, we respect it as an explicit confirmation we should retry.
if _, shouldRetry := apierrors.SuggestsClientDelay(err); shouldRetry {
return false, nil
}
if apierrors.IsInternalError(err) {
return false, nil
}
if err != nil {
return false, err
}
Expand Down