-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix logic of consecutive DELETE calls when gracefull deletion is enabled #27539
Conversation
Failures are strange:
I.e. deletion timestamp was correct, but for some reason we didn't delete anything. |
Please add a test case (since I'm pretty sure this is a regression in 1.3). |
@@ -69,14 +69,13 @@ func BeforeDelete(strategy RESTDeleteStrategy, ctx api.Context, obj runtime.Obje | |||
// only a shorter grace period may be provided by a user | |||
if options.GracePeriodSeconds != nil { | |||
period := int64(*options.GracePeriodSeconds) | |||
if period > *objectMeta.DeletionGracePeriodSeconds { | |||
return false, true, nil | |||
newDeletionTimestamp := unversioned.NewTime(unversioned.Now().Add(time.Second * time.Duration(*options.GracePeriodSeconds))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait. How is it possible to extend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting grace period seconds should not reset the deletion timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should always be possible to shorten the grace period seconds, regardless of where timestamp is. I.e., a user who does:
DELETE 30s at t1
DELETE 15s at t2
should end up with a resource with grace period 15s with deletion timestamp t1 +30 -15.
Logic I would expect here is:
if period > *objectMeta.DeletionGracePeriodSeconds {
return
}
delta := period - objectMeta.DeletionGracePeriodSeconds
objectMeta.DeletionGracePeriodSeconds = period
if objectMeta.DeletionTimestamp == nil {
// set deletion timestamp to now + grace period
return
}
objectMeta.DeletionTimestamp = objectMeta.DeletionTimestamp.Sub(delta)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resetting the timestamp should only be done based on the delta, not on a new timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've discussed this with @wojtek-t and basically there are two ways of seeing that. My was the same as yours (that the second delete is basically an update on grace period). Wojteks was that it's the second, completely independent delete, and we should proceed with one that will happen sooner. After short consideration I think the latter is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not the guarantee that we designed the graceful deletion path for, so I'm not in favor of making a change like that right now without a longer discussion. The shorter grace period duration must be recorded because an external client may not observe the first request before starting the second. The timestamp is advisory - the grace period is authoritative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make it crystal clear: you say that if we have a deletion in t1 with period p1, and deletion in t2 < t1 + p1 with p2 < t1-t2 we should delete object instantly. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two cases for a user invoking two deletes in order (30s and 15s).
- The owner of graceful (in this case, for a pod, the kubelet) sees the first request, acts on it, then sees the second
- The owner of graceful sees both requests at the same time (or within the same interval)
For the first case, the graceful deleter should trigger a 30s delete, then when it sees a shorter interval, needs to make a decision in its own timeframe (in the causal sense) about whether to terminate or shorten. In the second case, the initial deletion would be 15s.
I don't think it's a regression - judging by the git blame it always was this way... |
Needs a test. |
It's a regression in intent (introduced by me) :). The change in 02dbb95 to let the DeletionTimestamp get pushed was incorrect. |
OK, I can't argue with this definition of regression:). I'll add a test when we'll agree on semantics. |
I think the value of the DeletionTimestamp is not that critical, the behavior of kubelet (and I believe other components as well) is driven by the grace period, not DeletionTimestamp. I think our components only check if DeletionTimestamp is nil. @smarterclayton did I get it right? |
@caesarxuchao Yes - DeletionTimestamp is advisory for naive clients, but not for the owners of those components (since clients are not expected to share a common timeline). All critical components depend on DeletionGracePeriodSeconds as it applies to their own timeline (like the kubelet and the namespace controller). |
So no matter what we always need to record changes to DeletionGracePeriodSeconds, and Timestamp should be best effort based on what we know at the current time. |
So the description of this issue is not correct - I don't see how you can infinitely extend. Why is this a P1? |
My impression was that deletion timestamp was the source of truth. If it's not the case, then the situation I observed was caused by something else. The thing that made me think that was the fact that I was unable to see Kubelet deleting a pod, and in the same time deletion timestamp was being extended infinitely. |
There should be no code in the Kubernetes repo that depends on On Thu, Jun 16, 2016 at 1:24 PM, Marek Grabowski notifications@github.com
|
You know which word worries me, right? |
Doing a quick search:
I think we're good. |
7f8b4d6
to
6102d2f
Compare
Added test and changed the logic according to what @smarterclayton wrote. |
Thanks, LGTM. Do we want to risk this for 1.3 or leave it as is? I'd be inclined to punt it. |
@k8s-bot test this issue: #IGNORE |
GCE e2e build/test passed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af. |
What's the call here? In or out for 1.3? |
My preference is out - the bug it fixes is not material to the operation of a 1.3 system. |
Removing v1.3 milestone - speak up if you disagree |
@lavalamp PTAL as the master is open now. |
Changes LGTM |
GCE e2e build/test passed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af. |
GCE e2e build/test failed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af. Please reference the list of currently known flakes when examining this failure. If you request a re-test, you must reference the issue describing the flake. |
@k8s-bot e2e test this issue: #IGNORE |
GCE e2e build/test passed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af. |
GCE e2e build/test failed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af. Please reference the list of currently known flakes when examining this failure. If you request a re-test, you must reference the issue describing the flake. |
No-op rebase. reapplying lgtm. |
GCE e2e build/test passed for commit 2a5914d. |
Automatic merge from submit-queue |
Previously it was possible to extend delete timestamp to infinity by issuing repeated DELETE calls. This PR prevents that. Ref. discussion in #27422
cc @wojtek-t @smarterclayton @lavalamp @piosz @dchen1107