Fix logic of consecutive DELETE calls when gracefull deletion is enabled #27539

gmarek · 2016-06-16T14:05:34Z

Previously it was possible to extend delete timestamp to infinity by issuing repeated DELETE calls. This PR prevents that. Ref. discussion in #27422

cc @wojtek-t @smarterclayton @lavalamp @piosz @dchen1107

gmarek · 2016-06-16T15:17:58Z

Failures are strange:

Jun 16 07:53:42.596: INFO: Pod e2e-tests-job-pl5jn scale-down-6fwje on node e2e-gce-agent-pr-13-0-minion-group-8ljf remains, has deletion timestamp 2016-06-16T07:48:41-07:00
Jun 16 07:53:42.596: INFO: Pod e2e-tests-job-pl5jn scale-down-min8x on node e2e-gce-agent-pr-13-0-minion-group-vefa remains, has deletion timestamp 2016-06-16T07:49:34-07:00

I.e. deletion timestamp was correct, but for some reason we didn't delete anything.

smarterclayton · 2016-06-16T16:47:32Z

Please add a test case (since I'm pretty sure this is a regression in 1.3).

smarterclayton · 2016-06-16T16:48:07Z

pkg/api/rest/delete.go

@@ -69,14 +69,13 @@ func BeforeDelete(strategy RESTDeleteStrategy, ctx api.Context, obj runtime.Obje
 		// only a shorter grace period may be provided by a user
 		if options.GracePeriodSeconds != nil {
 			period := int64(*options.GracePeriodSeconds)
-			if period > *objectMeta.DeletionGracePeriodSeconds {
-				return false, true, nil
+			newDeletionTimestamp := unversioned.NewTime(unversioned.Now().Add(time.Second * time.Duration(*options.GracePeriodSeconds)))


Wait. How is it possible to extend?

Setting grace period seconds should not reset the deletion timestamp.

It should always be possible to shorten the grace period seconds, regardless of where timestamp is. I.e., a user who does:

DELETE 30s at t1
DELETE 15s at t2

should end up with a resource with grace period 15s with deletion timestamp t1 +30 -15.

Logic I would expect here is:

if period > *objectMeta.DeletionGracePeriodSeconds { return } delta := period - objectMeta.DeletionGracePeriodSeconds objectMeta.DeletionGracePeriodSeconds = period if objectMeta.DeletionTimestamp == nil { // set deletion timestamp to now + grace period return } objectMeta.DeletionTimestamp = objectMeta.DeletionTimestamp.Sub(delta)

Resetting the timestamp should only be done based on the delta, not on a new timestamp.

We've discussed this with @wojtek-t and basically there are two ways of seeing that. My was the same as yours (that the second delete is basically an update on grace period). Wojteks was that it's the second, completely independent delete, and we should proceed with one that will happen sooner. After short consideration I think the latter is better.

That's not the guarantee that we designed the graceful deletion path for, so I'm not in favor of making a change like that right now without a longer discussion. The shorter grace period duration must be recorded because an external client may not observe the first request before starting the second. The timestamp is advisory - the grace period is authoritative.

To make it crystal clear: you say that if we have a deletion in t1 with period p1, and deletion in t2 < t1 + p1 with p2 < t1-t2 we should delete object instantly. Right?

There are two cases for a user invoking two deletes in order (30s and 15s).

The owner of graceful (in this case, for a pod, the kubelet) sees the first request, acts on it, then sees the second

The owner of graceful sees both requests at the same time (or within the same interval)

For the first case, the graceful deleter should trigger a 30s delete, then when it sees a shorter interval, needs to make a decision in its own timeframe (in the causal sense) about whether to terminate or shorten. In the second case, the initial deletion would be 15s.

gmarek · 2016-06-16T16:55:03Z

I don't think it's a regression - judging by the git blame it always was this way...

lavalamp · 2016-06-16T17:02:13Z

Needs a test.

smarterclayton · 2016-06-16T17:03:35Z

It's a regression in intent (introduced by me) :). The change in 02dbb95 to let the DeletionTimestamp get pushed was incorrect.

gmarek · 2016-06-16T17:04:54Z

OK, I can't argue with this definition of regression:). I'll add a test when we'll agree on semantics.

caesarxuchao · 2016-06-16T17:05:38Z

I think the value of the DeletionTimestamp is not that critical, the behavior of kubelet (and I believe other components as well) is driven by the grace period, not DeletionTimestamp. I think our components only check if DeletionTimestamp is nil. @smarterclayton did I get it right?

smarterclayton · 2016-06-16T17:07:06Z

@caesarxuchao Yes - DeletionTimestamp is advisory for naive clients, but not for the owners of those components (since clients are not expected to share a common timeline). All critical components depend on DeletionGracePeriodSeconds as it applies to their own timeline (like the kubelet and the namespace controller).

smarterclayton · 2016-06-16T17:08:31Z

So no matter what we always need to record changes to DeletionGracePeriodSeconds, and Timestamp should be best effort based on what we know at the current time.

smarterclayton · 2016-06-16T17:09:00Z

So the description of this issue is not correct - I don't see how you can infinitely extend. Why is this a P1?

gmarek · 2016-06-16T17:24:21Z

My impression was that deletion timestamp was the source of truth. If it's not the case, then the situation I observed was caused by something else. The thing that made me think that was the fact that I was unable to see Kubelet deleting a pod, and in the same time deletion timestamp was being extended infinitely.

smarterclayton · 2016-06-16T17:36:56Z

There should be no code in the Kubernetes repo that depends on
DeletionTimestamp (other than printing it out). All components (node
controller, namespace controller, and kubelet) use a local clock that
begins when they observe the event.

On Thu, Jun 16, 2016 at 1:24 PM, Marek Grabowski notifications@github.com
wrote:

My impression was that deletion timestamp was the source of truth. If it's
not the case, then the situation I observed was caused by something else.
The thing that made me think that was the fact that I was unable to see
Kubelet deleting a pod, and in the same time deletion timestamp was being
extended infinitely.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#27539 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p8XYHQsMbH_oGqFtoyX9xI8dh-1Nks5qMYbSgaJpZM4I3a9G
.

gmarek · 2016-06-16T17:38:38Z

You know which word worries me, right?

smarterclayton · 2016-06-16T17:46:40Z

Doing a quick search:

$ grep -R ".DeletionTimestamp" --include=*.go --exclude=generated.pb.go --exclude=*_generated.go --exclude=types.generated.go --exclude=*_test.go * | grep -v "DeletionTimestamp != nil" | grep -v "DeletionTimestamp == nil"
pkg/api/meta/interfaces.go: GetDeletionTimestamp() *unversioned.Time
pkg/api/meta/interfaces.go: SetDeletionTimestamp(timestamp *unversioned.Time)
pkg/api/meta/meta.go:func (a genericAccessor) GetDeletionTimestamp() *unversioned.Time {
pkg/api/meta/meta.go:func (a genericAccessor) SetDeletionTimestamp(timestamp *unversioned.Time) {
pkg/api/meta.go:func (meta *ObjectMeta) GetDeletionTimestamp() *unversioned.Time { return meta.DeletionTimestamp }
pkg/api/meta.go:func (meta *ObjectMeta) SetDeletionTimestamp(deletionTimestamp *unversioned.Time) {
pkg/api/meta.go:    meta.DeletionTimestamp = deletionTimestamp
pkg/api/rest/create.go: objectMeta.DeletionTimestamp = nil
pkg/api/rest/delete.go:         objectMeta.DeletionTimestamp = &now
pkg/api/rest/delete.go: objectMeta.DeletionTimestamp = &now
pkg/api/types.go:   // DeletionTimestamp is the time after which this resource will be deleted. This
pkg/api/types.go:   DeletionTimestamp *unversioned.Time `json:"deletionTimestamp,omitempty"`
pkg/api/v1/types.go:    // DeletionTimestamp is RFC 3339 date and time at which this resource will be deleted. This
pkg/api/v1/types.go:    DeletionTimestamp *unversioned.Time `json:"deletionTimestamp,omitempty" protobuf:"bytes,9,opt,name=deletionTimestamp"`
pkg/api/validation/validation.go:       if !oldMeta.DeletionTimestamp.IsZero() {
pkg/api/validation/validation.go:           newMeta.DeletionTimestamp = oldMeta.DeletionTimestamp
pkg/api/validation/validation.go:       allErrs = append(allErrs, field.Invalid(fldPath.Child("deletionTimestamp"), newMeta.DeletionTimestamp, "field is immutable; may only be changed via deletion"))
pkg/api/validation/validation.go:   if newNamespace.DeletionTimestamp.IsZero() {
pkg/controller/controller_utils.go:// DeletionTimestamp on an object as a delete. To do so consistenly, one needs
pkg/controller/controller_utils.go:             p.Namespace, p.Name, p.Status.Phase, p.DeletionTimestamp)
pkg/controller/namespace/namespace_controller_utils.go: if namespace.DeletionTimestamp.IsZero() || namespace.Status.Phase == api.NamespaceTerminating {
pkg/controller/namespace/namespace_controller_utils.go: estimate, err := deleteAllContent(kubeClient, clientPool, opCache, groupVersionResources, namespace.Name, *namespace.DeletionTimestamp)
pkg/controller/petset/fakes.go:func (f *fakePetClient) setDeletionTimestamp(index int) error {
pkg/controller/petset/fakes.go: f.pets[index].pod.DeletionTimestamp = &unversioned.Time{Time: time.Now()}
pkg/controller/petset/pet.go:   glog.Infof("PetSet %v waiting on pet %v to die in %v", pet.parent.Name, realPet.pod.Name, realPet.pod.DeletionTimestamp)
pkg/controller/replicaset/replica_set.go:   glog.V(4).Infof("Pod %s/%s deleted through %v, timestamp %+v: %+v.", pod.Namespace, pod.Name, utilruntime.GetCaller(), pod.DeletionTimestamp, pod)
pkg/controller/replication/replication_controller.go:   glog.V(4).Infof("Pod %s/%s deleted through %v, timestamp %+v, labels %+v.", pod.Namespace, pod.Name, utilruntime.GetCaller(), pod.DeletionTimestamp, pod.Labels)
pkg/kubectl/describe.go:            fmt.Fprintf(out, "Status:\tTerminating (expires %s)\n", pod.DeletionTimestamp.Time.Format(time.RFC1123Z))
pkg/kubelet/config/config.go:       reflect.DeepEqual(existing.DeletionTimestamp, ref.DeletionTimestamp) &&
pkg/kubelet/config/config.go:   existing.DeletionTimestamp = ref.DeletionTimestamp
pkg/kubelet/kubelet.go:         // handle the work item. Check if the DeletionTimestamp has been
pkg/registry/generic/registry/store.go: accessor.SetDeletionTimestamp(nil)
pkg/registry/namespace/etcd/etcd.go:    if namespace.DeletionTimestamp.IsZero() {
pkg/registry/namespace/etcd/etcd.go:                if existingNamespace.DeletionTimestamp.IsZero() {
pkg/registry/namespace/etcd/etcd.go:                    existingNamespace.DeletionTimestamp = &now
pkg/registry/pod/strategy.go:   newPod.DeletionTimestamp = nil
pkg/runtime/types.go:func (u *Unstructured) GetDeletionTimestamp() *unversioned.Time {
pkg/runtime/types.go:func (u *Unstructured) SetDeletionTimestamp(timestamp *unversioned.Time) {

I think we're good.

gmarek · 2016-06-17T08:18:18Z

Added test and changed the logic according to what @smarterclayton wrote.

smarterclayton · 2016-06-17T14:56:58Z

Thanks, LGTM. Do we want to risk this for 1.3 or leave it as is? I'd be inclined to punt it.

gmarek · 2016-06-17T19:22:55Z

@k8s-bot test this issue: #IGNORE
no test results

k8s-bot · 2016-06-17T20:33:51Z

GCE e2e build/test passed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af.

goltermann · 2016-06-20T14:43:08Z

What's the call here? In or out for 1.3?

smarterclayton · 2016-06-20T14:57:31Z

My preference is out - the bug it fixes is not material to the operation of a 1.3 system.

goltermann · 2016-06-20T15:13:28Z

Removing v1.3 milestone - speak up if you disagree

gmarek · 2016-07-04T08:32:52Z

@lavalamp PTAL as the master is open now.

smarterclayton · 2016-07-05T03:06:20Z

Changes LGTM

k8s-bot · 2016-07-05T03:56:01Z

GCE e2e build/test passed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af.

k8s-bot · 2016-07-05T04:04:11Z

GCE e2e build/test failed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af.

Please reference the list of currently known flakes when examining this failure. If you request a re-test, you must reference the issue describing the flake.

gmarek · 2016-07-05T08:01:57Z

@k8s-bot e2e test this issue: #IGNORE
github query failed.

k8s-bot · 2016-07-05T08:35:58Z

GCE e2e build/test passed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af.

k8s-bot · 2016-07-05T08:39:16Z

GCE e2e build/test failed for commit 6102d2f27e61647946d4f55a8354f1f7bf3c08af.

Please reference the list of currently known flakes when examining this failure. If you request a re-test, you must reference the issue describing the flake.

gmarek · 2016-07-05T08:43:17Z

No-op rebase. reapplying lgtm.

k8s-bot · 2016-07-05T09:15:56Z

GCE e2e build/test passed for commit 2a5914d.

k8s-github-robot · 2016-07-05T09:16:41Z

Automatic merge from submit-queue

gmarek added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 16, 2016

gmarek added this to the v1.3 milestone Jun 16, 2016

gmarek assigned lavalamp and smarterclayton Jun 16, 2016

googlebot added the cla: yes label Jun 16, 2016

gmarek added kind/bug Categorizes issue or PR as related to a bug. release-note-none Denotes a PR that doesn't merit a release note. labels Jun 16, 2016

k8s-github-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jun 16, 2016

smarterclayton reviewed Jun 16, 2016
View reviewed changes

gmarek removed the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 17, 2016

gmarek force-pushed the graceful branch 2 times, most recently from 7f8b4d6 to 6102d2f Compare June 17, 2016 08:17

k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 17, 2016

goltermann modified the milestones: next-candidate, v1.3 Jun 20, 2016

smarterclayton added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 5, 2016

Fix logic of consecutive DELETE calls when gracefull deletion is enabled

2a5914d

gmarek force-pushed the graceful branch from 6102d2f to 2a5914d Compare July 5, 2016 08:42

gmarek added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jul 5, 2016

k8s-github-robot merged commit b37a5de into kubernetes:master Jul 5, 2016

gmarek deleted the graceful branch August 30, 2016 09:48

Fix logic of consecutive DELETE calls when gracefull deletion is enabled #27539

Fix logic of consecutive DELETE calls when gracefull deletion is enabled #27539

Conversation

gmarek commented Jun 16, 2016 • edited Loading

gmarek commented Jun 16, 2016

smarterclayton commented Jun 16, 2016

smarterclayton Jun 16, 2016

Choose a reason for hiding this comment

smarterclayton Jun 16, 2016

Choose a reason for hiding this comment

smarterclayton Jun 16, 2016 • edited Loading

Choose a reason for hiding this comment

smarterclayton Jun 16, 2016

Choose a reason for hiding this comment

gmarek Jun 16, 2016

Choose a reason for hiding this comment

smarterclayton Jun 16, 2016

Choose a reason for hiding this comment

gmarek Jun 16, 2016

Choose a reason for hiding this comment

smarterclayton Jun 16, 2016

Choose a reason for hiding this comment

gmarek commented Jun 16, 2016 • edited Loading

lavalamp commented Jun 16, 2016

smarterclayton commented Jun 16, 2016

gmarek commented Jun 16, 2016

caesarxuchao commented Jun 16, 2016

smarterclayton commented Jun 16, 2016 • edited Loading

smarterclayton commented Jun 16, 2016

smarterclayton commented Jun 16, 2016

gmarek commented Jun 16, 2016

smarterclayton commented Jun 16, 2016

gmarek commented Jun 16, 2016

smarterclayton commented Jun 16, 2016

gmarek commented Jun 17, 2016

smarterclayton commented Jun 17, 2016

gmarek commented Jun 17, 2016

k8s-bot commented Jun 17, 2016

goltermann commented Jun 20, 2016

smarterclayton commented Jun 20, 2016

goltermann commented Jun 20, 2016

gmarek commented Jul 4, 2016

smarterclayton commented Jul 5, 2016

k8s-bot commented Jul 5, 2016

k8s-bot commented Jul 5, 2016

gmarek commented Jul 5, 2016

k8s-bot commented Jul 5, 2016

k8s-bot commented Jul 5, 2016

gmarek commented Jul 5, 2016

k8s-bot commented Jul 5, 2016

k8s-github-robot commented Jul 5, 2016

gmarek commented Jun 16, 2016 •

edited

Loading

smarterclayton Jun 16, 2016 •

edited

Loading

gmarek commented Jun 16, 2016 •

edited

Loading

smarterclayton commented Jun 16, 2016 •

edited

Loading