Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c8d: Make sure the content isn't removed while we export #45963

Merged
merged 1 commit into from
Jul 19, 2023

Conversation

rumpl
Copy link
Member

@rumpl rumpl commented Jul 13, 2023

- What I did

This change adds leases for all the content that will be exported, once the image(s) are exported the lease is removed, thus letting containerd's GC to do its job if needed. This fixes the case where someone would remove an image that is still being exported.

This fixes the TestAPIImagesSaveAndLoad cli integration test.

- How I did it

  • Removed the creation of the lease on the context, it didn't do anything, a lease attached to a context is only added to the things that are created during the life of that context
  • We walk all the content that will be exported and add a lease to it, once finished we remove the lease.

- How to verify it

Run make DOCKER_GRAPHDRIVER=overlayfs TEST_FILTER=TestAPIImagesSaveAndLoad TEST_INTEGRATION_USE_SNAPSHOTTER=1 test-integration, the test should now pass.

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@thaJeztah thaJeztah added this to the 25.0.0 milestone Jul 13, 2023
@rumpl
Copy link
Member Author

rumpl commented Jul 13, 2023

@thaJeztah I see you added the 25.0.0 milestone, which is fine but this change is not essential to the 25 release

@rumpl rumpl force-pushed the c8d-image-save-lease branch from a6f6128 to 2591d72 Compare July 13, 2023 17:14
@rumpl rumpl force-pushed the c8d-image-save-lease branch 2 times, most recently from 1757030 to 4fe7bb3 Compare July 13, 2023 17:51
daemon/containerd/image_exporter.go Outdated Show resolved Hide resolved
daemon/containerd/image_exporter.go Outdated Show resolved Hide resolved
@rumpl rumpl force-pushed the c8d-image-save-lease branch from 4fe7bb3 to 54f3f47 Compare July 17, 2023 08:24
daemon/containerd/image_exporter.go Outdated Show resolved Hide resolved
daemon/containerd/image_exporter.go Outdated Show resolved Hide resolved
daemon/containerd/image_exporter.go Outdated Show resolved Hide resolved
@rumpl rumpl force-pushed the c8d-image-save-lease branch 3 times, most recently from 467cff5 to ab81658 Compare July 17, 2023 10:07
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rumpl rumpl requested a review from thaJeztah July 17, 2023 19:21
Copy link
Member

@laurazard laurazard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just left a couple suggestions/questions

ctx, release, err := i.client.WithLease(ctx)
contentStore := i.client.ContentStore()
leasesManager := i.client.LeasesService()
lease, err := leasesManager.Create(ctx, leases.WithRandomID())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lease, err := leasesManager.Create(ctx, leases.WithRandomID())
lease, err := leasesManager.Create(ctx, leases.WithRandomID(), leases.WithExpiration(1*time.Hour))

For paranoid safety's sake, should we create the lease with an expiration (a la https://github.com/moby/moby/blob/0c6b61665668fa68538a630d341ad56e53817162/daemon/containerd/image_commit.go#L71C67-L71C101) ?

This comment was marked as duplicate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think our best bet would be to add a cleanup routine at startup that would remove all dangling leases. This should only happen if there is a catastrophic failure and the daemon was killed mid-export. Adding a 1 hour lease would make exports that run more than 1h fail if someone removed the content while we were exporting. Both of the cases should be rare and yeah, a cleanup routine at startup would be best I think.

We could even add a check in all our tests checking that there are no dangling leases.

is there a relation between ctx and "expiration"? i.e., should leases also expire if the context is cancelled or timed out?

There isn't, not in the way you describe it no, would be a nice addition for sure. Containerd has only this:

ctx, release, err := i.client.WithLease(ctx)

In this case when release is called and the lease would be removed. This lease only works for content that is created during the life of that context. In the case of exporting we aren't creating any content, we are only reading them, that's why we need to pre-lease all the content and then safely read it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think our best bet would be to add a cleanup routine at startup that would remove all dangling leases.

I'm okay with this, and sounds like a better way to go than arbitrary 1-hour lease. I'll try to write this down to remember to do this somewhere, and when we do that we should go through the rest of the c8d integration code and remove the arbitrary 1-hour expirations in other places (so that an hour long docker commit won't break, even though I hope that's not happening frequently)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also make sure we always set options to the WithLease call, by default it will create a 24h lease.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we have a utility in one place;

func tempLease(ctx context.Context, mgr leases.Manager) (context.Context, func(context.Context) error, error) {
nop := func(context.Context) error { return nil }
_, ok := leases.FromContext(ctx)
if ok {
return ctx, nop, nil
}
// Use an expiration that ensures the lease is cleaned up at some point if there is a crash, SIGKILL, etc.
opts := []leases.Opt{
leases.WithRandomID(),
leases.WithExpiration(24 * time.Hour),
leases.WithLabels(map[string]string{
"moby.lease/temporary": time.Now().UTC().Format(time.RFC3339Nano),
}),
}

But given that we're wrapping the leases manager in our "namespaced" manager, would it make sense to apply defaults there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed! Forgot about that one

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a issue to track this one #46020

daemon/containerd/image_exporter.go Outdated Show resolved Hide resolved
daemon/containerd/image_exporter.go Show resolved Hide resolved
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one silly question on @laurazard's comment, and I agree with all other comments, but other than that, this LGTM

ctx, release, err := i.client.WithLease(ctx)
contentStore := i.client.ContentStore()
leasesManager := i.client.LeasesService()
lease, err := leasesManager.Create(ctx, leases.WithRandomID())

This comment was marked as duplicate.

@corhere
Copy link
Contributor

corhere commented Jul 18, 2023

@thaJeztah scoping a lease to a context will be much cheaper to implement with Go 1.21's context.AfterFunc.

I think it's wise for containerd not to implement a WithScopedOnContext option or similar as it would have to be implemented by retaining a reference to the client used to create the lease. Without cooperation by the application there'd be no way to guarantee that the client is still valid (i.e. has not been closed) when the context becomes done.

This change add leases for all the content that will be exported, once
the image(s) are exported the lease is removed, thus letting
containerd's GC to do its job if needed. This fixes the case where
someone would remove an image that is still being exported.

This fixes the TestAPIImagesSaveAndLoad cli integration test.

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
@rumpl rumpl force-pushed the c8d-image-save-lease branch from ab81658 to f3a6b0f Compare July 19, 2023 07:59
Copy link
Member

@laurazard laurazard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-LGTM

@thaJeztah
Copy link
Member

as it would have to be implemented by retaining a reference to the client used to create the lease. Without cooperation by the application there'd be no way to guarantee that the client is still valid (i.e. has not been closed) when the context becomes done.

Also fair point. It was a "wild blurb" on my side, mostly considering what the proper / cleanest way would be to create a "transaction" (with some work during), which is expected to last for the duration of the context (possibly with a deadline, but I guess that'd be more for the context), but I definitely haven't given it a lot of thought yet.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

5 participants