Add eStargz specification to OCI v1 (support lazy pulling) #815
Description
TL;DR
- Standardize eStargz archive format as an optional extension to OCI Image Spec v1: https://github.com/containerd/stargz-snapshotter/blob/v0.2.0/docs/stargz-estargz.md
- Define
org.opencontainers.image.toc.digest
annotation for enabling chunk-level content verification - No need to introduce a new layer media type, because eStargz is fully compatible with
application/vnd.oci.image.layer.v1.tar+gz
- Though compression methods other than gzip is out-of-scope, this spec can be smoothly extended to other compression methods in the future (e.g. zstd)
Overview
Pull is one of the time-consuming steps in the container lifecycle. One of the root causes of this issue is tar (+gzip) archived layer that doesn't allow image consumers (e.g. container runtimes, builders, etc.) to run container until the entire contents being locally available.
This proposal aims at solving this issue by enabling lazy pulling for OCI images. Lazy pulling here means image consumers don't download the entire image on pull operation but fetches necessary chunks of contents on-demand. This allows us to reduce the time to take for pull and startup the container quickly.
We propose standardizing lazily-pullable and OCI-compatible tar.gz extension "eStargz" (https://github.com/containerd/stargz-snapshotter/blob/v0.2.0/docs/stargz-estargz.md) which is developed in containerd Stargz Snapshotter project. The recent benchmarking result shows the performance improvement on the pull operation (Please also see the README for the detailed explanation).
Because eStargz is fully compatible with the current spec,
- it can be lazily pulled without any changes to the registry
- it can still run on eStargz-agnostic runtimes so the community can adopt the new spec without taking risk of breaking their environment
Though this proposal focuses on the extension to the gzip-compressed layer, we believe eStargz can be smoothly extended to other compression methods in the future. Recently, Podman community tries to define zstd-version of lazy-pullable format zstd:chunked based on the eStargz spec. Standardizing eStargz will also help standardize zstd:chunked in the future, with a minimum amount of changes to the spec. This consistency of the format across compression methods should also be beneficial for runtime implementers to adopt lazy pulling without unnecessary complexity.
-
Implementations: refer to the tracker issue Tracker issue for adoption status containerd/stargz-snapshotter#258
- including: containerd, CRI-O, nerdctl, Podman, BuildKit, Kaniko, ko, go-containerregistry
-
Links to talks
- FOSDEM 2020: Lazy distribution of container images - Akihiro Suda, NTT
- KubeCon EU 2020: Startup Containers in Lightning Speed with Lazy Image Distribution - Kohei Tokuanaga, NTT (YouTube: https://www.youtube.com/watch?v=H4Lbi26CqNU)
- KubeCon US 2020: Speeding Up Analysis Pipelines with Remote Container Images - Ricardo Rocha & Spyridon Trigazis, CERN (YouTube: https://www.youtube.com/watch?v=j4eIgdDkI9I)
- FOSDEM 2021: Build and Run Containers With Lazy Pulling - Adoption status of containerd Stargz Snapshotter and eStargz - Kohei Tokuanaga, NTT
- Container Plumbing Days 2021: Starting up Containers Super Fast With Lazy Pulling of Images - Kohei Tokuanaga, NTT (Slieds, Youtube: https://www.youtube.com/watch?v=r981cUwoD7o)
- KubeCon US 2021: Faster Container Image Distribution on a Variety of Tools with Lazy Pulling - Kohei Tokunaga, NTT Corporation & Tao Peng, Ant Financial
-
Links to other resources
- eStargz spec: https://github.com/containerd/stargz-snapshotter/blob/v0.2.0/docs/stargz-estargz.md
- Introductory blog: Startup Containers in Lightning Speed with Lazy Image Distribution on Containerd
- Blog about BuildKit's eStargz support: Building containers without waiting for pull completion of base images on BuildKit
- Blog introducing tools that support eStargz-based lazy pulling: Speeding Up Pulling Container Images on a Variety of Tools with eStargz
Thanks @AkihiroSuda for the discussion about this proposal.
Goal
The goal of this proposal is to add support of lazy pulling to OCI Image Spec by standardizing eStargz spec (https://github.com/containerd/stargz-snapshotter/blob/v0.2.0/docs/stargz-estargz.md) as an optional extension and by defining an annotation org.opencontainers.image.toc.digest
for content verification. Changes aren't needed to the OCI Distribution Spec because eStargz can be lazily pulled from the registry as long as it supports HTTP Range Request which is already included to that spec.
Proposed Changes
Fig 1. The Structure | Fig 2.Prefetching Support | Fig 3. Content Verification |
---|---|---|
Starndardize eStargz archive format as an optional extension to application/vnd.oci.image.layer.v1.tar+gz
(Fig 1 and 2)
eStargz is compatible with application/vnd.oci.image.layer.v1.tar+gz
so a new Media Type doesn't need to be introduced. Instead, we propose adding eStargz spec to OCI Image Spec as the optional extension to +gzip
Media Types.
The overview of eStargz is the following. For more details, please refer to eStargz spec.
- Gzip-compressing tar entry per file (or chunk if that file is large). This enables the image consumer to decompress each tar entry selectively.
- Adding TOC JSON to the layer tar blob. This contains metadata and content offset of all files. This allows image consumers to mount a layer without scanning the entire tar.gz and to extract necessary contents, selectively.
- Adding meta entries for indicating "prioritized" files that SHOULD be prefetched when mounting the layer. This helps image consumers to make sure that these files are locally available and to avoid network-related overheads when reading these files.
Define org.opencontainers.image.toc.digest
annotation (Fig 3)
In the current OCI Spec, a layer can be verified by the Digest of the layer written in the descriptor in the manifest. However, when a user lazily pull a layer (i.e. fetch and extract chunks separately on demand), this verification method cannot be applied because the entire layer contents haven't acquired.
For solving this issue, eStargz can verify the contents in chunk-granularity on demand. Digests of each chunk are written in the TOC JSON so that the image consumers can verify them separately every time it acquire the file contents. The TOC JSON itself is verified by the digest written in a pre-defined annotation on the layer descriptor in the manifest which is already verifiable with the current spec. More details of this extension are described in the eStargz definition doc.
For enabling this, we propose adding the following pre-defined annotation, following the OCI's naming convention of annotation.
org.opencontainers.image.toc.digest
: OCI Digest of the TOC JSON in the layer
Out-of-scope
This proposal focuses on lazy pulling and standardizing eStargz spec which is used in the wild, for OCIv1. Thus some requirements discussed in OCIv2 are out-of-scope in this proposal, incluiding:
Though OCIv2 is out-of-scope in this proposal, eStargz doesn't conflict to OCIv2 discussion.
This proposal focuses on the extension to application/vnd.oci.image.layer.v1.tar+gz
and other types of compression method (e.g. zstd) are out-of-scope.