-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make image (layer) downloads faster by using pigz #35697
Conversation
Any idea what:
Is due to? |
looks like this is implementing the same as #34788 (looks like that one stalled though) |
Should I hate this behind an environment variable, so you can opt out of
the library? The primary downside I know is that there is a memory overhead
with this library.
…Sent from my iPhone
On Dec 4, 2017, at 21:00, Sebastiaan van Stijn <notifications@github.com> wrote:
looks like this is implementing the same as #34788
<#34788> (looks like that one stalled
though)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#35697 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAtyRNDunSsbTahMlr9qTJ9MRJN9MYsGks5s9M4LgaJpZM4Q1g4j>
.
|
We preferred to use the external binary instead of using the Go implementation (see the other pull request); unless there's compelling reasons that you know of. |
7c2c30f
to
c84bb1a
Compare
@thaJeztah Switched out. It looks like LZMA / xz was already doing this, so I took this opportunity to share code between the two of them. |
Once this is in, I'll add pigz as a recommendation to the packages here: https://github.com/docker/docker-ce-packaging. |
@thaJeztah I don't see an easy way to install pigz in the container for CI. Any ideas? |
@thaJeztah PTAL. |
CC: |
pkg/archive/archive.go
Outdated
|
||
unpigzPath, err := exec.LookPath("unpigz") | ||
if err != nil { | ||
logrus.Debug("Cannot find unpigz") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly change the error to be something like "Unpigz binary not found in PATH, falling back to regular gzip" (better suggestions welcome 😅 )
Can you add the package to the Dockerfiles, so that it can be run in CI? (Or is it already there)? |
ping @AkihiroSuda @unclejack @stevvooe PTAL |
I don't think we should add an environment variable. I suggest introducing a daemon flag func DecompressStream(archive io.Reader, opts DecompressOpt...) (io.ReadCloser, error) |
@AkihiroSuda Exposing that knob seems like overkill, and introduces a deprecation cycle if we ever want to revert to using the Golang parallel gzip library. |
@AkihiroSuda If we do anything, I'd suggest (part of a different PR), |
What is gzip here? Do you mean executing /bin/gzip? |
@AkihiroSuda correct. |
How /bin/gzip is useful? |
This change is in response to moby/moby#35697 It adds pigz to the recommended binaries that should be installed with docker-ce. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Upstream-commit: 1ca014b Component: packaging
Wow I thought nobody liked my pigz proposal but then I just noticed a RHEL 7 install failed with pigz as required prerequisite: ec2-user@ip- ~> sudo yum install pigz I love that pigz is now used but my PR had no packaging dependency and would fall back to regular gzip if pigz was unavailable. D'oh! Anybody in packaging please remove pigz from deps... RHEL 7 doesn't have a pigz in repos. Thanks guys! |
@jboero Agree having it as a requirement is not right, however there is no version of Docker for RHEL that has pigz support yet. |
At least for the Debian packaging, it just falls back.
In Debian, they have the idea of “recommends” — does RHEL not?
…Sent from my iPhone
On Apr 23, 2018, at 06:21, Brian Goff <notifications@github.com> wrote:
@jboero <https://github.com/jboero> Agree having it as a requirement is not
right, however there is no version of Docker for RHEL that has pigz support
yet.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#35697 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAtyRG7MtMu4keQ8oc4AyGeqUl_262oFks5trdVGgaJpZM4Q1g4j>
.
|
No, there is no support for "weak depdendencies" until rather recently on
RPM, and not supported on RHEL.
Docker CE is for CentOS, which does have a pigz package.
On Mon, Apr 23, 2018 at 10:48 AM Sargun Dhillon <notifications@github.com>
wrote:
… At least for the Debian packaging, it just falls back.
In Debian, they have the idea of “recommends” — does RHEL not?
Sent from my iPhone
On Apr 23, 2018, at 06:21, Brian Goff ***@***.***> wrote:
@jboero <https://github.com/jboero> Agree having it as a requirement is
not
right, however there is no version of Docker for RHEL that has pigz support
yet.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#35697 (comment)>, or mute
the thread
<
https://github.com/notifications/unsubscribe-auth/AAtyRG7MtMu4keQ8oc4AyGeqUl_262oFks5trdVGgaJpZM4Q1g4j
>
.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#35697 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAwxZssdmd4oX3vzPvgNl1efMnjKiiUMks5trenPgaJpZM4Q1g4j>
.
|
Ah of course. OK thanks guys. I'm glad they pushed for using external pigz instead of trying to rewrite everything from scratch in go. Well done. |
Workaround - pigz is included in EPEL which works with RHEL (and CentOS) |
It's super not clear how to actually make use of this, do we need to ensure that |
@lox correct |
This change is in response to moby/moby#35697 It adds pigz to the recommended binaries that should be installed with docker-ce. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Upstream-commit: 1ca014b Component: packaging
Hey, how do you guys make package decompression parallel? dockerinfo Client:
Debug Mode: false
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 7
Server Version: 19.03.13
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 5.0.2-1.el7.elrepo.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.757GiB
Name: master
ID: KGRO:7J7A:7TKD:UNUO:ZQ7A:QYEJ:DJK4:OK4S:7JBS:U5IG:LBS3:2RJ7
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled after run |
@GitHubDiom this PR makes each individual "extract" run faster if |
Can docker pipeline the pull process for a specific layer? For instance, extracting the layer may start immediately after the beginning of the layer image has been decompressed. |
Theres some comments about decompress/extract in #21814 (comment) (and further) |
Can increase Docker image extraction speed significantly: moby/moby#35697
and introduce pigz for layer extract speed up: moby/moby#35697 Signed-off-by: Jingkai He <jingkai@hey.com>
* bash script fixing Signed-off-by: Jingkai He <jingkai@hey.com> * use the latest focal image as the base image Signed-off-by: Jingkai He <jingkai@hey.com> * removed unnecessary dependencies and introduce pigz for layer extract speed up: moby/moby#35697 Signed-off-by: Jingkai He <jingkai@hey.com> * bring net-tools & iproute2 back Signed-off-by: Jingkai He <jingkai@hey.com> --------- Signed-off-by: Jingkai He <jingkai@hey.com>
- What I did
The Golang built-in gzip library is serialized, and fairly slow
at decompressing. It also only decompresses on demand, versus
pipelining decompression. This fixes those problems.
- How I did it
This change switches to trying to use parallel gzip for gzip decompression as opposed to the
golang one. If it isn't available, it'll fall back to the default. This code path can also be disabled by environment variable.
- How to verify it
There are existing tests for this codepath to ensure correctness of this implementation. I ran some manual benchmarks, and I found I was able to get about 50% better performance as opposed to the build in library. These performance gains were primarily seen on images with layers about 10MB.
- Description for the changelog
Make image (layer) downloads faster by using pgzip