Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parallel decompression (pigz) #2640

Merged
merged 3 commits into from
Sep 19, 2018
Merged

Conversation

mxpv
Copy link
Member

@mxpv mxpv commented Sep 13, 2018

This PR speed up image (layer) decompression by using parallel gzip (http://github.com/klauspost/pgzip).

In the first revision decompression was done by running unpigz external process (similarly to how it's done in moby - moby/moby#35697), but after performing benchmarks it turned out that it's surprisingly slow (I think it's due to external process overhead and moving data back and forth via stdin/stdout).

Here are benchmark results:

  1. exec unpigz vs go gzip (unpigz is slower due to overhead)
BenchmarkGzipDecompression128-4    	       5	 209942631 ns/op	 975.31 MB/s
BenchmarkGzipDecompression256-4    	       5	 382351460 ns/op	 702.06 MB/s
BenchmarkGzipDecompression512-4    	       2	 550383735 ns/op	 639.45 MB/s
BenchmarkGzipDecompression1024-4   	       1	2960740138 ns/op	 362.66 MB/s

BenchmarkPigzDecompression128-4    	       3	 406130786 ns/op	 330.48 MB/s
BenchmarkPigzDecompression256-4    	       2	 813984409 ns/op	 329.78 MB/s
BenchmarkPigzDecompression512-4    	       1	1625377583 ns/op	 330.31 MB/s
BenchmarkPigzDecompression1024-4   	       1	3258120313 ns/op	 329.56 MB/s
  1. exec unpigz vs exec gzip (this was expected behavior, pigz is twice faster, but both decompressors are external processes)
BenchmarkGzipDecompression64-4     	       3	 463050022 ns/op	 144.93 MB/s
BenchmarkGzipDecompression128-4    	       1	1005415857 ns/op	 133.49 MB/s
BenchmarkGzipDecompression256-4    	       1	2197161346 ns/op	 122.17 MB/s
BenchmarkGzipDecompression512-4    	       1	4510266184 ns/op	 119.03 MB/s
BenchmarkGzipDecompression1024-4   	       1	7745037689 ns/op	 138.64 MB/s

BenchmarkPigzDecompression64-4     	       5	 204790158 ns/op	 327.70 MB/s
BenchmarkPigzDecompression128-4    	       3	 406545824 ns/op	 330.14 MB/s
BenchmarkPigzDecompression256-4    	       2	 811413161 ns/op	 330.82 MB/s
BenchmarkPigzDecompression512-4    	       1	1624756288 ns/op	 330.43 MB/s
BenchmarkPigzDecompression1024-4   	       1	4252662532 ns/op	 252.49 MB/s
  1. go gzip vs go pgzip
BenchmarkGzipDecompression64-4      	      20	  64101550 ns/op	1046.91 MB/s
BenchmarkGzipDecompression128-4     	      10	 147177273 ns/op	 911.95 MB/s
BenchmarkGzipDecompression256-4     	       5	 258219949 ns/op	 839.56 MB/s
BenchmarkGzipDecompression512-4     	       1	1119090817 ns/op	 479.74 MB/s
BenchmarkGzipDecompression1024-4    	       1	2742437092 ns/op	 391.53 MB/s

BenchmarkPgzipDecompression64-4     	      20	  59348248 ns/op	1130.76 MB/s
BenchmarkPgzipDecompression128-4    	      10	 119474318 ns/op	1123.40 MB/s
BenchmarkPgzipDecompression256-4    	       5	 253783084 ns/op	1057.74 MB/s
BenchmarkPgzipDecompression512-4    	       3	 482842690 ns/op	1111.90 MB/s
BenchmarkPgzipDecompression1024-4   	       1	1128599249 ns/op	 951.39 MB/s

CC @samuelkarp

Related #1896

@dmcgowan
Copy link
Member

Is there any reason not to make this a compilation option and use pgzip by default?

benchmarkDecompression(b, sizeInMb)
}

func BenchmarkGzipDecompression64(b *testing.B) { benchmarkGzipDecompression(b, 64) }
Copy link
Member

@dmcgowan dmcgowan Sep 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmarks can just be BenchmarkDecompression* and run with 2 different environments(or compilation options). This also makes it easier to use with benchcmp

@@ -37,6 +42,8 @@ const (
Gzip
)

const disablePgzipEnv = "CONTAINER_DISABLE_PGZIP"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid using env var?

Copy link
Member

@samuelkarp samuelkarp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, but it looks like b8aca28 still needs the Signed-off-by line.

@mxpv mxpv force-pushed the pgzip branch 2 times, most recently from 889f1bd to 806f6e9 Compare September 13, 2018 07:03
@Random-Liu
Copy link
Member

Thanks for contributing this!

http://github.com/klauspost/pgzip has no external dependencies, right? Just double check.

@mxpv
Copy link
Member Author

mxpv commented Sep 13, 2018

@Random-Liu yes, there are no external dependencies.

@Random-Liu
Copy link
Member

@mxpv Thanks!

b.Fatal(err)
}

_, err = ioutil.ReadAll(decompressor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try this _, err = io.CopyBuffer(ioutil.Discard, decompressor, buf) where buf is declared before the closure as buf := make([]byte, 32*1024). I saw a massive improvement locally with that.

@dmcgowan
Copy link
Member

I am seeing much different benchmark results. Are you running these with any other flags to see the benefits of parallelism?

$ go test -bench . -benchmem -tags use_go_gzip_decompressor ./archive/compression > /tmp/1
$ go test -bench . -benchmem ./archive/compression > /tmp/2
$ benchcmp /tmp/1 /tmp/2
benchmark                                      old ns/op     new ns/op     delta
BenchmarkDecompression/decompress-64mb-4       10129048      15112975      +49.20%
BenchmarkDecompression/decompress-128mb-4      20902408      28474168      +36.22%
BenchmarkDecompression/decompress-256mb-4      39889272      56016822      +40.43%
BenchmarkDecompression/decompress-512mb-4      84243279      145920711     +73.21%
BenchmarkDecompression/decompress-1024mb-4     163285345     253962614     +55.53%

benchmark                                      old MB/s     new MB/s     speedup
BenchmarkDecompression/decompress-64mb-4       6625.39      4440.48      0.67x
BenchmarkDecompression/decompress-128mb-4      6421.16      4713.67      0.73x
BenchmarkDecompression/decompress-256mb-4      6729.51      4792.05      0.71x
BenchmarkDecompression/decompress-512mb-4      6372.86      3679.20      0.58x
BenchmarkDecompression/decompress-1024mb-4     6575.86      4227.95      0.64x

benchmark                                      old allocs     new allocs     delta
BenchmarkDecompression/decompress-64mb-4       9              35             +288.89%
BenchmarkDecompression/decompress-128mb-4      9              34             +277.78%
BenchmarkDecompression/decompress-256mb-4      9              34             +277.78%
BenchmarkDecompression/decompress-512mb-4      9              34             +277.78%
BenchmarkDecompression/decompress-1024mb-4     9              34             +277.78%

benchmark                                      old bytes     new bytes     delta
BenchmarkDecompression/decompress-64mb-4       41704         4234139       +10052.84%
BenchmarkDecompression/decompress-128mb-4      41705         4233603       +10051.31%
BenchmarkDecompression/decompress-256mb-4      42687         4234991       +9821.03%
BenchmarkDecompression/decompress-512mb-4      43390         4236906       +9664.71%
BenchmarkDecompression/decompress-1024mb-4     45496         4236917       +9212.72%

@mxpv
Copy link
Member Author

mxpv commented Sep 13, 2018

@dmcgowan that's interesting. It looks like ioutil.ReadAll was the bottleneck for gzip reader. With io.CopyBuffer it produces much better results.

@dmcgowan
Copy link
Member

@mxpv I am curious if that was similarly the bottleneck for the external process approach. This is a much cleaner approach but I am wondering if it is a problem with the implementation or the way it is being tested (size of test, parallelism, etc).

@klauspost
Copy link

@dmcgowan 4440.48MB/s is way over normal decompression speed. Either the content is way too compressible or uncompressible to be a real-world test.

Typical speeds should be at least an order of magnitude slower.

Also, if you are just copying to ioutil.Discard you will not see any significant benefit. In pgzip, the decompression is merely done in an async goroutine ahead of the actual reader consuming it. So if you actually use the data for processing you will see a speedup, otherwise it will just match the built-in decompression.

@mxpv
Copy link
Member Author

mxpv commented Sep 14, 2018

@dmcgowan I've reworked benchmark to make it a bit closer to real world scenario.
Here are my results:

$ go test -bench . -tags use_go_gzip_decompressor > /tmp/1
$ go test -bench .  > /tmp/2
$ benchcmp /tmp/1 /tmp/2
benchmark                                     old ns/op      new ns/op      delta
BenchmarkDecompression/decompress-16mb-4      102192219      83716965       -18.08%
BenchmarkDecompression/decompress-32mb-4      203506915      164458274      -19.19%
BenchmarkDecompression/decompress-64mb-4      406238786      333350524      -17.94%
BenchmarkDecompression/decompress-128mb-4     807736530      652771252      -19.19%
BenchmarkDecompression/decompress-256mb-4     1618033941     1324395349     -18.15%

benchmark                                     old MB/s     new MB/s     speedup
BenchmarkDecompression/decompress-16mb-4      186.82       228.05       1.22x
BenchmarkDecompression/decompress-32mb-4      187.63       232.18       1.24x
BenchmarkDecompression/decompress-64mb-4      187.99       229.09       1.22x
BenchmarkDecompression/decompress-128mb-4     189.09       233.98       1.24x
BenchmarkDecompression/decompress-256mb-4     188.79       230.65       1.22x

@mxpv
Copy link
Member Author

mxpv commented Sep 14, 2018

I am curious if that was similarly the bottleneck for the external process approach.

I ran pigz with new benchmark and got awesome results. So I'm going to update this PR to use pigz instead.

$ go test -bench .
pkg: github.com/containerd/containerd/archive/compression
BenchmarkGzipDecompression/gzip-32mb-4         	       5	 205370698 ns/op	 185.92 MB/s
BenchmarkGzipDecompression/gzip-64mb-4         	       3	 404389060 ns/op	 188.84 MB/s
BenchmarkGzipDecompression/gzip-128mb-4        	       2	 802574271 ns/op	 190.30 MB/s
BenchmarkGzipDecompression/gzip-256mb-4        	       1	1637217618 ns/op	 186.58 MB/s

BenchmarkPigzDecompression/pigz-32mb-4         	      20	 101242086 ns/op	 377.15 MB/s
BenchmarkPigzDecompression/pigz-64mb-4         	      10	 202710945 ns/op	 376.73 MB/s
BenchmarkPigzDecompression/pigz-128mb-4        	       3	 386691558 ns/op	 394.98 MB/s
BenchmarkPigzDecompression/pigz-256mb-4        	       2	 771113624 ns/op	 396.14 MB/s

@mxpv mxpv force-pushed the pgzip branch 4 times, most recently from 62bb3a9 to 8bb1073 Compare September 14, 2018 20:12
@crosbymichael
Copy link
Member

crosbymichael commented Sep 17, 2018

Can you squash your commits? I'm not a fan of the merge commit, it's better to do a rebase on master, not a merge. After a squash this LGTM

@mxpv mxpv force-pushed the pgzip branch 2 times, most recently from 6c313ef to 6e131af Compare September 17, 2018 17:25
@mxpv
Copy link
Member Author

mxpv commented Sep 17, 2018

@crosbymichael done

@@ -0,0 +1,5902 @@
[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move test data to gist or somewhere else? (could be downloaded from TestMain())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mxpv mxpv force-pushed the pgzip branch 3 times, most recently from 182766a to 1d909d7 Compare September 17, 2018 19:50
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
@dmcgowan
Copy link
Member

I am going to add this to 1.2 milestone, we should have this in the release. There are still a few changes I want to see but could be done after we merge this

  • Integrate with our daemon configuration, a plugin might not make sense since this could be client side
  • Avoid init functions which do file reads, this could be replaced with a sync.Once
  • Put the entire benchmark in a gist or break the external dependency

@dmcgowan dmcgowan added this to the 1.2 milestone Sep 17, 2018
@mxpv mxpv changed the title Support parallel decompression (pgzip) Support parallel decompression (pigz) Sep 17, 2018
"testing"
)

const benchmarkTestDataURL = "https://git.io/fADcl"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think its a great idea to call out to an external url every time the tests are run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move test data to gist or somewhere else? (could be downloaded from TestMain())

@AkihiroSuda @stevvooe could you please decide which approach works for you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I realize the previous suggestion may not have been clear. I believe the suggestion to move to a gist was to move the entire benchmark to a gist, not just the input file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmcgowan oh, ok. I've moved the benchmark to https://git.io/fASMy and added sync.Once for initialization

Benchmark gist: https://git.io/fASMy

Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
@codecov-io
Copy link

Codecov Report

Merging #2640 into master will decrease coverage by 3.55%.
The diff coverage is 68.51%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2640      +/-   ##
==========================================
- Coverage   47.54%   43.98%   -3.56%     
==========================================
  Files          87       94       +7     
  Lines        8211    10362    +2151     
==========================================
+ Hits         3904     4558     +654     
- Misses       3590     5083    +1493     
- Partials      717      721       +4
Flag Coverage Δ
#linux 47.63% <67.44%> (+0.09%) ⬆️
#windows 40.69% <68.51%> (?)
Impacted Files Coverage Δ
archive/compression/compression.go 53.33% <68.51%> (+2.42%) ⬆️
snapshots/native/native.go 43.3% <0%> (-10%) ⬇️
metadata/snapshot.go 45.8% <0%> (-8.96%) ⬇️
archive/tar.go 43.13% <0%> (-6.87%) ⬇️
metadata/containers.go 47.97% <0%> (-6.62%) ⬇️
content/local/writer.go 56.25% <0%> (-6.42%) ⬇️
content/local/store.go 48.76% <0%> (-4.77%) ⬇️
archive/tar_opts.go 28.57% <0%> (-4.77%) ⬇️
metadata/images.go 58.46% <0%> (-4.7%) ⬇️
... and 55 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c2668d...e8fac24. Read the comment docs.

Copy link
Member

@dmcgowan dmcgowan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@crosbymichael
Copy link
Member

LGTM

@crosbymichael crosbymichael merged commit b38b442 into containerd:master Sep 19, 2018
@mxpv mxpv deleted the pgzip branch September 19, 2018 19:37
kzys added a commit to kzys/containerd that referenced this pull request May 18, 2022
The benchmarks were deleted in containerd#2640 but we could use that to evaluate
zstd further.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
kzys added a commit to kzys/containerd that referenced this pull request May 18, 2022
The benchmarks were deleted in containerd#2640 but we could use that to evaluate
zstd further.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
kiashok pushed a commit to kiashok/containerd-containerd that referenced this pull request Oct 23, 2024
The benchmarks were deleted in containerd#2640 but we could use that to evaluate
zstd further.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants