Skip to content

LZ4 v1.10.0 - Multicores edition

Latest
Compare
Choose a tag to compare
@Cyan4973 Cyan4973 released this 22 Jul 05:36
ebb370c

LZ4 v1.10.0 introduces major updates, integrating 600+ commits that significantly enhance its capabilities. This version brings multithreading support to the forefront, harnessing modern multi-core processors to accelerate both compression and decompression processing. It's a good upgrade for users looking to optimize performance in high-throughput environments.

Multithreading support

The most visible upgrade of this version is likely Multithreading support. While LZ4 has historically been recognized for its high-speed compression, the demand for even faster throughput has grown, particularly with the advent of nvme storage technologies that allow for multi-GB/s throughput.
Multithreading is particularly beneficial for High Compression modes, which now perform dramatically faster. The following benchmark table showcases the performance improvements:

source cpu os level v1.9.4 v1.10.0 Improvement
silesia.tar 7840HS Win11 12 13.4 sec 1.8 sec x7.4
silesia.tar M1 Pro macos 12 16.6 sec 2.55 sec x6.5
silesia.tar i7-9700k linux 12 16.2 sec 3.05 sec x5.4
enwik9 7840HS Win11 9 20.8 sec 2.6 sec x8.0
enwik9 M1 Pro macos 9 22.1 sec 2.95 sec x7.4
enwik9 i7-9700k linux 9 22.9 sec 4.05 sec x5.7

Multithreading is less critical for decompression, as modern nvme drives can still be saturated with a single decompression thread. Nonetheless, the new version enhances performance by overlapping I/O operations with decompression processes.
Tested on a x64 linux platform, decompressing a 5 GB text file locally takes 5 seconds with v1.9.4;
this is reduced to 3 seconds in v1.10.0, corresponding to > +60% performance improvement.

Official support for dictionary compression (and decompression)

Starting from v1.10.0, dictionary compression, previously tagged as "experimental", now receives full support. This upgrade assures stability and ongoing support for the feature, enabling developers to reliably use this functionality in their applications.
The new symbols supported by liblz4 are :

  • LZ4_loadDictSlow(): minor variant of LZ4_loadDict(), which consumes more initialization time to better reference the dictionary, resulting in slightly improved compression ratios.
  • LZ4_attach_dictionary(): use in-place a LZ4 state initialized with a dictionary, to perform dictionary compression (LZ4 Block format) without the initialization costs. Very useful for small data, where dictionary initialization can become a bottleneck. The dictionary state can be used by multiple threads concurrently.
  • LZ4_attach_HC_dictionary(): same as LZ4_attach_dictionary(), but for LZ4HC dictionary compression.
  • LZ4F_compressBegin_usingDict(): initiate streaming compression to the LZ4Frame format, using a Dictionary.
  • LZ4F_decompress_usingDict(): decompress a LZ4Frame requiring a Dictionary
  • LZ4F_createCDict() : create a materialized dictionary, ready to start compression without initialization cost. Can be shared across multiple threads.
  • LZ4F_compressFrame_usingCDict(): one-shot compression to the LZ4Frame format, using materialized CDict
  • LZ4F_compressBegin_usingCDict(): initiate streaming compression to the LZ4Frame format, using materialized CDict

New compression level 2

The new "Level 2" compression effectively fills the substantial gap between the standard "Fast Level 1" and the more intensive "High Compression Level 3." It provides a balanced option, optimizing performance and compression as evidenced in the benchmark results below (i7-9700k, linux):

file level 1 level 2 level 3
silesia.tar (speed) 685 MB/s 315 MB/s 110 MB/s
silesia.tar (ratio) x2.101 x2.378 x2.606

Level 2 is ideal for applications requiring better compression than lz4 level 1, without the speed trade-offs associated with HC level 3.

Miscellaneous

  • The CLI now supports the environment variables LZ4_CLEVEL and LZ4_NBWORKERS, offering flexible control over its behavior in scenarios where direct commands are impractical, or when customized local defaults are necessary.
  • The licensing for the CLI and test programs has been clarified to GPL-2.0-or-later to distinguish it from GPL-2.0-only, enhancing transparency. The liblz4 library maintains its BSD-2 clause license.
  • Various less common platforms have been validated (loongArch, risc-v, m68k, mips and sparc), and are now continuously tested in CI, to ensure portability.
  • Visual Studio solutions are now generated from cmake recipe, in an effort to reduce manual maintenance of multiple Solutions.

One-liner updates

  • cli : multithreading compression support: improves speed by X times threads allocated
  • cli : overlap decompression with i/o, improving speed by >+60%
  • cli : support environment variables LZ4_CLEVEL and LZ4_NBWORKERS
  • cli : license of CLI more clearly labelled GPL-2.0-or-later
  • cli : fix: refuse to compress directories
  • cli : fix dictionary compression benchmark on multiple files
  • cli : change: no more implicit stdout (except when input is stdin)
  • lib : new level 2, offering mid-way performance (speed and compression)
  • lib : Improved lz4frame compression speed for small data (up to +160% at 1KB)
  • lib : Slightly faster (+5%) HC compression speed (levels 3-9), by @JunHe77
  • lib : dictionary compression support now in stable status
  • lib : lz4frame states can be safely reset and reused after a processing error (described by @QrczakMK)
  • lib : lz4file API improvements, by @vsolontsov-volant and @t-mat
  • lib : new experimental symbol LZ4_compress_destSize_extState()
  • build: cmake minimum version raised to 3.5
  • build: cmake improvements, by @foxeng, @Ohjurot, @LocalSpook, @teo-tsirpanis, @ur4t and @t-mat
  • build: meson scripts are now hosted into build/ directory, by @eli-schwartz
  • build: meson improvements, by @tristan957
  • build: Visual Studio solutions generated by cmake via scripts
  • port : support for loongArch, risc-v, m68k, mips and sparc architectures
  • port : improved Visual Studio compatibility, by @t-mat
  • port : freestanding support improvements, by @t-mat

Automated change log

New Contributors

Full Changelog: v1.9.4...v1.10.0

edit: the lz4-1.10.0.tar.gz artifact has been updated because the initial version was embedding some macos specific stuff.

edit 2: the windows binary packages have been updated to fix a bug affecting time measurement when compressing extremely large files.