Releases · lz4/lz4

@JunHe77

LZ4 v1.10.0 introduces major updates, integrating 600+ commits that significantly enhance its capabilities. This version brings multithreading support to the forefront, harnessing modern multi-core processors to accelerate both compression and decompression processing. It's a good upgrade for users looking to optimize performance in high-throughput environments.

Multithreading support

The most visible upgrade of this version is likely Multithreading support. While LZ4 has historically been recognized for its high-speed compression, the demand for even faster throughput has grown, particularly with the advent of nvme storage technologies that allow for multi-GB/s throughput.
Multithreading is particularly beneficial for High Compression modes, which now perform dramatically faster. The following benchmark table showcases the performance improvements:

source	cpu	os	level	v1.9.4	v1.10.0	Improvement
silesia.tar	7840HS	Win11	12	13.4 sec	1.8 sec	x7.4
silesia.tar	M1 Pro	macos	12	16.6 sec	2.55 sec	x6.5
silesia.tar	i7-9700k	linux	12	16.2 sec	3.05 sec	x5.4
enwik9	7840HS	Win11	9	20.8 sec	2.6 sec	x8.0
enwik9	M1 Pro	macos	9	22.1 sec	2.95 sec	x7.4
enwik9	i7-9700k	linux	9	22.9 sec	4.05 sec	x5.7

Multithreading is less critical for decompression, as modern nvme drives can still be saturated with a single decompression thread. Nonetheless, the new version enhances performance by overlapping I/O operations with decompression processes.
Tested on a x64 linux platform, decompressing a 5 GB text file locally takes 5 seconds with v1.9.4;
this is reduced to 3 seconds in v1.10.0, corresponding to > +60% performance improvement.

Official support for dictionary compression (and decompression)

Starting from v1.10.0, dictionary compression, previously tagged as "experimental", now receives full support. This upgrade assures stability and ongoing support for the feature, enabling developers to reliably use this functionality in their applications.
The new symbols supported by liblz4 are :

LZ4_loadDictSlow(): minor variant of LZ4_loadDict(), which consumes more initialization time to better reference the dictionary, resulting in slightly improved compression ratios.
LZ4_attach_dictionary(): use in-place a LZ4 state initialized with a dictionary, to perform dictionary compression (LZ4 Block format) without the initialization costs. Very useful for small data, where dictionary initialization can become a bottleneck. The dictionary state can be used by multiple threads concurrently.
LZ4_attach_HC_dictionary(): same as LZ4_attach_dictionary(), but for LZ4HC dictionary compression.
LZ4F_compressBegin_usingDict(): initiate streaming compression to the LZ4Frame format, using a Dictionary.
LZ4F_decompress_usingDict(): decompress a LZ4Frame requiring a Dictionary
LZ4F_createCDict() : create a materialized dictionary, ready to start compression without initialization cost. Can be shared across multiple threads.
LZ4F_compressFrame_usingCDict(): one-shot compression to the LZ4Frame format, using materialized CDict
LZ4F_compressBegin_usingCDict(): initiate streaming compression to the LZ4Frame format, using materialized CDict

New compression level 2

The new "Level 2" compression effectively fills the substantial gap between the standard "Fast Level 1" and the more intensive "High Compression Level 3." It provides a balanced option, optimizing performance and compression as evidenced in the benchmark results below (i7-9700k, linux):

file	level 1	level 2	level 3
silesia.tar (speed)	685 MB/s	315 MB/s	110 MB/s
silesia.tar (ratio)	x2.101	x2.378	x2.606

Level 2 is ideal for applications requiring better compression than lz4 level 1, without the speed trade-offs associated with HC level 3.

Miscellaneous

The CLI now supports the environment variables LZ4_CLEVEL and LZ4_NBWORKERS, offering flexible control over its behavior in scenarios where direct commands are impractical, or when customized local defaults are necessary.
The licensing for the CLI and test programs has been clarified to GPL-2.0-or-later to distinguish it from GPL-2.0-only, enhancing transparency. The liblz4 library maintains its BSD-2 clause license.
Various less common platforms have been validated (loongArch, risc-v, m68k, mips and sparc), and are now continuously tested in CI, to ensure portability.
Visual Studio solutions are now generated from cmake recipe, in an effort to reduce manual maintenance of multiple Solutions.

One-liner updates

cli : multithreading compression support: improves speed by X times threads allocated
cli : overlap decompression with i/o, improving speed by >+60%
cli : support environment variables LZ4_CLEVEL and LZ4_NBWORKERS
cli : license of CLI more clearly labelled GPL-2.0-or-later
cli : fix: refuse to compress directories
cli : fix dictionary compression benchmark on multiple files
cli : change: no more implicit stdout (except when input is stdin)
lib : new level 2, offering mid-way performance (speed and compression)
lib : Improved lz4frame compression speed for small data (up to +160% at 1KB)
lib : Slightly faster (+5%) HC compression speed (levels 3-9), by @JunHe77
lib : dictionary compression support now in stable status
lib : lz4frame states can be safely reset and reused after a processing error (described by @QrczakMK)
lib : lz4file API improvements, by @vsolontsov-volant and @t-mat
lib : new experimental symbol LZ4_compress_destSize_extState()
build: cmake minimum version raised to 3.5
build: cmake improvements, by @foxeng, @Ohjurot, @LocalSpook, @teo-tsirpanis, @ur4t and @t-mat
build: meson scripts are now hosted into build/ directory, by @eli-schwartz
build: meson improvements, by @tristan957
build: Visual Studio solutions generated by cmake via scripts
port : support for loongArch, risc-v, m68k, mips and sparc architectures
port : improved Visual Studio compatibility, by @t-mat
port : freestanding support improvements, by @t-mat

Automated change log

Cancel in-progress CI if a new commit workflow supplants it by @tristan957 in #1142
allocation optimization for lz4frame compression by @Cyan4973 in #1158
fixed a few remaining ubsan warnings in lz4hc by @Cyan4973 in #1160
simplify getPosition by @Cyan4973 in #1161
build: Support BUILD_SHARED=no by in #1162
fix benchmark mode using Dictionary by @Cyan4973 in #1168
remove usages of base pointer by @Cyan4973 in #1163
fix rare ub by @Cyan4973 in #1169
LZ4 HC match finder and parsers use direct offset values by @Cyan4973 in #1173
very minor refactor of lz4.c by @Cyan4973 in #1174
Update Meson build to 1.9.4 by @tristan957 in #1139
Fixed const-ness of src data pointer in lz4file and install lz4file.h by @vsolontsov-volant in #1192
Add copying lz4file.h to make install by @vsolontsov-volant in #1191
Change the version of lib[x]gcc for clang-(11|12) -mx32 by @t-mat in #1197
Remove PATH=$(PATH) prefix from all shell script invocation in tests/Makefile by @t-mat in #1196
Add environment check for freestanding test : resolves #1186 by @t-mat in #1187
Declare read_long_length_no_check() static by @x4m in #1188
uncompressed-blocks: Allow uncompressed blocks for all modes by @alexmohr in #1178
fixed usan32 tests by @Cyan4973 in #1175
Meson updates by @tristan957 in #1184
Fix typo found by codespell by @DimitriPapadopoulos in #1204
Clean up generation of internal static library by @tristan957 in #1206
build: move meson files from contrib, to go alongside other build systems by @eli-schwartz in #1207
Improve LZ4F_decompress() docs by @embg in #1199
Add 64-bit detection for LoongArch by @zhaixiaojuan in #1209
refuse to compress directories by @Cyan4973 in #1212
Fix #1232 : lz4 command line utility sub-project for Visual Studio 2022 missing by @t-mat in #1233
Add security policy by @pnacht in #1238
fix #1246 by @Cyan4973 in #1247
fix: missing LZ4F_freeDecompressionContext by @t-mat in #1251
CI: updates (Add gcc-13 and clang-15. Fix msvc2022-x86-release) by @t-mat in #1245
Don't clobber default WINDRES in MinGW environments by @uckelman-sf in #1242
Reduce usage of variable cpy on decompression by @Nicoshev in #1226
lib/Makefile: Support building on legacy OS X by @sevan in #1220
Set CMake minimum requirement to 3.5 by @haampie in #1228
Adding XXH_NAMESPACE to CMake builds by @Ohjurot in #1258
Apply pyupgrade suggesti...

@yawqi

LZ4 v1.9.4 is a maintenance release, featuring a substantial amount (~350 commits) of minor fixes and improvements, making it a recommended upgrade. The stable portion of liblz4 API is unmodified, making this release a drop-in replacement for existing features.

Improved decompression speed

Performance wasn't a major focus of this release, but there are nonetheless a few improvements worth mentioning :

Decompression speed on high-end ARM64 platform is improved, by ~+20%. This is notably the case for recent M1 chips, featured in macbook laptops and nucs. Some server-class ARM64 cpus are also impacted, most notably when employing gcc as a compiler. Due to the diversity of aarch64 chips in service, it's still difficult to have a one-size-fits-all policy for this platform.
For the specific scenario of data compressed with -BD4 setting (small blocks, <= 64 KB, linked) decompressed block-by-block into a flush buffer, decompression speed is improved ~+70%. This is most visible in the lz4 CLI, which triggers this exact scenario, but since the improvement is achieved at library level, it may also apply to other scenarios.
Additionally, for compressed data employing the lz4frame format (native format of lz4 CLI), it's possible to ignore checksum validation during decompression, resulting in speed improvements of ~+40% . This capability is exposed at both CLI (see --no-crc) and library levels.

New experimental library capabilities

New liblz4 capabilities are provided in this version. They are considered experimental at this stage, and the most useful ones will be upgraded as candidate "stable" status in an upcoming release :

Ability to require lz4frame API to employ custom allocators for dynamic allocation.
Partial decompression of LZ4 blocks compressed with a dictionary, using LZ4_decompress_safe_partial_usingDict() by @yawqi
Create lz4frame blocks which are intentionally uncompressed, using LZ4F_uncompressedUpdate(), by @alexmohr
New API unit lz4file, abstracting File I/O operations for higher-level programs and libraries, by @anjiahao1
liblz4 can be built for freestanding environments, using the new build macro LZ4_FREESTANDING, by @t-mat. In which case, it will not link to any standard library, disable all dynamic allocations, and rely on user-provided memcpy() and memset() operations.

Miscellaneous

Fixed an annoying Makefile bug introduced in v1.9.3, in which CFLAGS was no longer respected when provided from environment variable. The root cause was an obscure bug in make, which has been fixed upstream following this bug report. There is no need to update make to build liblz4 though, the Makefile has been modified to circumvent the issue and remains compatible with older versions of make.
Makefile is compatible with -j parallel run, including to run parallel tests (make -j test).
Documentation of LZ4 Block format has been updated, featuring notably a paragraph "Implementation notes", underlining common pitfalls for new implementers of the format

Changes list

Here is a more detailed list of updates introduced in v1.9.4 :

perf : faster decoding speed (~+20%) on Apple Silicon platforms, by @zeux
perf : faster decoding speed (~+70%) for -BD4 setting in CLI
api : new function LZ4_decompress_safe_partial_usingDict() by @yawqi
api : lz4frame: ability to provide custom allocators at state creation
api : can skip checksum validation for improved decoding speed
api : new experimental unit lz4file for file i/o API, by @anjiahao1
api : new experimental function LZ4F_uncompressedUpdate(), by @alexmohr
cli : --list works on stdin input, by @Low-power
cli : --no-crc does not produce (compression) nor check (decompression) checksums
cli : fix: --test and --list produce an error code when parsing invalid input
cli : fix: --test -m does no longer create decompressed file artifacts
cli : fix: support skippable frames when passed via stdin, reported by @davidmankin
build: fix: Makefile respects CFLAGS directives passed via environment variable
build: LZ4_FREESTANDING, new build macro for freestanding environments, by @t-mat
build: make and make test are compatible with -j parallel run
build: AS/400 compatibility, by @jonrumsey
build: Solaris 10 compatibility, by @pekdon
build: MSVC 2022 support, by @t-mat
build: improved meson script, by @eli-schwartz
doc : Updated LZ4 block format, provide an "implementation notes" section

New Contributors

@emaxerrno made their first contribution in #884
@servusdei2018 made their first contribution in #886
@aqrit made their first contribution in #898
@attilaolah made their first contribution in #919
@XVilka made their first contribution in #922
@hmaarrfk made their first contribution in #962
@ThomasWaldmann made their first contribution in #965
@sigiesec made their first contribution in #964
@klebertarcisio made their first contribution in #973
@jasperla made their first contribution in #972
@GabeNI made their first contribution in #1001
@ITotalJustice made their first contribution in #1005
@lifegpc made their first contribution in #1000
@eloj made their first contribution in #1011
@pekdon made their first contribution in #999
@fanzeyi made their first contribution in #1017
@a1346054 made their first contribution in #1024
@kmou424 made their first contribution in #1026
@kostasdizas made their first contribution in #1030
@fwessels made their first contribution in #1032
@zeux made their first contribution in #1040
@DimitriPapadopoulos made their first contribution in #1042
@mcfi made their first contribution in #1054
@eli-schwartz made their first contribution in #1049
@leonvictor made their first contribution in #1052
@tristan957 made their first contribution in #1064
@anjiahao1 made their first contribution in #1068
@danyeaw made their first contribution in #1075
@yawqi made their first contribution in #1093
@nathannaveen made their first contribution in #1088
@alexmohr made their first contribution in #1094
@yoniko made their first contribution in #1100
@jonrumsey made their first contribution in #1104
@dpelle made their first contribution in #1125
@SpaceIm made their first contribution in #1133

@wolfpld

LZ4 v1.9.3 is a maintenance release, offering more than 200+ commits to fix multiple corner cases and build scenarios. Update is recommended. Existing liblz4 API is not modified, so it should be a drop-in replacement.

Faster Windows binaries

On the build side, multiple rounds of improvements, thanks to contributors such as @wolfpld and @remittor, make this version generate faster binaries for Visual Studio. It is also expected to better support a broader range of VS variants.
Speed benefits can be substantial. For example, on my laptop, compared with v1.9.2, this version built with VS2019 compresses at 640 MB/s (from 420 MB/s), and decompression reaches 3.75 GB/s (from 3.3 GB/s). So this is definitely perceptible.

Other notable updates

Among the visible fixes, this version improves the _destSize() variant, an advanced API which reverses the logic by targeting an a-priori compressed size and trying to shove as much data as possible into the target budget. The high compression variant LZ4_compress_HC_destSize() would miss some important opportunities in highly compressible data, resulting in less than optimal compression (detected by @hsiangkao). This is fixed in this version. Even the "fast" variant receives some gains (albeit very small).
Also, the corresponding decompression function, LZ4_decompress_safe_partial(), officially supports a scenario where the input (compressed) size is unknown (but bounded), as long as the requested amount of data to regenerate is smaller or equal to the block's content. This function used to require the exact compressed size, and would sometimes support above scenario "by accident", but then could also break it by accident. This is now firmly controlled, documented and tested.

Finally, replacing memory functions (malloc(), calloc(), free()), typically for freestanding environments, is now a bit easier. It used to require a small direct modification of lz4.c source code, but can now be achieved by using the build macro LZ4_USER_MEMORY_FUNCTIONS at compilation time. In which case, liblz4 no longer includes <stdlib.h>, and requires instead that functions LZ4_malloc(), LZ4_calloc() and LZ4_free() are implemented somewhere in the project, and then available at link time.

Changes list

Here is a more detailed list of updates introduced in v1.9.3 :

perf: highly improved speed in kernel space, by @terrelln
perf: faster speed with Visual Studio, thanks to @wolfpld and @remittor
perf: improved dictionary compression speed, by @felixhandte
perf: fixed LZ4_compress_HC_destSize() ratio, detected by @hsiangkao
perf: reduced stack usage in high compression mode, by @Yanpas
api : LZ4_decompress_safe_partial() supports unknown compressed size, requested by @jfkthame
api : improved LZ4F_compressBound() with automatic flushing, by Christopher Harvie
api : can (de)compress to/from NULL without UBs
api : fix alignment test on 32-bit systems (state initialization)
api : fix LZ4_saveDictHC() in corner case scenario, detected by @IgorKorkin
cli : compress multiple files using the legacy format, by Filipe Calasans
cli : benchmark mode supports dictionary, by @rkoradi
cli : fix --fast with large argument, detected by @picoHz
build: link to user-defined memory functions with LZ4_USER_MEMORY_FUNCTIONS
build: contrib/cmake_unofficial/ moved to build/cmake/
build: visual/* moved to build/
build: updated meson script, by @neheb
build: tinycc support, by Anton Kochkov
install: Haiku support, by Jerome Duval
doc : updated LZ4 frame format, clarify EndMark

Known issues :

Some people have reported a broken liblz4_static.lib file in the package lz4_win64_v1_9_3.zip. This is probably a mingw / msvc compatibility issue. If you have issues employing this file, the solution is to rebuild it locally from sources with your target compiler.
The standard Makefile in v1.9.3 doesn't honor CFLAGS when passed through environment variable. This is fixed in more recent version on dev branch. See #958 for details.

@cmeister2

This is primarily a bugfix release, driven by the bugs found and fixed since LZ4 recent integration into Google's oss-fuzz, initiated by @cmeister2 . The new capability was put to good use by @terrelln, dramatically expanding the number of scenarios covered by the profile-guided fuzzer. These scenarios were already covered by unguided fuzzers, but a few bugs require a large combinations of factors that unguided fuzzer are unable to produce in a reasonable timeframe.

Due to these fixes, an upgrade of LZ4 to its latest version is recommended.

fix : out-of-bound read in exceptional circumstances when using decompress_partial(), by @terrelln
fix : slim opportunity for out-of-bound write with compress_fast() with a large enough input and when providing an output smaller than recommended (< LZ4_compressBound(inputSize)), by @terrelln
fix : rare data corruption bug with LZ4_compress_destSize(), by @terrelln
fix : data corruption bug when Streaming with an Attached Dict in HC Mode, by @felixhandte
perf: enable LZ4_FAST_DEC_LOOP on aarch64/GCC by default, by @prekageo
perf: improved lz4frame streaming API speed, by @dreambottle
perf: speed up lz4hc on slow patterns when using external dictionary, by @terrelln
api: better in-place decompression and compression support
cli : --list supports multi-frames files, by @gstedman
cli: --version outputs to stdout
cli : add option --best as an alias of -12 , by @Low-power
misc: Integration into oss-fuzz by @cmeister2, expanded list of scenarios by @terrelln

@gabrielstedman

This is a point release, which main objective is to fix a read out-of-bound issue reported in the decoder of v1.9.0. Upgrade from this version is recommended.

A few other improvements were also merged during this time frame (listed below).
A visible user-facing one is the introduction of a new command --list, started by @gabrielstedman, which makes it possible to peek at the internals of a .lz4 file. It will provide the block type, checksum information, compressed and decompressed sizes (if present). The command is limited to single-frame files for the time being.

Changes

fix : decompression functions were reading a few bytes beyond input size (introduced in v1.9.0, reported by @ppodolsky and @danlark1)
api : fix : lz4frame initializers compatibility with c++, reported by @degski
cli : added command --list, based on a patch by @gabrielstedman
build: improved Windows build, by @JPeterMugaas
build: AIX, by Norman Green

Note : this release has an issue when compiling liblz4 dynamic library on Mac OS-X. This issue is fixed in : #696 .

@djwatson

Warning : this version has a known bug in the decompression function which makes it read a few bytes beyond input limit. Upgrade to v1.9.1 is recommended.

LZ4 v1.9.0 is a performance focused release, also offering minor API updates.

Decompression speed improvements

Dave Watson (@djwatson) managed to carefully optimize the LZ4 decompression hot loop, offering substantial speed improvements on x86 and x64 platforms.

Here are some benchmark running on a Core i7-9700K, source compiled using gcc v8.2.0 on Ubuntu 18.10 "Cosmic Cuttlefish" (Linux 4.18.0-17-generic) :

Version	v1.8.3	v1.9.0	Improvement
enwik8	4090 MB/s	4560 MB/s	+12%
calgary.tar	4320 MB/s	4860 MB/s	+13%
silesia.tar	4210 MB/s	4970 MB/s	+18%

Given that decompression speed has always been a strong point of lz4, the improvement is quite substantial.

The new decoding loop is automatically enabled on x64 and x86.
For other cpu types, since our testing capabilities are more limited, the new decoding loop is disabled by default. However, anyone can manually enable it, by using the build macro LZ4_FAST_DEC_LOOP, which accepts values 0 or 1. The outcome will vary depending on exact target and build chains. For example, in our limited tests with ARM platforms, we found that benefits vary strongly depending on cpu manufacturer, chip model, and compiler version, making it difficult to offer a "generic" statement. ARM situation may prove extreme though, due to the proliferation of variants available. Other cpu types may prove easier to assess.

API updates

`_destSize()`

The _destSize() compression variants have been promoted to stable status.
These variants reverse the logic, by trying to fit as much input data as possible into a fixed memory budget. This is used for example in WiredTiger and EroFS, which cram as much data as possible into the size of a physical sector, for improved storage density.

`reset*_fast()`

When compressing small inputs, the fixed cost of clearing the compression's internal data structures can become a significant fraction of the compression cost. In v1.8.2, new LZ4 entry points have been introduced to perform this initialization at effectively zero cost. LZ4_resetStream_fast() and LZ4_resetStreamHC_fast() are now promoted into stable.

They are supplemented by new entry points, LZ4_initStream() and its corresponding HC variant, which must be used on any uninitialized memory segment that will be converted into an LZ4 state. After that, only reset*_fast() is needed to start some new compression job re-using the same context. This proves especially effective when compressing a lot of small data.

deprecation

The decompress*_fast() variants have been moved into the deprecate section.
While they offer slightly faster decompression speed (~+5%), they are also unprotected against malicious inputs, resulting in security liability. There are some limited cases where this property could prove acceptable (perfectly controlled environment, same producer / consumer), but in most cases, the risk is not worth the benefit.
We want to discourage such usage as clearly as possible, by pushing the _fast() variant into deprecation area.
For the time being, they will not yet generate deprecation warnings when invoked, to give time to existing applications to move towards decompress*_safe(). But this is the next stage, and is likely to happen in a future release.

LZ4_resetStream() and LZ4_resetStreamHC() have also been moved into the deprecate section, to emphasize the preference towards LZ4_resetStream_fast(). Their real equivalent are actually LZ4_initStream() and LZ4_initStreamHC(), which are more generic (can accept any memory area to initialize) and safer (control size and alignment). Also, the naming makes it clearer when to use initStream() and when to use resetStream_fast().

Changes list

This release brings an assortment of small improvements and bug fixes, as detailed below :

perf: large decompression speed improvement on x86/x64 (up to +20%) by @djwatson
api : changed : _destSize() compression variants are promoted to stable API
api : new : LZ4_initStream(HC), replacing LZ4_resetStream(HC)
api : changed : LZ4_resetStream(HC) as recommended reset function, for better performance on small data
cli : support custom block sizes, by @blezsan
build: source code can be amalgamated, by Bing Xu
build: added meson build, by @lzutao
build: new build macros : LZ4_DISTANCE_MAX, LZ4_FAST_DEC_LOOP
install: MidnightBSD, by @laffer1
install: msys2 on Windows 10, by @vtorri

@Pashugan

This is maintenance release, mainly triggered by issue #560.
#560 is a data corruption that can only occur in v1.8.2, at level 9 (only), for some "large enough" data blocks (> 64 KB), featuring a fairly specific data pattern, improbable enough that multiple cpu running various fuzzers non-stop during a period of several weeks where not able to find it. Big thanks to @Pashugan for finding and sharing a reproducible sample.

Due to this fix, v1.8.3 is a recommended update.

A few other minor features were already merged, and are therefore bundled in this release too.

Should lz4 prove too slow, it's now possible to invoke --fast=# command, by @jennifermliu . This is equivalent to the acceleration parameter in the API, in which user forfeit some compression ratio for the benefit of better speed.

The verbose CLI has been fixed, and now displays the real amount of time spent compressing (instead of cpu time). It also shows a new indicator, cpu load %, so that users can determine if the limiting factor was cpu or I/O bandwidth.

Finally, an existing function, LZ4_decompress_safe_partial(), has been enhanced to make it possible to decompress only the beginning of an LZ4 block, up to a specified number of bytes. Partial decoding can be useful to save CPU time and memory, when the objective is to extract a limited portion from a larger block.

@svpv

LZ4 v1.8.2 is a performance focused release, featuring important improvements for small inputs, especially when coupled with dictionary compression.

General speed improvements

LZ4 decompression speed has always been a strong point. In v1.8.2, this gets even better, as it improves decompression speed by about 10%, thanks in a large part to suggestion from @svpv .

For example, on a Mac OS-X laptop with an Intel Core i7-5557U CPU @ 3.10GHz,
running lz4 -bsilesia.tar compiled with default compiler llvm v9.1.0:

Version	v1.8.1	v1.8.2	Improvement
Decompression speed	2490 MB/s	2770 MB/s	+11%

Compression speeds also receive a welcomed boost, though improvement is not evenly distributed, with higher levels benefiting quite a lot more.

Version	v1.8.1	v1.8.2	Improvement
lz4 -1	504 MB/s	516 MB/s	+2%
lz4 -9	23.2 MB/s	25.6 MB/s	+10%
lz4 -12	3.5 Mb/s	9.5 MB/s	+170%

Should you aim for best possible decompression speed, it's possible to request LZ4 to actively favor decompression speed, even if it means sacrificing some compression ratio in the process. This can be requested in a variety of ways depending on interface, such as using command --favor-decSpeed on CLI. This option must be combined with ultra compression mode (levels 10+), as it needs careful weighting of multiple solutions, which only this mode can process.
The resulting compressed object always decompresses faster, but is also larger. Your mileage will vary, depending on file content. Speed improvement can be as low as 1%, and as high as 40%. It's matched by a corresponding file size increase, which tends to be proportional. The general expectation is 10-20% faster decompression speed for 1-2% bigger files.

Filename	decompression speed	`--favor-decSpeed`	Speed Improvement	Size change
silesia.tar	2870 MB/s	3070 MB/s	+7 %	+1.45%
dickens	2390 MB/s	2450 MB/s	+2 %	+0.21%
nci	3740 MB/s	4250 MB/s	+13 %	+1.93%
osdb	3140 MB/s	4020 MB/s	+28 %	+4.04%
xml	3770 MB/s	4380 MB/s	+16 %	+2.74%

Finally, variant LZ4_compress_destSize() also receives a ~10% speed boost, since it now internally redirects toward primary internal implementation of LZ4 fast mode, rather than relying on a separate custom implementation. This allows it to take advantage of all the optimization work that has gone into the main implementation.

Compressing small contents

When compressing small inputs, the fixed cost of clearing the compression's internal data structures can become a significant fraction of the compression cost. This release adds a new way, under certain conditions, to perform this initialization at effectively zero cost.

New, experimental LZ4 APIs have been introduced to take advantage of this functionality in block mode:

LZ4_resetStream_fast()
LZ4_compress_fast_extState_fastReset()
LZ4_resetStreamHC_fast()
LZ4_compress_HC_extStateHC_fastReset()

More detail about how and when to use these functions is provided in their respective headers.

LZ4 Frame mode has been modified to use this faster reset whenever possible. LZ4F_compressFrame_usingCDict() prototype has been modified to additionally take an LZ4F_CCtx* context, so it can use this speed-up.

Efficient Dictionary compression

Support for dictionaries has been improved in a similar way: they can now be used in-place, which avoids the expense of copying the context state from the dictionary into the working context. Users are expect to see a noticeable performance improvement for small data.

Experimental prototypes (LZ4_attach_dictionary() and LZ4_attach_HC_dictionary()) have been added to LZ4 block API using a loaded dictionary in-place. LZ4 Frame API users should benefit from this optimization transparently.

The previous two changes, when taken advantage of, can provide meaningful performance improvements when compressing small data. Both changes have no impact on the produced compressed data. The only observable difference is speed.

This is a representative graphic of the sort of speed boost to expect. The red lines are the speeds seen for an input blob of the specified size, using the previous LZ4 release (v1.8.1) at compression levels 1 and 9 (those being, fast mode and default HC level). The green lines are the equivalent observations for v1.8.2. This benchmark was performed on the Silesia Corpus. Results for the dickens text are shown, other texts and compression levels saw similar improvements. The benchmark was compiled with GCC 7.2.0 with -O3 -march=native -mtune=native -DNDEBUG under Linux 4.6 and run on an Intel Xeon CPU E5-2680 v4 @ 2.40GHz.

`lz4frame_static.h` Deprecation

The content of lz4frame_static.h has been folded into lz4frame.h, hidden by a macro guard "#ifdef LZ4F_STATIC_LINKING_ONLY". This means lz4frame.h now matches lz4.h and lz4hc.h. lz4frame_static.h is retained as a shell that simply sets the guard macro and includes lz4frame.h.

Changes list

This release also brings an assortment of small improvements and bug fixes, as detailed below :

perf: faster compression on small files, by @felixhandte
perf: improved decompression speed and binary size, by Alexey Tourbin (@svpv)
perf: faster HC compression, especially at max level
perf: very small compression ratio improvement
fix : compression compatible with low memory addresses (< 0xFFFF)
fix : decompression segfault when provided with NULL input, by @terrelln
cli : new command --favor-decSpeed
cli : benchmark mode more accurate for small inputs
fullbench : can bench _destSize() variants, by @felixhandte
doc : clarified block format parsing restrictions, by Alexey Tourbin (@svpv)

@felixhandte

LZ4 v1.8.1 most visible new feature is its support for Dictionary compression .
This was already somewhat possible, but in a complex way, requiring knowledge of internal working.
Support is now more formally added on the API side within lib/lz4frame_static.h. It's early days, and this new API is tagged "experimental" for the time being.

Support is also added in the command line utility lz4, using the new command -D, implemented by @felixhandte. The behavior of this command is identical to zstd, should you be already familiar.

lz4 doesn't specify how to build a dictionary. All it says is that it can be any file up to 64 KB.
This approach is compatible with zstd dictionary builder, which can be instructed to create a 64 KB dictionary with this command :

zstd --train dirSamples/* -o dictName --maxdict=64KB

LZ4 v1.8.1 also offers improved performance at ultra settings (levels 10+).
These levels receive a new code, called optimal parser, available in lib/lz4_opt.h.
Compared with previous version, the new parser uses less memory (from 384KB to 256KB), performs faster, compresses a little bit better (not much, as it was already close to theoretical limit), and resists pathological patterns which could destroy performance (see #339),

For comparison, here are some quick benchmark using LZ4 v1.8.0 on my laptop with silesia.tar :

./lz4 -b9e12 -v ~/dev/bench/silesia.tar
*** LZ4 command line interface 64-bits v1.8.0, by Yann Collet ***
Benchmarking levels from 9 to 12
 9#silesia.tar       : 211984896 ->  77897777 (2.721),  24.2 MB/s ,2401.8 MB/s
10#silesia.tar       : 211984896 ->  77852187 (2.723),  16.9 MB/s ,2413.7 MB/s
11#silesia.tar       : 211984896 ->  77435086 (2.738),   7.1 MB/s ,2425.7 MB/s
12#silesia.tar       : 211984896 ->  77274453 (2.743),   3.3 MB/s ,2390.0 MB/s

and now using LZ4 v1.8.1 :

./lz4 -b9e12 -v ~/dev/bench/silesia.tar
*** LZ4 command line interface 64-bits v1.8.1, by Yann Collet ***
Benchmarking levels from 9 to 12
 9#silesia.tar       : 211984896 ->  77890594 (2.722),  24.4 MB/s ,2405.2 MB/s
10#silesia.tar       : 211984896 ->  77859538 (2.723),  19.3 MB/s ,2476.0 MB/s
11#silesia.tar       : 211984896 ->  77369725 (2.740),  10.1 MB/s ,2478.4 MB/s
12#silesia.tar       : 211984896 ->  77270146 (2.743),   3.7 MB/s ,2508.3 MB/s

The new parser is also directly compatible with lower compression levels, which brings additional benefits :

Compatibility with LZ4_*_destSize() variant, which reverses the logic by trying to fit as much data as possible into a predefined limited size buffer.
Compatibility with Dictionary compression, as it uses the same tables as regular HC mode

In the future, this compatibility will also allow dynamic on-the-fly change of compression level, but such feature is not implemented at this stage.

The release also provides a set of small bug fixes and improvements, listed below :

perf : faster and stronger ultra modes (levels 10+)
perf : slightly faster compression and decompression speed
perf : fix bad degenerative case, reported by @c-morgenstern
fix : decompression failed when using a combination of extDict + low memory address (#397), reported and fixed by Julian Scheid (@jscheid)
cli : support for dictionary compression (-D), by Felix Handte @felixhandte
cli : fix : lz4 -d --rm preserves timestamp (#441)
cli : fix : do not modify /dev/null permission as root, by @aliceatlas
api : new dictionary api in lib/lz4frame_static.h
api : _destSize() variant supported for all compression levels
build : make and make test compatible with parallel build -jX, reported by @mwgamera
build : can control LZ4LIB_VISIBILITY macro, by @mikir
install: fix man page directory (#387), reported by Stuart Cardall (@itoffshore)

Note : v1.8.1.2 is the same as v.1.8.1, with the version number fixed in source code, as notified by Po-Chuan Hsieh (@sunpoet).

@sunpoet

Prefer using v1.8.1.2.
It's the same as v1.8.1, but the version number in source code has been fixed, thanks to @sunpoet.
The version number is used in cli and documentation display, to create the full name of dynamic library, and can be requested via LZ4_versionNumber().

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multithreading support

Official support for dictionary compression (and decompression)

New compression level 2

Miscellaneous

One-liner updates

Automated change log

Contributors

Improved decompression speed

New experimental library capabilities

Miscellaneous

Changes list

New Contributors

Contributors

Faster Windows binaries

Other notable updates

Changes list

Contributors

Changes

Decompression speed improvements

API updates

`_destSize()`

`reset*_fast()`

deprecation

Changes list

General speed improvements

Compressing small contents

Efficient Dictionary compression

`lz4frame_static.h` Deprecation

Changes list

Releases: lz4/lz4

LZ4 v1.10.0 - Multicores edition

Multithreading support

Official support for dictionary compression (and decompression)

New compression level 2

Miscellaneous

One-liner updates

Automated change log

Contributors

LZ4 v1.9.4

Improved decompression speed

New experimental library capabilities

Miscellaneous

Changes list

New Contributors

Contributors

LZ4 v1.9.3

Faster Windows binaries

Other notable updates

Changes list

Contributors

LZ4 v1.9.2

LZ4 v1.9.1

Changes

LZ4 v1.9.0

Decompression speed improvements

API updates

_destSize()

reset*_fast()

deprecation

Changes list

LZ4 v1.8.3 - maintenance quickfix

LZ4 v1.8.2 - faster small data

General speed improvements

Compressing small contents

Efficient Dictionary compression

lz4frame_static.h Deprecation

Changes list

LZ4 v1.8.1.2

LZ4 v1.8.1 (deprecated)

`_destSize()`

`reset*_fast()`

`lz4frame_static.h` Deprecation