Add a compress-level parameter #418

dybvig · 2019-04-11T02:44:24Z

This patch adds a compress-level parameter that can be used to control the amount of effort expended compressing, whether with the default LZ4 compression or gzip compression. The parameter can take on one of the symbolic values low, medium, high, or maximum. The default is medium, which for LZ4 currently maps to LZ4HC_CLEVEL_MIN and is both more effective and slower (at compression, but not decompression) than the previous default. The patch also makes several changes to the LZ4-support code added in #410; in particular, it restructures the code a bit, puts the compress format directly in the thread context (which now also holds the compress level), and changes how the compressed bytevector compression format and length are stored in the header word. I have tested the changes on OSX and Linux but not Windows.

mflatt · 2019-04-11T14:05:10Z

These changes look good to me.

(Just to confirm, Racket-on-Chez builds fine with the patch, its boot file size goes down by about 17%, and load time remains about the same and maybe slightly faster.)

burgerrg

Thanks for the changes and for finding the missing documentation from the named condition and mutex change! I found a bug in the tag extraction of compressed bytevectors, a couple missing casts for the Windows C compiler, and a couple typos. I'm running the mats on Windows now. So far ta6nt passed o=0 and o=3.

c/externs.h

c/compress-io.c

c/new-io.c

csug/io.stex

s/bytevector.ss

c/new-io.c

burgerrg · 2019-04-11T23:55:44Z

All the Windows mats passed!

dybvig · 2019-04-12T00:52:26Z

Thanks for the corrections!

c/compress-io.c

c/thread.c

owaddell · 2019-04-18T10:36:24Z

Thanks for all the changes on this branch! Valgrind's memcheck tool is happy now.

Note typo on line 85 of compress-io.c: "lcoal" -> "local".

… other related improvements - added compress-level parameter to select a compression level for file writing and changed the default for lz4 compression to do a better job compressing. finished splitting glz input routines apart from glz output routines and did a bit of other restructuring. removed gzxfile struct-as-bytevector wrapper and moved its fd into glzFile. moved DEACTIVATE to before glzdopen_input calls in S_new_open_input_fd and S_compress_input_fd, since glzdopen_input reads from the file and could block. the compress format and now level are now recorded directly the thread context. replaced as-gz? flag bit in compressed bytevector header word with a small number of bits recording the compression format at the bottom of the header word. flushed a couple of bytevector compression mats that depended on the old representation. (these last few changes should make adding new compression formats easier.) added s-directory build options to choose whether to compress and, if so, the format and level. compress-io.h, compress-io.c, new-io.c, equates.h, system.h, scheme.c, gc.c, io.ss, cmacros.ss, back.ss, bytevector.ss, primdata.ss, s/Mf-base, io.ms, mat.ss, bytevector.ms, root-experr*, release_notes.stex, io.stex, system.stex, objects.stex - improved the effectiveness of LZ4 boot-file compression to within 15% of gzip by increasing the lz4 output-port in_buffer size to 1<<18. With the previous size (1<<14) LZ4-compressed boot files were about 50% larger. set the lz4 input-port in_buffer and out_buffer sizes to 1<<12 and 1<<14. there's no clear win at present for larger input-port buffer sizes. compress-io.c - To reduce the memory hit for the increased output-port in_buffer size and the corresponding increase in computed out_buffer size, one output-side out_buffer is now allocated (lazily) per thread and stored in the thread context. The other buffers are now directly a part of the lz4File_out and lz4File_in structures rather than allocated separately. compress-io.c, scheme.c, gc.c, cmacros.ss - split out the buffer emit code from glzwrite_lz4 into a separate glzemit_lz4 helper that is now also used by gzclose so we can avoid dealing with a NULL buffer in glzwrite_lz4. glzwrite_lz4 also uses it to writing large buffers directly and avoid the memcpy. compress-io.c - replaced lz4File_out and lz4File_in mode enumeration with the compress format and inputp boolean. using switch to check and raising exceptions for unexpected values to further simplify adding new compression formats in the future. compress-io.c - replaced the never-defined struct lz4File pointer in glzFile union with the more specific struct lz4File_in_r and Lz4File_out_r pointers. compress-io.h, compress-io.c - added free of lz4 structures to gzclose. also changed file-close logic generally so that (1) port is marked closed before anything is freed to avoid dangling pointers in the case of an interrupt or error, and (2) structures are freed even in the case of a write or close error, before the error is reported. also now mallocing glz and lz4 structures after possibility of errors have passed where possible and freeing them when not. compress-io.c, io.ss - added return-value checks to malloc calls and to a couple of other C-library calls. compress-io.c - corrected EINTR checks to look at errno rather than return codes. compress-io.c - added S_ prefixes to the glz* exports externs.h, compress-io.c, new-io.c, scheme.c, fasl.c - added entries for mutex-name and mutex-thread threads.stex

burgerrg · 2019-04-18T15:09:05Z

It compiles cleanly in Microsoft Visual Studio, and so far the ta6nt mats are passing.

Add a compress-level parameter

Add a compress-level parameter Original commit: 3ea6f8e

dybvig requested a review from burgerrg April 11, 2019 02:44

burgerrg requested changes Apr 11, 2019

View reviewed changes

dybvig force-pushed the compress-level branch from 393d660 to c678d1d Compare April 11, 2019 17:37

dybvig requested a review from burgerrg April 11, 2019 17:41

burgerrg requested changes Apr 11, 2019

View reviewed changes

c/new-io.c Outdated Show resolved Hide resolved

c/new-io.c Outdated Show resolved Hide resolved

dybvig force-pushed the compress-level branch from c678d1d to 151456e Compare April 12, 2019 00:42

burgerrg approved these changes Apr 12, 2019

View reviewed changes

dybvig self-assigned this Apr 12, 2019

dybvig requested a review from burgerrg April 17, 2019 00:25

burgerrg requested changes Apr 17, 2019

View reviewed changes

c/compress-io.c Show resolved Hide resolved

c/thread.c Outdated Show resolved Hide resolved

dybvig force-pushed the compress-level branch from 1499e4f to 6e116af Compare April 18, 2019 04:14

dybvig force-pushed the compress-level branch from 6e116af to 722ffab Compare April 18, 2019 12:48

burgerrg approved these changes Apr 18, 2019

View reviewed changes

dybvig merged commit 3ea6f8e into cisco:master Apr 18, 2019

dybvig deleted the compress-level branch April 18, 2019 17:06

mflatt pushed a commit to racket/ChezScheme that referenced this pull request Mar 24, 2021

Merge pull request cisco#418 from dybvig/compress-level

15aacee

Add a compress-level parameter

mflatt pushed a commit to mflatt/ChezScheme that referenced this pull request Oct 10, 2023

Merge pull request cisco#418 from dybvig/compress-level

2a2410d

Add a compress-level parameter Original commit: 3ea6f8e

mflatt pushed a commit to mflatt/ChezScheme that referenced this pull request Oct 10, 2023

Merge pull request cisco#418 from dybvig/compress-level

dc33dfe

Add a compress-level parameter Original commit: 3ea6f8e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a compress-level parameter #418

Add a compress-level parameter #418

dybvig commented Apr 11, 2019

mflatt commented Apr 11, 2019

burgerrg left a comment

burgerrg commented Apr 11, 2019

dybvig commented Apr 12, 2019

owaddell commented Apr 18, 2019

burgerrg commented Apr 18, 2019

Add a compress-level parameter #418

Add a compress-level parameter #418

Conversation

dybvig commented Apr 11, 2019

mflatt commented Apr 11, 2019

burgerrg left a comment

Choose a reason for hiding this comment

burgerrg commented Apr 11, 2019

dybvig commented Apr 12, 2019

owaddell commented Apr 18, 2019

burgerrg commented Apr 18, 2019