More robust detection of bzip2 magic number #6308

MichaelChirico · 2024-07-23T06:58:27Z

Also took the opportunity to factor out some helpers for easier maintenance/readability. That could be split to its own precursor PR if so desired.

We could extend is_bzip() to also look for "raw pi" (as.raw(c(0x31, 0x41, 0x59, 0x26, 0x53, 0x59))) in the 5th-10th bytes, but skip that for now. The example seems pretty pathological already.

Also, I don't think that any of our test_R.utils tests have been checking the "infer from magic number" functionality from fread(), so this PR also adds some nice regression coverage for this behavior.

github-actions · 2024-07-23T07:14:27Z

Generated via commit 2db0960

Download link for the artifact containing the test results: ↓ atime-results.zip

Time taken to finish the standard R installation steps: 11 minutes and 35 seconds

Time taken to run atime::atime_pkg on the tests: 3 minutes and 20 seconds

ben-schwen · 2024-07-23T08:25:33Z

Not sure we really need the fread(header=TRUE) test here, but I guess it can't hurt.

Besides that, LGTM

R/fread.R

MichaelChirico · 2024-07-23T13:44:59Z

Not sure we really need the fread(header=TRUE) test here, but I guess it can't hurt.

yea for the problem as diagnosed it's kinda silly. but wanted to keep some examples close to OP's report, that way we're ironclad sure we've closed the issue & it'll stay closed.

MichaelChirico added 2 commits July 22, 2024 23:45

More robust to false positives checking for bzip signature

e077ed2

NEWS

03fb5e6

trailing ws

2db0960

ben-schwen reviewed Jul 23, 2024

View reviewed changes

R/fread.R Show resolved Hide resolved

ben-schwen merged commit 0c022e2 into master Jul 24, 2024
5 checks passed

ben-schwen deleted the fread-bz2-digit branch July 24, 2024 07:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More robust detection of bzip2 magic number #6308

More robust detection of bzip2 magic number #6308

MichaelChirico commented Jul 23, 2024

github-actions bot commented Jul 23, 2024 •

edited

Loading

ben-schwen commented Jul 23, 2024

MichaelChirico commented Jul 23, 2024 •

edited

Loading

More robust detection of bzip2 magic number #6308

More robust detection of bzip2 magic number #6308

Conversation

MichaelChirico commented Jul 23, 2024

github-actions bot commented Jul 23, 2024 • edited Loading

ben-schwen commented Jul 23, 2024

MichaelChirico commented Jul 23, 2024 • edited Loading

github-actions bot commented Jul 23, 2024 •

edited

Loading

MichaelChirico commented Jul 23, 2024 •

edited

Loading