Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

releasing proper tarballs #1891

Closed
dimpase opened this issue Nov 29, 2018 · 22 comments
Closed

releasing proper tarballs #1891

dimpase opened this issue Nov 29, 2018 · 22 comments

Comments

@dimpase
Copy link

dimpase commented Nov 29, 2018

The current automatically made by GitHub release tarballs are not stable; they actually change (a bit) with the changes in the repo. (and so md5 hashes change, breaking various systems that use such hashes for consistency checks). However, there is always a possibility to upload a custom made release tarball, which would be also properly named, so that wgeting it would not need subsequent renaming, one can get better compression (bz2/xz, etc).

It would be great to have them, it's only a minor hassle for the release manager (click on Edit release, upload the tarball).

@dimpase
Copy link
Author

dimpase commented Nov 29, 2018

E.g. the current release tarball has md5 30e2f8d7317e84dde5a37152173848f1, while it used to be 7688cfbf657b348d510d9b56137ace40. Cf.
https://trac.sagemath.org/ticket/26052#comment:10

@dimpase
Copy link
Author

dimpase commented Nov 29, 2018

That's how a releases page of a project with custom tarballs looks like:
https://github.com/ivmai/bdwgc/releases

@martin-frbg
Copy link
Collaborator

martin-frbg commented Nov 29, 2018

From my (admittedly limited) understanding the hashes would only change when github applies some git patch that affects hash generation. (Homebrew/homebrew-core#18044). (At least as far as I am aware there is nothing in OpenBLAS that actively modifies file content at or after release time to write e.g. a hash or date to one of the source files on the release branch). I am not aware that the OpenBLAS pages specify a particular hash for any version of the autogenerated tarballs either.
Having one custom and one automatic tarball with practically identical content but different hashes
on the release page does not look that good to me either (nor does any extra hassle and possibility for mistakes at release time, frankly).
Perhaps what you could do is download from the sourceforge archive that automagically receives a copy of both zip and tarball through some scripting beyond my control (I do not own the project), which I expect will not re-run whenever github changes anything in the background.

@dimpase
Copy link
Author

dimpase commented Nov 29, 2018 via email

@brada4
Copy link
Contributor

brada4 commented Nov 29, 2018

wget does not show ads

@dimpase
Copy link
Author

dimpase commented Nov 29, 2018 via email

@brada4
Copy link
Contributor

brada4 commented Nov 29, 2018

They claim not since 2011
btw md5...40 file is still there.
I know they at once threw ad-laden downloader in front of windows .exe installers, but you verify checksums against such outcome.

@dimpase
Copy link
Author

dimpase commented Nov 29, 2018

btw md5...40 file is still there.

Great, and how on Earth can one say it's not a hacked tarball, without actually untarring and diffing against your git repo, as it now has a different md5 from your supposedly canonical tarball? Do you see now why it's desirable to have a canonical tarball?

@martin-frbg
Copy link
Collaborator

If you are going to be paranoid about archive integrity, perhaps you should look into distributing a "verified" copy of OpenBLAS with your project ? If you assume either the OpenBLAS project account or github itself could get hacked, I do not see how any checksum posted with an archive there could be expected to remain trustworthy.

@dimpase
Copy link
Author

dimpase commented Nov 30, 2018 via email

@brada4
Copy link
Contributor

brada4 commented Nov 30, 2018

@dimpase you can use other checksum that is not suspectible to collision attack, which is not applicable to compresed files anyway, it is applicable to altering whitespace to compensate for changing some text. Indeed you are free to diff against source tree used.

It is not the first time github changes compression, since then all download links go to sf.net to avoid scaring people.

If you have chance you can host distribution file properly with multiple checksums, re-compressed with zopfli (-10%) bzip2 (-50%) xz (-70% size) or .pax.Z (+80%) , we certainly do not modify those files after release.

@martin-frbg
Copy link
Collaborator

@brada4 if you are alluding to #504 I think that may have been a different problem (though there is no way of knowing for sure). Interesting that there apparently has been no problem with the checksum of the OpenBLAS release tarballs for JuliaLang since then. @ararslan ?

@dimpase
Copy link
Author

dimpase commented Nov 30, 2018

you can use other checksum

we are using a combination of md5, sha1, and a custom crc32-based checksum to ensure authenticity of tarballs (and we mirror them once a release is made). But we need to start somewhere, and a tarball that is changing over time is a non-starter. I reckon that out of dozens github-released packages we use, openblas is the only one that only does the default GitHub fluid tarballs as their releases.

download from the sourceforge archive that automagically receives a copy of both zip and tarball through some scripting beyond my control (I do not own the project)

You are sending me to a resource you don't control as the primary source for your releases. This looks quite fragile and potentially error-prone to me.

@brada4
Copy link
Contributor

brada4 commented Nov 30, 2018

sourceforge does not change file contents behind same names.
I would suggest
1/ detect release in github
2/ verify that it is same on sf.net and use that OR
3/ publish your re-compressed version thereof
If something fails in current distribution arrangement feel free to share, yes, you are sent to sf.net for release files that do not nibble behind the scenes.
If you have solution at zero budget, feel free to share.

@dimpase
Copy link
Author

dimpase commented Nov 30, 2018

If you have solution at zero budget, feel free to share

uploading to github a custom tarball,in addition to making a github release, takes about 30 seconds of the
release manager time (and is free otherwise). It can also be scripted through a CI system, I gather.

@brada4
Copy link
Contributor

brada4 commented Nov 30, 2018

You will have to change download link to get stable file either way.

@dimpase
Copy link
Author

dimpase commented Nov 30, 2018

You will have to change download link to get stable file either way.

Sorry, I don't follow you here. Open releases tab, select a release, click on Edit in the right top corner. You'll see a prompt saying "Attach binaries by dropping them there or selecting them". Upload the tarball you created, click "Update release", done. Now you have sane URL, wgettable etc., to use for the tarball, not messed up by GitHub.

@ararslan
Copy link
Contributor

Interesting that there apparently has been no problem with the checksum of the OpenBLAS release tarballs for JuliaLang since then.

Yeah, we've gotten no reports of checksum mismatches for OpenBLAS. Though I should note that Julia source builds keep the OpenBLAS tarball in the build tree so it doesn't have to be re-downloaded and checksummed, so the only time someone could encounter a checksum mismatch is when doing a fully clean build, like from a fresh clone or after git clean -fdx. That said, I did that myself just yesterday and had no problems.

@dimpase
Copy link
Author

dimpase commented Dec 2, 2018

@ararslan - could be that Julia did the bump-up to the new openblas at the moment when the hash change has already happened (it's not very frequent)?

@ararslan
Copy link
Contributor

ararslan commented Dec 3, 2018

Apparently I raised the OpenBLAS version Julia uses on 30 October (JuliaLang/julia#29845). I don't know when the hash changed, but the MD5 for the version used by is 7688cfbf657b348d510d9b56137ace40, and I didn't encounter checksum mismatches when I last did a clean build. (I did another yesterday on a different system.)

@brada4
Copy link
Contributor

brada4 commented Dec 3, 2018

@ararslan the problem is that github sometimes changes compression trimming few bytes from resulting files (still 10% bigger than with zopfli), thats why release links go to sf.net (thats in bottom of notes) while the top 2 items above notes will nibble behind the interface.

@martin-frbg
Copy link
Collaborator

closing as I remembered to upload md5sum'd copies for the last three releases now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants