-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup and improve reproducibility of builds #553
Cleanup and improve reproducibility of builds #553
Conversation
This reverts commit c331fe7
Just to confirm I understand: This doesn't actually cause any problems for programs using Distroless, this seems to mostly be some simplification/cleanup of the build tools themselves, right? I think this looks good to me, but I'm not an owner, just a sometime contributor and user :) To be honest I don't follow what exactly is happening with cacerts here. I do know that distroless at one point had somewhat "broken" public CA certificates, since it can't really run the Debian |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vaguely remember that when building the entire images, I saw output of multiple executions of the cacert extraction script. I just think if it is still the case with this new rule, it may increase the overall build time when we start to expand the usage in multiple places. Can it be that case? For example, if I use some extraction rule in java/BUILD
, will it unpack a deb and create the same tar for each of the dozens of the Java images?
package_manager/dpkg_extract.sh
Outdated
set -o errexit | ||
|
||
DEB=$1 | ||
shift |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get line 6: shift: shift count out of range
when I don't provide any argument to the script. (You may see different behaviors when using a sh
vs bash
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have a python version of this script in another personal project that includes a basic form of filtering (excluding) files. That script is also using build_tar instead of the unix tooling.
It was developed after I opened this PR, so it might need some polish (an hour or so).
Might be better to switch to that one instead of fixing this shell script? Or do you have a preference for shell scripts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a shell script is fine. Not that a script is a good thing, but I don't think we want over-engineering. I mean, the goal is to not increase complexity and code lines to reduce maintenance burden. When I first came to this repo, everything was overwhelming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the excluding feature will probably be a nice thing that we may need anyway. But let's not do it here to keep the PR small. We can always do it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed
I don't think I saw any changes about this? Again, thanks for the awesome contribution! |
Given that they are now regular targets, it should work like any other dependency. So I would expect cacerts extraction to run twice (once for deb 9 and another time for deb 10). Same for cacerts_java, resulting in 4 targets in total. |
@evanj Correct. My worry relates more to people "highjacking" some of the public targets of this repo. In theory they should have pinned the repo by commit hash in their workspace, but they might run into issues when updating to a newer hash due to change of build tooling that should have been internal to distroless (imho). |
@chanseokoh package_manager/util.py#generate_os_release was changed for this. |
@chanseokoh let me know if you're happy with the code. Then I can do a final check to make sure the digests of the unchanged layers match the ones on gcr. |
Looks good to me! Thanks for checking them. Really appreciate it. |
Set default mode to 0644.
Had to do a small fix for file permissions on /etc/os-release (is back to 0644 as on gcr). I've got two diffs after local build:
Diff on the layer.tar entry list output < is upstream
> is new
13a14,17
> drwxr-xr-x 0/0 0 1970-01-01 01:00 ./usr/
> drwxr-xr-x 0/0 0 1970-01-01 01:00 ./usr/share/
> drwxr-xr-x 0/0 0 1970-01-01 01:00 ./usr/share/doc/
> drwxr-xr-x 0/0 0 1970-01-01 01:00 ./usr/share/doc/ca-certificates/
40d43
< drwxr-xr-x 0/0 0 1970-01-01 01:00 ./usr/
47d49
< drwxr-xr-x 0/0 0 1970-01-01 01:00 ./usr/share/
72d73
< drwxr-xr-x 0/0 0 1970-01-01 01:00 ./usr/share/doc/
0a1,2
> drwxr-xr-x 0/0 0 1970-01-01 01:00 ./
> drwxr-xr-x 0/0 0 1970-01-01 01:00 ./etc/
13,14c15,16
< -rw-r--r-- 0/0 62 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_NAME
< -rw-r--r-- 0/0 2498 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_TIME
---
> -rw-r--r-- 0/0 131 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_ADDRESS
> -rw-r--r-- 0/0 1515838 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_COLLATE
15a18,19
> -rw-r--r-- 0/0 243 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_IDENTIFICATION
> -rw-r--r-- 0/0 23 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_MEASUREMENT
18d21
< -rw-r--r-- 0/0 50 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_NUMERIC
20,24c23,24
< -rw-r--r-- 0/0 243 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_IDENTIFICATION
< -rw-r--r-- 0/0 47 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_TELEPHONE
< -rw-r--r-- 0/0 131 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_ADDRESS
< -rw-r--r-- 0/0 23 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_MEASUREMENT
< -rw-r--r-- 0/0 1515838 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_COLLATE
---
> -rw-r--r-- 0/0 62 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_NAME
> -rw-r--r-- 0/0 50 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_NUMERIC
26c26,27
< drwxr-xr-x 0/0 0 1970-01-01 01:00 ./
---
> -rw-r--r-- 0/0 47 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_TELEPHONE
> -rw-r--r-- 0/0 2498 1970-01-01 01:00 ./usr/lib/locale/C.UTF-8/LC_TIME
86d86
< drwxr-xr-x 0/0 0 1970-01-01 01:00 ./etc/ |
@chanseokoh looks good to me. I just had to fix 2 file permissions from 755 to 644 (see last 2 commits after your review). Two layers will change (one in base and one in java base) as described in the comment above for future reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PeterMylemans thanks for thoroughly checking the difference! Just FYI, changes in the parent directories in a tar may sometimes have a serious impact (for example, #456 (comment)), I don't think we have any issue here.
Thanks for making Distroless more robust! Merging this now.
Package Manager package
/etc/os-release
when building locally by using the build_tar utility.cacerts
locale
Fixes #552
Fixes #458 with the exception of the python 3.8 issue