Tests wheel reproducibility against version control#243
Conversation
|
CI is not passing: https://circleci.com/gh/freedomofpress/securedrop-debian-packaging/8074 likely due to #231. Marking as draft until resolved. |
|
yeah, #231 is acting sus, however, i can't repro. I get the same reproducible wheel on different machines with different build roots :/ |
We were already checking that building a wheel twice yields the same checksum both times, but we weren't confirming that the checksum matched what's already committed to version control. Let's do that. Still leaving the 'reprotest' check enabled because it supports multiple methods for checking reproducible, which should help us shake out problems like #231.
1842e85 to
9a0cee5
Compare
|
Will take a closer look at CI now that #246 is merged. At a glance, the CI run times are a good 10m longer or so now. Perhaps we should skip the old |
Now we are using the Debian provided Python to create the virtualenv. Means the wheels will also be created using the system python, and will have the exact same sha256sums. By default the CircleCI images are based on the docker Python images and they install python from building the source.
|
While trying to identify the differences between the wheels generated via the CI and our wheels, If I look at the Dockerfile for Now that image shows that it installs Python executable build from source via the https://python.org But, all of our wheels are bulit via the system python provided by Debian Buster. I now added another commit, where we are using the Debian provided Python. |
kushaldas
left a comment
There was a problem hiding this comment.
At a glance, the CI run times are a good 10m longer or so now. Perhaps we should skip the old reprotest test for wheels and only use this new test, which effectively verifies reproducibility by comparing to vc.
Yes, I think removing the old reprowheels test is okay. We are verifying against the lfs checked in version here in this new code.
@conorsch after you mark this PR as ready for review, we can do the formal review and merge this.
tests/test_reproducible_wheels.py
Outdated
| repo_url = f"https://github.com/freedomofpress/{repo_name}" | ||
| build_cmd = f"./scripts/build-sync-wheels -p {repo_url} --clobber".split() | ||
| subprocess.check_call(build_cmd) | ||
| check_cmd = "git diff --exit-code".split() |
The version-control comparison effectively verifies that our wheels are byte-for-byte reproducible, and also ensures that we're tracking 100% of dependencies in version-control already, since the
|
Ready for final review. See the most recent commits, @kushaldas, which disable the |
kushaldas
left a comment
There was a problem hiding this comment.
All good. Approved.
In future we may want to have a night repro test for the wheels.
| subprocess.check_call("git diff --exit-code".split()) | ||
| # Check for new, untracked files. Test will fail if working copy is dirty | ||
| # in any way, so mostly useful in CI. | ||
| assert subprocess.check_output("git status --porcelain".split()) == b"" |
There was a problem hiding this comment.
I should keep this command in mind. Thank you @conorsch.
We were already checking that building a wheel twice yields the same
checksum both times, but we weren't confirming that the checksum matched
what's already committed to version control. Let's do that.
Still leaving the 'reprotest' check enabled because it supports multiple
methods for checking reproducible, which should help us shake out
problems like #231.
Closes #241.
Testing
pytest -k test_wheel_builds_match_version_control testslocally and confirm passingecho this-is-not-a-wheel > localwheels/MarkupSafe-1.1.1-cp37-cp37m-linux_x86_64.whl && gc -m 'TEMPORARY: broken wheel' localwheels/, then re-run step 2, and confirm the test fails.Step 3 ensures that if a wheel build emits a different checksum from what's already version controlled, CI will fail on it.