travis: Remove valgrind #18899

maflcko · 2020-05-06T20:15:44Z

When the valgrind run was added, it took 2 hours. Travis kindly raised the timeout limit to the maximum possible of 3 hours.

Today, a full build of Bitcoin Core with all tests takes more than three hours. Thus, it is impossible to run all tests on travis.

Moreover, the feedback loop for developers that create a pull request takes at least 2 hours, but in some cases (when the travis queue is full) until the next day. This is unacceptable.

Fix both issues by removing the build from travis.

Please note that the ci/test/ configurations are not removed. They will stay in the repo and can be executed anywhere (just not on travis).

jamesob · 2020-05-06T20:23:52Z

ACK fa082d0

Sad but true.

jnewbery · 2020-05-06T20:26:55Z

Concept ACK! I think it's very important for developer experience to be able to iterate quickly on pull requests. If someone suggests a change and the author pushes it, then the ci should run quickly so reviewers don't feel blocked.

I think it would make sense to have a valgrind run with just the unit tests, and perhaps a basic functional test, as long as that could be done in a reasonable time. I also think it'd be worthwhile doing a full valgrind test on nightly builds somewhere.

practicalswift · 2020-05-06T20:45:47Z

Oh, please don't remove Valgrind until we have an MSAN job in Travis. We really need automated detection of uninitialized memory reads enabled in Travis.

promag · 2020-05-06T21:59:29Z

Could run the job only in master (or non PR branch) or use travis cron?

jnewbery · 2020-05-06T22:07:10Z

@practicalswift I think it's a trade-off. valgrind tests are useful, but running all the functional tests in valgrind for every branch push in every PR is terrible for fast iteration and developer experience. I fully support doing valgrind testing, just not for every functional test in every PR.

Things like this:

(7 hours since I pushed and travis is still pending) mean that I can't make progress on PRs. That branch is failing because of a valgrind timeout in a test, and because iteration is so slow, that PR hasn't made any progress for two days.

DrahtBot · 2020-05-07T00:45:15Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

Multiprocess build support #18677 (Multiprocess build support by ryanofsky)
build: Add MemorySanitizer (MSan) in Travis to detect use of uninitialized memory #18288 (build: Add MemorySanitizer (MSan) in Travis to detect use of uninitialized memory by practicalswift)
Multiprocess bitcoin #10102 ([experimental] Multiprocess bitcoin by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

jnewbery · 2020-05-07T02:17:08Z

Travis completed in 1hr3mins. That's an improvement!

utACK fa082d0

practicalswift · 2020-05-07T06:08:14Z

Oh, please don't remove Valgrind until we have an MSAN job in Travis. We really need automated detection of uninitialized memory reads enabled in Travis.

@practicalswift I think it's a trade-off. valgrind tests are useful, but running all the functional tests in valgrind for every branch push in every PR is terrible for fast iteration and developer experience. I fully support doing valgrind testing, just not for every functional test in every PR.

I agree: I think we should remove Valgrind testing in Travis as long as MSAN testing is added in the same PR.

MSAN does not have this slowness problem :)

As long as we have some form of automated detection of uninitialized memory reads enabled in Travis (via Valgrind or MSAN) I'm happy :)

In summary:

ACK on removing Valgrind and adding MSAN.
NACK on removing Valgrind without adding MSAN.

sipa · 2020-05-07T07:17:10Z

We could also just run Valgrind on master but not all PRs, independently of when/how msan testing is enabled.

practicalswift · 2020-05-07T08:11:32Z

We could also just run Valgrind on master but not all PRs, independently of when/how msan testing is enabled.

Yes, that is a very good point.

As long as we continue guarding master on bitcoin/bitcoin from uninitialized memory reads by running the unit tests, functional tests and fuzz tests using Valgrind or MSAN I'm happy :)

laanwj · 2020-05-07T09:15:20Z

I think it would make sense to have a valgrind run with just the unit tests, and perhaps a basic functional test, as long as that could be done in a reasonable time.

I ike this idea. At least the initialization/shutdown sequence will be run at least once in valgrind in that case, which seems to be an effective use of CI resources.

(but removing it entirely in the mean time is fine with me)

practicalswift · 2020-05-07T11:11:42Z

I now noticed that this PR removes also the valgrind fuzz job too, is that intentional? Is running the fuzzing targets against our seed corpus under valgrind that slow? :(

FWIW: pulling in 7e14297 from #18288 to this PR would allow us to get both a.) good CI speed and b.) no reduction in uninitialized memory testing coverage. What are the best arguments for not doing that? :)

maflcko · 2020-05-07T11:56:19Z

I am going to merge this to unbreak travis. This has two ACKs and is needed as a first step for anything that is done later. If someone has ideas or implementations of how this can be improved in the future, they are more than welcome to submit pull requests and ask me for review. When those improvements are ready for merge, I am happy to take them.

Just to respond to some points:

We could run it on master

If this means run it on master in travis, then no, this is impossible as explained in the OP. The build might happen to pass right now because I reset the failing tests and the cache happens to be intact, but a fresh build without cache takes more than 3 hours. Builds that take more than 3 hours are impossible to run on travis.

If this means to run it on master somewhere else. Then: Hell yes! The ci system has been designed to be as flexible as possible. You can run it on your raspberry pi at home, on a virtual machine at home, or even inside of docker if you know how to (hint: grep for the env var that has DANGER in it), you can also run it on any cloud ci service that supports more builds longer than 3 hours. See for example the cirrus ci config in commit 2a9e983 (just an example, anything is possible)

We could run just the unit tests or "basic functional tests"

I am not aware of any memory bug that was caught by just the unit test valgrind run that was not caught by any other build configuration on travis, so I think this is useless. But as I said, improvements are welcome, and if people think this is worthwhile to peruse, a follow-up improvement should be submitted.

I am not sure how to classify "basic" functional tests and this task sounds like bikeshedding to me, where if too little tests are classified as "basic", we will have an equivalent result of this pull request, or where too much tests are classified as "basic", we will have an equivalent result of before this pull request.

We could use MSAN instead

Sure, Concept ACK. Though, the pull is currently blocked on build system changes, which I am not capable to review.

travis: Remove valgrind

fa082d0

DrahtBot added the Tests label May 6, 2020

This was referenced May 7, 2020

Multiprocess build support #18677

Merged

build: Add MemorySanitizer (MSan) in Travis to detect use of uninitialized memory #18288

Merged

DrahtBot mentioned this pull request May 7, 2020

Multiprocess bitcoin #10102

Draft

maflcko merged commit c1cd2b5 into bitcoin:master May 7, 2020

maflcko deleted the 2005-RemoveTravis branch May 7, 2020 11:58

bitcoin locked as resolved and limited conversation to collaborators May 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

travis: Remove valgrind #18899

travis: Remove valgrind #18899

maflcko commented May 6, 2020 •

edited

Loading

jamesob commented May 6, 2020

jnewbery commented May 6, 2020

practicalswift commented May 6, 2020 •

edited

Loading

promag commented May 6, 2020 •

edited

Loading

jnewbery commented May 6, 2020 •

edited

Loading

DrahtBot commented May 7, 2020

jnewbery commented May 7, 2020

practicalswift commented May 7, 2020 •

edited

Loading

sipa commented May 7, 2020

practicalswift commented May 7, 2020 •

edited

Loading

laanwj commented May 7, 2020 •

edited

Loading

practicalswift commented May 7, 2020 •

edited

Loading

maflcko commented May 7, 2020

travis: Remove valgrind #18899

travis: Remove valgrind #18899

Conversation

maflcko commented May 6, 2020 • edited Loading

jamesob commented May 6, 2020

jnewbery commented May 6, 2020

practicalswift commented May 6, 2020 • edited Loading

promag commented May 6, 2020 • edited Loading

jnewbery commented May 6, 2020 • edited Loading

DrahtBot commented May 7, 2020

Conflicts

jnewbery commented May 7, 2020

practicalswift commented May 7, 2020 • edited Loading

sipa commented May 7, 2020

practicalswift commented May 7, 2020 • edited Loading

laanwj commented May 7, 2020 • edited Loading

practicalswift commented May 7, 2020 • edited Loading

maflcko commented May 7, 2020

maflcko commented May 6, 2020 •

edited

Loading

practicalswift commented May 6, 2020 •

edited

Loading

promag commented May 6, 2020 •

edited

Loading

jnewbery commented May 6, 2020 •

edited

Loading

practicalswift commented May 7, 2020 •

edited

Loading

practicalswift commented May 7, 2020 •

edited

Loading

laanwj commented May 7, 2020 •

edited

Loading

practicalswift commented May 7, 2020 •

edited

Loading