-
Notifications
You must be signed in to change notification settings - Fork 950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve test compilation speed and space usage by linking tests against libopen_spiel.so
#1126
Conversation
38cc45b
to
5982505
Compare
libgames.a
libgames.so
It turns out a number of tests ( |
The failed wheels test is our fault from an update yesterday, we'll fix that. Apologies. |
Ok, fixed. You can pull from master to get the changes, and your next merge should at least fix the wheels test. |
d0dcf9d
to
58fa951
Compare
@lanctot: If you agree with the direction I've taken here, I think this is ready for another CI run. If you disagree and don't want a shared library here, please let me know! |
Hey @jthemphill sorry I've had a busy week and haven't had a chance to take a deep look yet. Was going to mention, too, that I'm about to go on vacation for 10 days. So I'll get back to you start of November. If this doesn't lead to any internal maintenance issues (we have some internal code built on top of OpenSpiel that is not open-sourced) nor affects the binary wheels that we distribute for Python users, or e.g. breaks our native Windows build, then I am hugely in favor of the change. I just need to be sure it doesn't disrupt anything. Thanks for doing this! We don't normally get contributions that take this form.. takes some rare skills to first notice it and even more so to fix it. |
BTW so far it's looking like that wouldn't be the case, so I'm not worried! But I just have to make sure before we import it. |
Thank you very much, and have an awesome vacation! |
open_spiel/games/CMakeLists.txt
Outdated
add_test(ultimate_tic_tac_toe_test ultimate_tic_tac_toe_test) | ||
target_link_libraries(games PRIVATE bridge_double_dummy_solver) | ||
|
||
macro(add_game_executable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much cleaner
such one line
so neat!
wow
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this PR lands, I have another speedup in the pipeline which precompiles spiel.h
and friends. This macro makes it easy to add target_precompile_headers("${_NAME}" REUSE_FROM open_spiel_core)
to each target😉
But the test passes locally, and stepping through the code, it's not clear why this would be a |
I think # Built with `-fsanitize=address -fsanitize=undefined`
$ games/nfg_game_test
=================================================================
==43819==ERROR: AddressSanitizer: odr-violation (0x55b4d98283e0):
[1] size=320 'vtable for open_spiel::matrix_game::MatrixState' /home/jhemphill/oss/open_spiel/open_spiel/matrix_game.cc
[2] size=320 'vtable for open_spiel::matrix_game::MatrixState' /home/jhemphill/oss/open_spiel/open_spiel/matrix_game.cc
These globals were registered at these points:
[1]:
#0 0x55b4d95957da in __asan_register_globals (/home/jhemphill/oss/open_spiel/build/games/nfg_game_test+0x1da7da) (BuildId: 6d7a0b8c5cad382e91a67328dba4709b33c28695)
#1 0x55b4d968af8b in asan.module_ctor matrix_game.cc
[2]:
#0 0x55b4d95957da in __asan_register_globals (/home/jhemphill/oss/open_spiel/build/games/nfg_game_test+0x1da7da) (BuildId: 6d7a0b8c5cad382e91a67328dba4709b33c28695)
#1 0x7fc0deeeedeb in asan.module_ctor matrix_game.cc
==43819==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0
SUMMARY: AddressSanitizer: odr-violation: global 'vtable for open_spiel::matrix_game::MatrixState' at /home/jhemphill/oss/open_spiel/open_spiel/matrix_game.cc
==43819==ABORTING ...and I that's because open_spiel/open_spiel/CMakeLists.txt Line 208 in a95a9d6
From the docs:
I'll replace this |
58fa951
to
b000131
Compare
libgames.so
libopen_spiel.so
fa417b7
to
b85bd80
Compare
Can you rerun the failed action? The failure complains about a sha256 mismatch in https://github.com/google-deepmind/open_spiel/actions/runs/6686304408/job/18167290661?pr=1126 |
b85bd80
to
a550d6b
Compare
I started the test from the last commit but.. yeah, just as I suspected, they are now failing because of a JAX compatibility issue. I suspect what happened was that they finally got rid of Python 3.7 (or some other change) on the test machines just very recently because everything was working fine as of two days ago. The real fix is rather involved and will take me a week or two given my schedule, but I have a short-term fix for now that temporarily disables all the JAX tests (#1134) that I'll put through on Monday. But the other thing is.. this PR is seemingly getting quite complicated with the external library dependencies.. do you think it is still worth the effort? You don't anticipate it causing any issues down the road? |
I think the core idea of linking against a shared library works and works well. This PR's transformation is mechanical - every instance of I wish I had locally tested more by enabling all the different compilation flags and compiling with each of the libraries, instead of relying on CI to throw the PR back to me every time I missed some compilation flag combination. I think CI will pass this time if the jaxlib wheel issue is fixed, but I understand that the size and CI churn of this PR feels risky. I think deciding if you want to go forward with this is your call as the maintainer. |
Closed on purpose? (The issues with the jax tests were fixed last week.) Did I forget to press the button? Apologies, I've been busy fixing something else for an impending release in the next few days. I was planning to get back to this eventually, just got a bit busy with work and other maintenance stuff. |
Yeah, I was clearing my PR queue, and based on your last comment I just wasn't sure if you want to go forward with this PR in this form. I'm still willing to make this green if you agree with the direction. |
Ahh, yeah fair enough. But: don't read too much into it yet. I tend to adopt "if it ain't broke, don't fix it" as a default because I need to minimize time spent on maintenance because I have very little time to devote to it and just mostly can't afford the time to fix things when they break.
Yeah, apologies for the lack of response up to now. I've just had a few other things I had to tend to (vacation, sickness, work travel, impending release fixes). But I'll get to it soon, promise! |
Thank you for all the time you've spent on this PR, and I hope you and your family are well. I know how hard it is to maintain an OSS project within a large company, and I don't want you to feel pressured to accept a PR if you think doing so will make your job harder! |
a550d6b
to
cc42628
Compare
cc42628
to
c7324ad
Compare
c7324ad
to
38ec117
Compare
Hi @jthemphill, Thanks for the effort you put into this. Unfortunately with withering resources dedicated to maintenance, it is unrealistic that we will be able to import this, so I will close it for now. If you work on it more, please feel free to reopen. |
This is an alternative approach to #1124. Compared to the status quo, it cuts over 500MB off of the compiled build artifacts, cuts three minutes off the single-threaded full build time, and removes a lot of thinking about which libraries you need to link your game or test against.
Before:
After:
Previously, we conditionally generated
libopen_spiel.so
. Now, we always generate it. And when creating a new executable, you'll want to useadd_open_spiel_executable()
instead ofadd_executable(...${OPEN_SPIEL_OBJECTS})
.