Upgrade Apache Arrow C++ to 17.0.0 #2749
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR upgrades the Apache Arrow C++ dependency to
17.0.0
.Previously, Perspective used Arrow
12.0.0
, which required some vendoring/patching of this library's source to work around a statically-initialized threadpool executor which we failed to compile with Emscripten. In Arrow17.0.0
, this threadpool can be disabled with theARROW_ENABLE_THREADING
cmake option. As a result, a large amount of duplicate code has been removed, duplicate code which had been the source of a recent infuriating symbol-resolution compile bug in our cross platform support.We must still do some cmake surgery to make this work on the platform trilogy + wasm, though nowhere near as gnarly as previously. Windows + Arrow + Cmake + External Boost do not play nicely together, this combination triggers a path in Arrow's
CMakeLists.txt
that tries totarget_link_options
a Boost module built byperspective
. We currently work around this by selectively installing boost after building Arrow exclusively on Windows, instead using Arrow's bundled dependency mode on this platform. This inflates the binary a bit on Windows.As a result of this upgrade, Perspective WASM's binary size has increased ~100 kilobytes, and our benchmarks are relatively flat. Despite this negligible impact (or even slight bloat), the integration code is much simpler, and presumably less bug prone. Arrow ingestion, generation, type detection/support, etc. is already extensively tested in both the Python and JavaScript implementations (and though the Rust test suite only yet has a single test, it is coincidently an Arrow test).
Some ancillary tech debt addressed:
test
task to work correctly when focused on theperspective-js
crate.import .. with ..
syntax fromperspective-python
build script, which caused issues with newernode.js
versions which deprecate this syntax.3.0.0
series to the benchmark suite (checks regression against a recent version without this major dependency upgrade).