Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Apache Arrow C++ to 17.0.0 #2749

Merged
merged 1 commit into from
Sep 15, 2024
Merged

Upgrade Apache Arrow C++ to 17.0.0 #2749

merged 1 commit into from
Sep 15, 2024

Conversation

texodus
Copy link
Member

@texodus texodus commented Sep 14, 2024

This PR upgrades the Apache Arrow C++ dependency to 17.0.0.

Previously, Perspective used Arrow 12.0.0, which required some vendoring/patching of this library's source to work around a statically-initialized threadpool executor which we failed to compile with Emscripten. In Arrow 17.0.0, this threadpool can be disabled with the ARROW_ENABLE_THREADING cmake option. As a result, a large amount of duplicate code has been removed, duplicate code which had been the source of a recent infuriating symbol-resolution compile bug in our cross platform support.

We must still do some cmake surgery to make this work on the platform trilogy + wasm, though nowhere near as gnarly as previously. Windows + Arrow + Cmake + External Boost do not play nicely together, this combination triggers a path in Arrow's CMakeLists.txt that tries to target_link_options a Boost module built by perspective. We currently work around this by selectively installing boost after building Arrow exclusively on Windows, instead using Arrow's bundled dependency mode on this platform. This inflates the binary a bit on Windows.

As a result of this upgrade, Perspective WASM's binary size has increased ~100 kilobytes, and our benchmarks are relatively flat. Despite this negligible impact (or even slight bloat), the integration code is much simpler, and presumably less bug prone. Arrow ingestion, generation, type detection/support, etc. is already extensively tested in both the Python and JavaScript implementations (and though the Rust test suite only yet has a single test, it is coincidently an Arrow test).

Some ancillary tech debt addressed:

  • Fixed test task to work correctly when focused on the perspective-js crate.
  • Removed import .. with .. syntax from perspective-python build script, which caused issues with newer node.js versions which deprecate this syntax.
  • Added 3.0.0 series to the benchmark suite (checks regression against a recent version without this major dependency upgrade).
  • Removed some cruft in formatters.

@texodus texodus added the internal Internal refactoring and code quality improvement label Sep 14, 2024
Signed-off-by: Andrew Stein <steinlink@gmail.com>
@texodus texodus merged commit 47f6cbe into master Sep 15, 2024
10 checks passed
@texodus texodus deleted the arrow-17 branch September 15, 2024 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal Internal refactoring and code quality improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant