Perspective Virtual API (JavaScript) #2615
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a radical new design for Perspective's client/server API. The basis of this work will enable the next version of Perspective UX to directly support streaming (* where supported), virtual access to non-Perspective databases such as DuckDB, SQLite and Postgres without copying the entire dataset into either the Browser or
perspective-python
, using an efficient Virtual API.The ad-hoc, JSON-based wire format of the 2.x series has been re-written as a set of Protobuf messages. This enables easier portability of this message protocol to new host languages, decreases message handling overhead on the server, and makes possible improved multi-thread utilization on platform which support multi-threading - e.g. in Python, where messages were previously dispatched with the GIL acquired, they can now be parsed and handled entirely on an internal thread pool.
In 2.x, Perspective's client, session manager, message processing loop, multiplexing, etc. was implemented in the domain language itself, and the native C++ API resembled a lower-level version of this public API. This resulted in a lot of duplicate (and subtle-y buggy) code, inconsistencies in implementation and performance, and made it difficult to add new features as they had to be custom-embedded in Python and Javascript. We had exported over 100 symbols from the Emscripten+Embind JavaScript API in C++, and had over 1,000 LoC of C++ in Python for PyBind. It also limited concurrent throughput in language like Python that have a GIL associated with interpreted evaluation.
In 3.0, the Server API is implement entirely in C++ and exports only 2 methods, both of which take only
[uint8_t]
arguments (the serialized Protobuf engine messages), and subsumes session management, client IDs, multiplexing and the lot. The new Client API is written in purely in Rust, and need only emit and consume this duplex binary message stream in order to communicate with a Perspective Server over any transport. As the Rust ecosystem has exceptional Python (PyO3) and JavaScript (wasm-bindgen) bindings, we can mostly get away with transparently wrapping this common Client library for these languages. This drastically decreases the volume of code we must write to expose Perspective's API to new language, simplifying the maintenance of these language bindings.The following new crates are introduced:
perspective-viewer
(wasm32)is the renamed 2.xperspective
crate, and contains the<perspective-viewer>
component .perspective-js
(wasm32) contains the JavaScript client/server bindings.perspective-client
(native + wasm32) implements the new Protobuf client, both as an abstract base forperspective-js
andperspective-python
, and as a native Rust client forperspective
.API Changes
There are a number of API changes. In JavaScript:
perspective.table()
constructor no longer supports inference forDate
andDatetime
for JSON columns and rows formats. These non-JSON compatible types can still be coerced into Perspective (or rather - the browser will auto coerce these to numeric types, which Perspective can coerce further), but a schema must be provided to the constructor to inform Perspective to do so. This change simplifies the API quite a bit as well as making it consistent in behavior between Python, JavaScript and Rust.perspective.worker()
andperspective.websocket()
are now asynchronous and must beawait
-ed.View.on_update()
,View.on_remove()
,View.on_delete()
now return callback ID values that must be provided to their reciprocalView.remove_update()
(etc., respectively).perspective.memory_usage()
is renamedperspective.system_info()
.In Python:
sync
andasync
clients, withasync
being the default.pandas.DataFrame
is not longer directly supported byperspective.table()
, but they may still be loaded internall ifpyarrow
is available in the environment. This load path is both dramatically less code and faster than 2.x, butpyarrow
is much stringent about type coercion/inference than Perspective 2.x is.Performance
Linear performance is not the point of this change, nevertheless the benchmarks track a ~15% improvement for the JavaScript (linear) suite.
Other project improvements
The documentation content, build process, format, and publication platform have also been updated. Previously, Perspective's docs were built to Markdown via an arcane amalgam of
sphinx
andjsdoc
and then built asdocusaurus
artifacts. While much of this content has been preserved, it has been mostly moved into the Rustdoc annotations in the API code itself, which allows us to usecargo docs
to build the docs site. While this eliminates the language-specific API docs in favor of one unified cross-language (but Rust-centric) doc, the new API is much more consistent between languages.Perspective's benchmarks have been updated to take advantage of the new API modularity as well, and we can now run the same benchmark suite with the same client across multiple different language & runtime implementations at once, giving us true apples-to-apples performance across feature, version, number of cores, size of dataset, and platform.