Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark suite for Python, Refactor module loading for environments where C++ cannot be built #859

Merged
merged 8 commits into from
Jan 7, 2020

Conversation

sc1f
Copy link
Contributor

@sc1f sc1f commented Jan 6, 2020

This PR adds a primitive benchmark suite for perspective-python, which follows the convention set by the Javascript benchmark suite - it runs operations for Table construction from different datasets, view construction with different pivot settings, and serializing data in all formats from each of those views. Benchmark results are written to a perspective.Table which is hosted via Tornado when the suite finishes (or when Ctrl-C is used to exit the script).

Additionally, this PR refactors the module loading semantic for perspective-python: when the C++ runtime is not present, a reasonable error message is logged, and symbols dependent on the C++ binding will not be exported in the module. PerspectiveWidget in client mode becomes the only usable symbol, and Table, PerspectiveManager, PerspectiveTornadoHandler, and PerspectiveViewer are not exported at all. A new test module has been added to assert correct behavior in this case.

The error message in question:

Screen Shot 2020-01-06 at 2 59 40 PM

Finally, minor refactors were made in to_format for performance enhancements.

@sc1f sc1f added internal Internal refactoring and code quality improvement Python labels Jan 6, 2020
Copy link
Member

@texodus texodus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the PR!

I made a few notes re: an ultimate plan to unify our WebAssembly/Python benchmarking into a single tool, which don't need follow up.

>>> func = Benchmark(lambda: self._view.to_records(), meta={
"name": "to_records",
"group": "view"
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these for use as a decorator?

@Benchmark
def _to_records_test():
    self._view.to_records();

If so I think the signature of __init__ is wrong - it wouldn't take a func. I see Benchmark is not used as a decorator though here or in the perspective.py benchmarks themselves, but they are referenced as decorator in the docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs are a little out of sync; Benchmark wraps a parameterless lambda and a metadata dictionary, which is a simpler implementation I went with instead of decorators.

with open(arrow_path, "wb") as file:
arrow = self._table.view().to_arrow()
file.write(arrow)
self._WROTE_RESULTS = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if this method could incrementally write to an existing arrow so we can locally archive multiple versions, by checking to see if an benchmark.arrow file exists on load. It would also enable cross-language benchmarks in the same archive ..

@texodus texodus merged commit b13ac1e into master Jan 7, 2020
@texodus texodus deleted the py-bench branch January 7, 2020 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal Internal refactoring and code quality improvement Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants