-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: generated protobuf classes (#1)
- Loading branch information
1 parent
ebc766d
commit 70a3cb8
Showing
26 changed files
with
888 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
src/substrait/gen/** linguist-generated=true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
[submodule "third_party/substrait"] | ||
path = third_party/substrait | ||
url = https://github.com/substrait-io/substrait |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# Getting Started | ||
## Get the repo | ||
Fork and clone the repo. | ||
``` | ||
git clone --recursive https://github.com/<your-fork>/substrait-python.git | ||
cd substrait-python | ||
``` | ||
## Update the substrait submodule locally | ||
This might be necessary if you are updating an existing checkout. | ||
``` | ||
git submodule sync --recursive | ||
git submodule update --init --recursive | ||
``` | ||
## Upgrade the substrait submodule | ||
You will need to regenerate protobuf classes if you do this (run `gen_proto.sh`). | ||
``` | ||
cd third_party/substrait | ||
git checkout <version> | ||
cd - | ||
git commit . -m "Use submodule <version>" | ||
``` | ||
|
||
|
||
# Setting up your environment | ||
## Conda env | ||
Create a conda environment with developer dependencies. | ||
``` | ||
conda env create -f environment.yml | ||
conda activate substrait-python-env | ||
``` | ||
|
||
# Build | ||
## Python package | ||
Editable installation. | ||
``` | ||
pip install -e . | ||
``` | ||
|
||
## Generate protocol buffers | ||
Generate the protobuf files manually. Requires protobuf `v3.20.1`. | ||
``` | ||
./gen_proto.sh | ||
``` | ||
|
||
# Test | ||
Run tests in the project's root dir. | ||
``` | ||
pytest | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
# Substrait | ||
|
||
A Python package for [Substrait](https://substrait.io), the cross-language specification for data compute operations. | ||
|
||
## Goals | ||
This project aims to provide a Python interface for the Substrait specification. It will allow users to construct and manipulate a Substrait Plan from Python for evaluation by a Substrait consumer, such as DataFusion or DuckDB. | ||
|
||
## Non-goals | ||
This project is not an execution engine for Substrait Plans. | ||
|
||
## Status | ||
This is an experimental package that is still under development. | ||
|
||
# Example | ||
At the moment, this project contains only generated Python classes for the Substrait protobuf messages. Let's use an existing Substrait producer, [Ibis](https://ibis-project.org), to provide an example using Python Substrait as the consumer. | ||
## Produce a Substrait Plan with Ibis | ||
``` | ||
In [1]: import ibis | ||
In [2]: movie_ratings = ibis.table( | ||
...: [ | ||
...: ("tconst", "str"), | ||
...: ("averageRating", "str"), | ||
...: ("numVotes", "str"), | ||
...: ], | ||
...: name="ratings", | ||
...: ) | ||
...: | ||
In [3]: query = movie_ratings.select( | ||
...: movie_ratings.tconst, | ||
...: avg_rating=movie_ratings.averageRating.cast("float"), | ||
...: num_votes=movie_ratings.numVotes.cast("int"), | ||
...: ) | ||
In [4]: from ibis_substrait.compiler.core import SubstraitCompiler | ||
In [5]: compiler = SubstraitCompiler() | ||
In [6]: protobuf_msg = compiler.compile(query).SerializeToString() | ||
In [7]: type(protobuf_msg) | ||
Out[7]: bytes | ||
``` | ||
## Consume the Substrait Plan using Python Substrait | ||
``` | ||
In [8]: import substrait | ||
In [9]: from substrait.gen.proto.plan_pb2 import Plan | ||
In [10]: my_plan = Plan() | ||
In [11]: my_plan.ParseFromString(protobuf_msg) | ||
Out[11]: 186 | ||
In [12]: print(my_plan) | ||
relations { | ||
root { | ||
input { | ||
project { | ||
common { | ||
emit { | ||
output_mapping: 3 | ||
output_mapping: 4 | ||
output_mapping: 5 | ||
} | ||
} | ||
input { | ||
read { | ||
common { | ||
direct { | ||
} | ||
} | ||
base_schema { | ||
names: "tconst" | ||
names: "averageRating" | ||
names: "numVotes" | ||
struct { | ||
types { | ||
string { | ||
nullability: NULLABILITY_NULLABLE | ||
} | ||
} | ||
types { | ||
string { | ||
nullability: NULLABILITY_NULLABLE | ||
} | ||
} | ||
types { | ||
string { | ||
nullability: NULLABILITY_NULLABLE | ||
} | ||
} | ||
nullability: NULLABILITY_REQUIRED | ||
} | ||
} | ||
named_table { | ||
names: "ratings" | ||
} | ||
} | ||
} | ||
expressions { | ||
selection { | ||
direct_reference { | ||
struct_field { | ||
} | ||
} | ||
root_reference { | ||
} | ||
} | ||
} | ||
expressions { | ||
cast { | ||
type { | ||
fp64 { | ||
nullability: NULLABILITY_NULLABLE | ||
} | ||
} | ||
input { | ||
selection { | ||
direct_reference { | ||
struct_field { | ||
field: 1 | ||
} | ||
} | ||
root_reference { | ||
} | ||
} | ||
} | ||
failure_behavior: FAILURE_BEHAVIOR_THROW_EXCEPTION | ||
} | ||
} | ||
expressions { | ||
cast { | ||
type { | ||
i64 { | ||
nullability: NULLABILITY_NULLABLE | ||
} | ||
} | ||
input { | ||
selection { | ||
direct_reference { | ||
struct_field { | ||
field: 2 | ||
} | ||
} | ||
root_reference { | ||
} | ||
} | ||
} | ||
failure_behavior: FAILURE_BEHAVIOR_THROW_EXCEPTION | ||
} | ||
} | ||
} | ||
} | ||
names: "tconst" | ||
names: "avg_rating" | ||
names: "num_votes" | ||
} | ||
} | ||
version { | ||
minor_number: 24 | ||
producer: "ibis-substrait" | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
plugins: | ||
- name: python | ||
out: src/substrait/gen | ||
version: v1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
version: v1 | ||
directories: | ||
- buf_work_dir |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
version: v1 | ||
breaking: | ||
use: | ||
- FILE | ||
lint: | ||
use: | ||
- DEFAULT |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
name: substrait-python-env | ||
channels: | ||
- conda-forge | ||
dependencies: | ||
- buf | ||
- pip | ||
- protobuf = 3.20.1 # protobuf==3.20 C extensions aren't compatible with 3.19.4 | ||
- protoletariat >= 2.0.0 | ||
- pytest >= 7.0.0 | ||
- python >= 3.8.1 | ||
- setuptools >= 61.0.0 | ||
- setuptools_scm >= 6.2.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
#!/usr/bin/env bash | ||
|
||
set -eou pipefail | ||
|
||
namespace=proto | ||
submodule_dir=./third_party/substrait | ||
src_dir="$submodule_dir"/proto | ||
tmp_dir=./buf_work_dir | ||
dest_dir=./src/substrait/gen | ||
|
||
# Prefix the protobuf files with a unique configuration to prevent namespace conflicts | ||
# with other substrait packages. Save output to the work dir. | ||
python "$submodule_dir"/tools/proto_prefix.py "$tmp_dir" "$namespace" "$src_dir" | ||
|
||
# Remove the old python protobuf files | ||
rm -rf "$dest_dir" | ||
|
||
# Generate the new python protobuf files | ||
buf generate | ||
protol --in-place --create-package --python-out "$dest_dir" buf | ||
|
||
# Remove the temporary work dir | ||
rm -rf "$tmp_dir" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
[project] | ||
name = "substrait" | ||
description = "A python package for Substrait." | ||
authors = [{name = "Substrait contributors", email = "substrait@googlegroups.com"}] | ||
license = {text = "Apache-2.0"} | ||
readme = "README.md" | ||
requires-python = ">=3.8.1" | ||
dependencies = ["protobuf >= 3.20"] | ||
dynamic = ["version"] | ||
|
||
[tool.setuptools_scm] | ||
write_to = "src/substrait/_version.py" | ||
|
||
[project.optional-dependencies] | ||
gen_proto = ["protobuf == 3.20.1", "protoletariat >= 2.0.0"] | ||
test = ["pytest >= 7.0.0"] | ||
|
||
[tool.pytest.ini_options] | ||
pythonpath = "src" | ||
|
||
[build-system] | ||
requires = ["setuptools>=61.0.0", "setuptools_scm[toml]>=6.2.0"] | ||
build-backend = "setuptools.build_meta" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
try: | ||
from ._version import __version__ | ||
except ImportError: | ||
pass |
Empty file.
Empty file.
Large diffs are not rendered by default.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Empty file.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Oops, something went wrong.