Implementations for the BI workload of the LDBC Social Network Benchmark. See our VLDB 2023 paper and its presentation for details on the design and implementation of the benchmark.
To get started with the LDBC SNB benchmarks, visit the ldbcouncil.org site.
📜 If you wish to cite the LDBC SNB, please refer to the documentation repository (bib snippet).
The repository contains the following implementations:
cypher
: an implementation using the Neo4j graph database management system with queries expressed in the Cypher languageumbra
: an implementation using the Umbra JIT-compiled columnar relational database management system with expressed in SQL queries written in the PostgreSQL dialecttigergraph
: an implementation using the TigerGraph graph database management system with queries expressed in the GSQL language
All implementations use Docker containers for ease of setup and execution. However, the setups can be adjusted to use a non-containerized DBMS.
Running an SNB BI experiment requires the following steps.
-
Pick a system, e.g. Umbra. Make sure you have the required binaries and licenses available.
-
Generate the data sets using the SNB Datagen according to the format described in the system's README.
-
Generate the substitution parameters using the
paramgen
tool. -
Load the data set: set the required environment variables and run the tool's
scripts/load-in-one-step.sh
script. -
Run the benchmark: set the required environment variables and run the tool's
scripts/benchmark.sh
script. -
Collect the results in the
output
directory of the tool.
To cross-validate the results of two implementations, use two systems. Load the data into both, then run the benchmark in validation mode, e.g. Cypher and Umbra results. Then, run:
export SF=10
cd cypher
scripts/benchmark.sh --validate
cd ..
cd umbra
scripts/benchmark.sh --validate
cd ..
scripts/cross-validate.sh cypher umbra
See .circleci/config.yml
for an up-to-date example on how to use the projects in this repository.
We have pre-generated data sets and parameters.
To run the scoring on a full benchmark run, use the scripts/score-full.sh
script, e.g.:
scripts/score-full.sh umbra 100
The script prints its summary to the standard output and saves the detailed output tables in the scoring
directory (as .tex
files).