FuseQuery is a Cloud Distributed SQL Query Engine at scale.
Cloud-Native and Distributed ClickHouse from scratch in Rust.
Give thanks to ClickHouse and Arrow.
-
High Performance
- Everything is Parallelism
-
High Scalability
- Everything is Distributed
-
High Reliability
- True Separation of Storage and Compute
Crate | Description | Status |
---|---|---|
distributed | Distributed scheduler and executor for planner | WIP |
optimizers | Optimizer for Distributed&Local plan | WIP |
datablocks | Vectorized data processing unit | WIP |
datastreams | Async streaming iterators | WIP |
datasources | Interface to the datasource(system.numbers for performance/Fuse-Store) | WIP |
executors | Executor(EXPLAIN/SELECT) for the Pipeline | WIP |
functions | Scalar and Aggregation Functions | WIP |
processors | Dataflow Streaming Processor | WIP |
planners | Distributed&Local planners for building processor pipelines | WIP |
servers | Server handler(MySQL/HTTP) | MySQL |
transforms | Data Stream Transform(Source/Filter/Projection/AggregatorPartial/AggregatorFinal/Limit) | WIP |
- Projection
- Filter
- Limit
- Aggregate
- Functions
- Filter Push-Down
- Projection Push-Down (TODO)
- Distributed Query (WIP)
- Sorting (TODO)
- Joins (TODO)
- SubQueries (TODO)
- Memory SIMD-Vector processing performance only
- Dataset: 10,000,000,000 (10 Billion)
- Hardware: 8vCPUx16G Cloud Instance
- Rust: rustc 1.50.0-nightly (f76ecd066 2020-12-15)
- Build with Link-time Optimization and Using CPU Specific Instructions
Query | FuseQuery (v0.1) | ClickHouse (v19.17.6) |
---|---|---|
SELECT avg(number) FROM system.numbers_mt | (0.60 s.) | ×2.27 slow, (1.70 s.) 5.90 billion rows/s., 47.16 GB/s |
SELECT sum(number) FROM system.numbers_mt | (0.75 s.) | ×1.79 slow, (1.34 s.) 7.48 billion rows/s., 59.80 GB/s |
SELECT min(number) FROM system.numbers_mt | (0.67 s.) | ×1.75 slow, (1.57 s.) 6.36 billion rows/s., 50.89 GB/s |
SELECT max(number) FROM system.numbers_mt | (0.70 s.) | ×2.53 slow, (2.33 s.) 4.34 billion rows/s., 34.74 GB/s |
SELECT max(number+1) FROM system.numbers_mt | (2.92 s.) | ×1.13 slow, (3.29 s.) 3.04 billion rows/s., 24.31 GB/s |
SELECT count(number) FROM system.numbers_mt | (0.44 s.) | ×1.52 slow, (0.67 s.) 15.00 billion rows/s., 119.99 GB/s |
SELECT sum(number+number+number) FROM numbers_mt | (3.21 s.) | ×1.22 slow, (4.95 s.) 2.02 billion rows/s., 16.17 GB/s |
SELECT sum(number) / count(number) FROM system.numbers_mt | (0.67 s.) | ×1.54 slow, (1.28 s.) 7.84 billion rows/s., 62.73 GB/s |
SELECT sum(number) / count(number), max(number), min(number) FROM system.numbers_mt | (1.14 s.) | ×3.54 slow, (4.03 s.) 2.33 billion rows/s., 18.61 GB/s |
Note:
- ClickHouse system.numbers_mt is 8-way parallelism processing
- FuseQuery system.numbers_mt is 8-way parallelism processing
Run from source
$ make run
12:46:15 [ INFO] Options { log_level: "debug", num_cpus: 8, mysql_handler_port: 3307 }
12:46:15 [ INFO] Fuse-Query Cloud Compute Starts...
12:46:15 [ INFO] Usage: mysql -h127.0.0.1 -P3307
or Run with docker(Recommended):
$ docker pull datafusedev/fuse-query
...
$ docker run --init --rm -p 3307:3307 datafusedev/fuse-query
05:12:36 [ INFO] Options { log_level: "debug", num_cpus: 6, mysql_handler_port: 3307 }
05:12:36 [ INFO] Fuse-Query Cloud Compute Starts...
05:12:36 [ INFO] Usage: mysql -h127.0.0.1 -P3307
or Download the release binary here:
https://github.com/datafusedev/fuse-query/releases
$ mysql -h127.0.0.1 -P3307
mysql> explain select (number+1) as c1, number/2 as c2 from system.numbers_mt(10000000) where (c1+c2+1) < 100 limit 3;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Limit: 3
Projection: (number + 1) as c1:UInt64, (number / 2) as c2:UInt64
Filter: (((c1 + c2) + 1) < 100)
ReadDataSource: scan parts [8](Read from system.numbers_mt table, Read Rows:10000000, Read Bytes:80000000) |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
mysql> explain pipeline select (number+1) as c1, number/2 as c2 from system.numbers_mt(10000000) where (c1+c2+1) < 100 limit 3;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
└─ LimitTransform × 1 processor
└─ Merge (LimitTransform × 8 processors) to (MergeProcessor × 1)
└─ LimitTransform × 8 processors
└─ ProjectionTransform × 8 processors
└─ FilterTransform × 8 processors
└─ SourceTransform × 8 processors |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select (number+1) as c1, number/2 as c2 from system.numbers_mt(10000000) where (c1+c2+1) < 100 limit 3;
+------+------+
| c1 | c2 |
+------+------+
| 1 | 0 |
| 2 | 0 |
| 3 | 1 |
+------+------+
3 rows in set (0.06 sec)
$ make test
- 0.1 support aggregation select
- 0.2 support distributed query (WIP)
- 0.3 support group by, order by
- 0.4 support join
- 0.5 support sub queries
- 0.6 support TPC-H benchmark