This is a simple wrapper/enabler for running Apache Drill on Apache Mesos.
Dromedar (DRill On MEsos aDAptoR) gets launched via Marathon and whenever a query request comes in, it launches a number of Drillbits, depending on the dataset size under query. The query-scale-factor (QSF) determines how many Drillbits are launched in relation to the dataset size and defaults to 1 Drillbit per 100MB (1:100) or qsf=100
, for short.
Dromedar's architecture is as follows:
+----------------+ +----------------------------------------+
| Marathon | | Mesos worker node |
| | | |
| | | +-------------------------+ |
| | | | | |
| | | | Drillbit <---------[3]-------> SQL client
| | | +------------+------------+ |
| | | [2] |
| | | +------------+------------+ |
| | | | | |
| +----[2]--> drillbit.sh start | |
| | | +-------------------------+ |
| | | |
| | | |
| | | +-------------------------+ |
| | | | | |
| | |HTTP API | |
| <----[2]--+ qsf.py <---------[1]-------- [QSF]
| | | | | |
| | | +-------------------------+ |
+----------------+ +----------------------------------------+
Dromedar's underlying long-runing service is qsf.py
which itself is initially deployed through dromedar.py
, using Marathon. Once qsf.py
is running as a Web service it performs the following steps:
- As an input it takes a QSF via its HTTP interface on port
9876
. - It uses the Marathon HTTP API to trigger on-demand Drillbits creation using the
drillbit.sh start
command. - The SQL client connects to (one of) the Drillbit(s) and executes the SQL query.
- Apache Mesos 0.22.x
- Marathon 0.8.1
- Apache Drill 0.8.0
- marathon-python
Note that Apache Drill and the Marathon Python package are installed via Dromedar, directly. The only two things that are assumed to be available are Mesos and Marathon itself.
$ ./launch.sh
Then, go to the Marathon UI where you should see something like the following:
- Bootstrap (install Drill, launch Dromedar via Marathon)
- Implement QSF HTTP API
- Implement Drillbit launch/teardown based on requests
- Clarify relation/communication between QSF and SQL client (out of band??)
- Strata implementation cross-check
- Cluster deployment and testing
- HAProxy deployment?
- Examples and video walkthrough