Cloudberry Middleware

Software Architecture

`Neo`

This module is the entry point to the client. It sets up the web server that wraps up the middleware logics. The server is implemented by using Play Framework.

conf
This folder contains the following configuration files
- routes : defines the HTTP entries.
- logback.xml : defines the log level.
- application.conf : defines the develop application related configurations.
- production.conf : the same as application.conf, but is used in the online machines.
app
The server related Scala codes.
- controllers
  It defines a web server application. The HTTP entries functions defined in the routes are implemented here.
- views
  It contains the *.scala.html scripted HTML files that will be rendered by Play.
- actor
  [To be changed] Currently, it contains a NeoActor which translate the JSON request from the web page to the Cloudberry JSON request. It's mainly used to simplify the front-end JS logics since the writer was more comfortable with the strong-typed language. It can be moved to the web page and let the JS send the Cloudberry request directly.
- db
  It connects the AsterixDB and checks if the berry.meta dataset is there. The metadata will be loaded once into the memory and will be created if not found.
public
The frontend resource folder. Please read the frontend documentation.

`Zion`

Zion contains the kernel of the middleware work. It is composed of the following general components:

Request Parses
Would parse the incoming request and forward it to the query planner
Query Planner
Responsible for query rewriting depending on the given views information. If there is an appropriate view, the original query will be split into multiple queries to ask different datasets. After all results come back, the Query Planner will merge the results from all queries and return to the client. Not every query can find an appropriate view, especially at the beginning when the system just started. In this case, instead of waiting for the entire query result, Cloudberry can return a serials of partial results as a streaming fashion in a steady pace. It splits the query into a serials of mini-queries. The selectivity of each mini-query is adapted based on the query performance so that each mini-query is guaranteed to finish within a short time limit.
Data Manager
This component deals with all data related modules including Metadata manager, View manager, and Datastore manager.

Datastore and Metadata managers are mainly for mapping data format and registering the data Data Store To speed up the query results, views are periodically updated and utilized if available to answer queries View Manager

Database Adapter
Cloudberry maintains an adapter for each datatabase for different languages and connections.

Below are the detailed packages of `Zion` in code base

actor
It has many types of actors to handle all the workflows.
- BerryClient
  Each web connection creates one BerryClient that utilizes the JSONParser to get a group of AQLs for DataSetAgent to execute.
- DataSetAgent
  Each AsterixDB dataset is connected to one unique DataSetAgent. It runs AQLs queries and updates on that dataset. The uniqueness guarantees the read and update consistency.
- DataStoreManager
  One global actor that sync with the berry.meta AsterixDB dataset which stores the view description and relations.
common
Configuration file (Asterix URL, timeouts)
model
The object model codes
- datastore
  It defines the interface of the datastore related query model.
- impl
  - AQLGenerator: It parses a Query by calling generate function that generates a correspondent AQL query with some syntax validation. It uses AQLFuncVisitor util to handle functions such as relation functions(e.g., contains, in) and aggregation functions(e.g., count, min, max). Reference 3
  - JSONParser: It parses a given Query to a JSON record. In addition, because query is part of the DataSetInfo which requires to be serialized and deserialized to and from AsterixDB, we also implement the write interface which can convert a Query back to a JSONRecord. Reference 1, 2
  - DataSetInfo: It is the dataset metadata that contains Schema, CreateQuery(if it is a view), statistical information(e.g., creation time, update time, cardinality). We implemented the corresponding JSON Formater to serialize/deserialize a DataSetInfo to/from a JSONRecord, so it can be stored into AsterixDB. Reference 1, 2
- schema
  Schema model: Query, Schema, Functions types. Reference 2

Quick Start
Documentation
Advanced topics
- Database Adapters
- Enable Sidebar Live Tweets
- Realtime Tweets' Ingestion
How to Contribute
Research

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cloudberry Middleware

`Neo`

conf

app

controllers

views

actor

db

public

`Zion`

Request Parses

Query Planner

Data Manager

Database Adapter

Below are the detailed packages of `Zion` in code base

actor

BerryClient

DataSetAgent

DataStoreManager

common

model

datastore

impl

schema

Clone this wiki locally

Cloudberry Middleware

Neo

conf

app

controllers

views

actor

db

public

Zion

Request Parses

Query Planner

Data Manager

Database Adapter

Below are the detailed packages of Zion in code base

actor

BerryClient

DataSetAgent

DataStoreManager

common

model

datastore

impl

schema

Clone this wiki locally

`Neo`

`Zion`

Below are the detailed packages of `Zion` in code base