Skip to content

Derecho-Project/cascade

Repository files navigation

Overview

Cascade is a C++17 cloud application framework powered by optimized RDMA data paths. It provides a K/V API for data manipulation in distributed memory and persistent storage. Besides the K/V API, Cascade allows injecting logic on the data paths for low-latency application. The highlights of Cascade's features include the following.

  • High-throughput and low latency from zero-copy RDMA and NVMe storage layer.
  • Timestamp-indexed versioning capability allows reproducing system states anytime in the past.
  • Users can specify the Key and Value types of the K/V API.
  • Users can configure the application layout using the group, subgroup, and shard concepts deriving from Derecho.
  • Cascade derives the same fault-tolerance model from Derecho.

Using Cascade

  • Cascade can be used both as a service, and as a software library
    • Used as a service, the developer would work in a client/server model
    • The use of Cascade as a library is primarily for our own purposes, in creating the Cascade service. However, this approach could be useful for creating other services that need to layer some other form of functionality over a K/V infrastructure.
  • Cascade's most direct and efficient APIs aim at applications coded in C++, which is the language used the Cascade implementation.
    • Within C++, we have found it useful to combine Cascade with a language-integrated query library such as LINQ (we can support both cpplinq and boolinq).
    • Doing so permits the developer to treat collections of objects or object histories as sets of K/V tuples, describing "transformations" on the data much as we would in a database setting, and leaving the runtime to make scheduling and object placement decisions on our behalf.
    • LINQ is closely related to models widely used in ML, such as the Spark concept of an RDD, or the TensorFlow model for tensors and sets of tensors. Cascade is currently enabled with LINQ data retrieving C++ API.
    • We do not plan to require use of LINQ, but we do think it lends itself to concise, elegant code. We have extended the API to also support use from C# via the .NET Core CLR. This allows for development of user-defined logic in C# as well.
  • Cascade also supports a variety of remoting options. Through them, Cascade's K/V API can be accessed from other popular high-level languages, notably Java and Python.
  • Cascade also offers a File system API that maps to its K/V API through libfuse.

Installation

Prerequisites

  • Linux (other operating systems don't currently support the RDMA features we use. We recommend Ubuntu18 or Ubuntu20. However, other distributions should also work.)
  • A C++ compiler supporting C++17: GCC 8.3+
  • CMake 3.10 or newer
  • Lohmann's json parser v3.2.0 or newer
  • Readline library v7.0 or newer. On Ubuntu, use apt install libreadline-dev to install it.
  • RPC library rpclib. For convenience, install it with this script.
  • Intel's regular expression library Hyperscan. For convenience, install it with this script. You need to install ragel compiler if you don't have it. On ubuntu, use apt-get install ragel to install it.
  • libfuse v3.9.3 or newer (Optional for file system API)
  • boolinq or newer (Optional for LINQ API)
  • Python 3.5 or newer and pybind11 (Optional for Python API)
  • OpenJDK 11.06 or newer. On Ubuntu, use apt install openjdk-11-jdk to install it. (Optional for Java API)
  • .NET Framework 6x. Please follow the instructions from Microsoft to install it based on Linux distro here. (Optional for C# API)
  • Derecho v2.2.2. Plesae follow this document to install Derecho. Note: this cascade version replies on Derecho commit 3f24e06ed5ad572eb82206e8c1024935d03e903e on the master branch.

Build Cascade

  1. Download Cascade Source Code
# git clone https://github.com/Derecho-Project/cascade
  1. Build Cascade source code
# mkdir build
# cd build
# cmake ..
# make -j

Please note that the cmake script will check whether the python3 environment (along with pybind11) is available or not. If python environment is not detected, the building process will disable python support quietly. If your pybind11 is not installed in a standard place, which is very common if pybind11 is install by pip3, you can use -Dpybind11_DIR= option for cmake to specify the location of pybind11 as following:

cmake -Dpybind11_DIR=`pybind11-config --cmakedir` ..
  1. Install Cascade
# make install

This will install the following cascade components:

  • headers to ${CMAKE_INSTALL_INCLUDEDIR}/include/cascade
  • libraries to ${CMAKE_INSTALL_LIBDIR}
  • binaries(cascade_client, cascade_server, cascade_fuse_client, interactive_test.py, perf_test.py) to ${CMAKE_INSTALL_BINDIR}

Usage

There are two ways to use Cascade in an application. You can use Cascade as a standalone service with pre-defined K/V types and configurable layout. Or, you can use the Cascade storage templates (defined in Cascade ) as building blocks to build the application using the Derecho group framework. Please refer to Cascade service's README for using Cascade as a service and cli_example README for using Cascade components to build your own binary with customized key type and value type.

New Features to Come

  1. Resource management