Thank you for contributing to Marqo! Contributions from the open source community help make Marqo be the tensor engine you want.
See here for how to run unit tests.
- Have a running Marqo-OS instance available to use. You can spin up a local instance with the following command
(if you are using an arm64 machine, replace
marqoai/marqo-os:0.0.3
withmarqoai/marqo-os:0.0.3-arm
):
docker run --name marqo-os -id -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" marqoai/marqo-os:0.0.3
- Clone the github repo
git clone https://github.com/marqo-ai/marqo.git
- Install marqo dependencies
cd marqo
pip install -r requirements.txt
- Run the following command:
# if you are running Marqo-OS locally:
export OPENSEARCH_URL="https://localhost:9200"
export PYTHONPATH="${PYTHONPATH}:$(pwd)/src"
cd src/marqo/tensor_search
uvicorn api:app --host 0.0.0.0 --port 8882 --reload
Notes:
- This is for marqo-os (Marqo OpenSearch) running locally. You can alternatively set
OPENSEARCH_URL
to a remote Marqo OpenSearch cluster
Marqo uses redis to handle concurrency throttling. Redis is automatically set up when running Marqo in docker (Options B-D), but if you are running Marqo locally on your machine (Options A and E), you will have to set redis up yourself to enable throttling.
Note: This setup is optional. If you do not have redis set up properly, Marqo will still run as normal, but throttling will be disabled (you will see warnings containing There is a problem with your redis connection...
). To suppress these warnings, disable throttling completely with:
export MARQO_ENABLE_THROTTLING='FALSE'
The redis-server version to install is redis 7.0.8. Install it using this command for Ubuntu 22.0.4:
apt-get update
apt-get install redis-server -y
If you are using an older version of Ubuntu, this may install an older version of redis. To get the latest redis version, run these commands instead:
apt install lsb-release
curl -fsSL https://packages.redis.io/gpg | gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main" | tee /etc/apt/sources.list.d/redis.list
apt-get update
apt-get install redis-server -y
To start up redis, simply run the command:
redis-server /etc/redis/redis.conf
The /etc/redis/redis.conf
configuration file should have been automatically created upon the redis installation step.
Option B. Build and run the Marqo as a Docker container, that creates and manages its own internal Marqo-OS
cd
into the marqo root directory- Run the following command:
docker rm -f marqo &&
DOCKER_BUILDKIT=1 docker build . -t marqo_docker_0
docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway marqo_docker_0
Option C. Build and run the Marqo as a Docker container, connecting to Marqo-OS which is running on the host:
- Run the following command to run Marqo-OS (if you are using an arm64 machine, replace
marqoai/marqo-os:0.0.3
withmarqoai/marqo-os:0.0.3-arm
):
docker run --name marqo-os -id -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" marqoai/marqo-os:0.0.3
cd
into the marqo root directory- Run the following command:
docker rm -f marqo &&
DOCKER_BUILDKIT=1 docker build . -t marqo_docker_0 &&
docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
-e "OPENSEARCH_URL=https://localhost:9200" marqo_docker_0
Notes:
- This is for marqo-os (Marqo OpenSearch) running locally. You can alternatively set
OPENSEARCH_URL
to a remote Marqo OpenSearch cluster
docker rm -f marqo &&
docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway marqoai/marqo:latest
- Run marqo-os,
docker run --name marqo-os -id -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" marqoai/marqo-os:0.0.3-arm
- Clone the Marqo github repo (if not already done),
git clone https://github.com/marqo-ai/marqo.git
- change into the Marqo directory,
cd marqo
- Install some dependencies (requires Homebrew),
brew install cmake;
brew install protobuf;
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs/ | sh;
-
Set up redis (follow instructions in Option A)
-
Install marqo dependencies,
pip install -r requirements.txt
- Change into the tensor search directory,
CWD=$(pwd)
cd src/marqo/tensor_search/
- Run Marqo,
export OPENSEARCH_URL="https://localhost:9200" &&
export PYTHONPATH="${PYTHONPATH}:${CWD}/src" &&
uvicorn api:app --host 0.0.0.0 --port 8882 --reload
Depending if you are running Marqo within Docker (steps B., C. and D.) or not (step A.) will determine if you need to do anything to use a GPU with Marqo.
Marqo outside Docker will rely on the system setup to use the GPU. If you can use a GPU normally with pytorch then it should be good to go. The usual caveats apply though, the CUDA version of pytorch will need to match that of the GPU drivers (see below on how to check).
Currently, only CUDA based (Nvidia) GPU's are supported. If you have a GPU on the host machine and want to use it with Marqo, there are two things to do;
- Add the
--gpus all
flag to thedocker run
command. This command excluded from the above but will allow the GPU's to be used within Marqo. For example, in the steps B., C., and D., above--gpus all
should be added after thedocker run --name marqo
part of the command, e.g. B. from above would become,
docker rm -f marqo &&
DOCKER_BUILDKIT=1 docker build . -t marqo_docker_0 &&
docker run --name marqo --gpus all --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
-e "OPENSEARCH_URL=https://localhost:9200" marqo_docker_0
note the --gpus all
has been added.
- Install nvidia-docker2 which is required for the GPU to work with Docker. The link provided has instructions for installing it but it should consist of only a couple of steps (refer to the link for full details). The three steps below should install it for a Ubuntu based machine;
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
Once this is installed, one of the previous Docker commands can be run (either step B., C., or D.).
- Install docker
To install Docker (through terminal) go to the Official Docker Website
- Set up SSH Config (to stop timeouts)
Edit the SSH config file with nano ~/.ssh/config
then insert the line: ServerAliveInterval 50
- Run marqo-os
sudo docker rm -f marqo-os; sudo docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" marqoai/marqo-os:0.0.3-arm
- Run marqo with a set
OPENSEARCH_URL
sudo docker rm -f marqo; sudo docker run --name marqo -it --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway -e "OPENSEARCH_URL=https://localhost:9200" marqoai/marqo:latest
In order for the GPU to be used within Marqo, the underlying host needs to have NVIDIA drivers installed. The current driver can be easily accessed by typing
nvidia-smi
in a terminal. If there is not output then there may be something wrong with the GPU setup and installing or updating drivers may be necessary.
Aside from having the correct drivers installed, a matching version of CUDA is required. The marqo Dockerfile comes setup to use CUDA 11.4.2 by default. The Dockerfile can be easily modified to support different versions of CUDA. Note, ONNX requires the system CUDA while pytorch relies on its own version of CUDA.
To see if a GPU is available when using pytorch, the following can be used to check (from python);
$ import torch
$ torch.cuda.is_available() # is a GPU available
$ torch.version.cuda # get the CUDA version
$ torch.cuda.device_count() # get the number of devices
To check your driver and maximum CUDA version supported, type the following into the terminal;
nvidia-smi
Pytorch comes with its own bundled CUDA which allows many different CUDA versions to be used. Follow the getting started to see how to install different versions of pytorch and CUDA.
To get just the json, run this command (if Marqo is running locally)
curl http://localhost:8882/openapi.json
To get the human readable spec, visit http://localhost:8882/docs