A simple guide to MLOps through ZenML and its various integrations. This repository is still a WIP. Please start with Chapter 0 in
Chapter 000 - Basics of ZenML
ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. Built for data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that are catered towards ML workflows.
Check out the ZenML repository and Docs for more details.
In order to run this entire demo you need to have some packages installed on your machine. Note you only need these for some parts, and you might get away
with only Python and pip install requirements.txt
for some parts of the codebase, but we recommend installing all these:
Currently, this will only run on UNIX systems.
package | MacOS installation | Linux installation |
---|---|---|
docker | Docker Desktop for Mac | Docker Engine for Linux |
kubectl | kubectl for mac | kubectl for linux |
k3d | Brew Installation of k3d | k3d installation linux |
You might also need to install Anaconda to get the MLflow deployment to work.
Once you've got the system requirements figured out, let's jump into the Python packages you need. Within the Python environment of your choice, run:
git clone https://github.com/zenml-io/
cd "Chapter 000 - Basics of ZenML"
pip install -r requirements.txt
If you are running the run.py
script, you will also need to install some integrations using zenml:
zenml integration install evidently -f
zenml integration install mlflow -f
zenml integration install kubeflow -f
We're ready to go now. You have two options:
You can go through the notebook step-by-step guide:
jupyter notebook
You can also directly run the code, using the run.py
script.
zenml init
python run.py # Runs pipeline locally
Once you are done running all notebooks you might want to stop all running processes. For this, run the following command.
(This will tear down your k3d
cluster and the local docker registry.)
zenml stack set local_kubeflow_stack
zenml stack down -f
- MacOS When starting the container registry for Kubeflow, I get an error about port 5000 not being available.
OSError: [Errno 48] Address already in use
Solution: In order for Kubeflow to run, the docker container registry currently needs to be at port 5000. MacOS, however, uses port 5000 for the Airplay receiver. Here is a guide on how to fix this Freeing up port 5000.