ZenBytes is a series of short practical MLOps lessons through ZenML and its various integrations. It is intended for people looking to learn about MLOps generally, and also for ML practitioners who want to get started with ZenML.
- Define an MLOps stack tailored to your project requirements.
- Build transparent and reproducible data-centric ML pipelines with automated artifact versioning, tracking, caching, and more.
- Deploy ML pipelines with tooling and infrastructure of your choice (e.g. as serverless microservice in the cloud).
- Monitor and address production issues like performance drift, data drift, and concept drift.
- Use some of the most popular MLOps tools like ZenML, Kubeflow, MLflow, Weights & Biases, Evidently, Seldon, Feast, and many more.
In the end, you will be able to take any of your ML models from experimentation to a customized, fully fleshed out production-grade MLOps setup in a matter of minutes!
- Chapter 1: ML Pipeline Basics
- Lesson 1.1: ML Pipelines with ZenML
- Lesson 1.2: Artifact Versioning, Tracking, and Caching
- Chapter 2: Training, Deployment, and Serving
- Lesson 2.1: Experiment Tracking with MLflow / W&B
- Lesson 2.2: Local Deployment with MLflow
- Lesson 2.3: Inference Pipelines
- Chapter 3: Data Management
- Lesson 3.1: Data Drift Detection with Evidently / Whylabs
- (Lesson 3.2: Data Validation with DeepChecks / GreatExpectations)
- (Lesson 3.3: Feature Stores with Feast?)
- Chapter 4: Advanced Deployment
- (Lesson 4.1: Model Serving with Seldon / BentoML?)
- (Lesson 4.2: Serverless Deployment with Seldon & Kubeflow)
- Lesson 4.3: Serverless Cloud Deployment with Seldon & Kubeflow on AWS
- Chapter 5: Full Examples
- (Lesson 5.1: Zero to Hero with ZenML - from Experimentation to Production-Grade MLOps)
- (Lesson 5.2: More Examples - zenml example run and ZenFiles)
ZenML is an extensible, open-source MLOps framework to create production-ready ML pipelines. Built for data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that are catered towards ML workflows.
If you enjoy these courses and want to learn more:
- Give the Main ZenML Repo a GitHub Star ⭐ to show your love!
- Join our Slack Community and become part of the ZenML family!
- Linux or MacOS
- Python 3.7 or 3.8
- Jupyter notebook and ZenML:
pip install zenml notebook
As you progress through the course, you will need to install additional packages for the various other MLOps tools you are going to use. You will find corresponding instructions in the respective notebooks, but we recommend you install all of the integrations ahead of time with the following commands:
zenml integration install sklearn -f
zenml integration install dash -f
zenml integration install wandb -f
zenml integration install evidently -f
zenml integration install mlflow -f
zenml integration install kubeflow -f
zenml integration install seldon -f
zenml integration install s3 -f
zenml integration install aws -f
For some of the advanced lessons you also need to have the following additional packages installed on your machine:
package | MacOS installation | Linux installation |
---|---|---|
docker | Docker Desktop for Mac | Docker Engine for Linux |
kubectl | kubectl for mac | kubectl for linux |
k3d | Brew Installation of k3d | k3d installation linux |
If you haven't done so already, clone ZenBytes to your local machine. Then, simply use Jupyter Notebook to go through the course lesson-by-lesson, starting with 1-1_Pipelines.ipynb
:
git clone https://github.com/zenml-io/zenbytes
cd zenbytes
jupyter notebook
1. MacOS When starting the container registry for Kubeflow, I get an error about port 5000 not being available.
OSError: [Errno 48] Address already in use
Solution: In order for Kubeflow to run, the docker container registry currently needs to be at port 5000. MacOS, however, uses port 5000 for the Airplay receiver. Here is a guide on how to fix this Freeing up port 5000.