Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.
/ zenbytes Public archive

A simple guide to MLOps through ZenML and its various integrations.

Notifications You must be signed in to change notification settings

zenml-io/zenbytes

Repository files navigation

ZenBytes

ZenBytes is a series of short practical MLOps lessons through ZenML and its various integrations. It is intended for people looking to learn about MLOps generally, and also for ML practitioners who want to get started with ZenML.

🚩 ZenML 0.20.0 chapters coming soon!

The release of ZenML 0.20.0 marks a big breaking change in ZenML history, and requires an equally big update to this repository. The ZenML team is working on this renovation as we speak, and will bring you a brand-new ZenBytes with the latest ZenML soon!

💡 What you will learn

  • Define an MLOps stack tailored to your project requirements.
  • Build transparent and reproducible data-centric ML pipelines with automated artifact versioning, tracking, caching, and more.
  • Deploy ML pipelines with tooling and infrastructure of your choice (e.g. as a serverless microservice in the cloud).
  • Monitor and address production issues like training-serving skew and data drift.
  • Use some of the most popular MLOps tools like ZenML, Kubeflow, MLflow, Weights & Biases, Evidently, Seldon, Feast, and many more.

In the end, you will be able to take any of your ML models from experimentation to a customized, fully fleshed-out production-grade MLOps setup in a matter of minutes!

Sam

🧑‍🏫 Syllabus

The series is structured into four chapters with several lessons each. Click on any of the links below to open the respective lesson directly in Colab.

🍡 1. ML Pipelines ♻️ 2. Training / Serving 📁 3. Data Management 🚀 4. Advanced Deployment
1.1 ML Pipelines 2.1 Experiment Tracking 3.1 Data Skew 4.1 Cloud Deployment
1.2 Artifact Lifecycle 2.2 Local Deployment
2.3 Inference Pipelines

🙏 About ZenML

ZenML is an extensible, open-source MLOps framework for creating production-ready ML pipelines. Built for data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that are catered towards ML workflows.

If you enjoy these courses and want to learn more:

💻 Setup

System Requirements

  • Linux or MacOS
  • Python 3.7 or 3.8
  • Jupyter notebook and ZenML: pip install zenml notebook

Integrations

As you progress through the course, you will need to install additional packages for the various other MLOps tools we will use. You will find corresponding instructions in the respective notebooks, but we recommend you install all integrations ahead of time with the following command:

zenml integration install sklearn dash wandb evidently mlflow kubeflow seldon s3 aws -y

Additional Requirements

For the advanced deployment lessons in chapter 4, you will also need to have the following additional packages installed on your machine:

package MacOS installation Linux installation
docker Docker Desktop for Mac Docker Engine for Linux
kubectl kubectl for mac kubectl for linux
k3d Brew Installation of k3d k3d installation linux

🚀 Getting Started

If you haven't done so already, clone ZenBytes to your local machine. Then, use Jupyter Notebook to go through the course lesson-by-lesson, starting with 1-1_Pipelines.ipynb:

git clone https://github.com/zenml-io/zenbytes
cd zenbytes
jupyter notebook

❓ FAQ

1. ZenML cannot find a component even though I have it in my stack

Updating or switching your ZenML stack is sometimes not immediately loaded in Jupyter notebooks.

Solution: First, make sure you really have the correct component installed and registered in your currently active stack with zenml stack describe. If the component is indeed there, restart the kernel of your Jupyter notebook, which will also reload the stack.

2. MacOS When starting the container registry for Kubeflow, I get an error about port 5000 not being available.

OSError: [Errno 48] Address already in use

Solution: For Kubeflow to run, the docker container registry currently needs to be at port 5000. MacOS, however, uses port 5000 for the Airplay receiver. Here is a guide on how to fix this Freeing up port 5000.

About

A simple guide to MLOps through ZenML and its various integrations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages