Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.
/ zenbytes Public archive

A simple guide to MLOps through ZenML and its various integrations.

Notifications You must be signed in to change notification settings

zenml-io/zenbytes

Repository files navigation

ZenBytes

ZenBytes is a series of short practical MLOps lessons through ZenML and its various integrations. It is intended for people looking to learn about MLOps generally, and also for ML practitioners who want to get started with ZenML.

💡 What you will learn

  • Define an MLOps stack tailored to your project requirements.
  • Build transparent and reproducible data-centric ML pipelines with automated artifact versioning, tracking, caching, and more.
  • Deploy ML pipelines with tooling and infrastructure of your choice (e.g. as serverless microservice in the cloud).
  • Monitor and address production issues like performance drift, data drift, and concept drift.
  • Use some of the most popular MLOps tools like ZenML, Kubeflow, MLflow, Weights & Biases, Evidently, Seldon, Feast, and many more.

In the end, you will be able to take any of your ML models from experimentation to a customized, fully fleshed out production-grade MLOps setup in a matter of minutes!

Sam

🧑‍🏫 Syllabus

  • Chapter 1: ML Pipeline Basics
    • Lesson 1.1: ML Pipelines with ZenML
    • Lesson 1.2: Artifact Versioning, Tracking, and Caching
  • Chapter 2: Training, Deployment, and Serving
    • Lesson 2.1: Experiment Tracking with MLflow / W&B
    • Lesson 2.2: Local Deployment with MLflow
    • Lesson 2.3: Inference Pipelines
  • Chapter 3: Data Management
    • Lesson 3.1: Data Drift Detection with Evidently / Whylabs
    • (Lesson 3.2: Data Validation with DeepChecks / GreatExpectations)
    • (Lesson 3.3: Feature Stores with Feast?)
  • Chapter 4: Advanced Deployment
    • (Lesson 4.1: Model Serving with Seldon / BentoML?)
    • (Lesson 4.2: Serverless Deployment with Seldon & Kubeflow)
    • Lesson 4.3: Serverless Cloud Deployment with Seldon & Kubeflow on AWS
  • Chapter 5: Full Examples
    • (Lesson 5.1: Zero to Hero with ZenML - from Experimentation to Production-Grade MLOps)
    • (Lesson 5.2: More Examples - zenml example run and ZenFiles)

🙏 About ZenML

ZenML is an extensible, open-source MLOps framework to create production-ready ML pipelines. Built for data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that are catered towards ML workflows.

If you enjoy these courses and want to learn more:

💻 Setup

System Requirements

  • Linux or MacOS
  • Python 3.7 or 3.8
  • Jupyter notebook and ZenML: pip install zenml notebook

Integrations

As you progress through the course, you will need to install additional packages for the various other MLOps tools you are going to use. You will find corresponding instructions in the respective notebooks, but we recommend you install all of the integrations ahead of time with the following commands:

zenml integration install sklearn -f
zenml integration install dash -f
zenml integration install wandb -f
zenml integration install evidently -f
zenml integration install mlflow -f
zenml integration install kubeflow -f
zenml integration install seldon -f
zenml integration install s3 -f
zenml integration install aws -f

Additional Requirements

For some of the advanced lessons you also need to have the following additional packages installed on your machine:

package MacOS installation Linux installation
docker Docker Desktop for Mac Docker Engine for Linux
kubectl kubectl for mac kubectl for linux
k3d Brew Installation of k3d k3d installation linux

🚀 Getting Started

If you haven't done so already, clone ZenBytes to your local machine. Then, simply use Jupyter Notebook to go through the course lesson-by-lesson, starting with 1-1_Pipelines.ipynb:

git clone https://github.com/zenml-io/zenbytes
cd zenbytes
jupyter notebook

❓ FAQ

1. MacOS When starting the container registry for Kubeflow, I get an error about port 5000 not being available.

OSError: [Errno 48] Address already in use

Solution: In order for Kubeflow to run, the docker container registry currently needs to be at port 5000. MacOS, however, uses port 5000 for the Airplay receiver. Here is a guide on how to fix this Freeing up port 5000.

About

A simple guide to MLOps through ZenML and its various integrations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages