telco-customer-churn-kaggle-competition

Objective

In this example we are going to convert this generic notebook based on the Telco Customer Churn Prediction competition into a Kubeflow pipeline.

The objective of this task is to analyze customer behavior in the telecommunication sector and to predict their tendency to churn.

Testing Environment

Environment:

Name	version
Kubeflow	v1.4
kfp	1.8.11
kubeflow-kale	0.6.0
pip	21.3.1

Section 1: Overview

Vanilla KFP Pipeline: Kubeflow lightweight component method

To get started, visit the Kubeflow Pipelines documentation to get acquainted with what pipelines are, its components, pipeline metrics and how to pass data between components in a pipeline. There are different ways to build out a pipeline component as mentioned here. In the following example, we are going to use the lightweight python functions based components for building our Kubeflow pipeline.
Kale KFP Pipeline

To get started, visit Kale's documentation to get acquainted with the Kale user interface (UI) from a Jupyter Notebook, notebook cell annotation and how to create a machine learning pipeline using Kale. In the following example, we are going to use the Kale JupyterLab Extension to building our Kubeflow pipeline.

Section 2: Vanilla KFP Pipeline

Kubeflow lightweight component method

Here, a python function is created to carry out a certain task and the python function is passed inside a kfp component method create_component_from_func.

The different components used in this example are:

Load data
Transform data
Feature Engineering
Catboost Modeling
Xgboost Modeling
Lightgbm Modeling
Ensembling

Kubeflow pipelines

A Kubeflow pipeline connects all components together, to create a directed acyclic graph (DAG). The kfp dsl.pipeline decorator was used to create a pipeline function. The kfp component method InputPath and OutputPath was used to pass data between components in the pipeline.

Finally, the create_run_from_pipeline_func from the KFP SDK Client was used to submit pipeline directly from pipeline function

To create pipeline using Vanilla KFP

Open your Kubeflow Cluster, create a Notebook Server and connect to it.
Clone this repo and navigate to this directory.
Open the telco-customer-churn-kfp notebook
Run the telco-customer-churn-kfp notebook from start to finish
View run details immediately after submitting pipeline.

View Pipeline

View Pipeline Visualization

Section 2: Kale KFP Pipeline

To create pipeline using the Kale JupyterLab extension

Clone GitHub repo and navigate to this directory
Launch the telco-customer-churn-kale Notebook
Install the requirements.txt file. After installation, restart kernel.
Enable the Kale extension in JupyterLab
The notebook's cells are automatically annotated with Kale tags

To fully understand the different Kale tags available, visit Kale documentation

The following Kale tags were used in this example:
- Imports
- Pipeline Step
- Skip Cell
With the use of Kale tags we define the following:
- Pipeline parameters are assigned using the "pipeline parameters" tag
- The necessary libraries that need to be used throughout the Pipeline are passed through the "imports" tag
- Notebook cells are assigned to specific Pipeline components (download data, load data, etc.) using the "pipeline step" tag
- Cell dependencies are defined between the different pipeline steps with the "depends on" flag
- Pipeline metrics are assigned using the "pipeline metrics" tag
The pipeline steps created in this example:
- Load data
- Transform data
- Feature Engineering
- Catboost Modeling
- Xgboost Modeling
- Lightgbm Modeling
- Ensembling
Compile and run the Notebook by hitting the "Compile & Run" in Kale's left panel

View Pipeline

View Pipeline by clicking "View" in Kale's left panel

Name		Name	Last commit message	Last commit date
parent directory ..
data		data
images		images
README.md		README.md
requirements.txt		requirements.txt
telco-customer-churn-kale.ipynb		telco-customer-churn-kale.ipynb
telco-customer-churn-kfp.ipynb		telco-customer-churn-kfp.ipynb
telco-customer-churn-orig.ipynb		telco-customer-churn-orig.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

telco-customer-churn-kaggle-competition

telco-customer-churn-kaggle-competition

README.md

Objective

Testing Environment

Section 1: Overview

Section 2: Vanilla KFP Pipeline

Kubeflow lightweight component method

Kubeflow pipelines

To create pipeline using Vanilla KFP

View Pipeline

View Pipeline Visualization

Section 2: Kale KFP Pipeline

View Pipeline

View Pipeline Visualization

Files

telco-customer-churn-kaggle-competition

Directory actions

More options

Directory actions

More options

Latest commit

History

telco-customer-churn-kaggle-competition

Folders and files

parent directory

README.md

Objective

Testing Environment

Section 1: Overview

Section 2: Vanilla KFP Pipeline

Kubeflow lightweight component method

Kubeflow pipelines

To create pipeline using Vanilla KFP

View Pipeline

View Pipeline Visualization

Section 2: Kale KFP Pipeline

View Pipeline

View Pipeline Visualization