Here we convert the https://www.kaggle.com/competitions/digit-recognizer code to a Kubeflow pipeline The objective of this task is to correctly identify digits from a dataset of tens of thousands of handwritten images.
Environment:
Name | version |
---|---|
Kubeflow | v1.4 |
kfp | 1.8.11 |
kubeflow-kale | 0.6.0 |
pip | 21.3.1 |
The KFP version used for testing can be installed as pip install kfp==1.8.11
Here, a python function is created to carry out a certain task and the python function is passed inside a kfp component methodcreate_component_from_func
.
A Kubeflow pipelines connects all components together, to create a directed acyclic graph (DAG). The kfp dsl.pipeline
method was used to create a pipeline function. The kfp component method InputPath
and OutputPath
was used to pass data amongst component.
Finally, the create_run_from_pipeline_func
was used to submit pipeline directly from pipeline function
-
Open your Kubeflow Cluster, create a Notebook Server and connect to it.
-
Clone this repo and navigate to this directory
-
Navigate to
data
directory, download the compressed kaggle data using this link, store thetraining.zip
,test.zip
andsample_sumbission.csv
files in the data folder -
Run the digit-recognizer-kfp notebook from start to finish
-
View run details immediately after submitting pipeline.
To create pipeline using the Kale JupyterLab extension
-
Clone GitHub repo and navigate to this directory
-
Install the requirements.txt file
-
Launch the digit-recognizer-kale.ipynb Notebook
-
Enable the Kale extension in JupyterLab
-
The notebook's cells are automatically annotated with Kale tags
With the use of Kale tags we define the following:
- Pipeline parameters are assigned using the "pipeline parameters" tag
- The necessary libraries that need to be used throughout the Pipeline are passed through the "imports" tag
- Notebook cells are assigned to specific Pipeline components (download data, load data, etc.) using the "pipeline step" tag
- Cell dependencies are defined between the different pipeline steps with the "depends on" flag
-
Compile and run Notebook using Kale