Skip to content

Commit

Permalink
Kaggle notebook to kfp pipeline (kubeflow#940)
Browse files Browse the repository at this point in the history
* Create README.md

* kaggle to kfp

* Create README.md

* Update README.md

* Add files via upload

* Update README.md

* Update README.md

* Update README.md

* kaggle to kfp
  • Loading branch information
josepholaide authored Apr 26, 2022
1 parent 68477e3 commit 639f84a
Show file tree
Hide file tree
Showing 4 changed files with 765 additions and 0 deletions.
30 changes: 30 additions & 0 deletions digit_recognition/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Objective
Here we convert the https://www.kaggle.com/competitions/digit-recognizer code to kfp-pipeline
The objective of this task is tois to correctly identify digits from a dataset of tens of thousands of handwritten images.

# Testing environment
| Name | version |
| ------------- |:-------------:|
| Kubeflow | v1 |
| kfp | 1.8.11 |


Kfp version used for testing can be installed as `pip install kfp==1.8.11`

# Components used

## kubeflow lightweight component method
Here, a python function is created to carry out a certain task and the python function is passed inside kfp component method`create_component_from_func`.


## Kubeflow pipelines
Kubeflow pipelines connect each components according to how they were passed and creates a pipeline. The kfp `dsl.pipeline` method was used to create a pipeline function. The kkfp component method `InputPath` and `OutputPath` was used to pass data amongst component.

Finally, the `create_run_from_pipeline_func` was used to submit pipeline directly from pipeline function

## To create pipeline
1. Navigate to `data` directory, download compressed kaggle data and put your `training.zip` and `test.zip` data in the data folder.
2. Open your setup kubeflow cluster and create a notebook server and connect to it.
3. Clone this repo and navigate to this directory
4. run the kfp-digit-recognizer notebook from start to finish
5. View run details immediately after submitting pipeline.
16 changes: 16 additions & 0 deletions digit_recognition/data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Objective
Download compressed data from https://www.kaggle.com/competitions/digit-recognizer/data and store here

replace download link with the repo link where the data is stored https://github-repo/data-dir/{file}.csv.zip?raw=true
<p>
<img src="https://github.com/josepholaide/examples/blob/master/digit_recognition/data/img1.PNG?raw=true" alt="kubeflow pipeline" width="850" height="250"/>
</p>

# Testing environment
| Name | version |
| ------------- |:-------------:|
| Kubeflow | v1 |
| kfp | 1.8.11 |


Kfp version used for testing can be installed as `pip install kfp==1.8.11`
Binary file added digit_recognition/data/img1.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 639f84a

Please sign in to comment.