Skip to content

Latest commit





Code Search on Kubeflow

This demo implements End-to-End Code Search on Kubeflow.

Warning: Running this example can be very expensive

This example uses large amounts of computation and cost several hundred dollars to run E2E on Cloud.


NOTE: If using the JupyterHub Spawner on a Kubeflow cluster, use the Docker image which has baked all the pre-prequisites.

  • Kubeflow Latest This notebook assumes a Kubeflow cluster is already deployed. See Getting Started with Kubeflow.

  • Python 2.7 (bundled with pip) For this demo, we will use Python 2.7. This restriction is due to Apache Beam, which does not support Python 3 yet (See BEAM-1251).

  • Google Cloud SDK This example will use tools from the Google Cloud SDK. The SDK must be authenticated and authorized. See Authentication Overview.

  • Ksonnet 0.12 We use Ksonnet to write Kubernetes jobs in a declarative manner to be run on top of Kubeflow.

Getting Started

To get started, follow the instructions below.

NOTE: We will assume that the Kubeflow cluster is available at Make sure you replace this with the true FQDN of your Kubeflow cluster in any subsequent instructions.

  • Spawn a new JupyterLab instance inside the Kubeflow cluster by pointing your browser to and clicking "Start My Server".

  • In the Image text field, enter This image contains all the pre-requisites needed for the demo.

  • Once spawned, you should be redirected to the Jupyter Notebooks UI.

  • Spawn a new Terminal and run

    $ git clone --branch=master --depth=1

    This will create an examples folder. It is safe to close the terminal now.

  • Navigate back to the Jupyter Notebooks UI and navigate to examples/code_search. Open the Jupyter notebook code-search.ipynb and follow it along.


This project derives from hamelsmu/code_search.