Example of TF Serving with GPU #154

lluunn · 2018-06-27T08:36:36Z

python predict.py --utl=XX should work.

Only files need review: README.md and predict.py
visualization_utils.py and standard_fields.py are copied from
https://github.com/tensorflow/models/blob/master/research/object_detection/utils/visualization_utils.py

Next PR should move the python logic into a webapp and deploy it.

This change is

lluunn · 2018-07-11T19:13:51Z

/retest

lluunn · 2018-07-11T19:16:37Z

/assign @jlewi
/assign @kunmingg

jlewi · 2018-07-13T13:25:31Z

gpu_serving/README.md

+## Deploy serving component
+
+```
+ks init ks-app


@texasmichelle
Should we just tell people to refer to the Getting Started guide to create the app and then just tell them to CD to the directory containing their application?

jlewi · 2018-07-13T13:27:42Z

gpu_serving/README.md

+```
+wget http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_coco_2018_01_28.tar.gz
+tar -xzf faster_rcnn_nas_coco_2018_01_28.tar.gz
+gsutil cp faster_rcnn_nas_coco_2018_01_28/saved_model/saved_model.pb gs://YOUR_BUCKET/YOUR_MODEL/1/


Should we publish the model to gs://kubeflow-examples-data/
so people can just serve it from there?

jlewi · 2018-07-13T14:09:40Z

gpu_serving/object-detection-app/components/params.libsonnet

+    model1: {
+      cloud: 'gcp',
+      deployHttpProxy: true,
+      gcpCredentialSecretName: 'kai-sa',


Why are you using the GCP secret kai-sa? I believe the GKE deploy secrets will create the secret user-sa does that work?

I was reading model in my bucket. user-sa should work.

Can you change the default parameters to be parameters that will work for users here and below.

jlewi · 2018-07-13T14:11:46Z

gpu_serving/predict.py

@@ -0,0 +1,46 @@
+import argparse


This is a CLI for the example?

Yes, see README: python predict.py --url=YOUR_KF_HOST/models/coco

jlewi · 2018-07-13T14:13:01Z

gpu_serving/object-detection-app/environments/kai4/globals.libsonnet

@@ -0,0 +1,2 @@
+{


Lets not check in your environment.
I'd suggest adding it to the .gitignore file so it doesn't get checked in.

Done.
But it's still in app.yaml. Not sure if there is a good way to avoid that.

jlewi · 2018-07-13T14:13:52Z

This is great.

@texasmichelle Any overall guidance about how we should be structuring the examples?

lluunn · 2018-07-13T18:47:08Z

@jlewi Thanks, PTAL

jlewi · 2018-07-16T13:08:57Z

It looks like someone started checking in an object detection model here
https://github.com/kubeflow/examples/tree/master/object_detection

Is this the same example? Should we move it into that directory?

lluunn · 2018-07-18T17:34:20Z

Filed #185

texasmichelle · 2018-07-19T00:13:30Z

See comments on #185

jlewi · 2018-07-19T19:49:45Z

The consensus on #185 seems to be to having a single object_detection model.

Lunkai do you want to move these files into that directory? Maybe you replace what's in object_detection/tf-serving? And maybe combine the READMEs?

@texasmichelle does that work for you?

lluunn · 2018-07-20T00:29:20Z

I think it's a little awkward to move this into object_detection now as that example currently uses many yaml and doesn't have a ks-app.
Should we either leave this PR here (or checked in for now) and then combine these two after we fix the issues in the object_detection/?

jlewi · 2018-07-20T15:09:58Z

Why can't you make it a subdirectory of object_detection?

e.g.

object_detection/tf_serving/ks-app

jlewi · 2018-07-20T15:11:29Z

Do we even need the YAML file that's currently in object_detection/tf-serving? What is that file doing that your example isn't?

lluunn · 2018-07-20T18:05:47Z

moved. I put the ks-app at the object_detection as I think other yaml files will be moved to the app too.

texasmichelle · 2018-07-20T18:16:00Z

object_detection/tf_serving_gpu.md

+
+## Setup
+
+If you followed previous steps to train the model, skip to deploy [section](#deploy-serving-component).


The setup steps feel out of place here, since the title is Serving an object detection model with GPU. Could you integrate them with the other markdown files in this example?

I think some users might only interested in trying out serving. This provides the steps to do if they skip the training part. And the first line is tell them to skip this if they've done training.

Does it make sense?

texasmichelle · 2018-07-20T18:21:10Z

What is inputs.json for? It is the last remaining file in gpu_serving and needs to be filed somewhere in object_detection if it is used for the example.

lluunn · 2018-07-20T18:33:56Z

removed inputs.json. That's missed when moving.

jlewi

Reviewable status: 0 of 17 files reviewed, 5 unresolved discussions (waiting on @jlewi, @lluunn, @texasmichelle, and @cwbeitel)

gpu_serving/predict.py, line 1 at r4 (raw file):

Previously, lluunn (Lun-Kai Hsu) wrote…

Yes, see README: python predict.py --url=YOUR_KF_HOST/models/coco

Can you add a doc string for the module explaining that.y

gpu_serving/README.md, line 25 at r4 (raw file):

Previously, lluunn (Lun-Kai Hsu) wrote…

Done

Why don't they use they ksonnet app in the examples directory and just create a new enviornment?

jlewi · 2018-07-22T21:05:25Z

Took another look. I think only significant comment I have is whether we should be telling users to use the ksonnet app provided in the example and just adding and environment to it rather than creating a new app.

My thinking is that based on other examples we will want to provide the app as an easy way of making a whole bunch of resources that are part of the example available to users.

lluunn

Reviewable status: 0 of 17 files reviewed, 5 unresolved discussions (waiting on @jlewi, @lluunn, @texasmichelle, and @cwbeitel)

gpu_serving/predict.py, line 1 at r4 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Can you add a doc string for the module explaining that.y

done

gpu_serving/README.md, line 25 at r4 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Why don't they use they ksonnet app in the examples directory and just create a new enviornment?

Modified the instruction to cd to app without ks init.

lluunn · 2018-07-23T22:38:15Z

Modify the instruction to tell user to cd to that ks app instead of ks init. PTAL, thanks

jlewi

Reviewable status: 0 of 17 files reviewed, 2 unresolved discussions (waiting on @jlewi, @texasmichelle, and @cwbeitel)

gpu_serving/object-detection-app/components/params.libsonnet, line 9 at r4 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Can you change the default parameters to be parameters that will work for users here and below.

The default secret is "user-gcp-sa"

jlewi · 2018-07-24T22:56:51Z

I had one minor comment abou the default value for the GCP secret name but other than that this LGTM.

lluunn

Reviewable status: 0 of 17 files reviewed, 2 unresolved discussions (waiting on @jlewi, @texasmichelle, and @cwbeitel)

gpu_serving/object-detection-app/components/params.libsonnet, line 9 at r4 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

The default secret is "user-gcp-sa"

Done.

lluunn · 2018-07-24T23:36:27Z

Done. Thanks

jlewi

Reviewable status: 0 of 17 files reviewed, 1 unresolved discussion (waiting on @texasmichelle and @cwbeitel)

jlewi · 2018-07-25T04:44:24Z

/lgtm
/approve
/hold for @texasmichelle

@texasmichelle I know your busy with NEXT do you want @lluunn to wait until you can take another look?

k8s-ci-robot · 2018-07-25T04:44:26Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jlewi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

initial

ff8b921

k8s-ci-robot added the do-not-merge/work-in-progress label Jun 27, 2018

k8s-ci-robot requested review from cwbeitel and jlewi June 27, 2018 08:36

k8s-ci-robot added the size/S label Jun 27, 2018

wip

b2ddc34

k8s-ci-robot added size/XL and removed size/S labels Jun 28, 2018

working now

71527be

k8s-ci-robot added size/XXL and removed size/XL labels Jul 11, 2018

lluunn changed the title ~~wip don't review~~ Example of TF Serving with GPU Jul 11, 2018

k8s-ci-robot removed the do-not-merge/work-in-progress label Jul 11, 2018

fix

6e61dc9

k8s-ci-robot assigned jlewi and kunmingg Jul 11, 2018

lluunn added 3 commits July 11, 2018 12:24

fix lint

d784c7a

fix lint

c0c3ce7

fix lint

c9b5155

jlewi reviewed Jul 13, 2018

View reviewed changes

review

25242f3

lluunn added 2 commits July 20, 2018 10:46

Merge branch 'master' into gpu_ser

163f57c

move

f16a910

texasmichelle reviewed Jul 20, 2018

View reviewed changes

fix

a043d3b

jlewi suggested changes Jul 22, 2018

View reviewed changes

addressing comment

62d03fa

lluunn commented Jul 23, 2018

View reviewed changes

lint

3c76480

jlewi suggested changes Jul 24, 2018

View reviewed changes

fix

23c7306

lluunn commented Jul 24, 2018

View reviewed changes

jlewi approved these changes Jul 25, 2018

View reviewed changes

k8s-ci-robot added the lgtm label Jul 25, 2018

k8s-ci-robot added the approved label Jul 25, 2018

k8s-ci-robot merged commit 1746820 into kubeflow:master Jul 25, 2018


		## Setup

		If you followed previous steps to train the model, skip to deploy [section](#deploy-serving-component).

Example of TF Serving with GPU #154

Example of TF Serving with GPU #154

Conversation

lluunn commented Jun 27, 2018 • edited Loading

lluunn commented Jul 11, 2018

lluunn commented Jul 11, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlewi commented Jul 13, 2018

lluunn commented Jul 13, 2018

jlewi commented Jul 16, 2018

lluunn commented Jul 18, 2018

texasmichelle commented Jul 19, 2018

jlewi commented Jul 19, 2018

lluunn commented Jul 20, 2018

jlewi commented Jul 20, 2018

jlewi commented Jul 20, 2018

lluunn commented Jul 20, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

texasmichelle commented Jul 20, 2018

lluunn commented Jul 20, 2018

jlewi left a comment

Choose a reason for hiding this comment

jlewi commented Jul 22, 2018

lluunn left a comment

Choose a reason for hiding this comment

lluunn commented Jul 23, 2018

jlewi left a comment

Choose a reason for hiding this comment

jlewi commented Jul 24, 2018

lluunn left a comment

Choose a reason for hiding this comment

lluunn commented Jul 24, 2018

jlewi left a comment

Choose a reason for hiding this comment

jlewi commented Jul 25, 2018

k8s-ci-robot commented Jul 25, 2018

lluunn commented Jun 27, 2018 •

edited

Loading