Skip to content

Commit

Permalink
Bump up supported Pytorch operator versions from v1alpha2/v1beta1 to …
Browse files Browse the repository at this point in the history
…v1beta1/v1beta2 to support Kubeflow 0.5

- Refactor training manifests from v1alpha2 to v1beta2
- Update documents
  • Loading branch information
dsdinter committed May 1, 2019
1 parent d474d51 commit 1fa7f80
Show file tree
Hide file tree
Showing 5 changed files with 8 additions and 8 deletions.
12 changes: 6 additions & 6 deletions pytorch_mnist/02_distributed_training.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,20 +20,20 @@ Since this is a strong scaling example, we should perform an average after the a

Deploy the PyTorchJob resource to start training the CPU & GPU models:

### If running on Kubeflow 0.3.x:
### If running on Kubeflow 0.4.x:
```bash
cd ks_app
ks env add ${KF_ENV}
ks apply ${KF_ENV} -c train_model_CPU_v1alpha2
ks apply ${KF_ENV} -c train_model_GPU_v1alpha2
ks apply ${KF_ENV} -c train_model_CPU_v1beta1
ks apply ${KF_ENV} -c train_model_GPU_v1beta1
```

### If running on Kubeflow 0.4.x or newer:
### If running on Kubeflow 0.5.x or newer:
```bash
cd ks_app
ks env add ${KF_ENV}
ks apply ${KF_ENV} -c train_model_CPU_v1beta1
ks apply ${KF_ENV} -c train_model_GPU_v1beta1
ks apply ${KF_ENV} -c train_model_CPU_v1beta2
ks apply ${KF_ENV} -c train_model_GPU_v1beta2
```

## What just happened?
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"apiVersion":"kubeflow.org/v1beta2","kind":"PyTorchJob","metadata":{"name":"pytorch-mnist-ddp-cpu"},"spec":{"pytorchReplicaSpecs":{"Master":{"replicas":1,"restartPolicy":"OnFailure","template":{"spec":{"containers":[{"image":"gcr.io/kubeflow-examples/pytorch-mnist/traincpu","name":"pytorch","volumeMounts":[{"mountPath":"/mnt/kubeflow-gcfs","name":"kubeflow-gcfs"}]}],"volumes":[{"name":"kubeflow-gcfs","persistentVolumeClaim":{"claimName":"kubeflow-gcfs","readOnly":false}}]}}},"Worker":{"replicas":3,"restartPolicy":"OnFailure","template":{"spec":{"containers":[{"image":"gcr.io/kubeflow-examples/pytorch-mnist/traincpu","name":"pytorch","volumeMounts":[{"mountPath":"/mnt/kubeflow-gcfs","name":"kubeflow-gcfs"}]}],"volumes":[{"name":"kubeflow-gcfs","persistentVolumeClaim":{"claimName":"kubeflow-gcfs","readOnly":false}}]}}}}}}

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"apiVersion":"kubeflow.org/v1beta2","kind":"PyTorchJob","metadata":{"name":"pytorch-mnist-ddp-gpu"},"spec":{"pytorchReplicaSpecs":{"Master":{"replicas":1,"restartPolicy":"OnFailure","template":{"spec":{"containers":[{"image":"gcr.io/kubeflow-examples/pytorch-mnist/traingpu","name":"pytorch","resources":{"limits":{"nvidia.com/gpu":1}},"volumeMounts":[{"mountPath":"/mnt/kubeflow-gcfs","name":"kubeflow-gcfs"}]}],"volumes":[{"name":"kubeflow-gcfs","persistentVolumeClaim":{"claimName":"kubeflow-gcfs","readOnly":false}}]}}},"Worker":{"replicas":3,"restartPolicy":"OnFailure","template":{"spec":{"containers":[{"image":"gcr.io/kubeflow-examples/pytorch-mnist/traingpu","name":"pytorch","resources":{"limits":{"nvidia.com/gpu":1}},"volumeMounts":[{"mountPath":"/mnt/kubeflow-gcfs","name":"kubeflow-gcfs"}]}],"volumes":[{"name":"kubeflow-gcfs","persistentVolumeClaim":{"claimName":"kubeflow-gcfs","readOnly":false}}]}}}}}}

0 comments on commit 1fa7f80

Please sign in to comment.