pytorch

Installation & deployment tips

You need to configure your node to utilize GPU. This can be done the following way:

Install nvidia-docker2

Connect to your MasterNode and set nvidia as the default run in /etc/docker/daemon.json:

{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

After that deploy nvidia-daemon to kubernetes:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.11/nvidia-device-plugin.yml

NVIDIA GPUs can now be consumed via container level resource requirements using the resource name nvidia.com/gpu:
```
resources:
  limits:
      nvidia.com/gpu: 2 # requesting 2 GPUs
```
Building image. Each example has prebuilt images that are stored on google cloud resources (GCR). If you want to create your own image we recommend using dockerhub. Each example has its own Dockerfile that we strongly advise to use. To build your custom image follow instruction on TechRepublic.
To deploy your job we recommend using official kubeflow documentation. Each example has example yaml files for two versions of apis. Feel free to modify them, e.g. image or number of GPUs.

Note: PyTorch job doesn’t work in a user namespace by default because of Istio automatic sidecar injection. In order to get it running, it needs annotation sidecar.istio.io/inject: "false" to disable it for either PyTorch pods or namespace. For example:

template:
  metadata:
    annotations:
      sidecar.istio.io/inject: "false"

Name		Name	Last commit message	Last commit date
parent directory ..
cpu-demo		cpu-demo
elastic		elastic
image-classification		image-classification
language-modeling		language-modeling
mnist		mnist
smoke-dist		smoke-dist
text-classification		text-classification
README.md		README.md
simple.yaml		simple.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytorch

pytorch

README.md

Installation & deployment tips

Files

pytorch

Directory actions

More options

Directory actions

More options

Latest commit

History

pytorch

Folders and files

parent directory

README.md

Installation & deployment tips