Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Tensorboard Controller to Support PVCs #5039

Closed
kandrio opened this issue May 25, 2020 · 10 comments · Fixed by #5218
Closed

Extend Tensorboard Controller to Support PVCs #5039

kandrio opened this issue May 25, 2020 · 10 comments · Fixed by #5218

Comments

@kandrio
Copy link
Contributor

kandrio commented May 25, 2020

/kind feature

Currently, the Tensorboard controller does not fully support arbitrary PVCs being mounted to pods running the Tensorboard server. In fact, it can only support PVCs named tb-volume.

I would like to extend the Tensorboard controller so that it fully supports any PVC created by the user in the namespace of the Tensorboard object applied. This is actually a part of my Google Summer of Code project for building a Tensorboard UI that also supports PVCs.

My proposition would be to extend the Tensorboard controller so that it obtains the name of the PVC by the spec.logspath field of the Tensorboard CR applied by the user. In detail, if a user wants to mount a PVC to a pod running the Tensorboard server, then the value of spec.logspath should follow the format: pvc://{pvc-name}/{local-path} (similar to what kfserving uses in its InferenceService CRDs). This way, I will be able to extend the code of the controller so that it creates a deployment that mounts the desired PVC to the pod running the Tensorboard server.
The extended controller will also be backwards compatible.

cc @jlewi @kimwnasptd @elikatsis

@issue-label-bot
Copy link

Issue Label Bot is not confident enough to auto-label this issue.
See dashboard for more details.

1 similar comment
@issue-label-bot
Copy link

Issue Label Bot is not confident enough to auto-label this issue.
See dashboard for more details.

@kandrio
Copy link
Contributor Author

kandrio commented Jun 24, 2020

I have extended the Tensorboard Controller so that it supports PVCs, as described above.

Also, there is another problem, which is related to this kubernetes issue. More specifically, If we use a ReadWriteOnce PVC as a logdir for Tensorboard Server and the PVC is already mounted by another running pod P, then the Tensorboard Server pod will not always be scheduled on the same node as P. As a result, the Tensorboard Server pod gets blocked since multi-node access is prohibited on ReadWriteOnce volumes.

I think it is really important that PVCs with ReadWriteOnce access mode are fully supported, because ReadWriteOnce volumes are the most widely used volumes out there.

In order to fix the problem, I propose that nodeAffinity should be added to the spec.template.spec.affinity field of the Tensorboard Server deployment so that both pods are scheduled on the same node.

@jlewi
Copy link
Contributor

jlewi commented Jun 26, 2020

In order to fix the problem, I propose that nodeAffinity should be added to the spec.template.spec.affinity field of the Tensorboard Server deployment so that both pods are scheduled on the same node.

This feels like a bit of a brittle solution. My interpretation of ReadWriteOnce is that it should only be mounted on a single pod at a time. Trying to work around that seems like a brittle solution.

Its not clear to me why supporting ReadWriteOnce is important. In general, the recommendation is to write event files to an appropriate objectstore (GCS, S3, HDFS, etc...). In which case you avoid issues with PVs altogether and can easily read from multiple pods.

If people really want to use PVs then it seems reasonable to require that either

  1. They use a ReadWriteMany PVC in order to be able to mount it simultaneously on multiple pods
  2. Live with the restrictions that a ReadWriteOnce pod supports

@kandrio
Copy link
Contributor Author

kandrio commented Jul 2, 2020

Its not clear to me why supporting ReadWriteOnce is important. In general, the recommendation is to write event files to an appropriate objectstore (GCS, S3, HDFS, etc...). In which case you avoid issues with PVs altogether and can easily read from multiple pods.

Yes, supporting object stores is an option for storing/retrieving event files, but I think that PVCs provide an easier/more intuitive workflow for the end user. More specifically, users might prefer using PVCs instead of object stores because:

  1. Object stores add complexity since they require a service to be configured/available in the cluster.
  2. In terms of ease of use, it is a lot simpler for users to just store their logs on a directory than pushing/pulling them to/from a service by making API calls or using libraries in their code.

Let's also not forget that users are already using PVCs, and mostly ReadWriteOnce ones, for storing their data and working with Notebooks and Pipelines.

My interpretation of ReadWriteOnce is that it should only be mounted on a single pod at a time. Trying to work around that seems like a brittle solution.

I think there might be a misunderstanding here. ReadWriteOnce PVCs can be mounted by a single node at a time (see docs), not by a single pod at a time. This holds for GKE as well (see docs). That said, mounting the same ReadWriteOnce PVC on multiple pods on the same node is actually a supported (and necessary for quite a few use cases) feature.

Now, the problem arises when pods, running on different nodes, try to access the same ReadWriteOnce PVC. More specifically, If we use a ReadWriteOnce PVC as an event file storage for Tensorboard server and the PVC is already mounted by another running pod (e.g. the Notebook server) then the Tensorboard server will not always be scheduled on the same node as that pod (kubernetes/kubernetes#26567). As a result, the Tensorboard server pod gets blocked. So, we need to specify preference that the scheduler should assign the Tensorboard server pod to the node that has mounted the ReadWriteOnce PVC. And my proposed solution would be to use nodeAffinity for that task.

@jlewi
Copy link
Contributor

jlewi commented Jul 4, 2020

Thanks @kandrio98 these are great points.

Based on your points I took a deeper look at the code to understand what's going on.

See my comment here: #3578 (comment)

Should we resolve that first before getting too deep into how we manage PVCs?

Some inline comments below.

I think there might be a misunderstanding here. ReadWriteOnce PVCs can be mounted by a single node at a time (see docs), not by a single pod at a time.

If you make this work by forcing a bunch of pods to be scheduled on the same node, will this lead to unexpected failures? e.g. suppose I start a TB pod and that is the first to schedule a PVC. TB doesn't really need GPUs so it gets scheduled on a node with no GPUs. Now I schedule a jupyter pod which requests the PVC and a GPU. So now you have conflicting scheduling constraints; you want to schedule on a different node with GPU but also schedule on the same node with the PVC.

Can we offload these type of scheduling decisions to the cluster?

Yes, supporting object stores is an option for storing/retrieving event files, but I think that PVCs provide an easier/more intuitive workflow for the end user.

Even for Cloud? In Cloud if you store your data on GCS or S3 you get the benefit of

  • Elasticity
  • Cost effectiveness
  • R/W from any pods

Isn't that easier then working about moving data to/from disks?

When would you recommend using PVCs over S3/GCS in Cloud?

Generally, I recommend PVCs on when dealing with legacy applications that don't speak object store. TensorBoard is not such an application. TensorBoard and TensorFlow can read/write to S3 and GCS directly.

Jupyter is a legacy application. The notebook gets stored on PVC because Jupyter doesn't know how to read/write GCS. I think storing on object store would be much better.

I think that's why Netflix built bookstore to make it easy to save your notebooks on S3.

@kandrio
Copy link
Contributor Author

kandrio commented Jul 15, 2020

Currently, the Tensorboard Controller provides very basic functionalities for PVCs. My proposition aimed at making the use of RWO PVCs a little easier for users. That said, the extra scheduling functionalities that I proposed could become optional.

More specifically, we could add an enviromental variable, say RWO_PVC_SCHEDULING, to the Tensorboard CR Controller. This variable will be set to false by default. If a user wants Tensorboard Servers to be scheduled on the same node as a RWO PVC log storage, then the cluster admin could set RWO_PVC_SCHEDULING to true.

@jlewi what are your thoughts on that?

@jlewi
Copy link
Contributor

jlewi commented Jul 16, 2020

@kandrio98 See my comment #3578 (comment)

Would that help us in terms of making progress to a long term goal of having a solid story for web apps? Or would it end up just end up being removed later?

If its just going to be removed later in favor of something more general (e.g. a pod template spec) is it still worth doing?

@issue-label-bot issue-label-bot bot added the area/jupyter Issues related to Jupyter label Jul 16, 2020
@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
area/jupyter 0.99

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@kandrio
Copy link
Contributor Author

kandrio commented Jul 29, 2020

First of all, I really agree with your thoughts on creating a solid story for managing our stateful web apps in a cohesive way.

Would that help us in terms of making progress to a long term goal of having a solid story for web apps? Or would it end up just end up being removed later?

I believe that this added scheduling functionality (even if optional) would be useful, currently and in the future. It would make it easier for users to use the Tensorboard web-app in the case that their event files are stored in a RWO PVC, regardless of the Custom Resource that we are using.

Also, since the implementation of this UX feature is part of my GSoC proposal, I have written the necessary code.

@jlewi what do you propose?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants