Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UX Discussion: Nomenclature about notebooks / Jupyter #2521

Closed
vkoukis opened this issue Feb 21, 2019 · 8 comments
Closed

UX Discussion: Nomenclature about notebooks / Jupyter #2521

vkoukis opened this issue Feb 21, 2019 · 8 comments
Assignees

Comments

@vkoukis
Copy link
Member

vkoukis commented Feb 21, 2019

Hello, following the discussion in the latest community meeting, I am opening an issue to solicit feedback on what would be the best way to name notebook-related resources in the UI. It would be ideal if we could follow a uniform approach and have a unified experience throughout the Kubeflow UI, so I think this discussion is of interest to @avdaredevil and @jlewi .

Here are some starting thoughts, based on the discussion in the community meeting:

  • The New Notebook UI [issue Jupyter notebook manager UI that uses the new CRD #1995 , PR Jupyter UI that manages Notebook CRs #2357, by @kimwnasptd and @ioandr ] uses @lluunn 's Go controller to create pods that run Jupyter inside them.
  • We don't want to be referring to "pods", and "namespaces" and other K8s entities in the UI. It's best to talk in Data Scientist-friendly terms.
  • There's two different things that we call "Notebooks":
    • The actual notebook file (*.ipynb), that the data scientist opens
    • The actual server/environment that the user needs to spawn and to which they connect, so they can open their notebooks. We run this as a pod, on K8s.
  • The new UI allows multiple notebook servers / pods to run at once.
  • Running multiple servers is important so users can have multiple things running in parallel, perhaps in different namespaces, so they can collaborate in different teams.

So, here is a step-by-step description, with two alternatives:

  1. The Notebook UI is accesible as Notebooks from the central dashboard. If I am a data scientist, and I want to find myself inside a notebook to start working, and I see a link called "Notebooks", I will follow it.
  2. I am shown a list of my notebook pods/servers. I have no notebook servers, so I click on ➕ to create my new notebook server. I can have multiple notebook servers. My notebook servers have a name, so I can actually tell them apart. [Note there is no reference to "K8s" or "pods" at this point]. My notebooks run inside my notebook servers. [There's at least two different options of what to call items in this list, more below].
  3. Once my server is done [I can see it changing going various transient states until it becomes a nice green Running], I can connect to it.
  4. Once connected, I find myself inside a familiar environment, and I can open my notebooks / *.ipynb files.

So, there's two alternatives for step (2), none of which can be 100% precise, so we have to compromise:

  • Option A: The list says "Notebooks ➕ " at the top. The label at the top of the list is simpler, "Notebooks", but there is the risk of confusing the actual notebook files with the servers to which I connect to open said files. Naming/renaming things in this list of servers does not rename my actual notebooks files. And if I connect to a "Notebook" from the list, why am I then shown multiple notebooks files? I think this is the argument raised during the community meeting, please chime in.

  • Option B: The list says "Notebook Servers ➕ " at the top. The list is an explicit list of servers, not notebook files, the names refer to distinct servers, and I can connect to them, to see the notebook files that live inside them. But then, the drawback is that the left-hand side item I clicked on was "Notebooks", and I was actually shown a list of "Notebook Servers" when I selected it.

So, what do you think would cause the minimal confusion?

If we allow the slight discrepancy between what's shown on the navigation panel ("Notebooks") and what's shown as a title to the list ("Notebook Servers") I prefer option B: it combines short, simple words in the left-hand side navigation panel that draw attention to the different functional parts of the platform (Pipelines, Notebooks, etc.), while still being explicit with what is actually shown inside the list ("Notebook Servers", not "Notebooks" themselves).

Looking forward to any feedback you may have!

@vkoukis
Copy link
Member Author

vkoukis commented Feb 21, 2019

/assign vkoukis
/assign avdaredevil
/assign jlewi

@jlewi
Copy link
Contributor

jlewi commented Feb 21, 2019

I don't have a strong opinion.

@jlewi
Copy link
Contributor

jlewi commented Feb 25, 2019

@vkoukis What's the priority of this issue? Do we need to get it done in 0.5?

@jlewi jlewi added area/jupyter Issues related to Jupyter kind/discussion labels Feb 25, 2019
@pdmack
Copy link
Member

pdmack commented Mar 5, 2019

"Notebook Servers" seems technically correct. I kind of feel that those who have used Kubeflow trade freely and comfortably between the concepts of notebooks and notebook pods.

@jtfogarty
Copy link
Contributor

jtfogarty commented Mar 6, 2019

I'm comfortable with Notebook Servers

Will namespace be used or will we change it? I'm in favor of using the term Project, here is where we are heading;

We are building an on-prem cluster where each node will have 10TB of space. Rook/Ceph will be used to manage disk space. (Features with Rook; Block storage is not available for RWX; Rook/Cephfs is available for RWX but does not support dynamic provisioning)

Below are examples of groups that will use the system to build models.

  • Internal Fraud
  • Credit Card Fraud
  • Emerging Risk
  • Marketing

Emerging Risk will come and execute a Human Trafficking project. Marketing will execute a spring ad campaign, etc.

Each Project should have their own shared data storage
No Project should have access to another project's data storage.

Groups can have many Projects
Projects can have many Notebook Servers
A Notebook Server must have User storage and shared data storage

I’m not advocating creating a Group entity but having Project Name which equates to a namespace would be helpful.

There will need to be an Onboarding process where the following are created

  • Project Name (namespace)
  • Shared Data storage
  • User storage

Project/Namespace isolation will also benefit when processing PCI / HIPAA data

This is probably more information than necessary for this thread but was helpful writing about it.

@jtfogarty
Copy link
Contributor

Question, How do we control User / Project(namespace) access?

@chenglinzhang
Copy link

For resource authorization to work, we will need the ability to define/maintain user ids and team structures also in the UI. The problem is how to map user ids and team structures to Kubernetes.

@stale
Copy link

stale bot commented Jun 16, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot closed this as completed Jun 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants