Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare Notebook Controller for the new Jupyter UI #2349

Closed

Conversation

kimwnasptd
Copy link
Member

@kimwnasptd kimwnasptd commented Jan 30, 2019

This PR adds various fixes and updates to the sync-notebook Webhook of the Notebook Controller that is responsible for creating child resources for each Notebook CR. The suggested changes are a prerequisite to ensure that the new, forthcoming Jupyter UI can directly manage Notebook CRs and preserve existing functionality.

@jlewi @kam We would really appreciate your perspective on this to make sure we are all on the same page. Hereaupon, we briefly describe the issues we diagnosed while working with the existing Notebook Controller and its Webhook:

  • The child Service does not use WebSockets and, thus, terminals and Python interpreters hang inside the JupyterLab UI. HTTP GET requests from inside the Notebook Pod result in 403 Forbidden responses.

  • The child Deployment always overrides the Notebook CR templateSpec with the podTemplateSpec that contains certain default values. This is not desirable when Users want to spawn a Notebook that overrides default configuration.

  • The child Deployment uses singled-quoted True/False values in two container spec args related to the Jupyter Notebook, so their values were treated as String and not as Boolean. This results in the Notebook Pod going to CrashLoopBackOff status with the following error in its logs:

    [C 15:56:44.180 LabApp] Bad config encountered during initialization:                                                                                                                                                                                                           
    [C 15:56:44.180 LabApp] The 'allow_remote_access' trait of a LabApp instance must be a boolean, but a value of 'True' <class 'str'> was specified.   
    

    Related sources:
    https://github.com/jupyter/notebook/blob/master/notebook/notebookapp.py#L845
    https://github.com/jupyter/notebook/blob/master/notebook/notebookapp.py#L633

    NOTE:
    According to this issue, root access should not be allowed in a JupyterLab environment. However, the Notebook Controller currently sets the allow_root argument to True. Which is the desired behavior?

  • The Notebook Pod does not use the latest stable Tensorflow Docker image, as the current JH Spawner UIs do

In addition, since the new Jupyter UI supports Volumes for Notebooks, we suggest the following additions to the Notebook Controller:

  • Set the “volumes” field in podTemplateSpec and set its default value to an empty list, as fallback
  • Explicitly set the securityContext (fsGroup) of the Notebook Pod to avoid permission errors when accessing attached volumes from inside the Notebooks
  • Remove the unused optional disks and pvcMount ksonnet parameters from the notebooks KF component. In the future, we might need to go over the rest of the defined parameters and cross-check which are currently needed.

This change is Reviewable

The declaration of the child Deployment should be aware of the entire
spec. Thus, `podTemplateSpec` and `templateSpec` should be merged by
preserving inner keys.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Ioannis Androulidakis <ioannis@arrikto.com>
The `fsGroup` field should be explicitly set to avoid permission
errors when accessing the volumes of the Notebook.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Ioannis Androulidakis <ioannis@arrikto.com>
Inside the podTemplateSpec declaration, the LabApp `allow_remote_access`
and `allow_root` args should have values of type Boolean, not String.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Ioannis Androulidakis <ioannis@arrikto.com>
Otherwise, HTTP GET requests from inside the Notebook Pod result in
403 error responses.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Ioannis Androulidakis <ioannis@arrikto.com>
* Remove the unused, optional `disks` and `pvcMount ksonnet parameters from the
  notebooks component
* Add a key for volumes in the PodTemplateSpec definition and set its
  default value to an empty list
* Update the default container image to "tensorflow-1.12.0-notebook-cpu:v0.4.0"

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Ioannis Androulidakis <ioannis@arrikto.com>
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: kkasravi

If they are not already assigned, you can assign the PR to them by writing /assign @kkasravi in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Hi @kimwnasptd. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ioandr
Copy link
Contributor

ioandr commented Jan 30, 2019

/ok-to-test

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Ioannis Androulidakis <ioannis@arrikto.com>
@ioandr ioandr force-pushed the feature-update-notebook-controller branch from fe0d986 to 36d55ff Compare January 30, 2019 14:09
@ioandr
Copy link
Contributor

ioandr commented Jan 30, 2019

/test kubeflow-presubmit

@ioandr
Copy link
Contributor

ioandr commented Jan 30, 2019

/assign @kkasravi

@ioandr
Copy link
Contributor

ioandr commented Jan 30, 2019

/retest

@jlewi
Copy link
Contributor

jlewi commented Feb 1, 2019

@kkasravi Any comment on this?

@kimwnasptd @ioandr @lluunn Has reimplemented the controller in go #2336 so I think you'll need to make the same changes there once its merged.

@kimwnasptd
Copy link
Member Author

Thanks for the heads up @jlewi. We will keep an eye on it and port these changes in a new PR once the new Notebook Controller gets merged.

@kimwnasptd
Copy link
Member Author

We thought it was good to give an early warning because these are issues that @lluunn may also bump into, when testing the new Go operator.

@lluunn
Copy link
Contributor

lluunn commented Feb 1, 2019

Thanks for the heads up

restartPolicy: "Always",
securityContext: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should 100 be a param like params.notebookGid?

@jlewi
Copy link
Contributor

jlewi commented Apr 25, 2019

Obsolete.
We are now using a go based controller.

@jlewi jlewi closed this Apr 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants