Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[discussion] Whether to create CRD in helm charts #353

Closed
gaocegege opened this issue Jan 30, 2018 · 6 comments
Closed

[discussion] Whether to create CRD in helm charts #353

gaocegege opened this issue Jan 30, 2018 · 6 comments

Comments

@gaocegege
Copy link
Member

gaocegege commented Jan 30, 2018

Ref #345 (comment)

CRD creation is in the set-up stage.

Charts is excepted to be deployed be helm automatically.
If you want to create CRD automatically along with TFJobs, maybe we need add some dependency here (not just put a yaml here):

  • create CRD first
  • deploy a TFJob later

\cc @jlewi @ScorpioCPH @DjangoPeng

@jlewi
Copy link
Contributor

jlewi commented Jan 30, 2018

Won't the TFJob controller just crash-loop until the CRD exists? We could have the controller just wait and retry until the CRD exists.

@gaocegege
Copy link
Member Author

gaocegege commented Jan 31, 2018

@jlewi It does wait and retry. The operator would work after the CRD is created:

Before CRD created:

E0131 12:44:30.111386   20851 reflector.go:205] github.com/tensorflow/k8s/pkg/client/informers/externalversions/factory.go:61: Failed to list *v1alpha1.TFJob: the server could not find the requested resource (get tfjobs.kubeflow.org)
I0131 12:44:31.111553   20851 reflector.go:240] Listing and watching *v1alpha1.TFJob from github.com/tensorflow/k8s/pkg/client/informers/externalversions/factory.go:61
E0131 12:44:31.112217   20851 reflector.go:205] github.com/tensorflow/k8s/pkg/client/informers/externalversions/factory.go:61: Failed to list *v1alpha1.TFJob: the server could not find the requested resource (get tfjobs.kubeflow.org)

After:

I0131 12:44:31.356222   20851 leaderelection.go:199] successfully renewed lease default/tf-operator
I0131 12:44:32.112320   20851 reflector.go:240] Listing and watching *v1alpha1.TFJob from github.com/tensorflow/k8s/pkg/client/informers/externalversions/factory.go:61
I0131 12:44:32.158485   20851 shared_informer.go:116] caches populated
I0131 12:44:32.158522   20851 controller.go:140] Starting %v workers1
I0131 12:44:32.158547   20851 controller.go:146] Started workers
I0131 12:44:34.363940   20851 leaderelection.go:199] successfully renewed lease default/tf-operator

And @ScorpioCPH 's idea is to move crd creation out of helm chart

@jlewi
Copy link
Contributor

jlewi commented Jan 31, 2018

If we move CRD creation out of the helm chart then the user has 2 things to deploy right?

In the context of Kubeflow, we use ksonnet to create both the CRD and deploy the operator. That seems equivalent to having the helm chart create the CRD and deploy the controller.

@gaocegege
Copy link
Member Author

@jlewi I think so, and BTW, I was wondering why do we use helm chart in this repo but ksonnet in kubeflow/kubeflow?

@jlewi
Copy link
Contributor

jlewi commented Feb 4, 2018

@gaocegege The use of helm predated our use of ksonnet. We should switch to using ksonnet and get rid of the helm chart #239

@gaocegege
Copy link
Member Author

LGTM and I am closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants