Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatefulSet supports all kinds of RestartPolicy #42892

Closed

Conversation

ohmystack
Copy link
Contributor

@ohmystack ohmystack commented Mar 10, 2017

What this PR does / why we need it:

Make StatefulSet to support all kinds of PodSpec.RestartPolicy, instead of allowing "Always" only.
We can allow the Pods in StatefulSet to use "OnFailure" and "Never" to exit when their work is done, instead of restarting over and over again.
In the meantime, it still follows the definition of StatefulSet, which is ordered and stateful.
To implement this, we only need to change the way how we determine a Pod is alive for different RestartPolicy.

Special notes for your reviewer:

A simple use-case:

In machine learning, for training a model in distributed way, we define a bunch of ordered stateful container workers, e.g. "worker-0" "worker-1" ... "worker-n". (They use numbers as their suffix. StatefulSet can manage these continuous numbers for us.)
When training begins, these containers talk to each other by their DNS records. (StatefulSet can help us to make their DNS records.)
When the training is done, all the workers save the result into their persistent volumes and exit 0. (StatefulSet helps us to manage their PVs.)
Then, all the stateful Pods stop there in success state, and we still can check their logs.

You see, this is a suitable scene to use StatefulSet, and it is really helpful to allow to use "OnFailure" and "Never" RestartPolicy in StatefulSet.

Release note:

StatefulSet supports pod restart policy: "OnFailure" and "Never"

@k8s-ci-robot
Copy link
Contributor

Hi @ohmystack. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with @k8s-bot ok to test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

Once you've signed, please reply here (e.g. "I signed it!") and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Mar 10, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

The following people have approved this PR: ohmystack

Needs approval from an approver in each of these OWNERS Files:

We suggest the following people:
cc @smarterclayton
You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Mar 10, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@ohmystack
Copy link
Contributor Author

I signed it!

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 10, 2017
@ohmystack
Copy link
Contributor Author

@bprashanth @enisoc @foxish @janetkuo Would you mind reviewing?

@0xmichalis
Copy link
Contributor

@kubernetes/sig-apps-api-reviews

@ncdc ncdc removed their assignment Mar 13, 2017
@ncdc ncdc added the sig/apps Categorizes an issue or PR as relevant to SIG Apps. label Mar 13, 2017
@kow3ns
Copy link
Member

kow3ns commented Mar 13, 2017

@ohmystack There are a few issues here.

  1. StatefulSet really wan't intended to be used for workflow management.
  2. None of the other controllers (DaemonSet, Deployment, ReplicaSet, ReplicationController) support the restart policy semantics you are proposing.
  3. Based on your use case, you don't really need the modification to StatefulSet. Instead of ''When the training is done, all the workers save the result into their persistent volumes and exit 0. ", you can just have each worker signal completion in some other manner (e.g. API callback, TPR Object, write to a file, etc.)

@ohmystack
Copy link
Contributor Author

ohmystack commented Mar 13, 2017

@kow3ns
Thanks for your reply. Please allow me to provide some arguments and thinking here.
I know there is a history of these controllers allowing only "Always" RestartPolicy. However, it doesn't mean that the rule could not be changed.

  • "DaemonSet" is a "daemon", which should not be stopped, its ok not having other RestartPolicy.
  • "ReplicaSet", "RC", "Deployment" and here "StatefulSet" are focused on the stable number of instances, which doesn't mean that the instances should not be a "successed" state if the user allows.

As far as I am concerned, the default RestartPolicy sets to "Always", that's reasonable and can prevent mistakes.
However, if a user clearly know what he/she wants the app to do when the app "exit 0" or "exit 1", it is totally fine to release the control power to the user. Let them to explicitly choose the other RestartPolicy.

To your 3rd issue, the problem is, when the containers exit 0, they are restarted right now. Their logs are lost. In my scene, the container images are provided by end-users, I cannot easily change their entrypoints or commands to redirect the stdout into a file.
There are two solutions for me without adding RestartPolicy support in StatefulSet,

  1. I build a logging system to collect container logs in live.
  2. Create N Pods one-by-one manually in program, and watch them.

However, I think the "Pod" is too light for those long running jobs. (And, another choice, the "batch/Job" is not fit for scientific computing.) Using "Pod", if a node dies, the Pod can do nothing but waiting for my service to watch the event and recreate one. In this way, my service just becomes a controller, it's like "Deployment/StatefulSet + Allowing OnFailure RestartPolicy". The funny thing is that, people will have to write a whole new controller besides Kubernetes for just allowing OnFailure/Never RestartPolicy to "Deployment/StatefulSet".

@kow3ns
Copy link
Member

kow3ns commented Mar 13, 2017

@ohmystack
ReplicaSets are meant to make sure a specific number of Pods are running at a particular time, DeamonSets are meant to make sure that all (or some) Nodes run a copy of a Pod, and StatefulSets are meant to make sure that a specific number of unique copies of a Pod are running.
The proposed changes differ from the intentions of the design of these controllers. They were never meant to support the life cycles of terminating processes, and my concern is that implementing those semantics now will hamper us from evolving these controllers to better suite their design intentions. Also, I would think that, changing their semantics in the way you request should be put forward as a design document, to be discussed in the open with the community, prior to merging a PR that implements it.
I get that what you really want is a hybrid of StatefulSet and Job to support your workload, but implementing a StatefulJob extension or a custom controller/operator to support your use case, imo, would be the correct approach.

@ohmystack
Copy link
Contributor Author

@kow3ns You got my point.
I like the "StatefulJob extension" idea.
I also agree with you about providing a design document to the community.
So, next,

  1. I plan to propose a design document of adding RestartPolicy to StatefulSet in Google Groups #kubernetes-sig-apps channel, and discuss this. (Because I find that when the StatefulSet is first designed, there was already someone talking about this. Maybe it's a nice try.)
  2. Then, I can try to make a "StatefulJob" proposal.

I'll close this PR now, and thanks for your discussion above!
If you have any advices about how can I discuss this design with the community better, please tell me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants