-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubectl run --restart=Never
restarts (creates a Job)
#24533
Comments
I can include the full command line and the reported pods:
|
I use this often:
I'd asked on the brainstorming doc that we handle this better, a bug probably makes more sense, but there is a version that does what I (both of us?) want. |
to clarify, there IS actually an error in the commandline - "exit" is not an executable, but a shell builtin.
Now instead of |
Note: the same happens when creating a job with the command |
cc @mikedanese @erictune on |
@thockin job controller's role is to execute your assignment successfully with the given parameters. That |
@thockin Does this mean that this behaviour is expected? If so how to create a job that is not restarted when returns a non-zero exit status? The documentation seems suggesting to specify |
Maciej, Yeah, I know that, but I'm arguing it's wrong. We already use --restart as On Mon, Apr 25, 2016 at 8:33 AM, Maciej Szulik notifications@github.com
|
So, I think tim would be happy with the following:
I would approve of that. On Mon, Apr 25, 2016 at 12:46 PM, Tim Hockin notifications@github.com
|
I think that is what I arrived at, too. On Mon, Apr 25, 2016 at 2:24 PM, Eric Tune notifications@github.com wrote:
|
|
details details :) On Mon, Apr 25, 2016 at 2:39 PM, Janet Kuo notifications@github.com wrote:
|
Actually that depends which cluster version you have, prior to 1.2 it does create RC 😉 I'm totally OK with the proposed solution, but I'd like to hear from @bgrant0607 as well. I think (but it might be my memory failing me) that he wanted |
Note that --replicas=N should be supported, so it would be N pods if we were to generate pods. |
I would rather not change kubectl run behavior again. I have to agree the current run behavior is unintuitive, but apparently so is the current Job behavior. |
The initial implementation of a job worked that way, but then I was pointed out we should only pass that to Kubelet and not look at it in the controller. Which, by its design, will always try to reach specified completions. The only noticeable difference between Never and OnFailure in job is that the former shows number of failures the job had, whereas the latter does not. |
Well, Job is v1, so I don't think we can change the behavior without adding another knob. |
Given that restart=Never behavior for kubectl run is currently useless and confusing, I guess the least-bad option would be to produce N pods for now. |
cc @kubernetes/kubectl |
@bgrant0607 the other option is we could do conditionally only for newer jobs (iow. from |
@bgrant0607 restart=Never could be useful if you are running an image that for some reason expects a clean EmptyDir every time. Maybe it writes temporary files, and would be confused if it crashes and then finds the temporary files in place? Maybe? This is a stretch. |
I have had people ask for Pods that run at most once, rather than at-least-once. |
PDS, once specified, is counting always and starts from the last time there was any progress (scaled up or down pods during a rollout). |
Infant mortality detection: #18568 |
activeDeadlineSeconds was originally intended to eventually terminate failing Jobs. Unlike with Deployment, a Job seems less likely to start failing after making some progress. Ok, let's have Let's suggest job-level activeDeadlineSeconds to users to ensure jobs don't crashloop forever. I propose we add infant-mortality detection to Job (eventually), with no knob, and backoff in the event of failures in the middle of Job execution (#2529). Unlike controllers for continuously running applications (Deployment, ReplicaSet, ReplicationController, DaemonSet, PetSet), Jobs don't have the expectation of existing forever. If a Job doesn't complete, fail-fast with notification may be better. |
having just one pod seems like a non-starter, unless we demand that On Fri, May 6, 2016 at 2:45 AM, Maciej Szulik notifications@github.com
|
We should, by default, retry a job if, the node was rebooted, or crashed, or the node OOM'ed or the Job was OOM-killed, while still under its memory limit, or otherwise evicted by the system. If the job has a rare race condition and crashes with a segfault, it is worth restarting. If the job has an external dependency, like some remote service that it needs to talk to, and it crashes when it can't talk to that service, but that remote service will be repaired by someone else pretty soon, then it is worth retrying the job with exponential backoff maybe. If the job hits a peak of resource usage after some time, and hits its specified memory limit, then it is debatable whether we should restart it.
Most of the use cases I can think of suggest retrying, at least without further information. |
Flakes aside, CI (build/test) failures are typically going to be deterministic. Interactive pods should also only run once, in general. |
While #25253 fixes the issue @thockin brought up here, it still feels strange that the pod of |
Automatic merge from submit-queue kubectl run --restart=Never creates pods Fixes #24533. @bgrant0607 @janetkuo ptal /fyi @thockin ```release-note * kubectl run --restart=Never creates pods ``` [![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
I think I would want the pod around in case of failure. Agreed we should remove it in case of a clean exit. |
@stts but after doing |
@soltysh it would be nice to have a way to use random pod names plus the |
Answering my own question, https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/garbage-collection.md is the midterm answer (currently alpha). It will collect pods after 12500 terminated pods exist, a pretty high number and it is not deterministic for a user when his pods are deleted. Hence, something like this without an explicit name would be great: $ kubectl run -it --image=busybox --restart=Never -- /bin/true |
I have started a pod with kubectl run busybox --image=busybox --restart=Never --tty -i generator=run-pod/v1I tried to delete this pod, but it never gets deleted. How can I delete this pod ? kubectl delete pods busybox-na3tmpod "busybox-na3tm" deleted kubectl get podsNAME READY STATUS RESTARTS AGE kubectl delete pod busybox-vlzh3 --grace-period=0kubectl get all -o name | wc -l2599 kubectl delete pods --allpod "busybox-131cq" deleted kubectl get podsNAME READY STATUS RESTARTS AGE kubectl get pods --all-namespacesNAMESPACE NAME READY STATUS RESTARTS AGE tail /var/log/messegaeNov 19 00:18:10 masterserver1 kube-controller-manager: E1119 00:18:10.013599 741 controller.go:409] Failed to update job busybox, requeuing. Error: jobs.extensions "busybox" cannot be updated: the object has been modified; please apply your changes to the latest version and try again kubectl versionClient Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.0", GitCommit:"ec7364b6e3b155e78086018aa644057edbe196e5", GitTreeState:"clean"} |
We also have a --rm flag ...
On Fri, Nov 18, 2016 at 4:38 PM, nisargam notifications@github.com wrote:
|
From the name
Now, when executing it pay attention what the generator says it created, if it says replication controller/set or a deployment you need to delete that and that will remove the pod. |
What do we think should happen here:
I would expect this to run once and then never retry. What happens is that we create a Job with a PodTemplate that says
restartPolicy: Never
and then the Job itself happily restarts the Pod (well, recreates it). So it appears we no longer have a way to express "really, just run once" throughkubectl run
? That doesn't seem right - I feel like I am missing something...@erictune @janetkuo
The text was updated successfully, but these errors were encountered: