Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using docker restart remote API to restart container? #629

Closed
dchen1107 opened this issue Jul 25, 2014 · 7 comments
Closed

Using docker restart remote API to restart container? #629

dchen1107 opened this issue Jul 25, 2014 · 7 comments
Labels
area/docker area/kubelet priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@dchen1107
Copy link
Member

This might not an issue, instead of an intention for a good reason. But I need a clarification here. By working on #127, I noticed that we always start a new docker container instead of using restart API. In this case, we leave a tons of garbage containers in the node. Any reasons? The only reason I could come up is that Kubelet doesn't have checkpoints, thus the number of restart attempts will be lost. #489 was filed for kubelet checkpointing. Also before we have that ready, shouldn't we pull the information directly through docker?

@smarterclayton
Copy link
Contributor

In general starting a new container guarantees a clean filesystem start state, whereas restart (unless this has changed recently) won't. I think we would want to use clean states as much as possible.

I think you are correct that without checkpoint state we must retain the docker containers. An intermediate solution would be to log the state to disk and then clean the containers.

@thockin
Copy link
Member

thockin commented Jul 26, 2014

Using restart has the advantage that any temporary on-disk state would be
preserved across container crash-restarts. What about crashes that are due
to disk quota?
On Jul 25, 2014 10:16 PM, "Clayton Coleman" notifications@github.com
wrote:

In general starting a new container guarantees a clean filesystem start
state, whereas restart (unless this has changed recently) won't. I think we
would want to use clean states as much as possible.

I think you are correct that without checkpoint state we must retain the
docker containers. An intermediate solution would be to log the state to
disk and then clean the containers.

Reply to this email directly or view it on GitHub
#629 (comment)
.

@smarterclayton
Copy link
Contributor

Ideally recording the exit condition and deleting the old container happens before the restart. If your crash is due to quota and you were the one writing it you probably can't restart until the original container is deleted anyway (is debug-ability or being available more important? I think you can come up with reasons why either is preferred, although you may want to get access to the old, crashed container as a separate operation.

@thockin
Copy link
Member

thockin commented Jul 26, 2014

We have had a bad experience waiting for disk cleanup on a container
crash/restart. We go to great pain to make it possible to restart
immediately - O(tens to hundreds of millseconds).

Obviously, if you are out of disk space, you are kind of stuck, but
otherwise it should be fast.

On Sat, Jul 26, 2014 at 8:22 AM, Clayton Coleman notifications@github.com
wrote:

Ideally recording the exit condition and deleting the old container
happens before the restart. If your crash is due to quota and you were the
one writing it you probably can't restart until the original container is
deleted anyway. I think you can come up with reasons why either is
preferred, although you may want to get access to the old, crashed
container as a separate operation.

Reply to this email directly or view it on GitHub
#629 (comment)
.

@smarterclayton
Copy link
Contributor

So prioritize at least the possibility that some subset of processes (those that at least do cleanup on start) can come up when restarted? Makes sense

@thockin
Copy link
Member

thockin commented Jul 26, 2014

I'm not sure I am arguing one way or the other. Just exploring the idea.

On Sat, Jul 26, 2014 at 8:34 AM, Clayton Coleman notifications@github.com
wrote:

So prioritize at least the possibility that some subset of processes
(those that at least do cleanup on start) can come up when restarted? Makes
sense

Reply to this email directly or view it on GitHub
#629 (comment)
.

@dchen1107
Copy link
Member Author

Debuggability and availability should be equally important to the user. At different stage of software / service lifecyle, one of them might be preferred. But talking about debuggability, in most of cases, only the initial failure and the last failure are interesting to the users. In this case, we could have an asyn operation to do disk cleanup for all crashed/restarted containers.

@bgrant0607 bgrant0607 added area/docker area/kubelet kind/enhancement and removed kind/support-question kind/design Categorizes issue or PR as related to design. labels Oct 4, 2014
@bgrant0607 bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Dec 3, 2014
@dchen1107 dchen1107 added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Feb 4, 2015
@thockin thockin closed this as completed Jul 9, 2015
vishh pushed a commit to vishh/kubernetes that referenced this issue Apr 6, 2016
Preliminary integration tests for getting events
b3atlesfan pushed a commit to b3atlesfan/kubernetes that referenced this issue Feb 5, 2021
Backends: Remove Run() from interface as it's not used
sanchezl pushed a commit to sanchezl/kubernetes that referenced this issue Mar 29, 2021
UPSTREAM: <carry>: rate limit initial watch storm from kubelets on apiserver restart
linxiulei pushed a commit to linxiulei/kubernetes that referenced this issue Jan 18, 2024
…image

images: use k8s-staging-test-infra/gcb-docker-gcloud
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docker area/kubelet priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

5 participants