Using docker restart remote API to restart container? #629

dchen1107 · 2014-07-25T22:22:09Z

This might not an issue, instead of an intention for a good reason. But I need a clarification here. By working on #127, I noticed that we always start a new docker container instead of using restart API. In this case, we leave a tons of garbage containers in the node. Any reasons? The only reason I could come up is that Kubelet doesn't have checkpoints, thus the number of restart attempts will be lost. #489 was filed for kubelet checkpointing. Also before we have that ready, shouldn't we pull the information directly through docker?

smarterclayton · 2014-07-26T05:16:45Z

In general starting a new container guarantees a clean filesystem start state, whereas restart (unless this has changed recently) won't. I think we would want to use clean states as much as possible.

I think you are correct that without checkpoint state we must retain the docker containers. An intermediate solution would be to log the state to disk and then clean the containers.

thockin · 2014-07-26T06:16:17Z

Using restart has the advantage that any temporary on-disk state would be
preserved across container crash-restarts. What about crashes that are due
to disk quota?
On Jul 25, 2014 10:16 PM, "Clayton Coleman" notifications@github.com
wrote:

In general starting a new container guarantees a clean filesystem start
state, whereas restart (unless this has changed recently) won't. I think we
would want to use clean states as much as possible.

I think you are correct that without checkpoint state we must retain the
docker containers. An intermediate solution would be to log the state to
disk and then clean the containers.

Reply to this email directly or view it on GitHub
#629 (comment)
.

smarterclayton · 2014-07-26T15:22:17Z

Ideally recording the exit condition and deleting the old container happens before the restart. If your crash is due to quota and you were the one writing it you probably can't restart until the original container is deleted anyway (is debug-ability or being available more important? I think you can come up with reasons why either is preferred, although you may want to get access to the old, crashed container as a separate operation.

thockin · 2014-07-26T15:26:25Z

We have had a bad experience waiting for disk cleanup on a container
crash/restart. We go to great pain to make it possible to restart
immediately - O(tens to hundreds of millseconds).

Obviously, if you are out of disk space, you are kind of stuck, but
otherwise it should be fast.

On Sat, Jul 26, 2014 at 8:22 AM, Clayton Coleman notifications@github.com
wrote:

Ideally recording the exit condition and deleting the old container
happens before the restart. If your crash is due to quota and you were the
one writing it you probably can't restart until the original container is
deleted anyway. I think you can come up with reasons why either is
preferred, although you may want to get access to the old, crashed
container as a separate operation.

Reply to this email directly or view it on GitHub
#629 (comment)
.

smarterclayton · 2014-07-26T15:34:16Z

So prioritize at least the possibility that some subset of processes (those that at least do cleanup on start) can come up when restarted? Makes sense

thockin · 2014-07-26T15:38:00Z

I'm not sure I am arguing one way or the other. Just exploring the idea.

On Sat, Jul 26, 2014 at 8:34 AM, Clayton Coleman notifications@github.com
wrote:

So prioritize at least the possibility that some subset of processes
(those that at least do cleanup on start) can come up when restarted? Makes
sense

Reply to this email directly or view it on GitHub
#629 (comment)
.

dchen1107 · 2014-07-28T17:30:22Z

Debuggability and availability should be equally important to the user. At different stage of software / service lifecyle, one of them might be preferred. But talking about debuggability, in most of cases, only the initial failure and the last failure are interesting to the users. In this case, we could have an asyn operation to do disk cleanup for all crashed/restarted containers.

Preliminary integration tests for getting events

Backends: Remove Run() from interface as it's not used

UPSTREAM: <carry>: rate limit initial watch storm from kubelets on apiserver restart

…image images: use k8s-staging-test-infra/gcb-docker-gcloud

lavalamp added design labels Jul 27, 2014

bgrant0607 added area/docker area/kubelet kind/enhancement and removed kind/support-question kind/design Categorizes issue or PR as related to design. labels Oct 4, 2014

bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Dec 3, 2014

dchen1107 added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Feb 4, 2015

thockin closed this as completed Jul 9, 2015

vishh pushed a commit to vishh/kubernetes that referenced this issue Apr 6, 2016

Merge pull request kubernetes#629 from kateknister/master

e32b661

Preliminary integration tests for getting events

This was referenced Jan 14, 2018

fix runtime-config issue due to Initializers is enabled by kubeadm #58254

Closed

kubeadm: remove Initializers (still in alpha) from admission control #58428

Merged

b3atlesfan pushed a commit to b3atlesfan/kubernetes that referenced this issue Feb 5, 2021

Merge pull request kubernetes#629 from tomdee/remove-backend-run

4ee76f8

Backends: Remove Run() from interface as it's not used

sanchezl pushed a commit to sanchezl/kubernetes that referenced this issue Mar 29, 2021

Merge pull request kubernetes#629 from deads2k/ratelimit-watch

29a606d

UPSTREAM: <carry>: rate limit initial watch storm from kubelets on apiserver restart

linxiulei pushed a commit to linxiulei/kubernetes that referenced this issue Jan 18, 2024

Merge pull request kubernetes#629 from spiffxp/use-k8s-infra-for-gcb-…

e7fe0b2

…image images: use k8s-staging-test-infra/gcb-docker-gcloud

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using docker restart remote API to restart container? #629

Using docker restart remote API to restart container? #629

dchen1107 commented Jul 25, 2014

smarterclayton commented Jul 26, 2014

thockin commented Jul 26, 2014

smarterclayton commented Jul 26, 2014

thockin commented Jul 26, 2014

smarterclayton commented Jul 26, 2014

thockin commented Jul 26, 2014

dchen1107 commented Jul 28, 2014

Using docker restart remote API to restart container? #629

Using docker restart remote API to restart container? #629

Comments

dchen1107 commented Jul 25, 2014

smarterclayton commented Jul 26, 2014

thockin commented Jul 26, 2014

smarterclayton commented Jul 26, 2014

thockin commented Jul 26, 2014

smarterclayton commented Jul 26, 2014

thockin commented Jul 26, 2014

dchen1107 commented Jul 28, 2014