Skip to content
This repository has been archived by the owner on Feb 24, 2020. It is now read-only.

[rfc] rkt kill #1496

Open
vcaputo opened this issue Sep 28, 2015 · 16 comments
Open

[rfc] rkt kill #1496

vcaputo opened this issue Sep 28, 2015 · 16 comments

Comments

@vcaputo
Copy link
Contributor

vcaputo commented Sep 28, 2015

There's no way to "stop" a rkt container today, it seems a general rkt kill $SIGNUM analogous to the standard unix kill(1) command would go a long way towards providing it.

The rkt kill command could deliver the specified signal or the default SIGTERM to the top-level process representing the named pod.

It's a bit awkward with systemd in stage1 though, because systemd handles SIGTERM with the following:

   SIGTERM
      Upon receiving this signal the systemd system manager serializes
      its state, reexecutes itself and deserializes the saved state
      again. This is mostly equivalent to systemctl daemon-reexec.

which isn't what we'd expect from a bare rkt kill $pod delivering SIGTERM as the default.

This has been brought up previously: #532 (comment)

Does it make sense to have a general kill which delivers a signal to the pod's top-level process? Or should we make a less general rkt stop which employs some stage1-specific stop entrypoint to Do The Right Thing?

@blixtra
Copy link
Collaborator

blixtra commented Sep 28, 2015

IMHO, from a semantic point of view we should only have a stop if we have a start, analogous to the the freeze/thaw in the issues you pointed to.

@alban
Copy link
Member

alban commented Sep 28, 2015

I think something specific would make it more user-friendly. For systemd-stage1, kill -37 $SYSTEMD_PID and kill -47 $SYSTEMD_PIDworks fine and would be the equivalent of rkt stop and rkt stop --force:

       SIGRTMIN+3 (kill -37)
           Halts the machine, starts the halt.target unit. This is mostly
           equivalent to systemctl start halt.target.
       SIGRTMIN+13 (kill -47)
           Immediately halts the machine.

I don't know if it worth an entrypoint with a binary just to send a signal.

@vcaputo
Copy link
Contributor Author

vcaputo commented Sep 28, 2015

well, what's appropriate for stopping lkvm? that's where the entrypoint comes in.

@jonboulle
Copy link
Contributor

naive driveby, can't we kill systemd-nspawn/lkvm process and it'll Do The
Right Thing?

On Mon, Sep 28, 2015 at 3:35 PM, Vito Caputo notifications@github.com
wrote:

well, what's appropriate for stopping lkvm? that's where the entrypoint
comes in.


Reply to this email directly or view it on GitHub
#1496 (comment).

@iaguis
Copy link
Member

iaguis commented Sep 29, 2015

By default, if --boot is used and you send SIGTERM to systemd-nspawn it will send SIGRTMIN+3 to PID 1 in the container.

I don't see any kind of similar signal handling in lkvm. There's lkvm stop though.

@jonboulle
Copy link
Contributor

/cc @jellonek @ppalucki

@vcaputo
Copy link
Contributor Author

vcaputo commented Sep 29, 2015

Maybe we just send SIGTERM to the outer rkt process of the pod then, like any process manager would do. The only reasons I can see for introducing a stop stage1 entrypoint are:

  1. symmetry (we have a run entrypoint)
  2. if a stage1 implementation benefits from a more graceful/involved exit path, it could be stuck in the stop.

Maybe it makes sense to have a stop entrypoint, invoke it when set, and afterwards send SIGTERM all the same to the outer rkt process of the pod.

I don't like the stop entrypoint for what it implies; the wrong thing will happen when plain SIGTERM is delivered as would be by a process manager. So maybe it's a flawed idea, symmetry be damned.

@jonboulle jonboulle modified the milestones: v1.0.0, v0.10.0 Oct 1, 2015
@jonboulle jonboulle modified the milestones: v0.11.0, v0.10.0 Oct 20, 2015
@markuskobler
Copy link

So I think I have a related issue that would benefit from generic rkt kill --signal=HUP command.

I'm currently trying to port a haproxy docker container that uses haproxy-systemd-wrapper to gracefully reload its configuration after a HUP signal. Currently I cant see a good way to signal the master process started by rkt.

@yifan-gu
Copy link
Contributor

Maybe we should return the pid of the stage1 now? On an oob discussion @vcaputo told me currently pid returned by rkt status is the pid of PID1 in the nspawn container(the systemd), because we need that to enter the namespace. But as now rkt enter takes the uuid of the pod, I think we don't have to show the pid of the systemd now.

@markuskobler
Copy link

Ah that would really help.

@iaguis
Copy link
Member

iaguis commented Oct 30, 2015

#1699

@alban
Copy link
Member

alban commented Nov 25, 2015

#1699 has been merged, so it returns the pid of stage1 now.

@philips
Copy link
Contributor

philips commented Nov 25, 2015

My hunch here is to do a rkt stop and then make sure we have the right behavior wired up for each stage1. Sending a kill with an arbitrary signal won't work everywhere right now. If we do kill we would need to try and figure out what the different "rkt signals" we could send would be.

@jonboulle jonboulle modified the milestones: v0.13.0, v0.12.0 Nov 25, 2015
@iaguis
Copy link
Member

iaguis commented Jan 11, 2016

My hunch here is to do a rkt stop and then make sure we have the right behavior wired up for each stage1. Sending a kill with an arbitrary signal won't work everywhere right now. If we do kill we would need to try and figure out what the different "rkt signals" we could send would be.

Started working on this https://github.com/kinvolk/rkt/commits/iaguis/rkt-stop

It sends SIGTERM to the pid of stage1 for the nspawn and fly flavors and it calls lkvm stop ${VM} for the lkvm flavor.

It seems to work but I want to do some refactorings to get rid of flavor code messiness (e.g. all the if flavor == "kvm").

@iaguis iaguis mentioned this issue Jan 12, 2016
1 task
@iaguis iaguis modified the milestones: v1.0.0, v0.16.0 Jan 20, 2016
@jonboulle jonboulle modified the milestones: v1+, v1.0.0 Jan 27, 2016
@sjpotter
Copy link
Contributor

for cadvisor testing this would be nice, i.e. I'm systemd-run'ing rkt to run-prepared the containers in the background and would just like to stop the uuid

@ghost
Copy link

ghost commented Feb 6, 2017

I'd also need to signal a process running inside a Rkt container. In my case that would be coreos.com/dnsmasq, which reloads its hostsfile upon receiving SIGHUP.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants