Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.7 backport] Fix bug where init exits were being dropped #10675

Merged

Conversation

laurazard
Copy link
Member

Backports: #10651

laurazard and others added 3 commits September 6, 2024 11:01
It's not true that `s.mu` needs to be held when calling
`handleProcessExit`, and indeed hasn't been the case for a
while – see 892dc54.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
(cherry picked from commit 7f3bf99)
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
The runc task state machine prevents execs from being created after the
init process has exited, but there are no guards against starting a
created exec after the init process has exited. That leaves a small
window for starting an exec to race our handling of the init process
exiting. Normally this is not an issue in practice: the kernel will
atomically kill all processes in a PID namespace when its "init" process
terminates, and will not allow new processes to fork(2) into the PID
namespace afterwards. Therefore the racing exec is guaranteed by the
kernel to not be running after the init process terminates. On the other
hand, when the container does not have a private PID namespace (i.e. the
container's init process is not the "init" process of the container's
PID namespace), the kernel does not automatically kill other container
processes on init exit and will happily allow runc to start an exec
process at any time. It is the runc shim's responsibility to clean up
the container when the init process exits in this situation by killing
all the container's remaining processes. Block execs from being started
after the container's init process has exited to prevent the processes
from leaking, and to avoid violating the task service's assumption that
an exec can be running iff the init process is also running.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit e735791)
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
This commit rewrites and simplifies a lot of this logic to reduce it's
complexity, and also handle the case where the container doesn't have
it's own pid-namespace, which means that we're not guaranteed to receive
the init exit last.

This is achieved by replacing `s.pendingExecs` with `s.runningExecs`,
for which both (previously) pending and de facto running execs are
considered.

The new exit handling logic can be summed up by:
- when we receive an init exit, stash it it in `s.containerInitExit`,
  and if a container's init process has exited, refuse new execs.
- (if the container does not have it's own pidns) kill all running
  processes (if the container has a private pid-namespace, then all
  processes will be dead already).
- wait for the container's running exec count (which includes execs
  which have been started but might still early exit) to get to 0.
- publish the stashed away init exit.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
(cherry picked from commit 421a4b5)
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
@laurazard laurazard self-assigned this Sep 6, 2024
@dosubot dosubot bot added the area/runtime Runtime label Sep 6, 2024
@laurazard laurazard changed the title runc-shim: remove misleading comment [1.7 backport] runc-shim: prevent init exits from being dropped Sep 6, 2024
@laurazard
Copy link
Member Author

laurazard commented Sep 6, 2024

There are a couple tests failing that I'm not sure are flakes, I'll TAL.

Copy link
Member

@dims dims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dims dims added the kind/bug label Sep 6, 2024
@samuelkarp samuelkarp changed the title [1.7 backport] runc-shim: prevent init exits from being dropped [1.7 backport] Fix bug where init exits were being dropped Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants