Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update reaper for multiple subscribers #1452

Merged
merged 1 commit into from
Aug 31, 2017

Conversation

crosbymichael
Copy link
Member

Depends on containerd/go-runc#24

The is currently a race with the reaper where you could miss some exit
events from processes.

The problem before and why the reaper was so complex was because
processes could fork, getting a pid, and then fail on an execve before
we would have time to register the process with the reaper. This could
cause pids to fill up in a map as a way to reduce the race.

This changes makes the reaper handle multiple subscribers so that the
caller can handle locking, for when they want to wait for a specific
pid, without affecting other callers using the reaper code.

Exit events are broadcast to multiple subscribers, in the case, the runc
commands and container pids that we get from a pid-file. Locking while
the entire container stats no longs affects runc commands where you want
to call runc create and wait until that has been completed.

Signed-off-by: Michael Crosby crosbymichael@gmail.com

@codecov-io
Copy link

codecov-io commented Aug 31, 2017

Codecov Report

Merging #1452 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1452   +/-   ##
======================================
  Coverage    40.8%   40.8%           
======================================
  Files          23      23           
  Lines        2924    2924           
======================================
  Hits         1193    1193           
  Misses       1453    1453           
  Partials      278     278

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2e894c...6b4c4a2. Read the comment docs.

defer s.p.mu.Unlock()

return s.p.kill(ctx, sig, all)
return errdefs.ToGRPCf(errdefs.ErrNotFound, "process %s not found", s.p.id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use failed preconditions here? Although I guess not found make it easier, only need to check against one type of error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its this way because we map no such process syscall error to NotFound


workDir string
platform platform
}

func (s *Service) Create(ctx context.Context, r *shimapi.CreateTaskRequest) (*shimapi.CreateTaskResponse, error) {
s.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use defer s.mu.Lock() now?

Otherwise, move it back after newInitProcess or unlock it in the error path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@mlaventure mlaventure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once go-runc sha has been updated.

@crosbymichael
Copy link
Member Author

@mlaventure done

Depends on containerd/go-runc#24

The is currently a race with the reaper where you could miss some exit
events from processes.

The problem before and why the reaper was so complex was because
processes could fork, getting a pid, and then fail on an execve before
we would have time to register the process with the reaper.  This could
cause pids to fill up in a map as a way to reduce the race.

This changes makes the reaper handle multiple subscribers so that the
caller can handle locking, for when they want to wait for a specific
pid, without affecting other callers using the reaper code.

Exit events are broadcast to multiple subscribers, in the case, the runc
commands and container pids that we get from a pid-file.  Locking while
the entire container stats no longs affects runc commands where you want
to call `runc create` and wait until that has been completed.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
@crosbymichael
Copy link
Member Author

@mlaventure included here systemd fix in this version bump

Copy link
Contributor

@mlaventure mlaventure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mlaventure mlaventure merged commit 22df20b into containerd:master Aug 31, 2017
@crosbymichael crosbymichael deleted the reaper2 branch August 31, 2017 18:52
cyyzero added a commit to cyyzero/containerd that referenced this pull request Oct 1, 2023
After pr containerd#8617, create handler of containerd-shim-runc-v2 will
call handleStarted() to record the init process and handle its exit.
Init process wouldn't quit so early in normal circumstances. But if
this screnario occurs, handleStarted() will call
handleProcessExit(), which will cause deadlock because create() had
acquired s.mu, and handleProcessExit() will try to lock it again.

For some historical reasons, the use of s.mu is a bit confusing,
sometimes used to protect s.containers only, and sometimes used as a
mutex for some functions.

According to analysis of @corhere in containerd#9103, we can know that:

Locking s.mu for Create() handler is introduced in containerd#1452, which was a
solution for the missed early-exits problem: Create() holds the mutex to
block checkProcesses() from handling exit events until after Create()
has added the container process to the s.processes map. This locking
logic was copied into the v2 shim implementation. containerd#8617 solves the same
problem using a strategy that does not rely on mutual exclusion between
container create and exit-event handling.

As for Shutdown(), the implementation was added in containerd#3004. And in this
initial implementation, the mutex is locked to safely access
s.containers, it is not unlocked because nothing matters after
os.Exit(0). Then ae87730 changed Shutdown() to return instead of
exiting, followed by containerd#4988 to unlock upon return. If the intention is
to block containers from being created while the shim's task service is
shutting down, locking a mutex is a poor solution since it makes the
create requests hang instead of returning an error, and is ineffective
after Shutdown() returns as the mutex is unlocked.

If Create() or handleProcessExit() enters again after Shutdown()
unlocking s.mu and returning, shim service will panic by sending to
a closed channel.

So I remove the unnecessary lock guards in service, and rename mu to
containersMu to make it clear that it is used to protect s.containers
only.

Fix: containerd#9103
Signed-off-by: Chen Yiyang <cyyzero@qq.com>
cyyzero added a commit to cyyzero/containerd that referenced this pull request Oct 1, 2023
After pr containerd#8617, create handler of containerd-shim-runc-v2 will
call handleStarted() to record the init process and handle its exit.
Init process wouldn't quit so early in normal circumstances. But if
this screnario occurs, handleStarted() will call
handleProcessExit(), which will cause deadlock because create() had
acquired s.mu, and handleProcessExit() will try to lock it again.

For some historical reasons, the use of s.mu is a bit confusing,
sometimes used to protect s.containers only, and sometimes used as a
mutex for some functions.

According to analysis of @corhere in containerd#9103, we can know that:

Locking s.mu for Create() handler is introduced in containerd#1452, which was a
solution for the missed early-exits problem: Create() holds the mutex to
block checkProcesses() from handling exit events until after Create()
has added the container process to the s.processes map. This locking
logic was copied into the v2 shim implementation. containerd#8617 solves the same
problem using a strategy that does not rely on mutual exclusion between
container create and exit-event handling.

As for Shutdown(), the implementation was added in containerd#3004. And in this
initial implementation, the mutex is locked to safely access
s.containers, it is not unlocked because nothing matters after
os.Exit(0). Then ae87730 changed Shutdown() to return instead of
exiting, followed by containerd#4988 to unlock upon return. If the intention is
to block containers from being created while the shim's task service is
shutting down, locking a mutex is a poor solution since it makes the
create requests hang instead of returning an error, and is ineffective
after Shutdown() returns as the mutex is unlocked.

If Create() or handleProcessExit() enters again after Shutdown()
unlocking s.mu and returning, shim service will panic by sending to
a closed channel.

So I remove the unnecessary lock guards in service, and rename mu to
containersMu to make it clear that it is used to protect s.containers
only.

Fix: containerd#9103
Signed-off-by: Chen Yiyang <cyyzero@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants