Update reaper for multiple subscribers #1452

crosbymichael · 2017-08-31T17:39:54Z

The is currently a race with the reaper where you could miss some exit
events from processes.

The problem before and why the reaper was so complex was because
processes could fork, getting a pid, and then fail on an execve before
we would have time to register the process with the reaper. This could
cause pids to fill up in a map as a way to reduce the race.

This changes makes the reaper handle multiple subscribers so that the
caller can handle locking, for when they want to wait for a specific
pid, without affecting other callers using the reaper code.

Exit events are broadcast to multiple subscribers, in the case, the runc
commands and container pids that we get from a pid-file. Locking while
the entire container stats no longs affects runc commands where you want
to call runc create and wait until that has been completed.

Signed-off-by: Michael Crosby crosbymichael@gmail.com

codecov-io · 2017-08-31T17:46:16Z

Codecov Report

Merging #1452 into master will not change coverage.
The diff coverage is n/a.

@@          Coverage Diff           @@
##           master   #1452   +/-   ##
======================================
  Coverage    40.8%   40.8%           
======================================
  Files          23      23           
  Lines        2924    2924           
======================================
  Hits         1193    1193           
  Misses       1453    1453           
  Partials      278     278

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2e894c...6b4c4a2. Read the comment docs.

mlaventure · 2017-08-31T17:48:48Z

linux/shim/init_state.go

-	defer s.p.mu.Unlock()
-
-	return s.p.kill(ctx, sig, all)
+	return errdefs.ToGRPCf(errdefs.ErrNotFound, "process %s not found", s.p.id)


maybe use failed preconditions here? Although I guess not found make it easier, only need to check against one type of error

Its this way because we map no such process syscall error to NotFound

mlaventure · 2017-08-31T17:50:04Z

linux/shim/service.go


 	workDir  string
 	platform platform
 }

 func (s *Service) Create(ctx context.Context, r *shimapi.CreateTaskRequest) (*shimapi.CreateTaskResponse, error) {
+	s.mu.Lock()


can we use defer s.mu.Lock() now?

Otherwise, move it back after newInitProcess or unlock it in the error path.

mlaventure

LGTM once go-runc sha has been updated.

crosbymichael · 2017-08-31T18:23:24Z

@mlaventure done

Depends on containerd/go-runc#24 The is currently a race with the reaper where you could miss some exit events from processes. The problem before and why the reaper was so complex was because processes could fork, getting a pid, and then fail on an execve before we would have time to register the process with the reaper. This could cause pids to fill up in a map as a way to reduce the race. This changes makes the reaper handle multiple subscribers so that the caller can handle locking, for when they want to wait for a specific pid, without affecting other callers using the reaper code. Exit events are broadcast to multiple subscribers, in the case, the runc commands and container pids that we get from a pid-file. Locking while the entire container stats no longs affects runc commands where you want to call `runc create` and wait until that has been completed. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

crosbymichael · 2017-08-31T18:30:21Z

@mlaventure included here systemd fix in this version bump

mlaventure

LGTM

@corhere

After pr containerd#8617, create handler of containerd-shim-runc-v2 will call handleStarted() to record the init process and handle its exit. Init process wouldn't quit so early in normal circumstances. But if this screnario occurs, handleStarted() will call handleProcessExit(), which will cause deadlock because create() had acquired s.mu, and handleProcessExit() will try to lock it again. For some historical reasons, the use of s.mu is a bit confusing, sometimes used to protect s.containers only, and sometimes used as a mutex for some functions. According to analysis of @corhere in containerd#9103, we can know that: Locking s.mu for Create() handler is introduced in containerd#1452, which was a solution for the missed early-exits problem: Create() holds the mutex to block checkProcesses() from handling exit events until after Create() has added the container process to the s.processes map. This locking logic was copied into the v2 shim implementation. containerd#8617 solves the same problem using a strategy that does not rely on mutual exclusion between container create and exit-event handling. As for Shutdown(), the implementation was added in containerd#3004. And in this initial implementation, the mutex is locked to safely access s.containers, it is not unlocked because nothing matters after os.Exit(0). Then ae87730 changed Shutdown() to return instead of exiting, followed by containerd#4988 to unlock upon return. If the intention is to block containers from being created while the shim's task service is shutting down, locking a mutex is a poor solution since it makes the create requests hang instead of returning an error, and is ineffective after Shutdown() returns as the mutex is unlocked. If Create() or handleProcessExit() enters again after Shutdown() unlocking s.mu and returning, shim service will panic by sending to a closed channel. So I remove the unnecessary lock guards in service, and rename mu to containersMu to make it clear that it is used to protect s.containers only. Fix: containerd#9103 Signed-off-by: Chen Yiyang <cyyzero@qq.com>

@corhere

After pr containerd#8617, create handler of containerd-shim-runc-v2 will call handleStarted() to record the init process and handle its exit. Init process wouldn't quit so early in normal circumstances. But if this screnario occurs, handleStarted() will call handleProcessExit(), which will cause deadlock because create() had acquired s.mu, and handleProcessExit() will try to lock it again. For some historical reasons, the use of s.mu is a bit confusing, sometimes used to protect s.containers only, and sometimes used as a mutex for some functions. According to analysis of @corhere in containerd#9103, we can know that: Locking s.mu for Create() handler is introduced in containerd#1452, which was a solution for the missed early-exits problem: Create() holds the mutex to block checkProcesses() from handling exit events until after Create() has added the container process to the s.processes map. This locking logic was copied into the v2 shim implementation. containerd#8617 solves the same problem using a strategy that does not rely on mutual exclusion between container create and exit-event handling. As for Shutdown(), the implementation was added in containerd#3004. And in this initial implementation, the mutex is locked to safely access s.containers, it is not unlocked because nothing matters after os.Exit(0). Then ae87730 changed Shutdown() to return instead of exiting, followed by containerd#4988 to unlock upon return. If the intention is to block containers from being created while the shim's task service is shutting down, locking a mutex is a poor solution since it makes the create requests hang instead of returning an error, and is ineffective after Shutdown() returns as the mutex is unlocked. If Create() or handleProcessExit() enters again after Shutdown() unlocking s.mu and returning, shim service will panic by sending to a closed channel. So I remove the unnecessary lock guards in service, and rename mu to containersMu to make it clear that it is used to protect s.containers only. Fix: containerd#9103 Signed-off-by: Chen Yiyang <cyyzero@qq.com>

crosbymichael mentioned this pull request Aug 31, 2017

Update monitor to subscribe to channel containerd/go-runc#24

Merged

mlaventure requested changes Aug 31, 2017

View reviewed changes

crosbymichael force-pushed the reaper2 branch from 4078693 to 85d45f8 Compare August 31, 2017 17:56

mlaventure approved these changes Aug 31, 2017

View reviewed changes

crosbymichael force-pushed the reaper2 branch from 85d45f8 to 0a30273 Compare August 31, 2017 18:17

crosbymichael force-pushed the reaper2 branch from 0a30273 to 6b4c4a2 Compare August 31, 2017 18:29

mlaventure approved these changes Aug 31, 2017

View reviewed changes

mlaventure merged commit 22df20b into containerd:master Aug 31, 2017

crosbymichael deleted the reaper2 branch August 31, 2017 18:52

corhere mentioned this pull request Sep 27, 2023

containerd-shim-runc-v2: avoid potential deadlock in create handler #9104

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update reaper for multiple subscribers #1452

Update reaper for multiple subscribers #1452

crosbymichael commented Aug 31, 2017

codecov-io commented Aug 31, 2017 •

edited

Loading

mlaventure Aug 31, 2017

crosbymichael Aug 31, 2017

mlaventure Aug 31, 2017

crosbymichael Aug 31, 2017

mlaventure left a comment

crosbymichael commented Aug 31, 2017

crosbymichael commented Aug 31, 2017

mlaventure left a comment

Update reaper for multiple subscribers #1452

Update reaper for multiple subscribers #1452

Conversation

crosbymichael commented Aug 31, 2017

codecov-io commented Aug 31, 2017 • edited Loading

Codecov Report

mlaventure Aug 31, 2017

Choose a reason for hiding this comment

crosbymichael Aug 31, 2017

Choose a reason for hiding this comment

mlaventure Aug 31, 2017

Choose a reason for hiding this comment

crosbymichael Aug 31, 2017

Choose a reason for hiding this comment

mlaventure left a comment

Choose a reason for hiding this comment

crosbymichael commented Aug 31, 2017

crosbymichael commented Aug 31, 2017

mlaventure left a comment

Choose a reason for hiding this comment

codecov-io commented Aug 31, 2017 •

edited

Loading