-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update reaper for multiple subscribers #1452
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1452 +/- ##
======================================
Coverage 40.8% 40.8%
======================================
Files 23 23
Lines 2924 2924
======================================
Hits 1193 1193
Misses 1453 1453
Partials 278 278 Continue to review full report at Codecov.
|
defer s.p.mu.Unlock() | ||
|
||
return s.p.kill(ctx, sig, all) | ||
return errdefs.ToGRPCf(errdefs.ErrNotFound, "process %s not found", s.p.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use failed preconditions here? Although I guess not found make it easier, only need to check against one type of error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its this way because we map no such process syscall error to NotFound
|
||
workDir string | ||
platform platform | ||
} | ||
|
||
func (s *Service) Create(ctx context.Context, r *shimapi.CreateTaskRequest) (*shimapi.CreateTaskResponse, error) { | ||
s.mu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use defer s.mu.Lock()
now?
Otherwise, move it back after newInitProcess
or unlock it in the error path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once go-runc
sha has been updated.
@mlaventure done |
Depends on containerd/go-runc#24 The is currently a race with the reaper where you could miss some exit events from processes. The problem before and why the reaper was so complex was because processes could fork, getting a pid, and then fail on an execve before we would have time to register the process with the reaper. This could cause pids to fill up in a map as a way to reduce the race. This changes makes the reaper handle multiple subscribers so that the caller can handle locking, for when they want to wait for a specific pid, without affecting other callers using the reaper code. Exit events are broadcast to multiple subscribers, in the case, the runc commands and container pids that we get from a pid-file. Locking while the entire container stats no longs affects runc commands where you want to call `runc create` and wait until that has been completed. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
@mlaventure included here systemd fix in this version bump |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
After pr containerd#8617, create handler of containerd-shim-runc-v2 will call handleStarted() to record the init process and handle its exit. Init process wouldn't quit so early in normal circumstances. But if this screnario occurs, handleStarted() will call handleProcessExit(), which will cause deadlock because create() had acquired s.mu, and handleProcessExit() will try to lock it again. For some historical reasons, the use of s.mu is a bit confusing, sometimes used to protect s.containers only, and sometimes used as a mutex for some functions. According to analysis of @corhere in containerd#9103, we can know that: Locking s.mu for Create() handler is introduced in containerd#1452, which was a solution for the missed early-exits problem: Create() holds the mutex to block checkProcesses() from handling exit events until after Create() has added the container process to the s.processes map. This locking logic was copied into the v2 shim implementation. containerd#8617 solves the same problem using a strategy that does not rely on mutual exclusion between container create and exit-event handling. As for Shutdown(), the implementation was added in containerd#3004. And in this initial implementation, the mutex is locked to safely access s.containers, it is not unlocked because nothing matters after os.Exit(0). Then ae87730 changed Shutdown() to return instead of exiting, followed by containerd#4988 to unlock upon return. If the intention is to block containers from being created while the shim's task service is shutting down, locking a mutex is a poor solution since it makes the create requests hang instead of returning an error, and is ineffective after Shutdown() returns as the mutex is unlocked. If Create() or handleProcessExit() enters again after Shutdown() unlocking s.mu and returning, shim service will panic by sending to a closed channel. So I remove the unnecessary lock guards in service, and rename mu to containersMu to make it clear that it is used to protect s.containers only. Fix: containerd#9103 Signed-off-by: Chen Yiyang <cyyzero@qq.com>
After pr containerd#8617, create handler of containerd-shim-runc-v2 will call handleStarted() to record the init process and handle its exit. Init process wouldn't quit so early in normal circumstances. But if this screnario occurs, handleStarted() will call handleProcessExit(), which will cause deadlock because create() had acquired s.mu, and handleProcessExit() will try to lock it again. For some historical reasons, the use of s.mu is a bit confusing, sometimes used to protect s.containers only, and sometimes used as a mutex for some functions. According to analysis of @corhere in containerd#9103, we can know that: Locking s.mu for Create() handler is introduced in containerd#1452, which was a solution for the missed early-exits problem: Create() holds the mutex to block checkProcesses() from handling exit events until after Create() has added the container process to the s.processes map. This locking logic was copied into the v2 shim implementation. containerd#8617 solves the same problem using a strategy that does not rely on mutual exclusion between container create and exit-event handling. As for Shutdown(), the implementation was added in containerd#3004. And in this initial implementation, the mutex is locked to safely access s.containers, it is not unlocked because nothing matters after os.Exit(0). Then ae87730 changed Shutdown() to return instead of exiting, followed by containerd#4988 to unlock upon return. If the intention is to block containers from being created while the shim's task service is shutting down, locking a mutex is a poor solution since it makes the create requests hang instead of returning an error, and is ineffective after Shutdown() returns as the mutex is unlocked. If Create() or handleProcessExit() enters again after Shutdown() unlocking s.mu and returning, shim service will panic by sending to a closed channel. So I remove the unnecessary lock guards in service, and rename mu to containersMu to make it clear that it is used to protect s.containers only. Fix: containerd#9103 Signed-off-by: Chen Yiyang <cyyzero@qq.com>
Depends on containerd/go-runc#24
The is currently a race with the reaper where you could miss some exit
events from processes.
The problem before and why the reaper was so complex was because
processes could fork, getting a pid, and then fail on an execve before
we would have time to register the process with the reaper. This could
cause pids to fill up in a map as a way to reduce the race.
This changes makes the reaper handle multiple subscribers so that the
caller can handle locking, for when they want to wait for a specific
pid, without affecting other callers using the reaper code.
Exit events are broadcast to multiple subscribers, in the case, the runc
commands and container pids that we get from a pid-file. Locking while
the entire container stats no longs affects runc commands where you want
to call
runc create
and wait until that has been completed.Signed-off-by: Michael Crosby crosbymichael@gmail.com