-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rerun init containers when the pod needs to be restarted #47599
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: yujuhong No associated issue. Update pull-request body to add a reference to an issue, or get approval with The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
6a9cc3f
to
c683ca0
Compare
/cc @yifan-gu |
/cc @smarterclayton |
/cc @dchen1107 |
mutatePodFn func(*v1.Pod) | ||
mutateStatusFn func(*kubecontainer.PodStatus) | ||
actions podActions | ||
}{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton @dchen1107, please read the description of the test cases and make sure they meet your expectation. Thanks!
/retest |
LGTM |
initContainerNames.Insert(container.Name) | ||
} | ||
for name := range initContainerNames { | ||
count := 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can see here, this count
variable never increments? It's printed below on L673, and in the previous implementation would be incremented on each iteration of the podStatus.ContainerStatuses
loop.
I'm not particularly familiar with this bit of code, so I could be wrong. I'm currently attempting to port this patch to the 1.6 dockertools
implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the count
should be incremented. Will do this later, after getting more comments. The variable is only used in the log message, so it does not affect the functionality of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Tests look good to me. This is generally LGTM, although I can do a deeper pass if necessary. |
Thanks for the review! That's exactly what I needed from you and @dchen1107 @feiskyer already took a pass and LGTM'd the PR. I'll keep the PR open for a couple more days because @Random-Liu mentioned that he may have time to look at it. |
defer r.Unlock() | ||
|
||
for id, c := range r.Containers { | ||
if c.Metadata.Name == name && c.Metadata.Attempt == attempt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not compare sandboxID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Added.
@@ -620,7 +620,7 @@ func (m *kubeGenericRuntimeManager) killContainersWithSyncResult(pod *v1.Pod, ru | |||
// pruneInitContainers ensures that before we begin creating init containers, we have reduced the number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/pruneInitContainers/pruneInitContainersBeforeStart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if _, ok := initContainersToKeep[status.ID]; ok { | ||
// prune all other init containers that match this container name | ||
glog.V(4).Infof("Removing init container %q instance %q %d", status.Name, status.ID.ID, count) | ||
if err := m.DeleteContainer(status.ID); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use DeleteContainer
here but use runtimeService.RemoveContainer
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Switching both to m.removeContainer(status.ID.ID)
to make them consistent. I think we do need to clean up the log here since we don't do it anywhere else.
@@ -620,7 +620,7 @@ func (m *kubeGenericRuntimeManager) killContainersWithSyncResult(pod *v1.Pod, ru | |||
// pruneInitContainers ensures that before we begin creating init containers, we have reduced the number | |||
// of outstanding init containers still present. This reduces load on the container garbage collector | |||
// by only preserving the most recent terminated init container. | |||
func (m *kubeGenericRuntimeManager) pruneInitContainersBeforeStart(pod *v1.Pod, podStatus *kubecontainer.PodStatus, initContainersToKeep map[kubecontainer.ContainerID]int) { | |||
func (m *kubeGenericRuntimeManager) pruneInitContainersBeforeStart(pod *v1.Pod, podStatus *kubecontainer.PodStatus) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just add one argument, e.g. force bool
or something.
If it's force, delete all; if not, keep latest exited one?
The 2 functions are almost the same, :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. I think it's clear what each function does with the current naming. Adding a boolean would make the function more complicated and harder to test.
containerStatus := podStatus.FindContainerStatusByName(container.Name) | ||
// If container does not exist, or is not running, check whehter we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/whehter/whether
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
reason = "Container failed liveness probe." | ||
} else { | ||
// Keep the container. | ||
keepCount += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both are perfectly fine unless this documented as convention(?)
m.pruneInitContainersBeforeStart(pod, podStatus, podContainerChanges.InitContainersToKeep) | ||
// This is an optmization because container removals are typically handled | ||
// by container garbage collector. | ||
m.pruneInitContainersBeforeStart(pod, podStatus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We purge and prune, add an options seems cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still think it's clear to keep both functions.
@Random-Liu comments addressed. PTAL again, thanks |
Pushed a new commit to address a bug @Random-Liu found (good catch!). Also added a new unit test for it. |
@yujuhong LGTM without full confidence. I could understand the current behavior, and it looks much cleaner than before! I think we could only rely on the test to catch it for us. :p |
Whenever pod sandbox needs to be recreated, all containers associated with it will be killed by kubelet. This change ensures that the init containers will be rerun in such cases. The change also refactors the compute logic so that the control flow of init containers act is more aligned with the regular containers. Unit tests are added to verify the logic.
Rebased and squashed the commits. |
@yujuhong Adds a release-note for this? |
Automatic merge from submit-queue (batch tested with PRs 46317, 48922, 50651, 50230, 47599) |
Automatic merge from submit-queue (batch tested with PRs 16889, 16865). UPSTREAM: 53857: kubelet sync pod throws more detailed events Also includes the following upstream dependant PRs: UPSTREAM: 50350: Wait for container cleanup before deletion UPSTREAM: 48970: Recreate pod sandbox when the sandbox does not have an IP address. UPSTREAM: 48589: When faild create pod sandbox record event. UPSTREAM: 48584: Move event type UPSTREAM: 47599: Rerun init containers when the pod needs to be restarted xrefs: kubernetes/kubernetes#53857 kubernetes/kubernetes#50350 kubernetes/kubernetes#48970 kubernetes/kubernetes#48589 kubernetes/kubernetes#48584 kubernetes/kubernetes#47599
Whenever pod sandbox needs to be recreated, all containers associated
with it will be killed by kubelet. This change ensures that the init
containers will be rerun in such cases.
The change also refactors the compute logic so that the control flow of
init containers act is more aligned with the regular containers. Unit
tests are added to verify the logic.
This fixes #36485