Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: Move KillContainer to container runtime. #6068

Closed
wants to merge 1 commit into from

Conversation

yifan-gu
Copy link
Contributor

@vmarmol vmarmol self-assigned this Mar 27, 2015

// KillContainer kills a docker container with the given containerID.
// TODO(yifan): To use strong type for the containerID.
func KillContainer(client DockerInterface, containerID string,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I'm not sure I like how this is looking. It seems like all the readiness and event recording should be outside of the runtime. It is conceptually all work Kubelet does outside of what container runtime it uses. Do you have a plan on how readiness, ref, and recorder will be used in the runtime? Seems like Kubelet can call KillContainer (which kills) and can do teh recording and readiness itself from the status.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmarmol I got your point. The reason here is if we want to let container runtime hide the pod infra container behind the interface, we will need to kill and restart the whole pod in the runtime. So we need to be able to manage the readiness, and report events in the runtime.

There could be other approaches that can leave the kubelet to manage the readiness, ref, etc.
One came to my mind is to add something like preparePod for the kubelet. After entering into syncPod, preparePod will be invoked first:

switch container runtime choice:
    case docker: check and compare the pod infra container, kill everything if necessary
    case rocket: no-op
    case other: ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmarmol OK, I think container runtime interface should not hide the implicit killing/restart. Probably it's better to add some code to kubelet to handle different container runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me just hold this, and take a time to rethink. I will try for the preparePod, see how that looks :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the runtime does do it implicitly I agree that it should set the readiness itself (or provide some information about the restarted containers). I don't want the Kubelet to do something different depending on the runtime so the approach you have today is preferred.

I will wait for your rethinking though :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmarmol Thank you. I am thinking that maybe we don't need to let container runtime manage the restart of the whole pod. Maybe we can figure out a way that let the containers restart during the reconciliation process.
For example:

docker.RunContainer() {
    if needRestartPodInfraContainer {
        restartPodInfraContainer()
        invalidateRunningPod().
    }
}

kubelet.syncPod()
    for each container in runningPod:
        restart := hashChanged || unhealthy || invalid
        if restart {
            restartContainer()
        }
    }       
}

By this we can pull the ref, readiness, etc out :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmarmol I have spread thoughts here: #6100 (comment)

Think we can just go that way, still let container runtime to handle refs, readiness for now, but move them out of the RunContainer() or KillContainer(). We can come back to revisit this later. How do you think about that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked your suggestion to have the runtime do KillContainer() and handle KillPod() in the Kubelet for now. This will let is set readiness and record events. Is this what you had in mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me put those together options together, just a second :)

@yifan-gu
Copy link
Contributor Author

Options to handle pod infra container:

    1. Let docker container runtime to recreate the pod infra container and then restart the whole pod in RunContainer():
      Something like
func dockertools.RunContainer() {
    if needToRecreatePodInfraContainer {
        killAndRecreatedPodInfraContainer()
        for each container in running pod {
            killContainer()
        }
    }
    runContainer()
}

Pros: No additional logic for kubelet, kubelet just needs to compare the running containers with their desired states, the pod infra container will be recreated at the first run of RunContainer() if it's necessary.
Cons: Need container runtime to manage the refs, readiness.

    1. Similiar to 1), but only "MARK" the running containers as invalid. So in the kubelet loop, it can kill and restart it.
func dockertools.RunContainer() {
    if needToRecreatePodInfraContainer {
        killAndRecreatedPodInfraContainer()
        for each container in running pod {
            container.Hash = SomeInvalidHash
        }
    }
    runContainer()
}

Pros: No additional logic for kubelet. Also no ref, readiness management for the container runtime.
Cons: Need the running pod as an input for the RunContainer() so we can mark them.

    1. Let kubelet handle the pod infra container in a seperate function, which will do something special if the container runtime is docker:
switch container runtime choice {
    case docker: 
        if needToRecreatePodInfraContainer {
            killAndRecreatedPodInfraContainer()
            re-listing the running pod
        }
    case rocket: no-op
    case other: ...

Pros: The interface of the container runtime can be simpler, and it does not need to deal with ref, readiness, etc.
Cons: Added some additional code path for kubelet to deal with different container runtime.

I think 2) or 3) is better than 1).
@vmarmol

@pmorie
Copy link
Member

pmorie commented Mar 28, 2015

I'm late to the party, but I have a question @vmarmol @yifan-gu does a pod running on rocket not require an infra container?

@vmarmol
Copy link
Contributor

vmarmol commented Mar 28, 2015

@pmorie AFAIK Rocket creates a top-level "container" with namespaces and then runs other containers within that container. We'd configure the top-level as we do today for the pod infra and run the others similar to what we do today in Docker.

@yifan-gu
Copy link
Contributor Author

@pmorie vmarmol
So in rocket, all the containers (a.k.a "apps" in rocket term) are launched together to form a pod. These apps are all managed by a PID 1 process within the same namespaces. Each app can have their dedicated resource requirements, and the whole pod can have a global resources requirement as well.
No additional pod infra container is necessary.

/cc @jonboulle

@jonboulle
Copy link
Contributor

Not sure why I'm being CCed (I think Yifan is explaining things pretty well - pods are the fundamental execution unit in Rocket so it should be able to manage them natively without futzing around with infra containers), but let me know if there's anything I can clear up :-).

@vmarmol
Copy link
Contributor

vmarmol commented Apr 7, 2015

Is this PR still relevant given our other discussions?

@yifan-gu
Copy link
Contributor Author

yifan-gu commented Apr 7, 2015

@vmarmol So if we decide to take the preparePod approach for now, then we can put off this PR.

@vmarmol
Copy link
Contributor

vmarmol commented Apr 20, 2015

Closing since we're moving elsewhere. Feel free to re-open or re-send when we get here.

@vmarmol vmarmol closed this Apr 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants