Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add container runtime #5572

Merged
merged 2 commits into from
Mar 20, 2015
Merged

Add container runtime #5572

merged 2 commits into from
Mar 20, 2015

Conversation

yifan-gu
Copy link
Contributor

Split the PR of #5518.

This is an attempt to introduce container runtime interface.

Some pieces are missing, (image pulling part, which I will try to fill out in the next couple of days)
The runtime cache is just copy-pasted from the docker cache, (I think we can refactor that piece later).

The interface fits rocket's model well, though by now we cannot handle RunContainerInPod / KillContainerInPod efficiently. But I think there is a plan to get that point soon.

For the docker part, I am now trying to hide the docker.GetKubeletDockerContainers with ListPods. And will try to see if we can hide other docker code in the kubelet with the interface.

Feedback welcome! Thank you!

@dchen1107

@j3ffml
Copy link
Contributor

j3ffml commented Mar 17, 2015

cc @vmarmol

@dchen1107
Copy link
Member

Container runtime interface looks good to me, and i am ok to iterate later with image handling operations.

There are compiling issues in second commit.

@yifan-gu
Copy link
Contributor Author

Oh sorry, I updated the ListPod to take a boolean this afternoon, but forgot to update the fake container runtime... just a sec

Also rebased on master, replace BoundPod with Pod

@yifan-gu yifan-gu force-pushed the rkt_support branch 4 times, most recently from 1dd6453 to 26f061d Compare March 18, 2015 17:04
@dchen1107 dchen1107 mentioned this pull request Mar 18, 2015
8 tasks
// ListPods returns a list of Pod. The boolean parameter specifies whether
// the runtime returns all pods including those already exited and dead pods(used
// for garbage collection).
ListPods(all bool) ([]*api.Pod, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out how you are planning to use ListPods in container garbage collection. Thinking about a scenario, a given pod with restart policy always has two containers: A and B. A is in crashloop, and per container cap is 5. How are you going to enforce the limit at the higher level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dchen1107 Good question! In the rocket world, I think we won't need to garbage collect each container, because it will be taken care by the PID 1 in the pod, so we only need to GC in the level of pods.
But in the docker scenario, I think maybe we can add an ID field in api.Container, so that we can get all A1 A2 A3 in the pod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dchen1107 Besides ID, there is another thing we need to store -- hash. Currently I did a hack which stores it on pod.Annotations..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also usually need to keep around a dead container for its logs and to find out how it ended (idk how Rocket handles that)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to enforce per container limit, and I like the idea to push down those logic to runtime library including docker.

But how to enforce per node limit? The node might end up with tons of pods which in Running phase, but a lot of crashing containers. When exceeding per node limit, kubelet have to recycle containers more aggressively without respecting per container limit.

Also I don't think I understand your suggestion on ID field in api.Container? How is it going to help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I think I can just use the container statuses in the pod?
api.Pod.Status.Info

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api.Pod.Status.Info includes per container detail status, but one entry per container name. I plan to extend it to include last terminated information for each container. But still it doesn't have rest terminated containers' information.

I do think we might need a container type in runtime abstraction, but open to other suggestions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think define container and pod in the container runtime here can be more clear. And actually there are lot of things in api.Container we don't actually use here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How bad to have ListContainersInPod(string containerName, pod_uid string, ...) which returns all containers running or dead grouped by a pod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not bad at all. But what could be the type of the returning containers? Since we want the container id (for killing) and hash (for comparison) in syncPod, we will need to store them somewhere.

At least we need to return map[id]api.Container, but we still need the hash, so maybe map[fullName]api.Container. But embedding hash in the name is not a good idea IMO. So I am still thinking about creating Container type in runtime, which can has any fields we want: id, hash...

Any suggestions?

// running containers, or mixed with dead ones (when ListPods(true)).
Containers []*Container
// The status of the pod.
Status api.PodStatus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thing that worries me with Status in Pod is that for Docker we'll need to inspect every container after we list. This will be a bit expensive. Lets go with this and see how it goes. The abstraction does make it much nicer to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmarmol Yes, I have thought about that. I think it might make more sense to let each syncPod() routine inspect its containers in the pod. Let me add a TODO to revisit this here, thank you!

@vmarmol
Copy link
Contributor

vmarmol commented Mar 19, 2015

Over all I really like the simplicity of the API :) today's API is a mess of stuff all put together.

@yifan-gu yifan-gu force-pushed the rkt_support branch 3 times, most recently from 8e4a20b to 8d6c144 Compare March 19, 2015 17:47
// ListPods returns a list of Pod. The boolean parameter specifies whether
// the runtime returns all pods including those already exited and dead pods(used
// for garbage collection).
ListPods(all bool) ([]*Pod, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Sorry to be a pain, but can we rename this to GetPods()? We seem to be standardizing on Get instead of List with the functions below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@yifan-gu yifan-gu force-pushed the rkt_support branch 5 times, most recently from 3dcb27b to 8c9a875 Compare March 19, 2015 20:49
@dchen1107
Copy link
Member

LGTM overall. There are a couple of things missing, but we can iterate them later. Meanwhile I will also take a look on how to refactory existing dockertool package to support this API.

@vmarmol Any more comments?

I will merge it on green.

@dchen1107 dchen1107 added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Mar 19, 2015
@yifan-gu
Copy link
Contributor Author

@dchen1107 Sure, we might need to modify the interfaces while refactoring the dockertool. I have tried to do a GetPods(), I think that is OK. Tricky parts would be hide pod infra container in RunContainerInPod() with efficiency.

@dchen1107
Copy link
Member

I had a PR just merged which hiding pod infra container completely in dockertool now. That package needs more refactory / cleanup.

@yifan-gu
Copy link
Contributor Author

@dchen1107 Fantastic!

Ugh, Shippable build is failing. But I cannot get the information. What is the script being run on shippable?

@yifan-gu
Copy link
Contributor Author

NM, just found a way to run my branch on shippable

@dchen1107
Copy link
Member

I kicked out shippable a while back. Travis takes long.

@vmarmol
Copy link
Contributor

vmarmol commented Mar 19, 2015

LGTM :)

@yifan-gu
Copy link
Contributor Author

@dchen1107 @vmarmol Think that GetPodStatus might take pod.Spec as well to know the terminationMessagePath. But we can change that in the refactor PR.

@dchen1107
Copy link
Member

I think you need a rebase again.

@yifan-gu yifan-gu force-pushed the rkt_support branch 4 times, most recently from 1e03e32 to 7888e94 Compare March 19, 2015 23:40
@yifan-gu yifan-gu force-pushed the rkt_support branch 2 times, most recently from a014b0a to 13033d4 Compare March 20, 2015 17:04
@yifan-gu
Copy link
Contributor Author

Shippable keeps failing me... Try rebase again.

@dchen1107
Copy link
Member

I just kicked out shippable for this PR. :-)

@dchen1107 dchen1107 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 20, 2015
@dchen1107
Copy link
Member

Thanks for the PR. We can start another round of refactory and cleanup.

dchen1107 added a commit that referenced this pull request Mar 20, 2015
@dchen1107 dchen1107 merged commit fbd362d into kubernetes:master Mar 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants