Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for implementing init containers #23666

Merged
merged 1 commit into from
Jun 18, 2016

Conversation

smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented Mar 31, 2016

Addresses #1589. Implemented in #23567. Docs in kubernetes/website#679

Init containers enable pod authors to perform tasks before their normal containers start. Each init container is started in order, and failing containers will prevent the application from starting.

@smarterclayton smarterclayton added the kind/design Categorizes issue or PR as related to design. label Mar 31, 2016
@smarterclayton
Copy link
Contributor Author

@kubernetes/sig-node @kubernetes/kube-api

@k8s-github-robot k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 31, 2016
to start and that configuration is environment dependent, the image must be
altered to add the necessary templating or retrieval.

This proposal introduces the concept of an **init container**, one or more
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a base case of DAG/workflow + pre&post-hook mechanics, except this proposal is missing the post bit.

@bgrant0607, @davidopp - Instead of speculating, I'll just ask ;-) Do you have a long term DAG/workflow model & hooks that you are planning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to do DAG, then we'd probably have data preconditions on
containers (the volume type container as proposed), and we'd need control
conditions. I may have missed that in Brian's comment about doing volume
type container (which form the edges) as the implication of the change.

If we implement the DAG, init containers are technically still useful and
easier to reason about total resources for. I believe in a future DAG
model you could convert init containers to a DAG defined solely in the
containers slice transparently, so one does not block the other.

On Thu, Mar 31, 2016 at 9:55 AM, Timothy St. Clair <notifications@github.com

wrote:

In docs/proposals/container-init.md
#23666 (comment)
:

+March 2016
+
+## Proposal and Motivation
+
+Within a pod there is a need to initialize local data or adapt to the current
+cluster environment that is not easily achieved in the current container model.
+Containers start in parallel after volumes are mounted, leaving no opportunity
+for coordination between containers without specialization of the image. If
+two containers need to share common initialization data, both images must
+be altered to cooperate using filesystem or network semantics, which introduces
+coupling between images. Likewise, if an image requires configuration in order
+to start and that configuration is environment dependent, the image must be
+altered to add the necessary templating or retrieval.
+
+This proposal introduces the concept of an init container, one or more

This is a base case of DAG/workflow + pre&post-hook mechanics, except this
proposal is missing the post bit.

@bgrant0607 https://github.com/bgrant0607, @davidopp
https://github.com/davidopp - Instead of speculating, I'll just ask ;-)
Do you have a long term DAG/workflow model & hooks that you are planning?


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/pull/23666/files/7efdc3d36197f1ca452339d49abbc87163fb0e75#r58056775

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really looks more like ~job/task/container-hooks despite the larger question. In other systems they are tied to life-cycle events:

  • fetch (which we may or may not care about)
  • pre/init (this proposal/setup)
  • update (which we don't really do, but useful in batching/restart)
  • evict (sometimes treated separately, billing and what not)
  • exit (we probably want/tear-down)

dunno if you want to address here or explicitly defer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am strongly opposed to DAG.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other proposed hooks include preStart and postStop, at the container level: #140

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "PreStart" container hooks the same as a pre-start pod hook?

A pre start container hook would just be some bash that executes a binary in the container right? If I need to insert the binary in my container, I might as well invoke it in an entrypoint script. How much value does it add?

Instead of stacking init containers at the pod level, we could just attach them to the specific container they're initializaing and call them PreStart container hooks. This feels less useful in a pod context, I'd want to wait till all containers in my pod are initialized before starting any part of the app.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bprashanth responding to vish i was addressing renaming. You seem to be addressing functional gap? I thought we seemed to be ok on init container as pre app-container start, merely naming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Pardon my confusion, I was just making sure we weren't doing a pre start container hook instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton So what is the final decision? Is it going to stay as init containers or be renamed to pre-start containers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekwaynecarr
Copy link
Member

Ensuring I do not forget to come back and read this in more detail when not debugging node issues. (cc @derekwaynecarr)

@yifan-gu
Copy link
Contributor

@jonboulle @philips We need to think about how this can be implemented in rkt without introducing container level intefaces to rkt for now. e.g. Introducing dependencies between the apps and let systemd of the pod to handle it?

cc @dchen1107 @vishh @sjpotter

@smarterclayton
Copy link
Contributor Author

I'll incorporate whatever plan you guys have - I had assumed systemd
dependencies would be a good angle but wanted to let you guys weigh in
before going any further.

On Thu, Mar 31, 2016 at 2:45 PM, Yifan Gu notifications@github.com wrote:

@jonboulle https://github.com/jonboulle @philips
https://github.com/philips We need to think about how this can be
implemented in rkt without introducing container level intefaces to rkt for
now. e.g. Introducing dependencies between the apps and let systemd of the
pod to handle it?

cc @dchen1107 https://github.com/dchen1107 @vishh
https://github.com/vishh @sjpotter https://github.com/sjpotter


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#23666 (comment)

block startup of containers
* Running a "pre-pod" would defeat the purpose of the pod being an atomic
unit of scheduling.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding a StartCondition? The start condition could be things like, Container A is ready, Container B is successful, or Container C is exited. Pros: (maybe) fewer new concepts; can represent arbitrary dags. Cons: adds lifecycle complexity, specifically complicates the restart policy; startup order less predictable.
Is this what @timothysc proposed above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example issue: We support image updates. What should happen if we update the image of a container that other containers depend on? For that matter, if a container fails and restarts, what should happen to dependent containers?

## Design Points

* Init containers should be able to:
* Perform initialization of shared volumes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think #831 is preferable for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll amend that to we would have a start / readiness precondition DAG if a fuse container was modeled with container volume defining the dependency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If init containers execute after all volumes have been mounted and initialized, it shouldn't preclude implementation of #831, no.

@yifan-gu
Copy link
Contributor

yifan-gu commented Apr 6, 2016

Other proposed hooks include preStart and postStop, at the container level: #140

@bgrant0607 @smarterclayton I also feel this is something can be implemented by preStart, even you have multiple containers that depends on the init container.

In this case the preStart should run in the container's namespace, which is something like what's defined in appc's pre-start.

One concern might be the app container's image doesn't have the binaries available for the hooks, but I don't really bother about this, as if users wants to run the hooks, they need to have the binary available anyway, either in the app container or in the init container. To me it seems have the binaries available in the app's image reduces the overhead then using a particular image for the init container.

So I am curious what values init containers add besides preStart hooks? Of course, one obvious reason could be that docker doesn't have preStart hooks today...

cc @philips @jonboulle

@smarterclayton
Copy link
Contributor Author

A big design point for init containers is to not have to have the binaries
in the app image. It's a composition pattern. Also, init containers may
need to run under different security privileges or have access to different
secrets than regular containers.

On Wed, Apr 6, 2016 at 2:19 AM, Yifan Gu notifications@github.com wrote:

Other proposed hooks include preStart and postStop, at the container
level: #140 #140

@bgrant0607 https://github.com/bgrant0607 @smarterclayton
https://github.com/smarterclayton I also feel this is something can be
implemented by preStart, even you have multiple containers that depends on
the init container.

In this case the preStart should run in the container's namespace, which
is something like what's defined in appc's pre-start
https://github.com/appc/spec/blob/master/spec/aci.md.

One concern might be the app container's image doesn't have the binaries
available for the hooks, but I don't really bother about this, as if users
wants to run the hooks, they need to have the binary available anyway,
either in the app container or in the init container. To me it seems have
the binaries available in the app's image reduces the overhead then using a
particular image for the init container.

So I am curious what values init containers add besides preStart hooks? Of
course, one obvious reason could be that docker doesn't have preStart hooks
today...

cc @philips https://github.com/philips @jonboulle
https://github.com/jonboulle


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#23666 (comment)

* Download binaries that will be used in app containers as execution targets
* Inject configuration or extension capability to generic images at startup
* Perform complex templating of information avaliable in the local environment
* Register the pod with other components of the system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are (at least) 3 categories of "init" actions:

  1. Fetch data into volumes. Better served by Image volumes and container volumes #831 IMO, but if this is the only hammer we're building, it could also pound that screw.
  2. Perform context-specific transformations of volume data, likely out-of-place, and other one-time config generation (perhaps using downward-API data). I thik this is usually container-specific rather than volume-specific, and perhaps could be done with a preStart hook (maybe leveraging a script mounted from a volume), if we had preStart hooks. Again, could be done by an init container in lieu of preStart hooks, though the separation of the init container and consuming container in the spec could be awkward.
  3. Perform "read-only" pre-start actions, like registration and delaying startup. Ideal for a pod-level init container.

@dchen1107
Copy link
Member

What are the open issues left here for closing this proposal? I raised QoS issues above, and thought we reached an agreement already and the proposal from @vishh already covered that based our discussion and merged. Also QoS issue shouldn't be blocker here.

@smarterclayton
Copy link
Contributor Author

I don't think there is anything left except to document the current
discussion in the PR.

On Jun 3, 2016, at 2:52 PM, Dawn Chen notifications@github.com wrote:

What are the open issues left here for closing this proposal? I raised QoS
issues above, and thought we reached an agreement already and the proposal
from @vishh https://github.com/vishh already covered that based our
discussion and merged. Also QoS issue shouldn't be blocker here.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#23666 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABG_pycH-5R_5Bu1ULo_7MHpDKci3OFSks5qIHfvgaJpZM4H8XbD
.

* Using a volume container that does not populate a volume to delay pod start
(in the absence of init containers) would be an abuse of the goal of volume
containers.
* Container pre-start hooks are not sufficient for all initialization cases:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since pre-start hooks are not implemented yet, link to the issue: #140

@smarterclayton smarterclayton force-pushed the initctrproposal branch 2 times, most recently from 0ce2d72 to f46a2f7 Compare June 5, 2016 23:31
@smarterclayton
Copy link
Contributor Author

Updated with all feedback to date (noted that pod QoS is higher of init container QoS or regular container QoS, which was part of the discussion above).

@pwittrock
Copy link
Member

pwittrock commented Jun 17, 2016

@smarterclayton
@bprashanth
@dchen1107

Would you provide an update on the status for the documentation for this feature as well as add any PRs as they are created?

Not Started / In Progress / In Review / Done

Docs Instructions for v1.3

Thanks
@pwittrock

@bprashanth
Copy link
Contributor

I totally forgot about this proposal since we got the implementation in.Giving it a final read through, I think it's good to go in though.

@bprashanth bprashanth added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2016
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2016
@pwittrock
Copy link
Member

Docs status?

@smarterclayton
Copy link
Contributor Author

Docs are waiting for review in kubernetes/website#679

@smarterclayton smarterclayton added release-note-experimental lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed release-note-label-needed labels Jun 17, 2016
@smarterclayton
Copy link
Contributor Author

Added release note.

@bprashanth bprashanth added this to the v1.3 milestone Jun 17, 2016
@k8s-bot
Copy link

k8s-bot commented Jun 17, 2016

GCE e2e build/test passed for commit bdde25c.

@k8s-bot
Copy link

k8s-bot commented Jun 18, 2016

GCE e2e build/test passed for commit bdde25c.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit d03a5fa into kubernetes:master Jun 18, 2016
@mqliang
Copy link
Contributor

mqliang commented Oct 8, 2016

@smarterclayton @bgrant0607

Why not add general Pod level life cycle hook? AFAIK, we have container level life cycle hook, Pod level life cycle hook is also very helpful.

IMO, InitContainers can implement as a kind of Pod level PreStart hook. And we have some case need Pod level PreStop hook support, it's ugly to add a PostContainers. A general Pod level life cycle hook sounds better.

@smarterclayton
Copy link
Contributor Author

Hooks run in containers. For there to be a post stop hook, there must be a
post stop container.

Can you describe a usecase for a post-stop pod hook so we can discuss
something more concrete?

On Oct 8, 2016, at 12:41 AM, Liang Mingqiang notifications@github.com
wrote:

@smarterclayton https://github.com/smarterclayton @bgrant0607
https://github.com/bgrant0607

Why not add general Pod level life cycle hook? AFAIK, we have container
level life cycle hook, Pod level life cycle hook is also very helpful.

IMO, InitContainer can implement as a kind of Pod level PreStart hook. And
we have some case need Pod level PreStop hook support, it's ugly to add a
PostContainer. A general Pod level life cycle hook sounds better.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#23666 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p5cABtQDwoI_0K1NNOsU7WweKwpFks5qxzsQgaJpZM4H8XbD
.

@mqliang
Copy link
Contributor

mqliang commented Oct 8, 2016

@smarterclayton

Can you describe a usecase for a post-stop pod hook so we can discuss
something more concrete?

For example, I have a Pod P, which running a container C. I want to commit container C and push it into registry when Pod exit. Then, I need a Pod level pre/post stop hook. In the pre/post stop hook, I run a new container which could access docker daemon(by mount /var/run/docker.sock into container) to commit and push container C in registry

@smarterclayton
Copy link
Contributor Author

I'd recommend you do this via a privileged container that observes
termination of the "container to commit" then either commits or aborts.
That's our current working plan for openshift containers.

After all, you need to know success or failure.

On Oct 8, 2016, at 8:39 AM, Liang Mingqiang notifications@github.com
wrote:

@smarterclayton https://github.com/smarterclayton

Can you describe a usecase for a post-stop pod hook so we can discuss
something more concrete?

For example, I have a Pod P, which running a container C. I want to commit
container C and push it into registry when Pod exit. Then, I need a Pod
level pre/post stop hook. In the pre/post stop hook, I run a new container
which could access docker daemon(by mount /var/run/docker.sock into
container) to commit and push container C in registry


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#23666 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p_1bcv0bIwNo7m9fyrfvz7NpsIOWks5qx6sogaJpZM4H8XbD
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet kind/design Categorizes issue or PR as related to design. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.