spec: introduce pods #150

jonboulle · 2015-01-28T09:57:59Z

This is a first cut at pods, introducing a Pod Manifest and tweaking some of
the other concepts slightly.

A Pod Manifest is simply a grouping of application images, and a convenient
deployable unit which can be provided as input to an executor.

However, the application image references in a pod manifest are not
necessarily deterministic: they may use the Dependency Matching mechanism (as
in the dependencies section of Image Manifest) to resolve applications. For
example, a pod manifest with an application with a "version=latest" label
might resolve to a different particular application image at different points
in time.

In contrast, the application references in a Container Runtime Manifest MUST
be deterministic (i.e. be image IDs).

This is a first cut at pods, introducing a _Pod Manifest_ and tweaking some of the other concepts slightly. A Pod Manifest is simply a grouping of application images, and a convenient deployable unit which can be provided as input to an executor. However, the application image references in a pod manifest are not necessarily deterministic: they may use the Dependency Matching mechanism (as in the `dependencies` section of Image Manifest) to resolve applications. For example, a pod manifest with an application with a "version=latest" label might resolve to a different particular application image at different points in time. In contrast, the application references in a Container Runtime Manifest MUST be deterministic (i.e. be image IDs).

jonboulle · 2015-01-28T10:00:29Z

This is an attempt to capture the conversation @cdaylward and I had last night, and would conceivably solve #83
Probably needs a bit of work, but consider this a starting point for discussion.
/cc @philips @mpasternacki

eyakubovich · 2015-01-28T18:36:55Z

+1. I don't have a suggestion for a different name but since a Pod is already taken by k8s project, I think we should use something else (even though there are lots of similarities).

Should the Pod manifest be retrievable via metadata svc?

jonboulle · 2015-01-28T18:41:20Z

@eyakubovich aaagh, English has run out of nouns 😛. Honestly, I think pod is a decent name; if we can come up on a way in which we can align/agree on usage with the k8s more concretely, so much the better.

Should the Pod manifest be retrievable via metadata svc?

My (implicit) argument in this incarnation of the diff is that the pod is really only useful as a deployable unit,but in itself it contains no information that is not available in the CRM/IMs. In fact, it contains less information since it contains pre-resolved app references.

Open to arguments otherwise, though.

mpasternacki · 2015-01-28T18:45:16Z

+1 on not reusing a similar-but-not-entirely-same term from a neighbouring project; maybe appliance would be a good word?

If I understand correcly a pod would be a container manifest, striped of everything except RuntimeApp list. I don't really understand the use case that wouldn't be solved by incomplete container manifests that could be used as a template for actual containers. Maybe relaxing restrictions on valid container manifests, or explicitly distinguishing between container manifest template and "real" container manifest would solve the problem without introducing a new category?

jonboulle · 2015-01-28T18:52:01Z

I don't really understand the use case that wouldn't be solved by incomplete container manifests that could be used as a template for actual containers.

The problem with doing this without introducing a new type is that a container manifest becomes a terribly loose concept. See e.g. #83 (comment) . My hope with this kind of change is to better codify the CRM as an "execution manifest of record".

mpasternacki · 2015-01-28T19:09:28Z

I see the issue here. IMO, to keep the spec DRY, the container runtime manifest should refer to a pod rather than include same information, if only to avoid schema divergence between the two.

If role of the pod would be to define a complete deployable setup (say, Sentry installation: HTTP application server, background task worker, Redis, and Memcached), then at least ports and volumes would need to be included in the pod, as interface description (maybe not all volumes, but definitely ones that need to be externally visible - say, data directory for backups/persistence).

If we take this concept a step forward and say that pod is a distributable description, it might need some metadata: its own name & version (or name & labels), maybe some longer description field or annotations (meant as notes/instructions for people).

Side note (may need a separate ticket): This reminds me of problems that I have with Docker containers: deep integration with host. While I want my application to be containerized, I prefer to keep some services on the host (say: SQL database, MTA, frontend httpd + ssl terminator; actually, for sake of this discussion, they may as well live each in its own separate container). I somehow need to provide MTA/SQL ip+port to the container, and container's ip+port to the frontend httpd; in a perfect world, I could also configure the frontend httpd to directly serve application's static files and only forward dynamic requests to the container. This requires a lot of problematic plumbing, and is something I'd love to be able to easily handle: things like rendering an nginx virtual host config from a template based on container's metadata. No ideas how to approach this, though, but it feels to me like it's something to keep in mind here.

cdaylward · 2015-02-02T21:57:40Z

SPEC.md

@@ -118,6 +118,13 @@ Image Format TODO
 * Define security requirements for a container. In particular is any isolation of users required between containers? What user does each application run under and can this be root (i.e. "real" root in the host).
 * Define how apps are supposed to communicate; can they/do they 'see' each other (a section in the apps perspective would help)?

+### Pod Manifest


Propose not using "pod". A group of colocated apps in one container is not the same model as the kubernetes "pod" (a group of containers). I don't like the idea of using a borrowed word from something that is actually a bit different.

For arguments sake let's look at what would happen if we just imported the Kubernetes definition wholesale (I skipped the "Uses of pods" and "Alternatives considered" sections because those are expository rationale for the whole concept and so not directly pertinent to this part of our discussion):

Pods
In Kubernetes, rather than individual containers, pods are the smallest deployable units that can be created, scheduled, and managed.

The only thing fuzzy here is "container", but that's well-known to be a weasel world and I'll touch on that more below. Perhaps "self-contained application environment" or so would be better - I would actually expect the language to change here in future to clarify this (sadly they don't define "container" in their glossary!). Otherwise, this captures exactly what we're trying to propose.

What is a pod?

A pod (as in a pod of whales or pea pod) correspond to a colocated group of Docker containers with shared volumes.

I will come back to this as "point number 1".

A pod models an application-specific "logical host" in a containerized environment. It may contain one or more containers which are relatively tightly coupled -- in a pre-container world, they would have executed on the same physical or virtual host.

Again, "containers" is weasly, but the rest is completely accurate.

Like running containers, pods are considered to be relatively ephemeral rather than durable entities. As discussed in life of a pod, pods are scheduled to nodes and remain there until termination (according to restart policy) or deletion. When a node dies, the pods scheduled to that node are deleted. Specific pods are never rescheduled to new nodes; instead, they must be replaced (see replication controller for more details). (In the future, a higher-level API may support pod migration.)

Given the current (unfortunate) appc nomenclature, if you s#pod#appc container# in this paragraph it all checks out perfectly.

Motivation for pods

Resource sharing and communication

Pods facilitate data sharing and communication among their constituents.

Ah, finally they omitted the word container! Constituents = instances of application images.

The containers in the pod all use the same network namespace/IP and port space, and can find and communicate with each other using localhost. Each pod has an IP address in a flat shared networking namespace that has full communication with other physical computers and containers across the network. The hostname is set to the pod's Name for the containers within the pod. More details on networking.

Again, s#container#instance-of-ACI# and we're fine

In addition to defining the containers that run in the pod, the pod specifies a set of shared storage volumes. Volumes enable data to survive container restarts and to be shared among the containers within the pod.

Yup

In the future, pods will share IPC namespaces, CPU, and memory (LPC2013).

So here we are actually ahead of the game since we already encapsulate these ideas as first-class to the spec.

Management

Pods also simplify application deployment and management by providing a higher-level abstraction than the raw, low-level container interface. Pods serve as units of deployment and horizontal scaling/replication.

This is exactly what I'm trying to clarify/implement with this PR

Co-location, fate sharing, coordinated replication, resource sharing, and dependency management are handled automatically.

Yup, with the pod being the first-class deployable unit then all of this is implicit.

Now, coming back to point numero one. This paragraph very specifically refers to Docker containers as the singular unit of which the pod is a collection - and then subsequently throughout the document, "container" is presumably shorthand for specifically this. This is unfortunate because I am not even sure how well a "Docker container" is defined. But let's cheat and look at the Docker website:

Docker containers are similar to a directory. A Docker container holds everything that is needed for an application to run. Each container is created from a Docker image. Docker containers can be run, started, stopped, moved, and deleted. Each container is an isolated and secure application platform. Docker containers are the run component of Docker.

OK, so this sounds... exactly like an ACI. There is obviously a bunch of stuff missing here (like embedded annotations describing how the application can run, isolation that applies to the level of the application itself, etc), but all of this exists very much in the appc world.

After Kubernetes gets around to implementing the sharing of other namespaces they allude to above (IPC, etc), they will be even closer to the definition that is codified in the appc spec today.

Pods
In Kubernetes, rather than individual containers, pods are the smallest deployable units that can be created, scheduled, and managed.

The only thing fuzzy here is "container", but that's well-known to be a weasel world and I'll touch on that more below. Perhaps "self-contained application environment" or so would be better - I would actually expect the language to change here in future to clarify this (sadly they don't define "container" in their glossary!). Otherwise, this captures exactly what we're trying to propose.

I'm not so sure. This language feels cherry-picked because the entire Pods section of the Kubernetes documentation is predicated upon the model being used (manipulating multiple containers simultaneously).

A pod models an application-specific "logical host" in a containerized environment. It may contain one or more containers which are relatively tightly coupled -- in a pre-container world, they would have executed on the same physical or virtual host.

Again, "containers" is weasly, but the rest is completely accurate.

"It may contain one or more containers" ... an example of how this model is just not the same. As App Container is already unique, I would rather approach this from the other direction. Decompose and define the pieces (create/extend the model) first, then incorporate shared language where appropriate. Either appc ends up with its own well-defined terms or it might end up matching an existing model (in which case we should use shared language). Hunting for exiting language to use on a different model seems like a recipe for confusion. It might also lead to needing additional language in documentation to contrast the App Container model from the model the language is being borrowed from (as "pods" already exists in the space, I feel this is likely).

jonboulle · 2015-02-03T10:47:39Z

I see the issue here. IMO, to keep the spec DRY, the container runtime manifest should refer to a pod rather than include same information, if only to avoid schema divergence between the two.

Well, the idea proposed in this ticket is that the pod is a template and that the CRM is a specific rendering of the template at a particular point in time. Does that make sense? So I am not sure how much duplication there necessarily is at this point. Open to digging into this further though.

If role of the pod would be to define a complete deployable setup (say, Sentry installation: HTTP application server, background task worker, Redis, and Memcached), then at least ports and volumes would need to be included in the pod, as interface description (maybe not all volumes, but definitely ones that need to be externally visible - say, data directory for backups/persistence).

If we take this concept a step forward and say that pod is a distributable description, it might need some metadata: its own name & version (or name & labels), maybe some longer description field or annotations (meant as notes/instructions for people).

Yeah, makes sense.

Side note (may need a separate ticket): This reminds me of problems that I have with Docker containers: deep integration with host. While I want my application to be containerized, I prefer to keep some services on the host (say: SQL database, MTA, frontend httpd + ssl terminator; actually, for sake of this discussion, they may as well live each in its own separate container). I somehow need to provide MTA/SQL ip+port to the container, and container's ip+port to the frontend httpd; in a perfect world, I could also configure the frontend httpd to directly serve application's static files and only forward dynamic requests to the container. This requires a lot of problematic plumbing, and is something I'd love to be able to easily handle: things like rendering an nginx virtual host config from a template based on container's metadata. No ideas how to approach this, though, but it feels to me like it's something to keep in mind here.

Let's discuss this in a separate ticket, but I think I understand what you're getting at and it's a use case that we have very much been kicking around too. My latest thinking on this is that we should not define it as part of the spec per se but essentially describe it (in an FAQ or best practices guide or just as an aside); basically that people can use the ACI format to run things on a host that might strictly speakly not conform to the spec (namely the executor stuff)

…re not necessary

cdaylward · 2015-02-13T07:50:24Z

SPEC.md

@@ -371,6 +380,16 @@ An AC Name Type cannot be an empty string.
 The AC Name Type is used as the primary key for a number of fields in the schemas below.
 The schema validator will ensure that the keys conform to these constraints.

+## Image ID Type
+
+An Image ID Type encapsulates the cryptographic hash of an image.


As it's just a string, s#encapsulates#represents# ?

cdaylward · 2015-02-13T08:07:57Z

If we take this concept a step forward and say that pod is a distributable description, it might need some metadata: its own name & version (or name & labels), maybe some longer description field or annotations (meant as notes/instructions for people).

+1 Any manifest that constitutes an "ingestible" artifact should support versioning. Any reason to not then also afford the manifest the same discoverability as ACIs (AC Name/DNS namespace)?

jonboulle · 2015-02-16T19:17:31Z

Closing this, to be followed up with another proposal.

Following on from appc#150, this is the next iteration at an attempt to introduce the concept of pods. Following the most recent discussion on the topic, this patch attempts to simplify things greatly by removing the ContainerRuntimeManifest entirely and settling instead on a single PodManifest. To achieve this, we give the PodManifest a dual purpose: it is both a _deployable template_ (which can be used to stamp out instances of pods) which before execution must be resolved into a final/reified version. The allowed reified versions are a _subset_ of the template versions; for example, app references in the reified version of a Pod Manifest MUST be deterministic and contain image IDs (i.e. they cannot rely on labels and dependency matching). An important distinction from the previous CRM model is that a reified PodManifest no longer _necessarily_ includes runtime information specific to the executor (i.e., the UUID). It is hence possible that a fully reified pod could be passed around as a deployable unit to different executors. This patch also goes to some length to eliminate the (frustratingly ambiguous) word "container" from the spec, except for in qualified contexts (e.g. "App Container Image").

jonboulle · 2015-02-18T01:54:39Z

@mpasternacki

Maybe relaxing restrictions on valid container manifests, or explicitly distinguishing between container manifest template and "real" container manifest would solve the problem without introducing a new category?

good idea ;-).

After thrashing this out in person a bit with @philips @cdaylward , we arrived at something like #207 (still slightly WIP)

Following on from appc#150, this is the next iteration at an attempt to introduce the concept of pods. Following the most recent discussion on the topic, this patch attempts to simplify things greatly by removing the ContainerRuntimeManifest entirely and settling instead on a single PodManifest. To achieve this, we give the PodManifest a dual purpose: it is both a _deployable template_ (which can be used to stamp out instances of pods) which before execution must be resolved into a final/reified version. The allowed reified versions are a _subset_ of the template versions; for example, app references in the reified version of a Pod Manifest MUST be deterministic and contain image IDs (i.e. they cannot rely on labels and dependency matching). An important distinction from the previous CRM model is that a reified PodManifest no longer _necessarily_ includes runtime information specific to the executor (i.e., the UUID). It is hence possible that a fully reified pod could be passed around as a deployable unit to different executors. This patch also goes to some length to eliminate the (frustratingly ambiguous) word "container" from the spec, except for in qualified contexts (e.g. "App Container Image").

cdaylward reviewed Feb 2, 2015
View reviewed changes

SPEC: update pod manifest example to make it clearer that image IDs a…

97d84c3

…re not necessary

cdaylward reviewed Feb 13, 2015
View reviewed changes

alban mentioned this pull request Feb 16, 2015

rkt: add the stop command rkt/rkt#532

Closed

jonboulle closed this Feb 16, 2015

jonboulle mentioned this pull request Feb 18, 2015

spec: introduce Pods, remove CRMs #207

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spec: introduce pods #150

spec: introduce pods #150

jonboulle commented Jan 28, 2015

jonboulle commented Jan 28, 2015

eyakubovich commented Jan 28, 2015

jonboulle commented Jan 28, 2015

mpasternacki commented Jan 28, 2015

jonboulle commented Jan 28, 2015

mpasternacki commented Jan 28, 2015

cdaylward Feb 2, 2015

jonboulle Feb 3, 2015

cdaylward Feb 13, 2015

jonboulle commented Feb 3, 2015

cdaylward Feb 13, 2015

cdaylward commented Feb 13, 2015

jonboulle commented Feb 16, 2015

jonboulle commented Feb 18, 2015

spec: introduce pods #150

spec: introduce pods #150

Conversation

jonboulle commented Jan 28, 2015

jonboulle commented Jan 28, 2015

eyakubovich commented Jan 28, 2015

jonboulle commented Jan 28, 2015

mpasternacki commented Jan 28, 2015

jonboulle commented Jan 28, 2015

mpasternacki commented Jan 28, 2015

cdaylward Feb 2, 2015

Choose a reason for hiding this comment

jonboulle Feb 3, 2015

Choose a reason for hiding this comment

cdaylward Feb 13, 2015

Choose a reason for hiding this comment

jonboulle commented Feb 3, 2015

cdaylward Feb 13, 2015

Choose a reason for hiding this comment

cdaylward commented Feb 13, 2015

jonboulle commented Feb 16, 2015

jonboulle commented Feb 18, 2015