Pods need to pre-declare service links iff they want the environment variables created #1768

bgrant0607 · 2014-10-14T03:59:36Z

Forked from #1107 and #386.

Now seems like a good time to decide whether we want to require/encourage/allow pods to declare services they depend upon. Internally, we've often wished we had such a mechanism.

Not only would pre-declaration reduce accidental/lazy coupling, but it would also improve scalability by reducing the number of iptables rules that must be created. Pre-declaration would also be compatible with Docker's approach to links. If we supported service aliasing in these declarations, that would facilitate dependency injection for tests and a wide variety of deployment adaptation scenarios, which seems like a compelling alternative to custom environment variables, command-line flags, dynamic configuration services, and so on.

However, it at least needs possible to opt out of static declaration and/or enforcement of service dependencies, such as in the case that dependent services are registered dynamically -- think of a proxy, load balancer, web browser, monitoring service, or naming/discovery service running in a container.

We'd also definitely need to support more flavors of services (cardinal services, headless services, master election, sharding, ...) in order for most clients to be able to utilize the pre-declaration mechanism.

Something else to consider is how to address dependencies pulled in by client libraries, though perhaps it's not unreasonable to require client libraries to be transparent regarding which services they access.

/cc @thockin @smarterclayton

erictune · 2014-10-15T23:47:18Z

How will services v2 interact with namespaces and authorization?

If namespaces are used to separate different companies or different organizations within a large company, then probably most of the time namespace owners will:

not normally allow cross-namespace listing of services objects
want to partition network traffic as a coarse form of access control (either the only, or as part of defense in depth)
not want automatically created DNS records to be by default visible to all, as that might leak information about their application structure, scope, etc.

Even a cluster's users are confined to a few cooperative organizational units, they might want:

to not expose the internal architecture of a cluster of microservices which together form a mesoservice.
to namespace the automatically created DNS records to prevent collisions when same-named services are created in different clusters.

On the other hand, having to declare all dependencies seems like a bad user experience. There is probably a reason why we have wished for this but not implemented it for so many years.

One compromise might be to default to fully connected within a namespace but require explicit connection across namespaces.

smarterclayton · 2014-10-15T23:54:36Z

Generally predeclaration seems valuable for making arbitrary software to work, and automatic injection seems useful for Kube designed software. Predeclaration works well for controlling dependencies, and automatic injection works well for forcing tolerance of missing components (degrading components?).

On Oct 15, 2014, at 7:47 PM, Eric Tune notifications@github.com wrote:

How will services v2 interact with namespaces and authorization?

If namespaces are used to separate different companies or different organizations within a large company, then probably most of the time namespace owners will:

not normally allow cross-namespace listing of services objects
want to partition network traffic as a coarse form of access control (either the only, or as part of defense in depth)
not want automatically created DNS records to be by default visible to all, as that might leak information about their application structure, scope, etc.
Even a cluster's users are confined to a few cooperative organizational units, they might want:

to not expose the internal architecture of a cluster of microservices which together form a mesoservice.
to namespace the automatically created DNS records to prevent collisions when same-named services are created in different clusters.
On the other hand, having to declare all dependencies seems like a bad user experience. There is probably a reason why we have wished for this but not implemented it for so many years.

One compromise might be to default to fully connected within a namespace but require explicit connection across namespaces.

—
Reply to this email directly or view it on GitHub.

smarterclayton · 2014-10-19T23:49:29Z

On Oct 15, 2014, at 7:47 PM, Eric Tune notifications@github.com wrote:

How will services v2 interact with namespaces and authorization?

If namespaces are used to separate different companies or different organizations within a large company, then probably most of the time namespace owners will:

not normally allow cross-namespace listing of services objects
want to partition network traffic as a coarse form of access control (either the only, or as part of defense in depth)
not want automatically created DNS records to be by default visible to all, as that might leak information about their application structure, scope, etc.
Even a cluster's users are confined to a few cooperative organizational units, they might want:

to not expose the internal architecture of a cluster of microservices which together form a mesoservice.
to namespace the automatically created DNS records to prevent collisions when same-named services are created in different clusters.
On the other hand, having to declare all dependencies seems like a bad user experience. There is probably a reason why we have wished for this but not implemented it for so many years.

One compromise might be to default to fully connected within a namespace but require explicit connection across namespaces.

I think this is an excellent rule of thumb. Connection across namespace might require acks on both ends (to prevent me injecting my variables into your space). That explicit connection might be modeled as a service on one end that talks to pods, but on the other end might point to a service in a namespace. Both are required for traffic to flow across a namespace boundary.=

smarterclayton · 2014-10-27T17:43:13Z

What would an explicit dependency from a pod to a service look like:

A name of a service
A namespace (if outside the current)
The environment variable name to use for the service host
Additional environment variables?
An internal address and/or port to use (including 127.0.0.1 or hardcoded)
The nature of the usage of the service?
A DNS name for the service within this pod?

ghost · 2014-12-16T18:48:05Z

+100

bgrant0607 · 2014-12-16T18:54:26Z

@kelonye Could you please describe your use case in more detail? What are you trying to achieve? Do you actually want firewalls (#2880), or is this security through obscurity, or something else?

ghost · 2014-12-17T08:06:15Z

@bgrant0607 a client of mine wants to run mini user applications using untrusted user images as linked services .. so security is 1. 2, have different services with the same name so say users A and B both have a REDIS and a WEB service.

bgrant0607 · 2014-12-18T20:03:15Z

@kelonye Namespaces were intended to address the multi-user issue. As for trusted vs. not, that's the firewall issue (#2880) -- which clients are permitted to see a service.

If we continue to support the environment variables, we are going to need to make them on request only. Creating so many variables for all services in a cluster, even just within a namespace, is a scalability problem. That has nothing to do with accessibility, though. It would just save the user from having to write a pre-start hook to resolve the service's DNS name and dump it into a file to be read by the container.

A separate issue is start-up dependencies. Though they're fragile, a number of applications do make (bad) assumptions about startup order, so we'll need to support them in some form in our deployment workflow mechanism(s) (#1704).

pmorie · 2015-01-09T17:45:52Z

The following use-case requires predeclaration:

As a user I want to be able to customize how a service is consumed by allowing the name and
form of environment variables to be transformed prior to use by a pod so that images which are
not designed to work with a specific named service can be used with kubernetes

Predeclaration would also play a role in the cross-namespace use-case:

As a user I want a pod to be able to consume the metadata of a service in a different
namespace (example: namespace A contains a database service which I want to use from namespace B)

I'll also note that there is a use-case where you want to opt-out of predeclaration and get info / firewall rules for every service in a pod's namespace. This adds a wrinkle because it divides pods into two types, those that follow the normal rules and those that don't, which would definitely impact scheduling in order to keep an opted-out pod from wrecking port allocation on a host where all other pods predeclare. I think @erictune's suggestion of defaulting to fully connected within a namespace and requiring predeclaration of cross namespace dependencies is a good middle ground.

@bgrant0607 @erictune @smarterclayton As a next step, how about a PR to explore the predeclaration mechanism in a vacuum. I would suggest that PodSpec be changed as follows:

type PodSpec struct {
  // other fields omitted
  ServiceLinks []ServiceLinks
}

type ServiceLink struct {
  TypeMeta
  ObjectMeta
  Name          string
  Namespace  string
  // Better perhaps as an ObjectReference?
}

The first iteration could change the BoundPodFactory or move the service env functionality to the Kubelet, the latter of which seems like the direction the tide is going.

pmorie · 2015-01-09T18:00:55Z

@lavalamp ^

lavalamp · 2015-01-09T18:55:45Z

Moving BoundPodFactory stuff to kubelet is good and necessary, but @erictune may be working on that already.

If we switch to DNS and deprecate env vars completely, does that eliminate the need for predeclaration? That sounds much easier...

pmorie · 2015-01-09T20:00:59Z

@lavalamp I don't think switching to DNS and deprecating env vars fully eliminates the need for predeclaration. There there's still the problem of iptables rules on the node.

smarterclayton · 2015-01-09T20:38:22Z

You still have to know what DNS you're looking forward to, and software that runs on different namespaces or clusters won't have the same DNS.

You need something injected into the container that lets legacy software react to the cluster topology. Predeclaration for adaption is a key thing, there's lots of software out there that doesn't know anything about X_SERVICE_HOST at all.

Also, that same software has to work outside of a cluster as well - on a local dev box how would you point your app to your db (except by using env or mutating a file on disk)?

On Jan 9, 2015, at 3:01 PM, Paul Morie notifications@github.com wrote:

@lavalamp I don't think switching to DNS and deprecating env vars fully eliminates the need for predeclaration. There there's still the problem of iptables rules on the node.

—
Reply to this email directly or view it on GitHub.

smarterclayton · 2015-01-09T20:41:47Z

On Jan 9, 2015, at 12:46 PM, Paul Morie notifications@github.com wrote:

@bgrant0607

The following use-case requires predeclaration:

As a user I want to be able to customize how a service is consumed by allowing the name and form of environment variables to be transformed prior to use by a pod so that images which are not designed to work with a specific named service can be used with kubernetes
Predeclaration would also play a role in the cross-namespace use-case:

As a user I want a pod to be able to consume the metadata of a service in a different namespace (example: namespace A contains a database service which I want to use from namespace B)
I'll also note that there is a use-case where you want to opt-out of predeclaration and get info / firewall rules for every service in a pod's namespace. This adds a wrinkle because it divides pods into two types, those that follow the normal rules and those that don't, which would definitely impact scheduling in order to keep an opted-out pod from wrecking port allocation on a host where all other pods predeclare. I think @erictune's suggestion of defaulting to fully connected within a namespace and requiring predeclaration of cross namespace dependencies is a good middle ground.

@bgrant0607 @erictune @smarterclayton As a next step, how about a PR to explore the predeclaration mechanism in a vacuum. I would suggest that PodSpec be changed as follows:

type PodSpec struct {
// other fields omitted
ServiceLinks ServiceLinkList
}

type ServiceLinkList struct {
TypeMeta
ListMeta
Items []ServiceLink
}

type ServiceLink struct {
TypeMeta
ObjectMeta
Name string
Namespace string
// Better perhaps as an ObjectReference?
}

I had been envisioning this to do adaptation, so mutating how the service shows up in the pod. I think that would make the use case a bit more concrete and practical.

The first iteration could change the BoundPodFactory or move the service env functionality to the Kubelet, the latter of which seems like the direction the tide is going.

—
Reply to this email directly or view it on GitHub.

pmorie · 2015-01-09T20:43:11Z

@smarterclayton We can roll adaptation into the POC, I will think through a design and propose a model here

erictune · 2015-01-10T00:08:15Z

@pmorie can you give a more concrete example of a system with legacy software that needs service links?

pmorie · 2015-01-12T17:19:33Z

@erictune As a more detailed example, say I have an image that depends on a specially formatted environment variable with a URL for a service. As an example format, take:

DOCKER_URL=http://$DOCKER_HOST/$DOCKER_PORT

This use case is to be able to adapt to these special requirements an image may have without changing the image.

It's definitely the case that there's a pretty sizable subset of these cases that can be addressed by performing the translation via the shell and either setting containers' commands to set the variables or wrapping an image with another image containing a script.

It's arguable that for those cases, the adaptation mechanism isn't necessary. However, it is necessary for images that do not contain a shell, or that use the ENTRYPOINT feature of docker (in which case the environment cannot be overridden from the container's command without specifically overriding the entrypoint). Personally, I also think it's arguable that the experience will be better to adapt services in this manner even when images have a shell and can perform the substitution themselves.

smarterclayton · 2015-01-12T17:33:32Z

Or even the mysql client (http://dev.mysql.com/doc/refman/5.0/en/environment-variables.html)

MYSQL_HOST
MYSQL_TCP_PORT

Neither of those match our existing env.

----- Original Message -----

@erictune As a more detailed example, say I have an image that depends on a
specially formatted environment variable with a URL for a service. As an
example format, take:
DOCKER_URL=http://$DOCKER_HOST/$DOCKER_PORT
This use case is to be able to adapt to these special requirements an image
may have without changing the image.

It's definitely the case that there's a pretty sizable subset of these cases
that can be addressed by performing the translation via the shell and either
setting containers' commands to set the variables or wrapping an image with
another image containing a script.

It's arguable that for those cases, the adaptation mechanism isn't necessary.
However, it is necessary for images that do not contain a shell, or that
use the ENTRYPOINT feature of docker (in which case the environment cannot
be overridden from the container's command without specifically overriding
the entrypoint). Personally, I also think it's arguable that the experience
will be better to adapt services in this manner even when images have a
shell and can perform the substitution themselves.

Reply to this email directly or view it on GitHub:
#1768 (comment)

pmorie · 2015-01-12T23:11:25Z

@smarterclayton and I discussed this offline and think the adaptation use-case doesn't depend on predeclaration of services nor does it need to be coupled to services at all.

Here's another use-case that might require pre-declaration:

As a user I want to express that a pod should not be started unless services it depends on have had IP/Ports allocated

Consider the following to address that use-case:

type PodSpec struct {
  // other fields omitted
  ServiceLinks []ServiceLinks
}

type ServiceLink struct {
  TypeMeta
  ObjectMeta

  Target    ObjectReference
  NeedReady bool
}

NeedReady would be a precondition that states that the service must have at least one endpoint.

Any thoughts? @smarterclayton @erictune @bgrant0607 @thockin @lavalamp

pmorie · 2015-01-12T23:12:26Z

Also, in the context of the above, it would be good to produce an event after some time if a pod which is scheduled cannot start, but that's perhaps another issue.

bgrant0607 · 2015-01-14T07:59:54Z

Previous discussion of this last type of dependency was in #2385.

It has been my hope that DNS will eliminate the creation order problem between services and their clients. Creation-order dependencies are problematic since containers can go down at any time, and they have unclear meaning when updating or replacing objects (such as with rolling updates).

That said, sometimes there are unavoidable turn-up dependencies, such as to initialize stateful services, such as databases or message brokers. I envision handling such dependencies in deployment automation: #1704.

bgrant0607 · 2015-01-14T19:42:05Z

I just experienced this, as additional confirmation: The new directory-reading feature of kubectl is not so useful for containers using service environment variables. Someone tried to do:

cluster/kubectl.sh create -f examples/guestbook-go/

zq-david-wang · 2017-09-20T08:10:33Z

@smarterclayton I notice a huge delay when spawning bash via "docker exec" if there are thousands of services (I tested with 3000) in the namespace. It would be great to have options to disable service env when creating docker process.

The test I run is as following: (kubelet 1.6)
I have build a docker image with bash installed on alpine base image.
When run "docker exec -it [docker-id] bash",
it took about 20s to run it if there are 3000 services within the namespace;
it took only less than 1s to run it if I disable the service env by modifying the code.

bgrant0607 · 2017-09-20T22:44:32Z

cc @kubernetes/sig-network-feature-requests @kubernetes/sig-node-feature-requests

fejta-bot · 2018-01-06T04:25:35Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

bgrant0607 · 2018-01-23T05:08:22Z

/remove-lifecycle stale
/lifecycle frozen

endocrimes · 2021-06-24T14:59:39Z

Now that service links are optional (dealing with some of the performance and collision issues that they used to cause), we're mostly waiting on a way to declare a subset of them for an application here, and it seems no headway has been made on that recently, I'm going to bump this down to important-longterm.

/priority important-longterm

endocrimes · 2021-06-24T15:27:30Z

/remove-priority important-soon

MadhavJivrajani · 2021-06-29T16:05:20Z

/remove-kind design
/kind feature

kind/design will soon be removed from k/k in favor of kind/feature. Relevant discussion can be found here: kubernetes/community#5641

thockin · 2023-01-16T19:24:25Z

Realistically, no. Service links are not something we're going to do more to support.

bgrant0607 added kind/design Categorizes issue or PR as related to design. sig/network Categorizes an issue or PR as relevant to SIG Network. area/api Indicates an issue on api area. area/kube-proxy area/downward-api labels Oct 14, 2014

bgrant0607 mentioned this issue Nov 14, 2014

Pod dependencies on services #2385

Closed

bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Dec 4, 2014

bgrant0607 mentioned this issue Dec 16, 2014

Explicit service links #2960

Closed

pmorie mentioned this issue Dec 14, 2016

Proposal for explicit service links kubernetes/community#176

Closed

bgrant0607 mentioned this issue Feb 5, 2017

[Proposal] Pod Injection Policy kubernetes/community#254

Merged

bgrant0607 added triaged and removed team/api (deprecated - do not use) labels Mar 9, 2017

bgrant0607 added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 20, 2017

bgrant0607 unassigned pmorie Sep 20, 2017

bgrant0607 removed area/app-lifecycle area/kube-proxy triaged labels Sep 20, 2017

bgrant0607 mentioned this issue Oct 26, 2017

Proposal: optional service links kubernetes/community#1249

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 6, 2018

k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 23, 2018

thockin mentioned this issue Feb 28, 2018

Environmental Variables for services causing conflict with external naming convention #18219

Closed

thockin added the triage/unresolved Indicates an issue that can not or will not be resolved. label Mar 8, 2019

freehan removed the triage/unresolved Indicates an issue that can not or will not be resolved. label May 16, 2019

liggitt mentioned this issue Jan 19, 2020

Service discovery via environment variables should be available for all namespaces #87366

Closed

k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jun 24, 2021

k8s-ci-robot removed the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 24, 2021

k8s-ci-robot removed the kind/design Categorizes issue or PR as related to design. label Jun 29, 2021

ash2k mentioned this issue Sep 5, 2021

Add apply-time-mutation feature kubernetes-sigs/cli-utils#400

Merged

thockin closed this as completed Jan 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pods need to pre-declare service links iff they want the environment variables created #1768

Pods need to pre-declare service links iff they want the environment variables created #1768

bgrant0607 commented Oct 14, 2014

erictune commented Oct 15, 2014

smarterclayton commented Oct 15, 2014

smarterclayton commented Oct 19, 2014

smarterclayton commented Oct 27, 2014

ghost commented Dec 16, 2014

bgrant0607 commented Dec 16, 2014

ghost commented Dec 17, 2014

bgrant0607 commented Dec 18, 2014

pmorie commented Jan 9, 2015

pmorie commented Jan 9, 2015

lavalamp commented Jan 9, 2015

pmorie commented Jan 9, 2015

smarterclayton commented Jan 9, 2015

smarterclayton commented Jan 9, 2015

pmorie commented Jan 9, 2015

erictune commented Jan 10, 2015

pmorie commented Jan 12, 2015

smarterclayton commented Jan 12, 2015

pmorie commented Jan 12, 2015

pmorie commented Jan 12, 2015

bgrant0607 commented Jan 14, 2015

bgrant0607 commented Jan 14, 2015

zq-david-wang commented Sep 20, 2017

bgrant0607 commented Sep 20, 2017

fejta-bot commented Jan 6, 2018

bgrant0607 commented Jan 23, 2018

endocrimes commented Jun 24, 2021

endocrimes commented Jun 24, 2021

MadhavJivrajani commented Jun 29, 2021

thockin commented Jan 16, 2023

Pods need to pre-declare service links iff they want the environment variables created #1768

Pods need to pre-declare service links iff they want the environment variables created #1768

Comments

bgrant0607 commented Oct 14, 2014

erictune commented Oct 15, 2014

smarterclayton commented Oct 15, 2014

smarterclayton commented Oct 19, 2014

smarterclayton commented Oct 27, 2014

ghost commented Dec 16, 2014

bgrant0607 commented Dec 16, 2014

ghost commented Dec 17, 2014

bgrant0607 commented Dec 18, 2014

pmorie commented Jan 9, 2015

pmorie commented Jan 9, 2015

lavalamp commented Jan 9, 2015

pmorie commented Jan 9, 2015

smarterclayton commented Jan 9, 2015

smarterclayton commented Jan 9, 2015

pmorie commented Jan 9, 2015

erictune commented Jan 10, 2015

pmorie commented Jan 12, 2015

smarterclayton commented Jan 12, 2015

pmorie commented Jan 12, 2015

pmorie commented Jan 12, 2015

bgrant0607 commented Jan 14, 2015

bgrant0607 commented Jan 14, 2015

zq-david-wang commented Sep 20, 2017

bgrant0607 commented Sep 20, 2017

fejta-bot commented Jan 6, 2018

bgrant0607 commented Jan 23, 2018

endocrimes commented Jun 24, 2021

endocrimes commented Jun 24, 2021

MadhavJivrajani commented Jun 29, 2021

thockin commented Jan 16, 2023