Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Internal Dynamic DNS service #1261

Closed
markllama opened this issue Sep 10, 2014 · 71 comments
Closed

Proposal: Internal Dynamic DNS service #1261

markllama opened this issue Sep 10, 2014 · 71 comments
Assignees
Labels
area/downward-api kind/design Categorizes issue or PR as related to design. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@markllama
Copy link
Contributor

There are instances where DNS is the best way to publish the presence of network resources.

Within a Kubernetes cluster minions, containers and services all have distinct IP addresses but in many cases these objects may need to migrate in a way which changes the IP. Clients that are consuming these services can find the new IP if they use DNS names rather than IP address as the service handle. The DNS records may change but the names will remain stable.

One specific case has raised this issue.

On each minion, when a new container is created the container is provded an environment variable SERVICE_MASTER. This variable indicates the location of the Kubernetes service proxy to which the processes within the container should connect. The value of this variable is a hostname (FQDN) which should resolve from within the container to the public IP address of the minion on which the container resides. It is not possible for Kubernetes, through Docker to inject a value into /etc/hosts into the container. It is possible for Docker to inject a nameserver IP address into /etc/resolv.conf. If the DNS service indicated correctly resolves the value of SERVICE_MASTER to the IP address of the hosting minion, then the processes within the container will connect properly to the minion proxy process.

Since Kubernetes is a dynamic service the DNS service must also be dynamic. New minions may be added (for this particular case). It may also be desirable to publish services with SRV records as well. As new IPs are allocated and names assigned, the Kubernetes service must be able to update the DNS name/address maps.

DNS service relies on several factors:

  • DNS server(s)
  • Must have stable IP address(s)
  • Must have allocated zones for dynamic update
  • Must respond to update queries
  • Should require signature on update queries
  • DNS clients
    • must be provided the IP address of DNS servers
    • Must be provided the FQDN of names to be resolved (A, SRV, TXT, PTR records)
    • May be provided the zone names

Docker provides a means to inject the nameserver values for /etc/resolv.conf in new containers.

  1. What values should be stored in DNS?
  2. What components are aware of new IP allocation and hence able to initiate DNS updates?
  3. How does one notify/configure the Kubernetes master/cluster about the presence of the DNS
    service?
  4. How does one coordinate the sharing of update keys between the DNS service and Kubernetes?
  5. When if ever is it acceptable to override the default nameserver injection into a new container?
  6. What are the set of optimal DNS master/slave servers and how do we initialize and maintain them?

Solution proposals to follow.

@jbeda
Copy link
Contributor

jbeda commented Sep 10, 2014

This relates to Services v2 -- #1107.

@thockin -- can you integrate this thinking with what we've been talking about wrt Services v2.

/cc @smarterclayton

Personally, I'm scared of leaning on DNS too much. While there is nothing stopping DNS from being very dynamic, most clients very aggressively cache DNS results past the TTL. I'm in favor of having a great inward facing DNS solution but I think that we should make sure that the bulk of IPs returned are stable from the point of view of the client. (Which requires some network magic or proxying, etc)

@markllama
Copy link
Contributor Author

I agree that this would be purely for internal consumption not for publication (ala OpenShift). One aspect is to select proper TTL values and carefully choose what objects are published by DNS. It should be used where it is the right solution and others should be used if it is not.

I'm interested in the comment "Aggressively cache DNS results". The caching servers do, and tools like ncd and possible sssd may. The resolver itself does not. Use cases for DNS must be tolerant of the TTL propagation delay for updates.

In the example above the hostname/IP mapping for minions and their clients would be stable for the lifetime of the minion. Additional cases must be evaluated on their merits.

@bgrant0607 bgrant0607 added kind/design Categorizes issue or PR as related to design. sig/network Categorizes an issue or PR as relevant to SIG Network. labels Sep 10, 2014
@markllama
Copy link
Contributor Author

This may have been half-baked. I'll still come up with the use cases but I was reacting to misunderstanding on my part about the scope of the problem from the vagrant cluster.

This may be moot.

@thockin
Copy link
Member

thockin commented Sep 11, 2014

I think there is an important role to be played by DNS, but that role is
around services (the think k8s calls Services, anyway).

With Services v2 I think we can present the appearance of stable IPs for
Services. If we can do that, then DNS is a natural fit. We have open
questions about the globalness of DNS vs namespaces. We have questions
about which implementation of DNS we can use. We have open questions about
the IPs of DNS servers (aka bootstrapping).

This is certainly an issue with room for people to help out :)

On Wed, Sep 10, 2014 at 7:50 PM, Mark Lamourine notifications@github.com
wrote:

This may have been half-baked. I'll still come up with the use cases but I
was reacting to misunderstanding on my part about the scope of the problem
from the vagrant cluster.

This may be moot.

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@derekwaynecarr
Copy link
Member

Just an FYI, but it looks like as of docker 1.2.0 you can change the /etc/hosts file of a running container.

see moby/moby#5129

@thockin
Copy link
Member

thockin commented Sep 11, 2014

This solution is a great big hack. It's not done atomically (indeed the
only way I can think of doing it atomically is to create a new file and
bind mount it over, which requires traversing namespaces).

We don't need it to be dynamic, I think. The static PR is much simpler.

On Wed, Sep 10, 2014 at 9:35 PM, Derek Carr notifications@github.com
wrote:

Just an FYI, but it looks like as of docker 1.2.0 you can change the
/etc/hosts file of a running container.

see moby/moby#5129 moby/moby#5129

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@bketelsen
Copy link
Contributor

Disclaimer: I'm a co-author of Skydns (https://github.com/skynetservices/skydns)
I think SkyDNS would solve nearly all of these requirements as-is. It supports DNSSEC, self-reported TTL, it's already built to store to etcd, and is very configurable. The only thing missing is a bridge from kubernetes' registry to SkyDNS registry. Most of SkyDNS's users are using it as the dns resolver on the docker daemon's bridge for name resolution inside docker containers.

I'd be happy to change the license if needed, and would love to submit a PR for the SkyDNS <-> Kubernetes API bridge service.

@thockin
Copy link
Member

thockin commented Sep 14, 2014

I have said elsewhere that the obvious DNS tie-in is for Service names.
But I would be open to ideas around more general usage, if we can position
it all in a way that users won't get tricked into thinking it is something
it is not.

Just brainstorming. Make a flag per pod asking for DNS resolution. Make a
dyndns service intra cluster. Make a flag per replication controller
asking for group DNS.

Do we think DNS is a case of a general naming service plugin interface, or
is it so fundamental it should be special-cased?

Tim
On Sep 14, 2014 9:32 AM, "Brian Ketelsen" notifications@github.com wrote:

Disclaimer: I'm a co-author of Skydns (
https://github.com/skynetservices/skydns)
I think SkyDNS would solve nearly all of these requirements as-is. It
supports DNSSEC, self-reported TTL, it's already built to store to etcd,
and is very configurable. The only thing missing is a bridge from
kubernetes' registry to SkyDNS registry. Most of SkyDNS's users are using
it as the dns resolver on the docker daemon's bridge for name resolution
inside docker containers.

I'd be happy to change the license if needed, and would love to submit a
PR for the SkyDNS <-> Kubernetes API bridge service.

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@smarterclayton
Copy link
Contributor

Being able to define special purpose DNS addresses adhoc is always nice, but I don't think the service -> dns focus is wrong. It just may be that when you define your service, instead of tying the dns entry to the service endpoint, make it resolve to the individual endpoints (the selected pods).

That feels more "service-y" than adding dns to a repl controller, which then couples a template to a DNS label.

I like flag per pod, but what happens when you want two pods to share a DNS entry? Back to services!

You get into a bit of a "service-all-the-things" mentality, but some of the concerns with that (too many ports, names for things that are very similar) could be addressed by other flexibility on or around services, and are what you want to encourage.

I think services are pets and pods are cattle in this vein - name your pets, count your cattle. Most of the things you could use names on pods for are things you want to discourage or handle at a lower level (if you want magic HA pod behavior for hosts, do it under the pod at the infra level so no one knows about it)

On Sep 14, 2014, at 1:08 PM, Tim Hockin notifications@github.com wrote:

I have said elsewhere that the obvious DNS tie-in is for Service names.
But I would be open to ideas around more general usage, if we can position
it all in a way that users won't get tricked into thinking it is something
it is not.

Just brainstorming. Make a flag per pod asking for DNS resolution. Make a
dyndns service intra cluster. Make a flag per replication controller
asking for group DNS.

Do we think DNS is a case of a general naming service plugin interface, or
is it so fundamental it should be special-cased?

Tim
On Sep 14, 2014 9:32 AM, "Brian Ketelsen" notifications@github.com wrote:

Disclaimer: I'm a co-author of Skydns (
https://github.com/skynetservices/skydns)
I think SkyDNS would solve nearly all of these requirements as-is. It
supports DNSSEC, self-reported TTL, it's already built to store to etcd,
and is very configurable. The only thing missing is a bridge from
kubernetes' registry to SkyDNS registry. Most of SkyDNS's users are using
it as the dns resolver on the docker daemon's bridge for name resolution
inside docker containers.

I'd be happy to change the license if needed, and would love to submit a
PR for the SkyDNS <-> Kubernetes API bridge service.

Reply to this email directly or view it on GitHub
#1261 (comment)
.


Reply to this email directly or view it on GitHub.

@smarterclayton
Copy link
Contributor

Also, SkyDNS was what I had in my head as were talking about this - glad to see there's interest on your end as well.

@thockin
Copy link
Member

thockin commented Sep 14, 2014

I don't think that "use a service" is a bad answer for anything multi-pod.
The problem with exposing a service DNS as the set of pods is that we KNOW
people cache DNS, and that set just is not stable. I might be OK exposing
that AND the service portal. I just don't want to lay traps for users.
DNS is attractive and easy to use. It's also easy to wrongly.
On Sep 14, 2014 10:33 AM, "Clayton Coleman" notifications@github.com
wrote:

Being able to define special purpose DNS addresses adhoc is always nice,
but I don't think the service -> dns focus is wrong. It just may be that
when you define your service, instead of tying the dns entry to the service
endpoint, make it resolve to the individual endpoints (the selected pods).

That feels more "service-y" than adding dns to a repl controller, which
then couples a template to a DNS label.

I like flag per pod, but what happens when you want two pods to share a
DNS entry? Back to services!

You get into a bit of a "service-all-the-things" mentality, but some of
the concerns with that (too many ports, names for things that are very
similar) could be addressed by other flexibility on or around services, and
are what you want to encourage.

I think services are pets and pods are cattle in this vein - name your
pets, count your cattle. Most of the things you could use names on pods for
are things you want to discourage or handle at a lower level (if you want
magic HA pod behavior for hosts, do it under the pod at the infra level so
no one knows about it)

On Sep 14, 2014, at 1:08 PM, Tim Hockin notifications@github.com
wrote:

I have said elsewhere that the obvious DNS tie-in is for Service names.
But I would be open to ideas around more general usage, if we can
position
it all in a way that users won't get tricked into thinking it is
something
it is not.

Just brainstorming. Make a flag per pod asking for DNS resolution. Make
a
dyndns service intra cluster. Make a flag per replication controller
asking for group DNS.

Do we think DNS is a case of a general naming service plugin interface,
or
is it so fundamental it should be special-cased?

Tim
On Sep 14, 2014 9:32 AM, "Brian Ketelsen" notifications@github.com
wrote:

Disclaimer: I'm a co-author of Skydns (
https://github.com/skynetservices/skydns)
I think SkyDNS would solve nearly all of these requirements as-is. It
supports DNSSEC, self-reported TTL, it's already built to store to
etcd,
and is very configurable. The only thing missing is a bridge from
kubernetes' registry to SkyDNS registry. Most of SkyDNS's users are
using
it as the dns resolver on the docker daemon's bridge for name
resolution
inside docker containers.

I'd be happy to change the license if needed, and would love to submit
a
PR for the SkyDNS <-> Kubernetes API bridge service.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/1261#issuecomment-55530880>

.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@smarterclayton
Copy link
Contributor

Part of the portal / service design is you have to make an explicit choice to hardcode the dns name in your code or env - if the default path is easier (auto env var, or exposed on localhost via ambassador container or infra magic), it seems like folks won't be incentivized to hardcode dns. Or if you make dns exposure an option (explicitly declared) then you take the responsibility to do the right thing.

Taken with what we've said that DNS is primarily an internal, inter-pod concept and that external exposure is typically an explicit choice for users or admins, that makes me think that there's less of a requirement to dictate general DNS internally, but more of a requirement to give operators the flexibility to do their own opinionated setups with delegation or mapping.

On Sep 14, 2014, at 2:44 PM, Timil Hockin notifications@github.com wrote:

I don't think that "use a service" is a bad answer for anything multi-pod.
The problem with exposing a service DNS as the set of pods is that we KNOW
people cache DNS, and that set just is not stable. I might be OK exposing
that AND the service portal. I just don't want to lay traps for users.
DNS is attractive and easy to use. It's also easy to wrongly.
On Sep 14, 2014 10:33 AM, "Clayton Coleman" notifications@github.com
wrote:

Being able to define special purpose DNS addresses adhoc is always nice,
but I don't think the service -> dns focus is wrong. It just may be that
when you define your service, instead of tying the dns entry to the service
endpoint, make it resolve to the individual endpoints (the selected pods).

That feels more "service-y" than adding dns to a repl controller, which
then couples a template to a DNS label.

I like flag per pod, but what happens when you want two pods to share a
DNS entry? Back to services!

You get into a bit of a "service-all-the-things" mentality, but some of
the concerns with that (too many ports, names for things that are very
similar) could be addressed by other flexibility on or around services, and
are what you want to encourage.

I think services are pets and pods are cattle in this vein - name your
pets, count your cattle. Most of the things you could use names on pods for
are things you want to discourage or handle at a lower level (if you want
magic HA pod behavior for hosts, do it under the pod at the infra level so
no one knows about it)

On Sep 14, 2014, at 1:08 PM, Tim Hockin notifications@github.com
wrote:

I have said elsewhere that the obvious DNS tie-in is for Service names.
But I would be open to ideas around more general usage, if we can
position
it all in a way that users won't get tricked into thinking it is
something
it is not.

Just brainstorming. Make a flag per pod asking for DNS resolution. Make
a
dyndns service intra cluster. Make a flag per replication controller
asking for group DNS.

Do we think DNS is a case of a general naming service plugin interface,
or
is it so fundamental it should be special-cased?

Tim
On Sep 14, 2014 9:32 AM, "Brian Ketelsen" notifications@github.com
wrote:

Disclaimer: I'm a co-author of Skydns (
https://github.com/skynetservices/skydns)
I think SkyDNS would solve nearly all of these requirements as-is. It
supports DNSSEC, self-reported TTL, it's already built to store to
etcd,
and is very configurable. The only thing missing is a bridge from
kubernetes' registry to SkyDNS registry. Most of SkyDNS's users are
using
it as the dns resolver on the docker daemon's bridge for name
resolution
inside docker containers.

I'd be happy to change the license if needed, and would love to submit
a
PR for the SkyDNS <-> Kubernetes API bridge service.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/1261#issuecomment-55530880>

.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub
#1261 (comment)
.


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member

thockin commented Sep 16, 2014

On /etc/hosts: moby/moby#8019

@bgrant0607
Copy link
Member

The discussion in #146 is relevant.

I mostly agree with @smarterclayton.

I am strongly in favor of publishing DNS for services. Service v2 (#1107) IP addresses should be reasonably stable -- as stable as the services. Among other things, this would make #1331 moot, would solve the problem of needing to resolve minion hostnames from within containers, and would eliminate our reliance on environment variables as a naming system.

We should not publish DNS for pods nor for replication controllers. Both are ephemeral (cattle). For example, I want to be able to create new replication controllers to do rolling updates (#1353). So, even the names wouldn't be stable, much less the IP addresses.

I'm hesitant to directly publish groups of service pod IPs to DNS, due to the aforementioned caching problems, as well as latency and scalability challenges (e.g., not good to need to reload 10000 IPs when one changes). Load balancers and load-balancing clients should use a more efficient mechanism to subscribe to membership changes in the presence of auto-scaling, rolling updates, rescheduling, etc. I'd be happy if a standard group membership watching API emerged, in which case we could adopt it and expose it to cloud-aware applications.

It should also be possible for people to use their own discovery mechanism, such as etcd, Eureka, or Consul, by registering/unregistering ephemeral pod IP addresses in lifecycle hooks or in their cloud-aware applications, but, by definition, their clients are equipped to handle the dynamic bindings.

@bgrant0607
Copy link
Member

@bketelsen We'd like to enable DNS in Kubernetes O(soon). Would you like to work together on this?

@bketelsen
Copy link
Contributor

As a matter of fact, I would, @bgrant0607 Where do we begin? I'll likely drag @erikstmartin along too, since he works with me and we wrote SkyDNS together.

@thockin
Copy link
Member

thockin commented Sep 27, 2014

I think we want to run a DNS server by default with every kubernetes
install (as part of the master suite), possible split-horizon (see the
namespaces work RH is pushing). The service IP work is almost done, and
DNS is an obvious fit there.

On Fri, Sep 26, 2014 at 5:17 PM, Brian Ketelsen notifications@github.com
wrote:

As a matter of fact, I would, @bgrant0607 https://github.com/bgrant0607
Where do we begin? I'll likely drag @erikstmartin
https://github.com/erikstmartin along too, since he works with me and
we wrote SkyDNS together.

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@thockin
Copy link
Member

thockin commented Sep 29, 2014

FWIW, I have a branch with ip-per-service working.

https://github.com/thockin/kubernetes/tree/services_v2

If we wanted to start playing with DNS, this would be a starting place.
I'm working on docs and tests now, but I have run it through a lot of
manual tests and it does seem to work.

On Fri, Sep 26, 2014 at 8:13 PM, Tim Hockin thockin@google.com wrote:

I think we want to run a DNS server by default with every kubernetes
install (as part of the master suite), possible split-horizon (see the
namespaces work RH is pushing). The service IP work is almost done, and
DNS is an obvious fit there.

On Fri, Sep 26, 2014 at 5:17 PM, Brian Ketelsen notifications@github.com
wrote:

As a matter of fact, I would, @bgrant0607 https://github.com/bgrant0607
Where do we begin? I'll likely drag @erikstmartin
https://github.com/erikstmartin along too, since he works with me and
we wrote SkyDNS together.

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@bketelsen
Copy link
Contributor

I looked at your branch today, @thockin. Should I start sketching up my thoughts on how to integrate DNS? Also, what are your preferences on where the DNS Server should live? Is it another /cmd/, or should it be integrated into something directly, like apiserver or proxy/kubelet?

@stp-ip
Copy link
Member

stp-ip commented Oct 8, 2014

So just to clarify. My idea was, that instead of resolving dns in each container and code/configure the application to use dns, it would be much simpler (as most applications use env vars for other things anyway) to use env vars from the application and let skydns handle the naming, assignment and resolving and on the other hand let kubernetes handle the injection of these supplied names into the containers via env vars.
So it's much easier to run a single container, without needing to configer the dns in a similar fashion.
Additionally we have the benefit of more stable access points for services, which are automaticly used from applications (when they use env vars), instead of having these access points be IP addresses only.

I hope I made my ideas and concerns a bit clearer. If not, I'll have to rethink my argument :)

@bketelsen
Copy link
Contributor

While I'm working through refactoring my proof of concept code, how is the integration between skydns and kubernetes going to work as far as documentation, packaging, deployment goes?

@smarterclayton
Copy link
Contributor

On Oct 8, 2014, at 2:35 PM, Michael Grosser notifications@github.com wrote:

So just to clarify. My idea was, that instead of resolving dns in each container and code/configure the application to use dns, it would be much simpler (as most applications use env vars for other things anyway) to use env vars from the application and let skydns handle the naming, assignment and resolving and on the other hand let kubernetes handle the injection of these supplied names into the containers via env vars.
So it's much easier to run a single container, without needing to configer the dns in a similar fashion.
Additionally we have the benefit of more stable access points for services, which are automaticly used from applications (when they use env vars), instead of having these access points be IP addresses only.

I think that's what we intended, so no worries.
I hope I made my ideas and concerns a bit clearer. If not, I'll have to rethink my argument :)


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member

thockin commented Oct 16, 2014

Brian,

Assuming all your changes are mainlined, I assume we would treat it like we
do etcd - as a standalone project of which we pull specific versions when
installing a cluster. I'm not a salt expert, but this is my understanding
of how we manage etcd. :)

What's the status on these changes?

On Wed, Oct 8, 2014 at 7:48 PM, Brian Ketelsen notifications@github.com
wrote:

While I'm working through refactoring my proof of concept code, how is the
integration between skydns and kubernetes going to work as far as
documentation, packaging, deployment goes?

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@bketelsen
Copy link
Contributor

Preliminary Kubernetes support is pending a merge into SkyDNS's repository. When it's merged, we'll create a new binary release and I'll update this thread.

@pweil-
Copy link
Contributor

pweil- commented Oct 16, 2014

+1 to the question of packaging and deployment.

If I understand how services access works, when the environment variables for a container are created kubernetes will need to know if it is working in a DNS enabled environment so it can put in an IP or a name to resolve as the service host. From there it enters the portal/ambassador infrastructure for routing.

So I guess my questions are

  1. will kubernetes run in a non-dns mode after this is implemented?
  2. if no, how will the DNS implementation be shipped with kubernetes and will it be pluggable?

@bketelsen
Copy link
Contributor

https://github.com/skynetservices/skydns/releases/tag/2.0.1a Merged in SkyDNS, tagged at 2.0.1a

@thockin
Copy link
Member

thockin commented Oct 16, 2014

Paul:

We can still put IPs in the environment variables. Thus, for a service
named "foo-bar", you would have FOO_BAR_SERVICE_HOST=10.0.0.1 and DNS for
foo-bar == 10.0.0.1

On Thu, Oct 16, 2014 at 12:22 PM, Paul notifications@github.com wrote:

+1 to the question of packaging and deployment.

If I understand how services access works, when the environment variables
for a container are created kubernetes will need to know if it is working
in a DNS enabled environment so it can put in an IP or a name to resolve as
the service host. From there it enters the portal/ambassador infrastructure
for routing.

So I guess my questions are

  1. will kubernetes run in a non-dns mode after this is implemented?
  2. if no, how will the DNS implementation be shipped with kubernetes and
    will it be pluggable?

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@bketelsen
Copy link
Contributor

Added an official skynetservices/skydns docker image as an automated build.

https://registry.hub.docker.com/u/skynetservices/skydns/

@thockin
Copy link
Member

thockin commented Oct 18, 2014

How do I tell skydns where etcd servers are?

On Wed, Oct 8, 2014 at 7:16 AM, Brian Ketelsen notifications@github.com
wrote:

@thockin https://github.com/thockin pull master from my repo. then call
it like this:
sudo ./skydns -kubernetes=true -master=http://127.0.0.1:8080

It syncs and adds records, removing is broken and the loop logic is
inefficient. I'll work on refactoring after some meetings today.

Here's sample output from dig:

https://gist.github.com/bketelsen/b9bae2a910eff666e448

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@bketelsen
Copy link
Contributor

You can either export ETCD_MACHINES='http://192.168.0.1:4001,http://192.168.0.2:4001' or set the machines flag to the same value.
https://github.com/skynetservices/skydns/blob/master/main.go#L55

@thockin
Copy link
Member

thockin commented Oct 18, 2014

All I see is:

E1018 19:30:36.626006 00001 etcd.go:167] Failed to contact etcd for key
registry/services: 501: All the given peers are not reachable (Tried to
connect to each peer twice and failed) [0]

ps auxw | grep sky

root 14914 0.0 0.2 277524 5044 ? Ssl 19:26 0:00 skydns
-kubernetes=true -master=http://kubernetes-master:8080 -machines=
http://kubernetes-master:4001

On Sat, Oct 18, 2014 at 4:35 AM, Brian Ketelsen notifications@github.com
wrote:

You can either export ETCD_MACHINES='http://192.168.0.1:4001,
http://192.168.0.2:4001' or set the machines flag to the same value.
https://github.com/skynetservices/skydns/blob/master/main.go#L55

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@bgrant0607
Copy link
Member

@thockin Try using IP addresses rather than DNS?

@thockin
Copy link
Member

thockin commented Oct 18, 2014

I did that too

On Sat, Oct 18, 2014 at 12:40 PM, bgrant0607 notifications@github.com
wrote:

@thockin https://github.com/thockin Try using IP addresses rather than
DNS?

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@bketelsen
Copy link
Contributor

Are you running skydns in a container? Is etcd set to listen on all interfaces? That hit me yesterday.

Sent from my iPhone

On Oct 18, 2014, at 4:03 PM, Tim Hockin notifications@github.com wrote:

I did that too

On Sat, Oct 18, 2014 at 12:40 PM, bgrant0607 notifications@github.com
wrote:

@thockin https://github.com/thockin Try using IP addresses rather than
DNS?

Reply to this email directly or view it on GitHub
#1261 (comment)
.


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member

thockin commented Oct 18, 2014

Winner. Etcd is on localhost only. Sigh. I'll have to think harder.

On Sat, Oct 18, 2014 at 1:07 PM, Brian Ketelsen notifications@github.com
wrote:

Are you running skydns in a container? Is etcd set to listen on all
interfaces? That hit me yesterday.

Sent from my iPhone

On Oct 18, 2014, at 4:03 PM, Tim Hockin notifications@github.com
wrote:

I did that too

On Sat, Oct 18, 2014 at 12:40 PM, bgrant0607 notifications@github.com
wrote:

@thockin https://github.com/thockin Try using IP addresses rather
than
DNS?

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/1261#issuecomment-59626986>

.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@bgrant0607
Copy link
Member

Aren't we just running this on the master node for now? Why not use localhost?

We're going to have the same issue when we dockerize the other components.

@thockin
Copy link
Member

thockin commented Oct 19, 2014

I was trying to run it under kubelet. If I add net=host support, which I
planned to anyway, then it will work. Will do tomorrow or Monday.
On Oct 18, 2014 5:03 PM, "bgrant0607" notifications@github.com wrote:

Aren't we just running this on the master node for now? Why not use
localhost?

We're going to have the same issue when we dockerize the other components.

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@thockin
Copy link
Member

thockin commented Oct 23, 2014

@bketelsen What's the status on reading from kubernetes API rather than
etcd?

On Sat, Oct 18, 2014 at 5:20 PM, Tim Hockin thockin@google.com wrote:

I was trying to run it under kubelet. If I add net=host support, which I
planned to anyway, then it will work. Will do tomorrow or Monday.
On Oct 18, 2014 5:03 PM, "bgrant0607" notifications@github.com wrote:

Aren't we just running this on the master node for now? Why not use
localhost?

We're going to have the same issue when we dockerize the other components.

Reply to this email directly or view it on GitHub
#1261 (comment)
.

@bketelsen
Copy link
Contributor

I'll finish it tonight. SkyDNS is broken anyway since the ID field was moved out of thing that used to be JSONBase. So I'll kill 2 birds with one stone.

@derekwaynecarr
Copy link
Member

#1937

Just a heads up but definition for source API constructor will change slightly after this PR and I think you used it in your work.

On Oct 23, 2014, at 8:02 PM, Brian Ketelsen notifications@github.com wrote:

I'll finish it tonight. SkyDNS is broken anyway since the ID field was moved out of thing that used to be JSONBase. So I'll kill 2 birds with one stone.


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member

thockin commented Oct 26, 2014

Pieces are falling into place for DNS. This is a collection of thoughts and open questions. I'm going to @ some people, but please feel free to chime in on any issue. Sorry this got a bit long.

Currently: We have IP-per-service. We have a read-only, rate-limited master interface. We will soon have a virtual service for the k8s master, so pods can find it. We have a PR pending to allow static portal IP assignment.

The rough design looks like this: At cluster turnup, choose a static IP for DNS. Create a service using that IP selecting on "infrastructure=kubernetes-dns" (or similar). Launch DNS pods with that label.

Now for open questions.

  1. The assumption is skydns for serving DNS. But it doesn't have to be. As a thought experiment: We already expose EVERY service known at pod start time as env vars. We could also jam those into /etc/hosts (as of docker 1.3). No need for DNS. DNS is nicer because it allows services defined after the fact. How important is that? If/when we get to declared portals, /etc/hosts seems simple.

  2. Let's assume skydns. SkyDNS requires etcd. Since we're trying to run DNS as a plain old pod, it can not use the same etcd. We could spin up an etcd cluster just for this.

  • How many etcd replicas?
  • How to do etcd ACLs so nobody mucks with it?
  • Since skydns is pulling data from kubernetes ONLY, the etcd is essentially a private cache - no need to actually do RAFT. Could we run without etcd? Or maybe just run etcd in the same pod as skydns? That would leave a small window for inconsistent responses when something has change very recently (if using replicated DNS). @bketelsen - thoughts?
  1. Running DNS as a plain old POD means the core infra can not rely on it. Probably not a huge deal, but a vector for bootstrapping bugs. If we instead ran DNS on master nodes (e.g. from config files), we could actually rely on it. But then we could not scale it independently. Not sure that matters. Or we could fix it later to also have regular pods AND master-bound pods in the same service.

  2. How do we propagate DNS to pods?

  • It seems that containers, by default, get the resolve.conf of the node. We could configure nodes to use the k8s DNS service IP (recall, it is static) themselves. This means that resolving DNS names works inside or outside pods, and all containers (even those run manually) get the DNS resolution.
  • We could use docker's --dns and --dns-search flags, which is a bit more dynamic, but we don't actually need the dynamic-ness.

Any reason to go with docker flags instead of minions? @brendandburns had some argument that I have misplaced.

  1. Next, we need a way to configure the cluster domain. Not complicated, just more ugly flags.

  2. The intersection with namespaces. We could keep track of all pods in a namespace and check the source IP before returning DNS results (split-horizon). Problem: kube-proxy obscures source IP. Instead, we could give each namespace a sub-domain. Pods in namespace "default" get --dns-search="default.kubernetes.local", while Pods in namespace "foobar" get "foobar.kubernetes.local". Is this level of lockdown sufficient, O namespaces people? @derekwaynecarr

We still have an open issue for "global" services (like the master!), which maybe get DNSed as kubernetes.local, which every pod gets as a secondary --dns-search. This argues for using docker flags rather than minion configs.

  1. @bketelsen What limits does skyDNS impose on the hostnames? Can they have dots (e.g. imply sub-domains)? Do they have to be 24 characters or less, or can they be up to 63? Do you respect the 255 total length of a name?

I feel like I have forgotten some questions. Feel free to add more, folks.

@smarterclayton
Copy link
Contributor

On Oct 26, 2014, at 12:08 AM, Tim Hockin notifications@github.com wrote:

Pieces are falling into place for DNS. This is a collection of thoughts and open questions. I'm going to @ some people, but please feel free to chime in on any issue. Sorry this got a bit long.

Currently: We have IP-per-service. We have a read-only, rate-limited master interface. We will soon have a virtual service for the k8s master, so pods can find it. We have a PR pending to allow static portal IP assignment.

The rough design looks like this: At cluster turnup, choose a static IP for DNS. Create a service using that IP selecting on "infrastructure=kubernetes-dns" (or similar). Launch DNS pods with that label.

Now for open questions.

  1. The assumption is skydns for serving DNS. But it doesn't have to be. As a thought experiment: We already expose EVERY service known at pod start time as env vars. We could also jam those into /etc/hosts (as of docker 1.3). No need for DNS. DNS is nicer because it allows services defined after the fact. How important is that? If/when we get to declared portals, /etc/hosts seems simple.

  2. Let's assume skydns. SkyDNS requires etcd. Since we're trying to run DNS as a plain old pod, it can not use the same etcd. We could spin up an etcd cluster just for this

At medium and small scales I don't think it's unreasonable to share the same etcd. It just means we need a bit more nuance about where the pod runs, and over time more ability to isolate network sections or protect core infra slightly differently. At the largest scales it should have its own etcd.

How many etcd replicas?
How to do etcd ACLs so nobody mucks with it?
Since skydns is pulling data from kubernetes ONLY, the etcd is essentially a private cache - no need to actually do RAFT. Could we run without etcd? Or maybe just run etcd in the same pod as skydns? That would leave a small window for inconsistent responses when something has change very recently (if using replicated DNS). @bketelsen - thoughts?
3) Running DNS as a plain old POD means the core infra can not rely on it. Probably not a huge deal, but a vector for bootstrapping bugs. If we instead ran DNS on master nodes (e.g. from config files), we could actually rely on it. But then we could not scale it independently. Not sure that matters. Or we could fix it later to also have regular pods AND master-bound pods in the same service.

Why does the core infra need to depend on it? I don't know that I've heard of a feature in the masters that was DNS predicated.

  1. How do we propagate DNS to pods?

It seems that containers, by default, get the resolve.conf of the node. We could configure nodes to use the k8s DNS service IP (recall, it is static) themselves. This means that resolving DNS names works inside or outside pods, and all containers (even those run manually) get the DNS resolution.
We could use docker's --dns and --dns-search flags, which is a bit more dynamic, but we don't actually need the dynamic-ness.
Any reason to go with docker flags instead of minions? @brendandburns had some argument that I have misplaced.

I had assumed that in most cases DNS is an infra decision (at all but the largest scales) and so the pods would respect the global configuration. Resolv.conf pointing to the service is an easy way to handle that, although it unfortunately is one more thing to be preconfigured.
5) Next, we need a way to configure the cluster domain. Not complicated, just more ugly flags

  1. The intersection with namespaces. We could keep track of all pods in a namespace and check the source IP before returning DNS results (split-horizon). Problem: kube-proxy obscures source IP. Instead, we could give each namespace a sub-domain. Pods in namespace "default" get --dns-search="default.kubernetes.local", while Pods in namespace "foobar" get "foobar.kubernetes.local". Is this level of lockdown sufficient, O namespaces people? @derekwaynecarr

Namespaces almost certainly should subset the cluster domain in my opinion, otherwise you have to invent a collision management solution for names.
We still have an open issue for "global" services (like the master!), which maybe get DNSed as kubernetes.local, which every pod gets as a secondary --dns-search. This argues for using docker flags rather than minion configs.

We need to talk about whether there are global services, special namespaces for exposing services, or a list of services that are also global.
7) @bketelsen What limits does skyDNS impose on the hostnames? Can they have dots (e.g. imply sub-domains)? Do they have to be 24 characters or less, or can they be up to 63? Do you respect the 255 total length of a name?

I feel like I have forgotten some questions. Feel free to add more, folks.


Reply to this email directly or view it on GitHub.

@bketelsen
Copy link
Contributor

  1. Let's assume skydns. SkyDNS requires etcd. Since we're trying to run DNS as a plain old pod, it can not use the same etcd. We could spin up an etcd cluster just for this.

How many etcd replicas?
How to do etcd ACLs so nobody mucks with it?
Since skydns is pulling data from kubernetes ONLY, the etcd is essentially a private cache - no need to actually do RAFT. Could we run without etcd? Or maybe just run etcd in the same pod as skydns? That would leave a small window for inconsistent responses when something has change very recently (if using replicated DNS). @bketelsen https://github.com/bketelsen - thoughts?
There are 2 versions of SkyDNS. SkyDNS1 runs its own Raft implementation, and can run stand-alone or in a cluster of other SkyDNS1 servers. I/we could merge the k8s support from the newer version of SkyDNS (etcd backed) into the original SkyDNS and we wouldn’t need etcd at all. There are some other changes between the two versions that would need to be worked out, but if etcd is a pain point, it’s not required. The advantage would be that we could run stand-alone in smaller clusters, but add more skydns instances in a consensus configuration for larger clusters.

  1. Running DNS as a plain old POD means the core infra can not rely on it. Probably not a huge deal, but a vector for bootstrapping bugs. If we instead ran DNS on master nodes (e.g. from config files), we could actually rely on it. But then we could not scale it independently. Not sure that matters. Or we could fix it later to also have regular pods AND master-bound pods in the same service.

  2. How do we propagate DNS to pods?

It seems that containers, by default, get the resolve.conf of the node. We could configure nodes to use the k8s DNS service IP (recall, it is static) themselves. This means that resolving DNS names works inside or outside pods, and all containers (even those run manually) get the DNS resolution.
We could use docker's --dns and --dns-search flags, which is a bit more dynamic, but we don't actually need the dynamic-ness.
Any reason to go with docker flags instead of minions? @brendandburns https://github.com/brendandburns had some argument that I have misplaced.

  1. Next, we need a way to configure the cluster domain. Not complicated, just more ugly flags.

  2. The intersection with namespaces. We could keep track of all pods in a namespace and check the source IP before returning DNS results (split-horizon). Problem: kube-proxy obscures source IP. Instead, we could give each namespace a sub-domain. Pods in namespace "default" get --dns-search="default.kubernetes.local", while Pods in namespace "foobar" get "foobar.kubernetes.local". Is this level of lockdown sufficient, O namespaces people? @derekwaynecarr https://github.com/derekwaynecarrThis is supported in both versions of SkyDNS, but would require a one-line change of code in the current k8s implementation.

We still have an open issue for "global" services (like the master!), which maybe get DNSed as kubernetes.local, which every pod gets as a secondary --dns-search. This argues for using docker flags rather than minion configs.

  1. @bketelsen https://github.com/bketelsen What limits does skyDNS impose on the hostnames? Can they have dots (e.g. imply sub-domains)? Do they have to be 24 characters or less, or can they be up to 63? Do you respect the 255 total length of a name?

SkyDNS 1 (internal raft version) has a very rigid naming convention that would need to be modified for k8s. It looks like this: ......skydns.local
SkyDNS 2 (etcd version) relies on the key structure of etcd.

It seems like it makes more sense to use SkyDNS1, with modifications to make it K8s aware and remove the heavily regimented DNS structure enforcement.

Both versions of skydns support up to 63 character host names, and 255 character total length.

I feel like I have forgotten some questions. Feel free to add more, folks.


Reply to this email directly or view it on GitHub #1261 (comment).

@smarterclayton
Copy link
Contributor

An additional note - currently it's impossible to add a local name for the DNS resolve in Docker. We can set hostname, but we can't add an additional segment that makes $hostname.local resolve. There are examples of software that require a local resolvable DNS entry to even run singleton where $hostname won't work (i.e. needs two segments).

Should we enforce / encourage a default where every pod has a deterministic address that resolves at least locally? Nominal services might be able to improve this (or mutate that value), but at least some LDAP servers will fail to start without a two segment local DNS entry.

@bgrant0607 bgrant0607 added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Dec 3, 2014
@thockin
Copy link
Member

thockin commented Jan 6, 2015

Closing for now. There's more to do, but I think the primary goal here has been achieved, no?

@thockin thockin closed this as completed Jan 6, 2015
deads2k added a commit to deads2k/kubernetes that referenced this issue Jun 2, 2022
Bug 2086092: UPSTREAM: 108284: fix: exclude non-ready nodes and deleted nodes from azure load balancers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/downward-api kind/design Categorizes issue or PR as related to design. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests