Skip to content
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.

No domain entry in /etc/resolv.conf on Azure #267

Closed
errordeveloper opened this issue Feb 13, 2015 · 14 comments
Closed

No domain entry in /etc/resolv.conf on Azure #267

errordeveloper opened this issue Feb 13, 2015 · 14 comments

Comments

@errordeveloper
Copy link

Starting dhcpcd.service remedies the issue. At glance, it looks just like #220, but I know that had been fixed.

@errordeveloper
Copy link
Author

I can see there is a bug report in systemd, pretty old actually and still open: https://bugs.freedesktop.org/show_bug.cgi?id=85397

errordeveloper added a commit to errordeveloper/weave-demos that referenced this issue Feb 13, 2015
@crawford
Copy link
Contributor

It looks like UseDomains just needs to be set.

@errordeveloper
Copy link
Author

Which isn't unset, as far as I can see.

@crawford
Copy link
Contributor

Looking through systemd-resolved, I see that it only writes "nameserver" and "search". Domains and search suffixes supplied through DHCP are written into /etc/resolv.conf as "search". Is there a reason you need "domains" in particular?

@errordeveloper
Copy link
Author

What I see is there is only nameserver entry, and neither search or domain are present. I don't think there is a big difference between search and domain, either should have a very similar effect. The point is that with previous stable release short hostnames worked most of the time, and only sometimes it wouldn't work and then resolv.conf contained only a nameserver entry; with the latest stable release, I haven't seen it working at all. And, as said, I was able to resolve the issue by switching to dhcpcd (although that broke network-online.target and I had to workaround that too, but never mind).

@crawford
Copy link
Contributor

Ah, so you only have a single "nameserver" entry is your /etc/resolv.conf? Without "search", short hostnames won't work.

Which version of CoreOS is this? Can you show the contents of /var/log/waagent.conf?

@crawford crawford self-assigned this Feb 26, 2015
@marineam
Copy link

Note: last I looked networkd did not yet support the dns search dhcp option which is a newer standard for providing multiple domains. For the most part we get by since many networks also set the older single domain option.

@errordeveloper
Copy link
Author

Which version of CoreOS is this?

Image ID is 2b171e93f07c4903bcad35bda10acf22__CoreOS-Stable-557.2.0.

Can you show the contents of /var/log/waagent.conf?

2015/03/09 13:05:06 Windows Azure Linux Agent Version: WALinuxAgent-2.0.11
2015/03/09 13:05:06 Linux Distribution Detected      : CoreOS
2015/03/09 13:05:06 Module /lib/modules/3.18.1/kernel/drivers/ata/ata_piix.ko driver for ATAPI CD-ROM is already present.
2015/03/09 13:05:07 mount: /dev/sr0 is write-protected, mounting read-only
2015/03/09 13:05:07 mount: /dev/sr0 mounted on /mnt/cdrom/secure.
2015/03/09 13:05:07 mount succeeded on attempt #1
2015/03/09 13:05:07 VMM Init script not found.  Provisioning for Azure
2015/03/09 13:05:07 IPv4 address: 172.18.0.13
2015/03/09 13:05:07 MAC  address: 00:0D:3A:20:76:7C
2015/03/09 13:05:07 Probing for Windows Azure environment.
2015/03/09 13:05:07 DoDhcpWork: Setting socket.timeout=10, entering recv
2015/03/09 13:05:07 Discovered Windows Azure endpoint: 168.63.129.16
2015/03/09 13:05:07 Fabric preferred wire protocol version: 2012-11-30
2015/03/09 13:05:07 Negotiated wire protocol version: 2012-11-30
2015/03/09 13:05:07 SetBlockDeviceTimeout: Update the device sda with timeout 300
2015/03/09 13:05:08 SetBlockDeviceTimeout: Update the device sdb with timeout 300
2015/03/09 13:05:08 Retrieved GoalState from Windows Azure Fabric.
2015/03/09 13:05:08 ExpectedState: Started
2015/03/09 13:05:08 ContainerId: a9aa0e3a-d6d7-42de-8d6f-5be282ecc6a9
2015/03/09 13:05:08 RoleInstanceId: e248071cd7cf4e0796fd3ed7bc1ba951.kube-01
2015/03/09 13:05:08 Public cert with thumbprint: ACE6E33E8B719A063EE6159752D02ED9A41552E2 was retrieved.
2015/03/09 13:05:08 Provisioning image started.
2015/03/09 13:05:08 Detect GPT...
2015/03/09 13:05:08 mount: /dev/sr0 is write-protected, mounting read-only
2015/03/09 13:05:08 mount succeeded on attempt #1
2015/03/09 13:05:08 Provisioning image using OVF settings in the DVD.
2015/03/09 13:05:08 Wrote /var/lib/waagent/CustomData
2015/03/09 13:05:08 Disabled SSH password-based authentication methods.
2015/03/09 13:05:08 CreateAccount: core already exists. Will update password.
2015/03/09 13:05:08 Created user account: core
2015/03/09 13:05:08 Deploy public key:ACE6E33E8B719A063EE6159752D02ED9A41552E2
2015/03/09 13:05:13 Resource disk (/dev/sdb1) is mounted at /mnt/resource with fstype ext4
2015/03/09 13:05:13 EnvMonitor: Detected host name change: localhost -> kube-01
2015/03/09 13:05:13 Setting host name: kube-01
2015/03/09 13:05:14 Ovf XML process finished
2015/03/09 13:05:14 Posted Role Properties. CertificateThumbprint=6b869ffa3f9a7f5cc91301ff980fca7f
2015/03/09 13:05:14 Provisioning image completed.
2015/03/09 13:05:14 Posted Role Properties. CertificateThumbprint=6b869ffa3f9a7f5cc91301ff980fca7f
2015/03/09 13:05:18 EnvMonitor: Detected dhcp client restart. Restoring routing table.
2015/03/09 13:06:30 Retrieved GoalState from Windows Azure Fabric.
2015/03/09 13:06:30 ExpectedState: Started
2015/03/09 13:06:30 ContainerId: a9aa0e3a-d6d7-42de-8d6f-5be282ecc6a9
2015/03/09 13:06:30 RoleInstanceId: e248071cd7cf4e0796fd3ed7bc1ba951.kube-01
2015/03/09 13:06:30 Public cert with thumbprint: ACE6E33E8B719A063EE6159752D02ED9A41552E2 was retrieved.
2015/03/09 13:06:31 Posted Role Properties. CertificateThumbprint=6b869ffa3f9a7f5cc91301ff980fca7f

@errordeveloper
Copy link
Author

With dhcpcd I get:

# Generated by dhcpcd from eth0
# /etc/resolv.conf.head can replace this line
domain kubernetes-service-cluster-edb47372fd57ab.a4.internal.cloudapp.net
nameserver 168.63.129.16
# /etc/resolv.conf.tail can replace this line

And with systemd-resoved:

# This file is managed by systemd-resolved(8). Do not edit.
#
# Third party programs must not access this file directly, but
# only through the symlink at /etc/resolv.conf. To manage
# resolv.conf(5) in a different way, replace the symlink by a
# static file or a different symlink.

nameserver 168.63.129.16

@crawford
Copy link
Contributor

crawford commented Mar 9, 2015

That long domain name looks suspicious. I bet networkd/resolved are getting hung up on that. Can you try with a shorter hostname (just to verify)?

@errordeveloper
Copy link
Author

@crawford well spotted, it works with a shorter domain.

# This file is managed by systemd-resolved(8). Do not edit.
#
# Third party programs must not access this file directly, but
# only through the symlink at /etc/resolv.conf. To manage
# resolv.conf(5) in a different way, replace the symlink by a
# static file or a different symlink.

nameserver 168.63.129.16
search srv-ca2b85bca8b21e.a3.internal.cloudapp.net

@crawford
Copy link
Contributor

crawford commented Mar 9, 2015

OK, thanks for confirming. I believe the upper limit is 64 characters right now. Let me dig into that a bit. Silently failing like this is a problem.

errordeveloper added a commit to errordeveloper/kubernetes that referenced this issue Mar 10, 2015
it turned out the issue was due to the domain lenght, so let's just
keep it short.
@mischief
Copy link

mischief commented Apr 1, 2015

this is a bug in systemd. when your dhcp server replies with a domain longer than HOST_NAME_MAX, systemd-networkd decides to silently drop it.
https://github.com/coreos/systemd/blob/master/src/libsystemd-network/sd-dhcp-lease.c#L507

@crawford
Copy link
Contributor

crawford commented Apr 2, 2015

Fixed via https://github.com/coreos/systemd/pull/3.

@crawford crawford closed this as completed Apr 2, 2015
akram pushed a commit to akram/kubernetes that referenced this issue Apr 7, 2015
it turned out the issue was due to the domain lenght, so let's just
keep it short.
errordeveloper added a commit to weaveworks-guides/weave-kubernetes-coreos-azure that referenced this issue Nov 3, 2015
it turned out the issue was due to the domain lenght, so let's just
keep it short.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants