Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAC address in minion-1 vagrant image is set and differs from actual MAC #10906

Closed
jameskyle opened this issue Jul 8, 2015 · 16 comments
Closed
Assignees
Labels
area/platform/vagrant priority/backlog Higher priority than priority/awaiting-more-evidence. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@jameskyle
Copy link
Contributor

The /etc/sysconfig/network-scripts/ifcfg-en33 configuration file declares a MAC address that does not match the one the device receives.

    $ ./cluster/kube-up.sh
    Starting cluster using provider: vagrant
    ... calling verify-prereqs
    ... calling kube-up
    Using credentials: vagrant:vagrant
    Bringing machine 'master' up with 'vmware_fusion' provider...
    Bringing machine 'minion-1' up with 'vmware_fusion' provider...
    ==> master: Machine is already running.
    ==> minion-1: Cloning VMware VM: 'kube-fedora20'. This can take some time...
    ==> minion-1: Verifying vmnet devices are healthy...
    ==> minion-1: Preparing network adapters...
    ==> minion-1: Fixed port collision for 22 => 2222. Now on port 2200.
    ==> minion-1: Starting the VMware VM...
    ==> minion-1: Waiting for machine to boot. This may take a few minutes...
        minion-1: SSH address: 172.16.156.131:22
        minion-1: SSH username: vagrant
        minion-1: SSH auth method: private key
        minion-1:
        minion-1: Vagrant insecure key detected. Vagrant will automatically replace
        minion-1: this with a newly generated keypair for better security.
        minion-1:
        minion-1: Inserting generated public key within guest...
        minion-1: Removing insecure key from the guest if its present...
        minion-1: Key inserted! Disconnecting and reconnecting using new SSH key...
    ==> minion-1: Machine booted and ready!
    ==> minion-1: Forwarding ports...
        minion-1: -- 22 => 2200
    ==> minion-1: Configuring network adapters within the VM...
    The following SSH command responded with a non-zero exit status.
    Vagrant assumes that this means the command failed!

    /sbin/ifdown ens33

    Stdout from the command:

    ERROR    : [/etc/sysconfig/network-scripts/ifdown-eth] Device ens33 has MAC address 00:0C:29:96:3F:2C, instead of configured address 00:0C:29:16:5F:2B. Ignoring.


    Stderr from the command:
@derekwaynecarr
Copy link
Member

I am unable to test or maintain issues specific to the VMWare Fusion provider.

@posita was the original author in this space and may be able to assist getting issues specific to that provider up to speed with latest project changes.

I suspect the issue is rooted in not moving up to Fedora 21. Not moving to Fedora 21 will result in running with older versions of Docker since Fedora 20 is no longer packaging Docker updates, so if you want to debug this further, I would move up to a Fedora 21 box similar to what was done for the Virtual Box provider here:

https://github.com/GoogleCloudPlatform/kubernetes/blob/master/Vagrantfile#L66

If @posita is unable to assist, and you are not able to contribute a satisfactory fix, I am inclined to remove support for VMWare Fusion as part of Kubernetes 1.0 release.

@posita
Copy link
Contributor

posita commented Jul 9, 2015

If @posita is unable to assist, and you are not able to contribute a satisfactory fix, I am inclined to remove support for VMWare Fusion as part of Kubernetes 1.0 release.

To be clear, I basically guessed at many VMWare implementation details when I hijacked #2741 from @jameskyle.1 @jameskyle, I'm willing to assist with a fix for this, but I would need your guidance. Do you have any idea as to the cause?

1 This was because I don't have licenses for either Fusion or Workstation (see this and this).

@vmarmol vmarmol added priority/backlog Higher priority than priority/awaiting-more-evidence. area/platform/vagrant team/community labels Jul 10, 2015
@fredjean
Copy link
Contributor

I have a license for Fusion and the associated Vagrant plugin. I was able to reproduce the original issue. Upgrading the box to Fedora 21 lead to a different error when provisioning the master:

ARPCHECK=no /sbin/ifup eth1 2> /dev/null

Stdout from the command:

ERROR    : [/etc/sysconfig/network-scripts/ifup-eth] Device eth1 does not seem to be present, delaying initialization.

I'm willing to help troubleshoot and put together a PR to address.

@fredjean
Copy link
Contributor

Further investigation shows that the error I'm getting is a vagrant issue, not a kubernetes issue. I'll see if I can find a solution.

@fredjean
Copy link
Contributor

I am making progress in getting this running. The master and node(s) are now getting to the provisioning step.

I had to upgrade Vagrant to 1.7.3 to get over networking related issue.

@marun
Copy link
Contributor

marun commented Jul 17, 2015

This is a known issue [1] with vmware images that can be fixed by ensuring that HWADDR entries in ifcfg-* are deleted on cleanup during image build [2]. Ideally the default vmware image should be fixed, but if that's not possible, the provision scripts could conditionally update the network device script with the device's actual mac:

ifname=eth0
mac_address=$(ip addr show dev ${ifname} | grep ether | awk '{print $2}')
sudo sed -i -e "s+^\(HWADDR=\).*+\1${mac_address}+" /etc/sysconfig/network-scripts/ifcfg-${ifname}
sudo systemctl restart network.service

1: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2002767
2: https://groups.google.com/forum/#!topic/packer-tool/UZ7rSpKKgyo

@marun
Copy link
Contributor

marun commented Jul 18, 2015

Scratch that, removing the HWADDR entry did not work. Removing the interface scripts entirely at the end of image construction did resolve the issue:

rm /etc/sysconfig/network-scripts/ifcfg-[ep]*

@jameskyle
Copy link
Contributor Author

@marun Hm, removing the HWADDR is one of a few post flight cleanup tasks for vagrant boxes.

For a complete overview of cleanup tasks for networking, my packer build scripts are fairly tested:

https://github.com/jameskyle/packer-vagrant/blob/master/scripts/network.sh

function configs() {
    rm -f /etc/udev/rules.d/70-persistent-net.rules
    rm -f /etc/sysconfig/network-scripts/ifcfg-eno*
    ln -sf /dev/null /etc/udev/rules.d/80-net-name-slot.rules
    sed -i '/^HWADDR/d' /etc/sysconfig/network-scripts/ifcfg-*
    sed -i "/^UUID/d" /etc/sysconfig/network-scripts/ifcfg-*
    cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-eth0
# Generated by packer image builder
DEVICE="eth0"
ONBOOT=yes
NETBOOT=yes
IPV6INIT=yes
BOOTPROTO=dhcp
TYPE=Ethernet
NAME="eth0"
EOF
}

@fredjean
Copy link
Contributor

@marun, @jameskyle

The box that we are currently using were built through Bento. They do not appear to have the same clean up strategy and are leaving behind a /etc/sysconfig/netowrk-scripts/ifcfg-en33 file and associated interface.

I'm open to work on a custom Fedora box that will incorporate the appropriate clean up steps and posting it somewhere where it is going to be available.

@dguerri
Copy link

dguerri commented Jul 26, 2015

Same problem with Parallels.
Quick workaround (waiting for a fix on the upstream box):

mkdir kubertemp
cd kubertemp/
vagrant init kube-fedora20
vagrant up
vagrant ssh -- sudo rm -f /etc/udev/rules.d/70-persistent-net.rules /etc/sysconfig/network-scripts/ifcfg-eth*
vagrant ssh -- sudo poweroff
vagrant package 
vagrant destroy -f
vagrant box remove kube-fedora20
vagrant box add --name kube-fedora20 package.box

@3goats
Copy link

3goats commented Sep 26, 2015

Hmm, the last work around doesn't work for vmware_fusion and vagrant. "vagrant package" is not supported.

@tom-haines
Copy link

I resolved the issue under vmware_fusion by leaving the default SSH insecure vagrant keys in place while building the middle box solution that @dguerri suggested.

mkdir kubertemp && cd kubertemp
vagrant init kube-fedora20

# add config.ssh.insert_key = false into Vagrantfile

vagrant up

vagrant ssh -- sudo rm -f /etc/udev/rules.d/70-persistent-net.rules
vagrant ssh -- sudo rm -f /etc/sysconfig/network-scripts/ifcfg-eno*
vagrant ssh -- sudo ln -sf /dev/null /etc/udev/rules.d/80-net-name-slot.rules
vagrant ssh -- sudo sed -i '/^HWADDR/d' /etc/sysconfig/network-scripts/ifcfg-*
vagrant ssh -- sudo sed -i "/^UUID/d" /etc/sysconfig/network-scripts/ifcfg-*
vagrant ssh -- sudo cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-eth0

vagrant ssh -- sudo poweroff

# package it
export BOX_KTEMP=`pwd`
cd .vagrant/machines/default/vmware_fusion/<machine__uuid>
tar cvzf package.box ./* && mv package.box $BOX_KTEMP && cd $BOX_KTEMP

vagrant destroy -f
vagrant box remove kube-fedora20
vagrant box add --name kube-fedora20 package.box

@markpollack
Copy link

Just to help a bit, I am on ubuntu 14.04 with workstation 10.0.6 and vagrant 1.7.4. I had the same MAC address issue. I tried upgrading to the fedora21 box and got past the network adapter configuration but ran into another error

==> master: Configuring network adapters within the VM...
==> master: Waiting for HGFS kernel module to load...
==> master: Enabling and configuring shared folders...
    master: -- /home/mpollack/software/kubernetes: /vagrant
==> master: Running provisioner: shell...
    master: Running: /tmp/vagrant-shell20151110-7889-196ijzy.sh
==> master: grep: 
==> master: /etc/sysconfig/network-scripts/ifcfg-eth1
==> master: : No such file or directory
==> master: Job for network.service failed. See "systemctl status network.service" and "journalctl -xe" for details.

I'll try @tom-haines suggestion another time....

@markpollack
Copy link

Looks like creating a vmware provider package is rather complicated.... so guess I'll have to try another way to run k8s. Hope this gets sorted out soon.

@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label May 31, 2017
@0xmichalis
Copy link
Contributor

@kubernetes/sig-testing-misc @kubernetes/sig-cluster-lifecycle-misc

@k8s-ci-robot k8s-ci-robot added the sig/testing Categorizes an issue or PR as relevant to SIG Testing. label Jun 9, 2017
@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Jun 9, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 9, 2017
@ixdy
Copy link
Member

ixdy commented Jun 9, 2017

Closing as obsolete, please reopen if this is still an issue.

@ixdy ixdy closed this as completed Jun 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform/vagrant priority/backlog Higher priority than priority/awaiting-more-evidence. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

No branches or pull requests