-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mitigate impact of docker 1.8 ErrNoAvailableIPs bug #19477
Comments
cc/ @ArtfulCoder This looks like the issue I ran into a while back, but never could reproduce it. cc/ @thockin @2opremio Can you run 'docker info', and copy&paste of the result here. Can you also do "docker ps -a" and check how many are running and how many have exited. One possible issue of Kubelet might have which causes this leakage is that Kubelet failed to recycle POD infra container holding the network namespace for each pod. If there is no POD container leakage I mentioned above, you might run into this docker network issue. Actually docker does ip allocation here. |
Sure:
BTW, am running k8s in AWS using a setup very similar to the one create wtih https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/aws.md
As I mentioned above, I cleaned the Exited containers with Before doing this, there were a lot of them in Exited state due to the kubelet retries.
I don't think that's the case here. There are no dangling
I may be missing something, but then I don't see why Docker doesn't deallocate the IP of containers once they Exit. Maybe it's a bug triggered by the In any case, knowing this is Docker's behaviour, kubelet should probably garbage collect the Exited containers immediately after retrying. Or, at the very least, set a limit on the number of Exited containers to keep. |
This is most likely a docker issue fixed in 1.9, I can repro on 1.8 quite easily with: # docker run -d -p 80:80 gcr.io/google_containers/nginx
# while true; do docker run -d -p 80:80 gcr.io/google_containers/nginx; done
# while true; do docker run -it busybox sh -c 'ip addr | grep eth0'; done
14765: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc noqueue
inet 10.245.1.104/24 scope global eth0
14771: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc noqueue
inet 10.245.1.225/24 scope global eth0
Error response from daemon: Cannot start container 3415f31b1ac17487c304599a2926af2cabcd7f2544738c7e4d77acf5cebb1850: no available ip addresses on network If I had to guess I'd say it's somthing to do with hitting 255 and wrapping around to 245.1.0, when the bridge is offset by one. Since 1.8 they seem to have changed their ip allocation policy to reuse ips as they're released.
Yeah, it's related to the hostport.
In any case when we move to using CNI's bridge plugin we will be handling ipam ourselves. |
Related: moby/moby#14788 flannel-io/flannel#315 |
Closing this, since I am now convinced it's a docker bug (most probably moby/moby#14788). BTW, removing the exited containers doesn't seem to be the solution (it's just a mitigation which deallocates a few ips). Docker really is leaking IPs from containers which do not exist anymore (it reaches a point in which removing Exited containers doesn't help and the number of running containers is way below the number of IPs which the bridge provides). The solution seems to be upgrading to docker >= 1.9 which we don't want to do due to its performance problems: moby/moby#17720 |
@kubernetes/goog-node @kubernetes/goog-cluster this is an easy DOS attack. If we go to 1.2 with docker 1.8 we should detect and mitigate. |
Kubernetes drops support for docker 1.8.X, close the issue. |
I am using a kubernetes 1.1 cluster
It seems that errors creating containers cause kubelet to leak IPs.
See the following excerpt from
kubectl describe pod <podname>
after creating a replication controller whose pod was exposing an address already in use by the host.The first error is legitimate:
Error starting userland proxy: listen tcp 0.0.0.0:4040: bind: address already in use
because the host was in fact using port 4040.However, after some time of retrying (during which I was investigating why that address was in use and forgot to delete the replication controller), the error changed to
no available ip addresses on network
This is wrong since no other pod was being started in the meantime and there were plenty of IPs available (The host was only running 10 pods and I am using a
/24
CIDR forcbr0
which should allow for 256 pods)I waited a few minutes to see if kubelet somehow garbage collected the IPs of the failed containers, but it was only after manually removing all the Exited docker containers with
sudo docker ps -a | grep Exit | cut -d ' ' -f 1 | xargs sudo docker rm
My guess is kubelet reserves IPs for the failing containers but never deallocates them (at least not in a reasonable amount of time).
The text was updated successfully, but these errors were encountered: