Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kube-proxy scaling/perf #1277

Closed
thockin opened this issue Sep 11, 2014 · 14 comments
Closed

Kube-proxy scaling/perf #1277

thockin opened this issue Sep 11, 2014 · 14 comments
Labels
area/kube-proxy priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.

Comments

@thockin
Copy link
Member

thockin commented Sep 11, 2014

We should measure and (maybe) optimize kube-proxy at some point. There are some fairly easy opportunities to make it more efficient.

@bgrant0607 bgrant0607 added the sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. label Oct 4, 2014
@hustcat
Copy link
Contributor

hustcat commented Oct 10, 2014

Has there been any progress?

@thockin
Copy link
Member Author

thockin commented Oct 10, 2014

We've had no reports of actual trouble, though there are some obvious
things that could be improved code-wise. Are you having trouble?

On Fri, Oct 10, 2014 at 8:27 AM, Ye Yin notifications@github.com wrote:

Has there been any progress?

Reply to this email directly or view it on GitHub
#1277 (comment)
.

@hustcat
Copy link
Contributor

hustcat commented Oct 11, 2014

@thockin No. I just want to know whether kube-proxy better than iptables of Docker, so I am also prepared to test kube-proxy performance.

@thockin
Copy link
Member Author

thockin commented Oct 11, 2014

Funny, I was just thinking about proxy performance today. As for better or
worse, its just different - not the same goal, though superficially similar.
On Oct 10, 2014 8:03 PM, "Ye Yin" notifications@github.com wrote:

@thockin https://github.com/thockin No. I just want to know whether
kube-proxy better than iptables of Docker, so I am also prepared to test
kube-proxy performance.

Reply to this email directly or view it on GitHub
#1277 (comment)
.

@KyleAMathews
Copy link
Contributor

Fwiw, I had setup my web app on Kubernetes this past week. I'm using GCE. I had a load balancer pointed at two Kubernetes Minions with both running HAProxy doing SSL termination. HAProxy then routed requests to web+api services. I switched off it yesterday as loads were somewhat slow (perhaps 100ms less) + I kept getting lots of ERR_CONTENT_LENGTH_MISMATCH errors in Chrome. One page has 10-15 images on it and at least one or two of the images failed to load every time.

I switched instead to a single CoreOS box with HAProxy again routing to Docker containers. Both load times are improved and the content length mismatch errors haven't returned.

@josselin-c
Copy link
Contributor

Here are some measurements I did today between two hosts (Host A and Host B) with a ~250Mbps network link between them. The machines are baremetal and each run a pod (pod A on host A and pod B on host B). I use flannel for the overlay network.

During the whole test, pod B runs a netcat server that accepts connections and discard the output:

pod B# nc -l -p 12345 | dd of=/dev/null

Transfer from Pod A to Pod B:

pod A# dd if=/dev/urandom bs=1M count=100 of=payload
pod A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

161167+57980 records in
204800+0 records out
104857600 bytes (105 MB) copied, 14.1918 s, 7.4 MB/s

Transfer from Host A to Pod B:

Host A# dd if=/dev/urandom bs=1M count=100 of=payload
Host A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

158299+61844 records in
204800+0 records out
104857600 bytes (105 MB) copied, 4.29805 s, 24.4 MB/s

@thockin
Copy link
Member Author

thockin commented Dec 9, 2014

2 things: First, this does not test kube-proxy at all. Second, this is
exactly in-line with what you can expect from standard docker networking -
veth is very expensive. I'm hoping we can switch to something better by
default soon.

On Mon, Dec 8, 2014 at 2:45 AM, josselin-c notifications@github.com wrote:

Here are some measurements I did today between two hosts (Host A and Host
B) with a ~250Mbps network link between them. The machines are baremetal
and each run a pod (pod A on host A and pod B on host B).
During the whole test, pod B runs a netact server that accepts connections
and discard the output:

pod B# nc -l -p 12345 | dd of=/dev/null

Transfer from Pod A to Pod B:

pod A# dd if=/dev/urandom bs=1M count=100 of=payload
pod A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

161167+57980 records in
204800+0 records out
104857600 bytes (105 MB) copied, 14.1918 s, 7.4 MB/s

Transfer from Host A to Pod B:

Host A# dd if=/dev/urandom bs=1M count=100 of=payload
Host A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

158299+61844 records in
204800+0 records out
104857600 bytes (105 MB) copied, 4.29805 s, 24.4 MB/s

Reply to this email directly or view it on GitHub
#1277 (comment)
.

@thockin
Copy link
Member Author

thockin commented Dec 9, 2014

I should have been clearer, sorry :)

Pod-to-pod traffic does not cross the proxy - this is just exercising the
straight-up ethernet performance of the system, which is a very interesting
measure on its own.

The veth throughput is know to be a performance issue - it's one we've been
tracking for a while, and we have a new kernel driver (ipvlan, written by
Google's Mahesh Bandewar) which should give you 95-98% of native
performance once we can integrate it with Docker.

On Mon, Dec 8, 2014 at 9:52 PM, Tim Hockin thockin@google.com wrote:

2 things: First, this does not test kube-proxy at all. Second, this is
exactly in-line with what you can expect from standard docker networking -
veth is very expensive. I'm hoping we can switch to something better by
default soon.

On Mon, Dec 8, 2014 at 2:45 AM, josselin-c notifications@github.com
wrote:

Here are some measurements I did today between two hosts (Host A and Host
B) with a ~250Mbps network link between them. The machines are baremetal
and each run a pod (pod A on host A and pod B on host B).
During the whole test, pod B runs a netact server that accepts
connections and discard the output:

pod B# nc -l -p 12345 | dd of=/dev/null

Transfer from Pod A to Pod B:

pod A# dd if=/dev/urandom bs=1M count=100 of=payload
pod A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

161167+57980 records in
204800+0 records out
104857600 bytes (105 MB) copied, 14.1918 s, 7.4 MB/s

Transfer from Host A to Pod B:

Host A# dd if=/dev/urandom bs=1M count=100 of=payload
Host A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

158299+61844 records in
204800+0 records out
104857600 bytes (105 MB) copied, 4.29805 s, 24.4 MB/s

Reply to this email directly or view it on GitHub
#1277 (comment)
.

@josselin-c
Copy link
Contributor

You are right, I should have created a Service to test the proxy code. Should I create an issue for my measurements then? Most veth benchmarks still get results way over what I see here.

@thockin
Copy link
Member Author

thockin commented Dec 9, 2014

If you think there is a problem, go ahead and open an issue, and we can
discuss. But your measurements align with my own - about 1/3 of wire
throughput.
On Dec 9, 2014 7:51 AM, "josselin-c" notifications@github.com wrote:

You are right, I should have created a Service to test the proxy code.
Should I create an issue for my measurements then? Most veth benchmarks
still get results way over what I see here.

Reply to this email directly or view it on GitHub
#1277 (comment)
.

@goltermann goltermann added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Dec 17, 2014
@dchen1107 dchen1107 added sig/node Categorizes an issue or PR as relevant to SIG Node. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Feb 4, 2015
@roberthbailey
Copy link
Contributor

Also see #3760 for a performance optimization of the kube proxy.

@bgrant0607
Copy link
Member

A benchmark would be useful. I had a chat with a user today about proxy performance problems.

@quinton-hoole

@bgrant0607 bgrant0607 added team/cluster and removed sig/node Categorizes an issue or PR as relevant to SIG Node. labels Mar 13, 2015
@wojtek-t
Copy link
Member

This is extremely old issue and a lot has been done since then. Can we close it?

@josselin-c
Copy link
Contributor

It's okay with me

-----Message d'origine-----
De : "Wojciech Tyczynski" notifications@github.com
Envoyé : ‎22/‎12/‎2015 16:36
À : "kubernetes/kubernetes" kubernetes@noreply.github.com
Cc : "josselin-c" thecr3pe@gmail.com
Objet : Re: [kubernetes] Kube-proxy scaling/perf (#1277)

This is extremely old issue and a lot has been done since then. Can we close it?

Reply to this email directly or view it on GitHub.

@fejta fejta closed this as completed Apr 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kube-proxy priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.
Projects
None yet
Development

No branches or pull requests

10 participants