Kube-proxy scaling/perf #1277

thockin · 2014-09-11T17:14:20Z

We should measure and (maybe) optimize kube-proxy at some point. There are some fairly easy opportunities to make it more efficient.

hustcat · 2014-10-10T15:27:01Z

Has there been any progress?

thockin · 2014-10-10T15:34:28Z

We've had no reports of actual trouble, though there are some obvious
things that could be improved code-wise. Are you having trouble?

On Fri, Oct 10, 2014 at 8:27 AM, Ye Yin notifications@github.com wrote:

Has there been any progress?

Reply to this email directly or view it on GitHub
#1277 (comment)
.

hustcat · 2014-10-11T03:03:24Z

@thockin No. I just want to know whether kube-proxy better than iptables of Docker, so I am also prepared to test kube-proxy performance.

thockin · 2014-10-11T03:09:48Z

Funny, I was just thinking about proxy performance today. As for better or
worse, its just different - not the same goal, though superficially similar.
On Oct 10, 2014 8:03 PM, "Ye Yin" notifications@github.com wrote:

@thockin https://github.com/thockin No. I just want to know whether
kube-proxy better than iptables of Docker, so I am also prepared to test
kube-proxy performance.

Reply to this email directly or view it on GitHub
#1277 (comment)
.

KyleAMathews · 2014-11-02T01:58:32Z

Fwiw, I had setup my web app on Kubernetes this past week. I'm using GCE. I had a load balancer pointed at two Kubernetes Minions with both running HAProxy doing SSL termination. HAProxy then routed requests to web+api services. I switched off it yesterday as loads were somewhat slow (perhaps 100ms less) + I kept getting lots of ERR_CONTENT_LENGTH_MISMATCH errors in Chrome. One page has 10-15 images on it and at least one or two of the images failed to load every time.

I switched instead to a single CoreOS box with HAProxy again routing to Docker containers. Both load times are improved and the content length mismatch errors haven't returned.

josselin-c · 2014-12-08T10:45:27Z

Here are some measurements I did today between two hosts (Host A and Host B) with a ~250Mbps network link between them. The machines are baremetal and each run a pod (pod A on host A and pod B on host B). I use flannel for the overlay network.

During the whole test, pod B runs a netcat server that accepts connections and discard the output:

pod B# nc -l -p 12345 | dd of=/dev/null

Transfer from Pod A to Pod B:

pod A# dd if=/dev/urandom bs=1M count=100 of=payload
pod A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

161167+57980 records in
204800+0 records out
104857600 bytes (105 MB) copied, 14.1918 s, 7.4 MB/s

Transfer from Host A to Pod B:

Host A# dd if=/dev/urandom bs=1M count=100 of=payload
Host A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

158299+61844 records in
204800+0 records out
104857600 bytes (105 MB) copied, 4.29805 s, 24.4 MB/s

thockin · 2014-12-09T05:53:16Z

2 things: First, this does not test kube-proxy at all. Second, this is
exactly in-line with what you can expect from standard docker networking -
veth is very expensive. I'm hoping we can switch to something better by
default soon.

On Mon, Dec 8, 2014 at 2:45 AM, josselin-c notifications@github.com wrote:

Here are some measurements I did today between two hosts (Host A and Host
B) with a ~250Mbps network link between them. The machines are baremetal
and each run a pod (pod A on host A and pod B on host B).
During the whole test, pod B runs a netact server that accepts connections
and discard the output:

pod B# nc -l -p 12345 | dd of=/dev/null

Transfer from Pod A to Pod B:

pod A# dd if=/dev/urandom bs=1M count=100 of=payload
pod A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

161167+57980 records in
204800+0 records out
104857600 bytes (105 MB) copied, 14.1918 s, 7.4 MB/s

Transfer from Host A to Pod B:

Host A# dd if=/dev/urandom bs=1M count=100 of=payload
Host A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

158299+61844 records in
204800+0 records out
104857600 bytes (105 MB) copied, 4.29805 s, 24.4 MB/s

Reply to this email directly or view it on GitHub
#1277 (comment)
.

thockin · 2014-12-09T07:05:02Z

I should have been clearer, sorry :)

Pod-to-pod traffic does not cross the proxy - this is just exercising the
straight-up ethernet performance of the system, which is a very interesting
measure on its own.

The veth throughput is know to be a performance issue - it's one we've been
tracking for a while, and we have a new kernel driver (ipvlan, written by
Google's Mahesh Bandewar) which should give you 95-98% of native
performance once we can integrate it with Docker.

On Mon, Dec 8, 2014 at 9:52 PM, Tim Hockin thockin@google.com wrote:

2 things: First, this does not test kube-proxy at all. Second, this is
exactly in-line with what you can expect from standard docker networking -
veth is very expensive. I'm hoping we can switch to something better by
default soon.

On Mon, Dec 8, 2014 at 2:45 AM, josselin-c notifications@github.com
wrote:

Here are some measurements I did today between two hosts (Host A and Host
B) with a ~250Mbps network link between them. The machines are baremetal
and each run a pod (pod A on host A and pod B on host B).
During the whole test, pod B runs a netact server that accepts
connections and discard the output:

pod B# nc -l -p 12345 | dd of=/dev/null

Transfer from Pod A to Pod B:

pod A# dd if=/dev/urandom bs=1M count=100 of=payload
pod A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

161167+57980 records in
204800+0 records out
104857600 bytes (105 MB) copied, 14.1918 s, 7.4 MB/s

Transfer from Host A to Pod B:

Host A# dd if=/dev/urandom bs=1M count=100 of=payload
Host A# cat payload | netcat pod_b_ip 12345

Result (from pod B pov):

158299+61844 records in
204800+0 records out
104857600 bytes (105 MB) copied, 4.29805 s, 24.4 MB/s

Reply to this email directly or view it on GitHub
#1277 (comment)
.

josselin-c · 2014-12-09T15:51:33Z

You are right, I should have created a Service to test the proxy code. Should I create an issue for my measurements then? Most veth benchmarks still get results way over what I see here.

thockin · 2014-12-09T15:55:27Z

If you think there is a problem, go ahead and open an issue, and we can
discuss. But your measurements align with my own - about 1/3 of wire
throughput.
On Dec 9, 2014 7:51 AM, "josselin-c" notifications@github.com wrote:

You are right, I should have created a Service to test the proxy code.
Should I create an issue for my measurements then? Most veth benchmarks
still get results way over what I see here.

Reply to this email directly or view it on GitHub
#1277 (comment)
.

roberthbailey · 2015-02-11T22:58:30Z

Also see #3760 for a performance optimization of the kube proxy.

bgrant0607 · 2015-03-13T20:19:53Z

A benchmark would be useful. I had a chat with a user today about proxy performance problems.

@quinton-hoole

wojtek-t · 2015-12-22T15:35:46Z

This is extremely old issue and a lot has been done since then. Can we close it?

josselin-c · 2015-12-22T15:59:57Z

It's okay with me

-----Message d'origine-----
De : "Wojciech Tyczynski" notifications@github.com
Envoyé : ‎22/‎12/‎2015 16:36
À : "kubernetes/kubernetes" kubernetes@noreply.github.com
Cc : "josselin-c" thecr3pe@gmail.com
Objet : Re: [kubernetes] Kube-proxy scaling/perf (#1277)

This is extremely old issue and a lot has been done since then. Can we close it?
—
Reply to this email directly or view it on GitHub.

thockin added the area/kube-proxy label Sep 11, 2014

bgrant0607 added the sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. label Oct 4, 2014

thockin added the status/help-wanted label Oct 7, 2014

goltermann added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Dec 17, 2014

dchen1107 added sig/node Categorizes an issue or PR as relevant to SIG Node. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Feb 4, 2015

bgrant0607 added team/cluster and removed sig/node Categorizes an issue or PR as relevant to SIG Node. labels Mar 13, 2015

fejta closed this as completed Apr 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kube-proxy scaling/perf #1277

Kube-proxy scaling/perf #1277

thockin commented Sep 11, 2014

hustcat commented Oct 10, 2014

thockin commented Oct 10, 2014

hustcat commented Oct 11, 2014

thockin commented Oct 11, 2014

KyleAMathews commented Nov 2, 2014

josselin-c commented Dec 8, 2014

thockin commented Dec 9, 2014

thockin commented Dec 9, 2014

josselin-c commented Dec 9, 2014

thockin commented Dec 9, 2014

roberthbailey commented Feb 11, 2015

bgrant0607 commented Mar 13, 2015

wojtek-t commented Dec 22, 2015

josselin-c commented Dec 22, 2015

Kube-proxy scaling/perf #1277

Kube-proxy scaling/perf #1277

Comments

thockin commented Sep 11, 2014

hustcat commented Oct 10, 2014

thockin commented Oct 10, 2014

hustcat commented Oct 11, 2014

thockin commented Oct 11, 2014

KyleAMathews commented Nov 2, 2014

josselin-c commented Dec 8, 2014

thockin commented Dec 9, 2014

thockin commented Dec 9, 2014

josselin-c commented Dec 9, 2014

thockin commented Dec 9, 2014

roberthbailey commented Feb 11, 2015

bgrant0607 commented Mar 13, 2015

wojtek-t commented Dec 22, 2015

josselin-c commented Dec 22, 2015