-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extend timeout to workaround slow arm64 math #66264
extend timeout to workaround slow arm64 math #66264
Conversation
/assign @liggitt |
You say "if a server uses RSA certificates". Does this also happen with elliptic curves? |
I haven't tested that yet but according to the conversations I found in the golang community, ecdsa should not be a problem. |
(Nice find! I noticed that TLS handshaking was very slow on my rPi2s (arm32), and ended up disabling etcd TLS encryption altogether (!) because etcd at the time didn't offer a way to extend the extremely aggressive 1s connect timeout to peers (typically completed in ~1.1s on my rpi2s, worse when under load). I shall have to retry with ecdsa certs to see how they perform...) |
Hey! I used your changes to build kubernetes for me to test with but how do I actually patch the changes onto my raspberry pi? I have the kube-apiserver binary here but I can't seem to figure out where to place it so that kubernetes uses it - can I create the necessary docker images on my pc or would I have to run that process on my Raspberry Pi? |
/ok-to-test |
/assign @xiang90 |
/test pull-kubernetes-e2e-gce |
@@ -40,7 +40,7 @@ var ( | |||
keepaliveTime = 30 * time.Second | |||
keepaliveTimeout = 10 * time.Second | |||
// dialTimeout is the timeout for failing to establish a connection. | |||
dialTimeout = 10 * time.Second | |||
dialTimeout = 20 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are really manifest constants then vars. We should also cross-link the issue or some meta-data to outline how this magic number was chosen.
We have O(100)s of magic numbers in the system with little/no history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered that as well.
The original number seemed to be arbitrary and the new number was just a double of that so equally as arbitrary. When go1.11 is released and is used to build the binaries, this workaround will be moot and can be reverted.
I did reference the hardware to which this was a problem and that this number was chosen as it alleviated that problem for said hardware. There is an issue linked. What further meta-data would you be interested in seeing?
cc649f8
to
d8bf732
Compare
d8bf732
to
62b9d37
Compare
The math/big functions are slow on arm64. There is improvement coming with go1.11 but in the mean time if a server uses rsa certificates on arm64, the math load for the multitude of watches over taxes the ability of the processor and the TLS connections time out. Retries will also not succeed and serve to exacerbate the problem. By extending the timeout, the TLS connections will eventually be successful and the load will drop. Fixes kubernetes#64649
/test pull-kubernetes-kubemark-e2e-gce-big |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: joejulian, timothysc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Automatic merge from submit-queue (batch tested with PRs 66341, 66405, 66403, 66264, 66447). If you want to cherry-pick this change to another branch, please follow the instructions here. |
/cc @cheftako |
/sig api-machinery |
What this PR does / why we need it:
The math/big functions are slow on arm64. There is improvement coming
with go1.11 but until such time as that version can be used to build releases,
if a server uses rsa certificates on arm64, the math load for the multitude
of watches over-taxes the ability of the processor and the TLS connections
time out. Retries will also not succeed and serve to exacerbate the problem.
By extending the timeout, the TLS connections will eventually be
successful and the load will drop.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #64649
Special notes for your reviewer:
This was tested on a Raspberry Pi 3
Release note: