Disable session affinity for internal kuberntes service #56690

redbaron · 2017-12-01T11:16:34Z

Under following conditions session affinity leads to a deadlock:

Self hosted controller-manager, where it talks to API servers
via kubernetes service ClusterIP
default master-count reconcilier is used
--apiserver-count is set to >1 according to the help message
number of responsive APIServers goes below apiserver-count
all controller-managers happen to be hashed to APIServers which
are down.

What then happens is that controller managers never be able to
contact APIServer, despite correctly working APIServer available.

Less serious outages also possible for other consumers of kubernetes
service, such as operators, kube-dns, flannel & calico, etc. There is
always non zero chance, that given consumer is hashed to an apiserver
which is down.

This reverts PR #23129

/sig api-machinery
CCing:

author and approver of reverted PR: @mikedanese, @lavalamp
other affected users which spoke up: @jsravn, @tatsuhiro-t

NONE

Under following conditions session affinity leads to a deadlock: - Self hosted controller-manager, where it talks to API servers via kubernetes service ClusterIP - default master-count reconcilier is used - --apiserver-count is set to >1 according to the help message - number of responsive APIServers goes below `apiserver-count` - all controller-managers happen to be hashed to APIServers which are down. What then happens is that controller managers never be able to contact APIServer, despite correctly working APIServer available. Less serious outages also possible for other consumers of kubernetes service, such as operators, kube-dns, flannel & calico, etc. There is always non zero chance, that given consumer is hashed to an apiserver which is down. Revert "give the kubernetes service client ip session affinity" This reverts commit e21ebbc.

dims · 2017-12-02T19:58:38Z

/ok-to-test

/assign @mikedanese
/assign @lavalamp

@enisoc fyi, sounds serious for 1.9 release

mikedanese · 2017-12-04T21:26:14Z

I think we don't want this for the reasons mentioned in the original PR. The issue you mention (#22609) should be fixed by #51698 which we are currently transitioning to. cc @rphillips @ncdc

redbaron · 2017-12-04T21:44:19Z

Lease endpoint reconciler is alpha feature and is not going to be a default reconciler anytime soon, IMHO it cannot be "solution" to this deadlock untils it is default

Original PR mentions hypothetical issues with ordering at the cost of a very real deadlock which at least 3 users experienced, spent time debugging and narrowing down to individual commit and then raising their concerns in that PR comments.

Original PR was merged within a day without any discussion.

Essence of original PR is that answer to following question:

Is having 1/3 of all requests fail for all clients worse then having all requests fail for 1/3 of clients?

was given essentially as "yes, it is better to have all requests of 1/3 users to fail", which is clearly wrong answer, as if you fail all requests for 1/3 of users they have NO CHANCE of recovery, given that there are enough singleton users of kubernetes API this should be obvious.

TLDR; Original PR does more damage than good and should be reverted.

wojtek-t · 2017-12-05T07:51:48Z

I took some look into things mentioned in this PR. Some of my comments.

IMHO it cannot be "solution" to this deadlock untils it is default

+1

Because each apiserver serves from a cache (unless the linearize read flag is passed explicitly) and each apiserver cache could be a different degree behind etcd.

I don't fully understand this original argument. The requests that are served from apiserver cache are:

WATCH (but this is be-design eventual consistency, and I'm not convinced that this matters here)
GET, LIST - but ONLY when you explicitly opt-in for it (by setting ResourceVersion param).

And I really hope that all opt-ins for that were done carefully.

What we should fix is the latency of a bad endpoint being remove from the master service.

This was significantly improved since the this was mentioned. So maybe that's no longer that important now?

as if you fail all requests for 1/3 of users they have NO CHANCE of recovery, given that there are enough singleton users of kubernetes API this should be obvious.

+1

Personally, I have a slight preference for merging it (until we will have the new reconciler), but I don't have that strong opinion. I would like to hear opinions from others - including @lavalamp who LGTM-ed original PR and @bowei because I may be missing something.

mikedanese · 2017-12-05T15:09:32Z

The PR in question was merge a year and a half ago so I don't think it can be argued that this is urgent. I'm ok with whatever @wojtek-t and @lavalamp say since they are the experts. However this is a significant change that affects all in cluster clients and I am concerned about invalidating testing this late in a release cycle.

wojtek-t · 2017-12-05T15:15:46Z

The PR in question was merge a year and a half ago so I don't think it can be argued that this is urgent

+1 - that's why I would like to hear other opinions.
If we will decide that it should be reverted, we may cherrypick it to previous releases if needed.
But that needs to be a conscious decision.

jsravn · 2017-12-05T15:27:50Z

Original PR caused an outage for us. We actioned by manually overriding session affinity on the apiserver service. The original PR breaks HA apiservers in my opinion, and should be reverted.

redbaron · 2017-12-11T10:10:31Z

Now that kubecon is over, any chance it can be looked at?:)

wojtek-t · 2018-01-19T15:04:15Z

@lavalamp - friendly ping

redbaron · 2018-02-06T15:43:46Z

@lavalamp , can you have a look and add your opinion on this, please?

fejta-bot · 2018-05-07T15:56:06Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

redbaron · 2018-05-07T19:44:57Z

/remove-lifecycle stale

lavalamp · 2018-05-07T22:54:05Z

The justification I gave for LGTMing the original PR is no longer valid, as consistent reads are on by default now. I think I may have also thought that session affinity was implemented by caching and not by static hashing--I am not sure how it is implemented at the moment.

We should stop using the --master-count mechanism ASAP, though, now that we have something better.

My apologies that you had to wait for an additional kubecon before I noticed this :)

/lgtm
/approve

k8s-ci-robot · 2018-05-07T22:54:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, redbaron

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/master/OWNERS~~ [lavalamp]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

wojtek-t · 2018-05-10T06:50:49Z

/retest

redbaron · 2018-05-10T09:18:40Z

/test pull-kubernetes-e2e-gce

k8s-github-robot · 2018-05-10T10:29:16Z

/test all [submit-queue is verifying that this PR is safe to merge]

k8s-github-robot · 2018-05-10T11:20:40Z

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

…y_for_1_10 Automatic merge from submit-queue. [backport] Disable session affinity for internal kubernetes service Under following conditions session affinity leads to a deadlock: - Self hosted controller-manager, where it talks to API servers via kubernetes service ClusterIP - default master-count reconcilier is used - --apiserver-count is set to >1 according to the help message - number of responsive APIServers goes below `apiserver-count` - all controller-managers happen to be hashed to APIServers which are down. What then happens is that controller managers never be able to contact APIServer, despite correctly working APIServer available. Less serious outages also possible for other consumers of kubernetes service, such as operators, kube-dns, flannel & calico, etc. There is always non zero chance, that given consumer is hashed to an apiserver which is down. Revert "give the kubernetes service client ip session affinity" This reverts commit e21ebbc. **What this PR does / why we need it**: Backporting #56690 to 1.10 release branch. **Which issue(s) this PR fixes** Fixes #23129 **Release note**: ```release-note Disable session affinity for internal kubernetes service - Backport of #56690 to 1.10 release branch ```

…690-upstream-release-1.8 Automatic merge from submit-queue. Automated cherry pick of #56690: Disable session affinity for internal kuberntes service Cherry pick of #56690 on release-1.8. #56690: Disable session affinity for internal kuberntes service

…690-upstream-release-1.9 Automatic merge from submit-queue. Automated cherry pick of #56690: Disable session affinity for internal kuberntes service Cherry pick of #56690 on release-1.9. #56690: Disable session affinity for internal kuberntes service

k8s-github-robot assigned pmorie and soltysh Dec 1, 2017

k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Dec 1, 2017

k8s-ci-robot assigned lavalamp and mikedanese Dec 2, 2017

k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 2, 2017

mikedanese assigned wojtek-t and unassigned pmorie and soltysh Dec 4, 2017

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 7, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 7, 2018

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 7, 2018

kubernetes deleted a comment from k8s-github-robot May 7, 2018

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 7, 2018

k8s-github-robot merged commit 0ba8002 into kubernetes:master May 10, 2018

afritzler mentioned this pull request Jun 22, 2018

Automated cherry pick of #56690: Disable session affinity for internal kuberntes service #65360

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable session affinity for internal kuberntes service #56690

Disable session affinity for internal kuberntes service #56690

redbaron commented Dec 1, 2017 •

edited

Loading

dims commented Dec 2, 2017

mikedanese commented Dec 4, 2017

redbaron commented Dec 4, 2017

wojtek-t commented Dec 5, 2017

mikedanese commented Dec 5, 2017

wojtek-t commented Dec 5, 2017

jsravn commented Dec 5, 2017

redbaron commented Dec 11, 2017

wojtek-t commented Jan 19, 2018

redbaron commented Feb 6, 2018

fejta-bot commented May 7, 2018

redbaron commented May 7, 2018

lavalamp commented May 7, 2018

k8s-ci-robot commented May 7, 2018

wojtek-t commented May 10, 2018

redbaron commented May 10, 2018

k8s-github-robot commented May 10, 2018

k8s-github-robot commented May 10, 2018

Disable session affinity for internal kuberntes service #56690

Disable session affinity for internal kuberntes service #56690

Conversation

redbaron commented Dec 1, 2017 • edited Loading

dims commented Dec 2, 2017

mikedanese commented Dec 4, 2017

redbaron commented Dec 4, 2017

wojtek-t commented Dec 5, 2017

mikedanese commented Dec 5, 2017

wojtek-t commented Dec 5, 2017

jsravn commented Dec 5, 2017

redbaron commented Dec 11, 2017

wojtek-t commented Jan 19, 2018

redbaron commented Feb 6, 2018

fejta-bot commented May 7, 2018

redbaron commented May 7, 2018

lavalamp commented May 7, 2018

k8s-ci-robot commented May 7, 2018

wojtek-t commented May 10, 2018

redbaron commented May 10, 2018

k8s-github-robot commented May 10, 2018

k8s-github-robot commented May 10, 2018

redbaron commented Dec 1, 2017 •

edited

Loading