Fix network connection problem on Azure #72879

marwinski · 2019-01-14T10:05:09Z

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:

Add a reject iptables rule for loadbalancer type serivces without endpoints to behave consistently for the transparent Azure load balancer.

This PR avoids the creation of stale conntrack entries in case there is temporarily no endpoint (which happens when components crash or are updated). This indeed causes a denial of service as source port numbers on Azue are sometimes aggressively re-used in between 15 and 20 seconds. This will reset the 120 second timeout (default) on the stale conntrack table entries. We have seen this for 7+ hours which causes requests to randomly fail due to timeouts although the service had been restored almost immediately. Once there is a conntrack entry iptables rules will not be processed and the action that was valid at that time will be executed. In this case it if "forward".

This resolves a serious issue with the Gardener project. In Gardener the API servers are behind a load balancer. The shoot clusters are accessing the kube-apiserver via a nat gateway. In this setup we do see many failures of components trying to access the API resulting in dodgy and sometimes unresponsive clusters.

Which issue(s) this PR fixes:

Fixes #48719

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Add a reject iptables rule for loadbalancer type serivces without endpoints to behave consistently for transparent load balancers (e.g. Azure).

k8s-ci-robot · 2019-01-14T10:05:16Z

Hi @marwinski. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2019-01-14T10:08:40Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: marwinski
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: matchstick

If they are not already assigned, you can assign the PR to them by writing /assign @matchstick in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

pkg/proxy/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mvladev · 2019-01-14T21:41:18Z

/assign @matchstick @thockin

thockin

Please also see #72534 - especially the last commit. Would be nice to test your patch on top of that one.

Do we need an equivalent IPVS mode fix?

@m1093782566 for consultation - these are starting to pile up :)

pkg/proxy/iptables/proxier.go

thockin · 2019-01-14T22:50:48Z

pkg/proxy/iptables/proxier.go

+					// (immediate reject instead of connection timeout). This also avoids a stale entry in the
+					// contrack table.
+					writeLine(proxier.filterRules,
+						"-A", string(kubeForwardChain),


I think this wants to be kubeServicesChain.

kubeServicesChain is called from OUTPUT. To prevent the default FORWARD we need to add the rule to the FORWARD chain; it has no effect in the KUBE-SERVICES chain. Not sure whether it would work or should be added to the raw-prerouting or mangele-prerouting chains.

we fixed forwarding to be called in both cases - please rebase and retry in services

This somehow contradicts the other comment. Sure, I will change this if the other comment is resolved.

I don't understand what you mean - what other comment? In My PR I linked Input to kubeServicesChain, too.

pkg/proxy/iptables/proxier.go

m1093782566 · 2019-01-15T01:40:18Z

@thockin

IPVS has no such issue.

m1093782566 · 2019-01-15T09:13:16Z

/ok-to-test

Add a reject iptables rule for serivces without endpoints to behave consistently for the transparent Azure load balancer. This also avoids the creation of stale conntrack entries which cause a denial of service as port number on Azue are sometimes aggressively re-used.

marwinski · 2019-01-16T08:14:08Z

/retest

marwinski · 2019-01-17T08:18:03Z

/retest

marwinski · 2019-01-17T14:20:10Z

@thockin: I tested this fix with #72534 and yes, my fix works the same way as it did before. This said, this is a manual test for my fix but the fact that the cluster is still operational gives some indication that nothing big appears to be broken.

feiskyer

/lgtm

khenidak · 2019-02-08T00:06:45Z

as discussed here: #48719 (comment) this has to with dsr not just azure can we please rename the title accordingly?

/hold

khenidak · 2019-02-08T00:05:54Z

pkg/proxy/iptables/proxier.go

+					// Add a reject rule so the behavior is the same for clients going via the load balancer
+					// (immediate reject instead of connection timeout). This also avoids a stale entry in the
+					// contrack table.
+					writeLine(proxier.filterRules,


I think this rule should be part of of https://github.com/kubernetes/kubernetes/pull/72879/files#diff-d51765b83fe795b469e8a86276b12dc9L911 logic. There is a rule there already for ClusterIP with endpoints.

I don't think it would work there. This code path is used for the externalIPs feature in services. This is not used for external load balancers (at least we don't see it being used with load balancers on Azure, only with the externalIPs feature). I admit to still have a somehow limited knowledge of the code but it appears to me that the rule is exactly where it should be.

Yes that is the correct path. The problem happens with Services of type LoadBalancer that has ExternalIPs and no endpoints. For Azure (or anything that uses dsr) no traffic will be delivered to node if rule is not in place (and has an ExternalIP

khenidak · 2019-02-26T00:21:37Z

/LGTM
Please note this is exactly like #74394

So let us merge one of them

@thockin @craiglpeters

thockin · 2019-03-01T22:03:46Z

We should not merge this without tests. I am a little buried in PRs and docs right now - help to finish the test work on #74394 would be the way forward.

thockin · 2019-03-05T23:38:09Z

Closing in favor of #74394

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 14, 2019

k8s-ci-robot requested review from justinsb and smarterclayton January 14, 2019 10:08

k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jan 14, 2019

marwinski mentioned this pull request Jan 14, 2019

Issue with VPN Health Check gardener-attic/vpn#36

Closed

k8s-ci-robot assigned matchstick and thockin Jan 14, 2019

thockin reviewed Jan 14, 2019

View reviewed changes

thockin assigned m1093782566 Jan 14, 2019

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 15, 2019

marwinski force-pushed the azure_kube_proxy_fix branch from 3e88f39 to 6e01139 Compare January 15, 2019 13:00

k8s-ci-robot assigned feiskyer Jan 23, 2019

feiskyer reviewed Jan 23, 2019

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 23, 2019

khenidak mentioned this pull request Feb 8, 2019

Client TCP connections have to wait full timeout when set of endpoints goes from empty to non-empty #48719

Closed

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 8, 2019

khenidak suggested changes Feb 8, 2019

View reviewed changes

prashanth26 mentioned this pull request Feb 20, 2019

Ignore Azure IT gardener/machine-controller-manager#227

Merged

thockin mentioned this pull request Feb 22, 2019

Kube-proxy: ICMP reject via LBs when no endpoints #74394

Merged

k8s-ci-robot assigned khenidak Feb 26, 2019

thockin closed this Mar 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix network connection problem on Azure #72879

Fix network connection problem on Azure #72879

marwinski commented Jan 14, 2019 •

edited

Loading

k8s-ci-robot commented Jan 14, 2019

k8s-ci-robot commented Jan 14, 2019

mvladev commented Jan 14, 2019

thockin left a comment

thockin Jan 14, 2019

marwinski Jan 15, 2019

thockin Jan 26, 2019

marwinski Feb 8, 2019

thockin Mar 1, 2019

m1093782566 commented Jan 15, 2019

m1093782566 commented Jan 15, 2019

marwinski commented Jan 16, 2019

marwinski commented Jan 17, 2019

marwinski commented Jan 17, 2019

feiskyer left a comment

khenidak commented Feb 8, 2019

khenidak Feb 8, 2019

marwinski Feb 8, 2019

khenidak Feb 10, 2019

khenidak commented Feb 26, 2019

thockin commented Mar 1, 2019

thockin commented Mar 5, 2019

Fix network connection problem on Azure #72879

Fix network connection problem on Azure #72879

Conversation

marwinski commented Jan 14, 2019 • edited Loading

k8s-ci-robot commented Jan 14, 2019

k8s-ci-robot commented Jan 14, 2019

mvladev commented Jan 14, 2019

thockin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

m1093782566 commented Jan 15, 2019

m1093782566 commented Jan 15, 2019

marwinski commented Jan 16, 2019

marwinski commented Jan 17, 2019

marwinski commented Jan 17, 2019

feiskyer left a comment

Choose a reason for hiding this comment

khenidak commented Feb 8, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

khenidak commented Feb 26, 2019

thockin commented Mar 1, 2019

thockin commented Mar 5, 2019

marwinski commented Jan 14, 2019 •

edited

Loading