Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix udp service blackhole problem when number of backends changes from 0 to non-0 #48524

Merged
merged 3 commits into from
Jul 7, 2017

Conversation

freehan
Copy link
Contributor

@freehan freehan commented Jul 6, 2017

fixes: #48370

Fix udp service blackhole problem when number of backends changes from 0 to non-0

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 6, 2017
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Jul 6, 2017
@freehan freehan added release-note-none Denotes a PR that doesn't merit a release note. cherrypick-candidate and removed release-note-label-needed labels Jul 6, 2017
@k8s-cherrypick-bot
Copy link

Removing label cherrypick-candidate because no release milestone was set. This is an invalid state and thus this PR is not being considered for cherry-pick to any release branch. Please add an appropriate release milestone and then re-add the label.

@freehan
Copy link
Contributor Author

freehan commented Jul 6, 2017

adding cherrypick-candidate label because this can potential solve a bunch of "my kube-dns does not work" case. @dchen1107

@wojtek-t
Copy link
Member

wojtek-t commented Jul 6, 2017

This looks sane to me, but I would prefer @bowei or @thockin to take a look.

@freehan
Copy link
Contributor Author

freehan commented Jul 6, 2017

I will fix the unit test and repush

hostname string) (hcEndpoints map[types.NamespacedName]int, staleSet map[endpointServicePair]bool) {
staleSet = make(map[endpointServicePair]bool)

hostname string) (hcEndpoints map[types.NamespacedName]int, staleEndpoints map[endpointServicePair]bool, staleServiceNames map[proxy.ServicePortName]bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create struct for the return value so it's not so unwieldy

type updateResult struct {
  healthcheckEndpoints map[types.NamespacedName]int
  staleEndpoints map[endpointServicePair]bool
  staleServices map[proxy.ServicePortName]bool
}

@thockin thockin changed the title flush conntrack entry for udp service when # of backend changes from … flush conntrack for udp service when # of backend changes from 0 Jul 6, 2017
@thockin
Copy link
Member

thockin commented Jul 6, 2017

/approve

@bowei has LGTM

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 6, 2017
@freehan freehan force-pushed the udp-service-flush branch from ea41ac0 to bd3552b Compare July 6, 2017 22:59
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 6, 2017
@freehan freehan force-pushed the udp-service-flush branch from bd3552b to 68a2749 Compare July 6, 2017 23:01
@shyamjvs
Copy link
Member

shyamjvs commented Jul 7, 2017

/test pull-kubernetes-kubemark-e2e-gce
@wojtek-t @gmarek This should work now. IIUC the problem was this test was run before kubernetes/test-infra#3309 was merged but after they migrated to passing --test-args to kubetest directly from e2e-runner.sh. So the "--test" arg was wrongly enabled.

@shyamjvs
Copy link
Member

shyamjvs commented Jul 7, 2017

Seems like it is still running test against the old e2e-runner.sh:

W0707 01:53:29.702] +(/workspace/e2e-runner.sh:53): main(): e2e_go_args+=(--test)

As that line doesn't exist in e2e-runner.sh anymore.
@fejta Seems like the kubekins image you pushed yesterday wasn't up to date. I'll push a new one and update the tag.

@shyamjvs
Copy link
Member

shyamjvs commented Jul 7, 2017

/test pull-kubernetes-kubemark-e2e-gce
It should hopefully work now.

@shyamjvs
Copy link
Member

shyamjvs commented Jul 7, 2017

/test pull-kubernetes-kubemark-e2e-gce

2 similar comments
@shyamjvs
Copy link
Member

shyamjvs commented Jul 7, 2017

/test pull-kubernetes-kubemark-e2e-gce

@shyamjvs
Copy link
Member

shyamjvs commented Jul 7, 2017

/test pull-kubernetes-kubemark-e2e-gce

@shyamjvs
Copy link
Member

shyamjvs commented Jul 7, 2017

/test pull-kubernetes-kubemark-e2e-gce
(Sorry for the spam, I'm on fixing it)

@freehan
Copy link
Contributor Author

freehan commented Jul 7, 2017

Ping

@bowei
Copy link
Member

bowei commented Jul 7, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 7, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bowei, freehan, thockin

Associated issue: 48370

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 48374, 48524, 48519, 42548, 48615)

@k8s-github-robot k8s-github-robot merged commit f0964b2 into kubernetes:master Jul 7, 2017
@wojtek-t
Copy link
Member

@freehan - I'm fine with cherrypicking it to 1.7, but please add add a release note to this PR (describing the bug this is fixing).

@freehan freehan added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jul 10, 2017
@freehan freehan changed the title flush conntrack for udp service when # of backend changes from 0 fix udp service blackhole problem when number of backends changes from 0 to non-0 Jul 10, 2017
@wojtek-t wojtek-t added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Jul 12, 2017
@wojtek-t
Copy link
Member

#48809 constains a cherrypick of it

k8s-github-robot pushed a commit that referenced this pull request Jul 14, 2017
…49-upstream-release-1.7

Automatic merge from submit-queue

Automated cherry pick of #48849 upstream release 1.7

Cherry pick of #48524 and #48849 on release-1.7.

#48849 : GCE: Fix panic when service loadbalancer has static IP address
k8s-github-robot pushed a commit that referenced this pull request Jul 18, 2017
#48524-upstream-release-1.7

Automatic merge from submit-queue

Automated cherry pick of #48402 #48524 upstream release 1.7

Cherry pick of #48524 and #48402 on release-1.7.

#48524 : fix udp service blackhole problem when number of backends changes from 0 to non-0
#48402 : Local storage teardown fix
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.7" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

openshift-merge-robot added a commit to openshift/origin that referenced this pull request Sep 15, 2017
Automatic merge from submit-queue (batch tested with PRs 15725, 16244, 15796, 16328, 16334)

Fix UDP service blackhole problem when number of endpoints changes from 0 to non-0

When a UDP service goes from 0 endpoints to 1, we need to run "conntrack -D ..." in case there are cached conntrack entries from pods hitting the "-j REJECT" iptables rule that gets installed for services with no endpoints.

Additionally, we need to make sure that OpenShift nodes have conntrack-tools installed so that they can actually run /sbin/conntrack in this and other cases. (There are additional bugs open about fixing the official images.)

Upstream: kubernetes/kubernetes#48524
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1487438
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
8 participants