Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP improve kube-proxy conntrack resilience #92122

Closed
wants to merge 2 commits into from

Conversation

aojea
Copy link
Member

@aojea aojea commented Jun 14, 2020

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:

/kind api-change
/kind bug

/kind cleanup

/kind deprecation
/kind design
/kind documentation

/kind failing-test

/kind feature

/kind flake

What this PR does / why we need it:

Analyzing current conntrack test failure
https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/92122/pull-kubernetes-e2e-kind/1272167341655855104

We can see that one pod client can not connect to a NodePort UDP service, despite the backend is running.
If we see the kube-proxy logs https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/92122/pull-kubernetes-e2e-kind/1272103421071069186/artifacts/logs/kind-worker2/containers/kube-proxy-vxqmd_kube-system_kube-proxy-771c07013332c57aee7a54f659eee28428265c6bdc33c3c80155c30ea112991e.log
we can see that kube-proxy skips the iptables cycle, hence the logic that delete the conntrack stale entries

020-06-14T10:41:16.713324865Z stderr F Trace[1260496449]: [2.833057484s] [2.833057484s] END
2020-06-14T10:41:37.655915121Z stderr F I0614 10:41:37.629312       1 trace.go:116] Trace[1898108080]: "iptables restore" (started: 2020-06-14 10:41:34.859334571 +0000 UTC m=+1248.192582056) (total time: 2.769920001s):
2020-06-14T10:41:37.6559627Z stderr F Trace[1898108080]: [2.769920001s] [2.769920001s] END
2020-06-14T10:44:53.090771324Z stderr F E0614 10:44:53.087163       1 proxier.go:866] Failed to ensure that filter chain INPUT jumps to KUBE-EXTERNAL-SERVICES: error checking rule: exit status 4: Another app is currently holding the xtables lock; still 4s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:44:53.090804341Z stderr F Another app is currently holding the xtables lock; still 3s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:44:53.090812193Z stderr F Another app is currently holding the xtables lock; still 2s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:44:53.090817632Z stderr F Another app is currently holding the xtables lock; still 1s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:44:53.090822825Z stderr F Another app is currently holding the xtables lock; still 0s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:44:53.090831003Z stderr F Another app is currently holding the xtables lock. Stopped waiting after 5s.
2020-06-14T10:44:53.09083729Z stderr F I0614 10:44:53.087196       1 proxier.go:850] Sync failed; retrying in 30s
2020-06-14T10:45:07.958821003Z stderr F E0614 10:45:07.957005       1 proxier.go:858] Failed to ensure that filter chain KUBE-FORWARD exists: error creating chain "KUBE-FORWARD": exit status 4: Another app is currently holding the xtables lock; still 4s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:45:07.958863181Z stderr F Another app is currently holding the xtables lock; still 3s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:45:07.958870983Z stderr F Another app is currently holding the xtables lock; still 2s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:45:07.958879755Z stderr F Another app is currently holding the xtables lock; still 1s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:45:07.958885596Z stderr F Another app is currently holding the xtables lock; still 0s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:45:07.958893286Z stderr F Another app is currently holding the xtables lock. Stopped waiting after 5s.
2020-06-14T10:45:07.958900667Z stderr F I0614 10:45:07.957038       1 proxier.go:850] Sync failed; retrying in 30s
2020-06-14T10:46:10.783689592Z stderr F E0614 10:46:10.692885       1 proxier.go:866] Failed to ensure that filter chain FORWARD jumps to KUBE-SERVICES: error checking rule: exit status 4: Another app is currently holding the xtables lock; still 4s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:46:10.783766723Z stderr F Another app is currently holding the xtables lock; still 3s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:46:10.783776996Z stderr F Another app is currently holding the xtables lock; still 2s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:46:10.783786709Z stderr F Another app is currently holding the xtables lock; still 1s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:46:10.783793103Z stderr F Another app is currently holding the xtables lock; still 0s 100000us time ahead to have a chance to grab the lock...
2020-06-14T10:46:10.783799786Z stderr F Another app is currently holding the xtables lock. Stopped waiting after 5s.
2020-06-14T10:46:10.783809779Z stderr F I0614 10:46:10.692921       1 proxier.go:850] Sync failed; retrying in 30s
2020-06-14T10:47:28.158771573Z stderr F E0614 10:47:28.157681       1 proxier.go:858] Failed to ensure that filter chain KUBE-EXTERNAL-SERVICES exists: error creating chain "KUBE-EXTERNAL-SERVICES": exit status 4: Another app is currently holding the xtables lock; still 4s 100000us time ahead to have a chance to grab the lock...

The current e2e test "should be able to preserve UDP traffic when server pod cycles for a
NodePort service"

Signed-off-by: Antonio Ojea antonio.ojea.garcia@gmail.com

Which issue(s) this PR fixes:

Fixes #91236

Special notes for your reviewer:

related to #92076

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/test sig/network Categorizes an issue or PR as relevant to SIG Network. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 14, 2020
@aojea
Copy link
Member Author

aojea commented Jun 14, 2020

/cc @thockin @BenTheElder

is just the first one and the scaffolding, replacing a goroutine by a pod that creates the udp traffic with the same source port (netcat is awesome 😹 ) and parsing the logs of the pods.
This is an example of the pod log output,

STEP: client pod connecting to the backend 2 on 172.18.0.3
Jun 14 11:41:51.826: INFO: Pod client logs: Sun Jun 14 09:41:02 UTC 2020
Try: 1

Try: 2

Try: 3

Try: 4

Try: 5

Try: 6

Try: 7

Try: 8
pod-server-1
Try: 9
pod-server-1
Try: 10
pod-server-1
Try: 11
pod-server-1
Try: 12
pod-server-2
Try: 13
pod-server-2
Try: 14
pod-server-2
Try: 15
pod-server-2
Try: 16

If this works well, I need to brainstorm about the different scenarios now, my thinking is that we need to cover:

@aojea
Copy link
Member Author

aojea commented Jun 14, 2020

/retest

@aojea
Copy link
Member Author

aojea commented Jun 14, 2020

/test pull-kubernetes-e2e-kind-ipv6
/test pull-kubernetes-e2e-kind
/test pull-kubernetes-e2e-gce
/test pull-kubernetes-e2e-gce-ubuntu-containerd

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aojea
To complete the pull request process, please assign danwinship
You can assign the PR to them by writing /assign @danwinship in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@aojea aojea changed the title WIP e2e conntrack tests WIP improve kube-proxy conntrack resilience Jun 14, 2020
@aojea
Copy link
Member Author

aojea commented Jun 14, 2020

no failures

/test pull-kubernetes-e2e-gce
/test pull-kubernetes-e2e-gce-ubuntu-containerd
/test pull-kubernetes-e2e-kind
/test pull-kubernetes-e2e-kind-ipv6

aojea added 2 commits June 15, 2020 10:28
deflake current e2e test
"should be able to preserve UDP traffic when server pod cycles for a
NodePort service" and reorganize the code in the e2e framework

Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
kube-proxy was only flushing the conntrack entries for stale
services and endpoints after it synced the iptables rules.

However, if iptables fails the stale entries are not flushed
and the next resync period is scheduled after 30 secs by
default.

Kube-proxy can flush the conntrack stale entries before syncing
the iptables rules.

Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 15, 2020
@k8s-ci-robot
Copy link
Contributor

@aojea: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kind d169dfb link /test pull-kubernetes-e2e-kind
pull-kubernetes-e2e-gce-ubuntu-containerd d169dfb link /test pull-kubernetes-e2e-gce-ubuntu-containerd
pull-kubernetes-verify d169dfb link /test pull-kubernetes-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@aojea
Copy link
Member Author

aojea commented Jun 15, 2020

@aojea
Copy link
Member Author

aojea commented Jun 16, 2020

I think I was wrong about this, if the iptables rules are not installed, there is no risk we drop traffic to the new endpoints. And if we keep sending to the old endpoint the result is the same for the client, because, there is no way the new endpoints will receive it

@aojea aojea closed this Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
2 participants