-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicitly add iptables rule to allow healthcheck nodeport #97824
Conversation
Hi @hanlins. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
As the document says
Need we check this before adding iptables rule ? |
Hi @pacoxu, api-server will ensure ---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment3
labels:
app: nginx3
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx3
#image: nginx:1.14.2
image: us.gcr.io/google-containers/nginx
ports:
- containerPort: 80
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Equal"
effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
name: nginx3
spec:
type: NodePort
healthCheckNodePort: 30010
externalTrafficPolicy: Local
selector:
app: nginx3
ports:
- protocol: TCP
port: 8080
targetPort: 80 And this is the error I got:
That said I think we can safely assume that LB type and external traffic policy are set correctly if |
Hi @andrewsykim, in
currently, I'm adding rules that accept inbound health check traffic in |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hanlins can you also update IPVS proxier please? Similar iptables rules allowing the health check noder port should be sufficient
pkg/proxy/iptables/proxier.go
Outdated
// no matter if node has local endpoints, healthCheckEndpoints | ||
// need to be exposed | ||
writeLine(proxier.filterRules, | ||
"-A", string(kubeExternalServicesChain), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KUBE-NODE-PORTS might make more sense here actually because healthCheckNodePort is technically still a "node port"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah originally there's no KUBE-NODE-PORT
chain in the filter
table for both iptables
and ipvs
versions of the kube-proxy
. Just FYI that this change will add the node port chain to the filter
table.
50a4c90
to
aca369f
Compare
aca369f
to
8df8a4c
Compare
/retest |
LGTM and thx @hanlins for adding the missing piece! |
/lgtm |
/assign @danwinship |
pkg/proxy/iptables/proxier.go
Outdated
@@ -395,6 +395,7 @@ type iptablesJumpChain struct { | |||
var iptablesJumpChains = []iptablesJumpChain{ | |||
{utiliptables.TableFilter, kubeExternalServicesChain, utiliptables.ChainInput, "kubernetes externally-visible service portals", []string{"-m", "conntrack", "--ctstate", "NEW"}}, | |||
{utiliptables.TableFilter, kubeExternalServicesChain, utiliptables.ChainForward, "kubernetes externally-visible service portals", []string{"-m", "conntrack", "--ctstate", "NEW"}}, | |||
{utiliptables.TableFilter, kubeNodePortsChain, utiliptables.ChainInput, "kubernetes health check service portals", nil}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to add this; inbound connections already eventually pass through kubeNodePortsChain
via the "tail-call" created at the end of syncProxyRules
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that still true when the default INPUT
chain policy is DROP
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it would work without this rule in my early attempts but turned out it didn't. The fact is for NodePort
services, the traffic was DNAT
ted in nat
table, basically in this order:
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A KUBE-SERVICES -d 10.102.149.66/32 -p tcp -m comment --comment "default/nginx cluster IP" -m tcp --dport 8080 -j KUBE-SVC-2CMXP7HKUVJN7L6M
-A KUBE-SVC-2CMXP7HKUVJN7L6M -m comment --comment "default/nginx" -j KUBE-SEP-V6JOQ423IM4BM4HR
-A KUBE-SEP-V6JOQ423IM4BM4HR -s 100.96.0.11/32 -m comment --comment "default/nginx" -j KUBE-MARK-MASQ
-A KUBE-SEP-V6JOQ423IM4BM4HR -p tcp -m comment --comment "default/nginx" -m tcp -j DNAT --to-destination 100.96.0.11:80
In the rules above, 100.96.0.11
is the nginx
pod IP.
In filter
table, the traffic is accepted by the KUBE-FORWARD
chain, see the third rule:
➜ ~ iptables -L KUBE-FORWARD --line-n -v
Chain KUBE-FORWARD (1 references)
num pkts bytes target prot opt in out source destination
1 0 0 DROP all -- any any anywhere anywhere ctstate INVALID
2 0 0 ACCEPT all -- any any anywhere anywhere /* kubernetes forwarding rules */ mark match 0x4000/0x4000
3 235 42692 ACCEPT all -- any any anywhere anywhere /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
4 0 0 ACCEPT all -- any any anywhere anywhere /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
So, different from healthcheck nodeports, the traffic towards node ports got DNAT
ted and instead of hitting INPUT
chain, it will hit FORWARD
chain. However health check node ports are listening on local addresses so the incoming packets will hit INPUT
chain.
Before this change, there's no KUBE-NODEPORTS
chain in the filter
table at all, so whether the traffic is accepted or not is purely depending on the policy of INPUT
chain. That's why we need to add a rule from INPUT
chain to the newly added KUBE-NODEPORTS
chain.
Speaking of the difference of node port traffic that is DNAT
ted and the traffic towards health check endpoints that listens on local ports, I realized that if we have pods on hostNetwork
, then node ports towards them won't work just like for health check node ports. I just validated that in my environment and the issue is really there, and I'm confident that I could fix that using the same approach just like for health check node port. I don't want to make more changes in the PR so maybe open an issue and fix it in another one, what do you guys think? @andrewsykim @danwinship ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that better to do that in a follow up PR,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... there are two completely separate KUBE-NODEPORTS
chains now... one in -t nat
and one in -t filter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I think this is right, but don't use the word "portals" in the description. That's archaic pre-kube-1.0 terminology. Actually, just changing it to "ports" is probably what you meant anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, changed here.
@@ -1337,6 +1338,19 @@ func (proxier *Proxier) syncProxyRules() { | |||
} | |||
} | |||
|
|||
// Capture healthCheckNodePorts. | |||
if svcInfo.HealthCheckNodePort() != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should go inside the if svcInfo.NodePort() != 0
section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node ports are now optional for load balancers (see spec.allocateLoadBalancerNodePorts
), so a separate check still makes sense since healthCheckNodePorts is always set regardless of whether node ports is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second to @andrewsykim, I think these are effectively the same, for now, HealthCheckNodePort
will be non-zero only if the svc is of LoadBalancer
type. And if the service is of LoadBalancer
type, then it should have a non-zero NodePort
. To me it's api-server's job to validate the spec and ensure all the svc proxy sees are valid (see my previous comment). If couple the logic here then if someday the validation logic changed, then we also need to fix it here.
@@ -913,6 +913,50 @@ func TestNodePort(t *testing.T) { | |||
} | |||
} | |||
|
|||
func TestHealthCheckNodePort(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is if I just modify the INPUT chain policy to DROP then we might lose ssh connection to that VM instantly
You don't need to change the default INPUT
chain policy, you can just do
iptables -A INPUT -p tcp --dport <PORT> -j DROP
or whatever
@@ -106,6 +106,7 @@ var iptablesJumpChain = []struct { | |||
{utiliptables.TableNAT, utiliptables.ChainPrerouting, kubeServicesChain, "kubernetes service portals"}, | |||
{utiliptables.TableNAT, utiliptables.ChainPostrouting, kubePostroutingChain, "kubernetes postrouting rules"}, | |||
{utiliptables.TableFilter, utiliptables.ChainForward, KubeForwardChain, "kubernetes forwarding rules"}, | |||
{utiliptables.TableFilter, utiliptables.ChainInput, KubeNodePortChain, "kubernetes health check rules"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm less familiar with the IPVS rules but I believe the comment I made about iptables should apply here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah same reason as my previous comment why we need to keep this rule.
@@ -161,6 +163,7 @@ var ipsetInfo = []struct { | |||
{kubeNodePortLocalSetUDP, utilipset.BitmapPort, kubeNodePortLocalSetUDPComment}, | |||
{kubeNodePortSetSCTP, utilipset.HashIPPort, kubeNodePortSetSCTPComment}, | |||
{kubeNodePortLocalSetSCTP, utilipset.HashIPPort, kubeNodePortLocalSetSCTPComment}, | |||
{kubeHealthCheckNodePortSet, utilipset.BitmapPort, kubeHealthCheckNodePortSetComment}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if you also need to add something to ipsetWithIptablesChain
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha that's a tricky one, at first I thought I should add it here, but then I realized that ipsetWithIptablesChain
is for nat
table where we need to add it in filter
table. I've already added the rule below:
// Add rule to accept traffic towards health check node port
utilproxy.WriteLine(proxier.filterRules,
"-A", string(KubeNodePortChain),
"-m", "comment", "--comment", proxier.ipsetList[kubeHealthCheckNodePortSet].getComment(),
"-m", "set", "--match-set", proxier.ipsetList[kubeHealthCheckNodePortSet].Name, "dst",
"-j", "ACCEPT",
)
1. For iptables mode, add KUBE-NODEPORTS chain in filter table. Add rules to allow healthcheck node port traffic. 2. For ipvs mode, add KUBE-NODE-PORT chain in filter table. Add KUBE-HEALTH-CHECK-NODE-PORT ipset to allow traffic to healthcheck node port.
7c78180
to
4cd1eac
Compare
/retest |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andrewsykim, danwinship, hanlins The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-kubernetes-dependencies |
@hanlins can you include in the release notes that this fix is only required when the default INPUT chain policy is not ACCEPT? |
@andrewsykim just rephrased, feel free to update it if needed! |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Explicitly add iptables rule to expose healthcheck nodeport in
filter
table.Which issue(s) this PR fixes:
Fixes #97225
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: