Don't recreate forwarding rule/target pool on master restart #29079
Labels
area/platform/gce
priority/critical-urgent
Highest priority. Must be actively worked on as someone's top priority right now.
Milestone
Restarting master causes a recreation of fwrule and target pool for type=loadbalancer services. We are saved in the non restart case because we compare a cached verson of the service to decided if we should even call into the cloudprovider and update resources.
This is happening because our reconciliation routines are confused about session affintiy and lbip, leading them to recreate, since inplace update isn't allowed.
The impact is at best a few seconds of radio silence, at worst who knows?
Repro:
Create a loadbalanced service, eg:
Wait for it to acquire a vip.
Try it out
Ssh into master and restart kube-controller-manager:
$ docker kill $(docker ps | grep kube-controller | awk '{print $1}')
Watch the wgets, you'll notice radio silence after a bit.
This is because the targetpool and forwarding rule get recreted, for 2 reasons:
So this check fails: https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/gce/gce.go#L816
So this check fails: https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/gce/gce.go#L765 (lbip is retrieved here: https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/gce/gce.go#L535)
The text was updated successfully, but these errors were encountered: