Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting a backend deletes too many DNS records #27989

Closed
mml opened this issue Jun 23, 2016 · 5 comments
Closed

Deleting a backend deletes too many DNS records #27989

mml opened this issue Jun 23, 2016 · 5 comments
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@mml
Copy link
Contributor

mml commented Jun 23, 2016

Step 1. Create a two-cluster federation, one in zone A one in F. Create a loadbalancer service in the federation, and create one pod in each cluster that matches the service selector. Result is good:

NAME                                                                                                 TYPE  TTL    DATA
kube.5yetis.net.                                                                                     NS    21600  ns-cloud-b1.googledomains.com., ns-cloud-b2.googledomains.com., ns-cloud-b3.googledomains.com., ns-cloud-b4.googledomains.com.
kube.5yetis.net.                                                                                     SOA   21600  ns-cloud-b1.googledomains.com. cloud-dns-hostmaster.google.com. 4 21600 3600 259200 300
federated-service.e2e-tests-service-lt9rd.federation.svc.kube.5yetis.net.                            A     180    104.197.202.248, 23.236.61.35
federated-service.e2e-tests-service-lt9rd.federation.svc.us-central1.kube.5yetis.net.                A     180    104.197.202.248, 23.236.61.35
federated-service.e2e-tests-service-lt9rd.federation.svc.us-central1-a.us-central1.kube.5yetis.net.  A     180    104.197.202.248
federated-service.e2e-tests-service-lt9rd.federation.svc.us-central1-f.us-central1.kube.5yetis.net.  A     180    23.236.61.35

Step 2. Delete the pod in F.

% kc --context=federation-e2e-gce-us-central1-f delete pod backend --namespace=e2e-tests-service-lt9rd
pod "backend" deleted

Step 3. Zone file looks very sad. The remaining A record feels very lonely. 😭

NAME                                                                                                 TYPE  TTL    DATA
kube.5yetis.net.                                                                                     NS    21600  ns-cloud-b1.googledomains.com., ns-cloud-b2.googledomains.com., ns-cloud-b3.googledomains.com., ns-cloud-b4.googledomains.com.
kube.5yetis.net.                                                                                     SOA   21600  ns-cloud-b1.googledomains.com. cloud-dns-hostmaster.google.com. 4 21600 3600 259200 300
federated-service.e2e-tests-service-lt9rd.federation.svc.us-central1-a.us-central1.kube.5yetis.net.  A     180    104.197.202.248

federation-controller-manager.log.txt

@quinton-hoole @kubernetes/sig-cluster-federation

@ghost
Copy link

ghost commented Jun 23, 2016

@mml Did you manage to get any logs from the controller-manager? Any errors reported there?

@mml
Copy link
Contributor Author

mml commented Jun 23, 2016

@quinton-hoole logs are attached. I did not see any out of the ordinary messages.

@ghost
Copy link

ghost commented Jun 24, 2016

I think this bug is introduced by this commit:

dd78dd8

The endpointReachable logic here is flawed. I'm sorry I did not pick that up in the code review.
There are multiple endpoints passed into ensureDnsRrsets(). So adding a single boolean endpointReachable doesn't make sense. We rather need to remove the endpoint that is unreachable from the list of endpoints passed in by the caller. Then the logic should all work correctly. I'll essentially revert the above commit, and replace it as described.

@mfanjie, do you agree?

Will send a fix shortly.

@ghost ghost added team/control-plane priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 24, 2016
@ghost ghost modified the milestones: v1.4, v1.3 Jun 24, 2016
@ghost ghost self-assigned this Jun 24, 2016
@ghost
Copy link

ghost commented Jun 24, 2016

#27999 is my proposed fix.

@mfanjie
Copy link

mfanjie commented Jun 24, 2016

@quinton-hoole thanks for catching this up, and sorry for the regression. I am fine with your proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

2 participants