Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

federation: fix dns provider initialization issues #27252

Merged
merged 5 commits into from
Jun 14, 2016

Conversation

mfanjie
Copy link

@mfanjie mfanjie commented Jun 12, 2016

This PR is based on the integration test with Google DNS API. This is the first time of full integration test.
So multiple issues was found and I combined all of them in this single PR

  1. add dns provider initialization and add ensureDns call when removing federation service.
  2. add new flags federation-name and zone-name to controller manager, both are used as part of the dns record name
  3. fix assertion failure at rrsets.go#L61, which will cause panic
  4. change getFederationDNSZoneName to get zoneName from config instead of hard code
  5. change logic of ensureDnsRrsets, only add new dns record when endpointReachable(set to true when ready address is catched) is true
  6. fix bug in processEndpointUpdate, only call ensuredns when ready address is caught
  7. change behavior of syncService, there is cases that endpoint is created before ingress IP assignment, so before there is defect for this case, ensureDns was not called when service being updated, so if Ingress IP is assigned after endpoint ready address is caught, the corresponding A records can not be created
  8. add a checking before update federation service

@nikhiljindal , can you help to add 1.3 milestone when @quinton-hoole is on leave?
Thanks.

Analytics

@k8s-bot
Copy link

k8s-bot commented Jun 12, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@mfanjie
Copy link
Author

mfanjie commented Jun 12, 2016

@k8s-bot
Copy link

k8s-bot commented Jun 12, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

1 similar comment
@k8s-bot
Copy link

k8s-bot commented Jun 12, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-label-needed labels Jun 12, 2016
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 13, 2016
@mfanjie mfanjie force-pushed the fix-ensuredns branch 2 times, most recently from 5a8a751 to e855199 Compare June 13, 2016 09:01
@mfanjie
Copy link
Author

mfanjie commented Jun 13, 2016

Updated test matrix, the expected result part.
image

@madhusudancs madhusudancs added this to the v1.3 milestone Jun 13, 2016
@ghost
Copy link

ghost commented Jun 13, 2016

Reviewing now...

@ghost ghost added area/cluster-federation priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 13, 2016
// federation name.
FederationName string `json:"federationName"`
// zone name, like example.com.
ZoneName string `json:"zoneName"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which zone is this? The zone in which federation control plane is running?
k8s clusters could be in multiple zones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: at other places we just use Zone (or Zones) rather than ZoneName

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikhil this is the DNS zone name of the federation (see comment directly above). We could perhaps prefix it with "dns" to make that clearer. Zone or Zones usually refer to structs or interfaces, not plain names. Make sense?

@nikhiljindal
Copy link
Contributor

@mfanjie Thanks for that great test matrix.
Is there a spreadsheet for it?

We can add more columns noting who is working on adding a test for each of them (@madhusudancs is adding a few in #26636)

@@ -90,6 +94,8 @@ func NewCMServer() *CMServer {
func (s *CMServer) AddFlags(fs *pflag.FlagSet) {
fs.IntVar(&s.Port, "port", s.Port, "The port that the controller-manager's http service runs on")
fs.Var(componentconfig.IPVar{Val: &s.Address}, "address", "The IP address to serve on (set to 0.0.0.0 for all interfaces)")
fs.StringVar(&s.FederationName, "federation-name", s.FederationName, "Federation name.")
fs.StringVar(&s.ZoneName, "zone-name", s.ZoneName, "Zone name, like example.com.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think that we should call this "DNSZoneName", "dns-zone-name" etc, to disambiguate it from an availability zone name. Comment/doc field should similarly include "DNS" to make it clearer.

@ghost
Copy link

ghost commented Jun 13, 2016

@mfanjie Thanks! A few nits that can be addressed in a followup PR. I assume that you ran all of the tests in the attached test matrix manually and they passed? Adding LGTM to get this merged, as it's strictly an improvement over current HEAD, even if all tests are not passing consistently yet.

@ghost ghost added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2016
@ghost
Copy link

ghost commented Jun 13, 2016

ok to test

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2016
@nikhiljindal
Copy link
Contributor

Fixed the release note label (and the title) and added back LGTM

@nikhiljindal
Copy link
Contributor

unit test seems to have failed to come up
@k8s-bot unit test this issue: #IGNORE

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2016
@nikhiljindal
Copy link
Contributor

nikhiljindal commented Jun 14, 2016

@k8s-bot node e2e test this issue: #IGNORE

(Logs say all tests passed)

@nikhiljindal nikhiljindal added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2016
@mfanjie
Copy link
Author

mfanjie commented Jun 14, 2016

I saw UT failure, fixing

@mfanjie
Copy link
Author

mfanjie commented Jun 14, 2016

fix two more issues:

  1. the original fix solution for panic in rrsets.go#L61 will cause UT panic, by reviewing and testing more, the right solution is to change the List() function.
  2. when zone info was added to cluster controller, and some changes were added to service controller, but the test case was not properly updated, this cause the ensureDNS function failed and kept retry, it took over 2.5 minutes to finish the UT. setup the UT correctly so that it can finish in ms level.

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2016
@nikhiljindal
Copy link
Contributor

Thanks adding back LGTM

@nikhiljindal nikhiljindal added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2016
@fejta
Copy link
Contributor

fejta commented Jun 14, 2016

@k8s-bot node e2e test this issue #24479

@mfanjie
Copy link
Author

mfanjie commented Jun 14, 2016

I think @quinton-hoole has helped to answer most of the questions.
I have tested the matrix and all cases passed, and I have shared the spreadsheet by email.
The remaining comments are not function related from my perspective, so they can be addressed in later time.

@@ -242,6 +248,24 @@ func (s *ServiceController) Run(workers int, stopCh <-chan struct{}) error {
return nil
}

func (s *ServiceController) init() error {
if s.federationName == "" {
return fmt.Errorf("ServiceController should not be run without federationName.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will need to add these flags to the controller-manager YAML: https://github.com/kubernetes/kubernetes/blob/master/federation/manifests/federation-controller-manager-deployment.yaml#L22. Otherwise, the e2e tests will start failing as soon as this PR is merged.

You can use the FEDERATION_NAME env var for federation name:

export FEDERATION_NAME="${FEDERATION_NAME:-federation}"

You will have to add a similar env var for ZoneName

@mfanjie
Copy link
Author

mfanjie commented Jun 14, 2016

@nikhiljindal thanks for reminding, cloud you please help to review the last commit in this PR.

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2016
@nikhiljindal
Copy link
Contributor

lgtm, thx

@nikhiljindal nikhiljindal added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Jun 14, 2016

GCE e2e build/test passed for commit 8e26283.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit c7e3b2f into kubernetes:master Jun 14, 2016
@mml mml mentioned this pull request Jun 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants