Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NodeController waits for informer sync before doing anything #34809

Merged
merged 1 commit into from
Oct 14, 2016

Conversation

gmarek
Copy link
Contributor

@gmarek gmarek commented Oct 14, 2016

cc @lavalamp @davidopp

NodeController waits for informer sync before doing anything

This change is Reviewable

@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-label-needed labels Oct 14, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins unit/integration failed for commit ef04b9a13fdd7910eb29f5a28164c03eaec8e184. Full PR test history.

The magic incantation to run this job again is @k8s-bot unit test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins Kubemark GCE e2e failed for commit ef04b9a13fdd7910eb29f5a28164c03eaec8e184. Full PR test history.

The magic incantation to run this job again is @k8s-bot kubemark e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GKE smoke e2e failed for commit 01ed1d6dde209a51f1d4c7673c41f55c1954ec41. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCE e2e failed for commit 01ed1d6dde209a51f1d4c7673c41f55c1954ec41. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GCE e2e failed for commit 01ed1d6dde209a51f1d4c7673c41f55c1954ec41. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@wojtek-t
Copy link
Member

Please add a release note

@wojtek-t wojtek-t added cherrypick-candidate release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Oct 14, 2016
@wojtek-t wojtek-t added this to the v1.4 milestone Oct 14, 2016
@wojtek-t
Copy link
Member

/lgtm

@gmarek gmarek added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Oct 14, 2016
@k8s-github-robot k8s-github-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2016
@@ -351,6 +358,10 @@ func NewNodeController(
func (nc *NodeController) Run() {
// Incorporate the results of node status pushed from kubelet to master.
go wait.Until(func() {
if !nc.nodeInformer.Informer().HasSynced() || !nc.podInformer.Informer().HasSynced() || !nc.daemonSetInformer.Informer().HasSynced() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like

    if !cache.WaitForCacheSync(stopCh, dsc.podStoreSynced, dsc.nodeStoreSynced) {
        return
    }

before starting any additional gofuncs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot about this function. I agree it would be better.
@gmarek - can you please fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my second though - I can do what you suggest, but then I'll need to do NC.Run() in a separate goroutine - which is semantically slightly bigger change than this one. As this is going to be cherry-picked I wanted to do safest one possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - so let's use WaitForCacheSync function instead the if`s you have

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@wojtek-t wojtek-t removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2016
@gmarek gmarek force-pushed the sync2 branch 2 times, most recently from fd6d728 to b7d6fe3 Compare October 14, 2016 12:26
@@ -351,6 +358,10 @@ func NewNodeController(
func (nc *NodeController) Run() {
// Incorporate the results of node status pushed from kubelet to master.
go wait.Until(func() {
if !cache.WaitForCacheSync(wait.NeverStop, nc.nodeInformer.Informer().HasSynced, nc.podInformer.Informer().HasSynced, nc.daemonSetInformer.Informer().HasSynced) {
glog.V(2).Infof("NodeController is waiting for informers to sync...")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log is outdated now - it should be something like "timed out waiting for informers to sync" or sth like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gmarek gmarek added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Oct 14, 2016
@@ -351,6 +358,10 @@ func NewNodeController(
func (nc *NodeController) Run() {
// Incorporate the results of node status pushed from kubelet to master.
go wait.Until(func() {
if !cache.WaitForCacheSync(wait.NeverStop, nc.nodeInformer.Informer().HasSynced, nc.podInformer.Informer().HasSynced, nc.daemonSetInformer.Informer().HasSynced) {
glog.V(2).Infof("NodeController timed out while waiting for informers to sync...")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - one more thing, these should be Errorf()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@wojtek-t
Copy link
Member

/lgtm

@k8s-github-robot k8s-github-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins verification failed for commit 7fa52e8a50d7219649caa77cddde1d1579001360. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@gmarek
Copy link
Contributor Author

gmarek commented Oct 14, 2016

gofmt.

@gmarek gmarek added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Oct 14, 2016
@jessfraz jessfraz added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Oct 14, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 928b8cb into kubernetes:master Oct 14, 2016
jessfraz added a commit that referenced this pull request Oct 14, 2016
…-upstream-release-1.4

Automated cherry pick of #34809
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.4" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

@wojtek-t
Copy link
Member

It's possible that this PR broke scalability tests - see discussion in:
#34840

I'm trying the revert if this is the fault of this PR.

@jessfraz jessfraz added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Oct 14, 2016
@jessfraz jessfraz added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Oct 14, 2016
shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this pull request Dec 1, 2016
…-of-#34809-upstream-release-1.4

Automated cherry pick of kubernetes#34809
@gmarek gmarek deleted the sync2 branch February 21, 2017 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants