Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace controller presence checking logic #36924

Merged
merged 1 commit into from
Nov 17, 2016

Conversation

krousey
Copy link
Contributor

@krousey krousey commented Nov 16, 2016

fixes #36912

Replace controller presence checking logic

This change is Reviewable

@krousey
Copy link
Contributor Author

krousey commented Nov 16, 2016

@saad-ali This will probably need some combination of cherry picks to 1.5, 1.4, and/or 1.3.

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Nov 16, 2016
@saad-ali
Copy link
Member

Looks fine to me, but I'm not super familiar with this code. Can someone from @kubernetes/sig-testing take a look?

@krousey krousey added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Nov 16, 2016
Copy link
Contributor

@spxtr spxtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks fine to me, but this code is pretty confusing, especially when considering past release branches and upgrade jobs.

// We get the new list of pods and replication controllers in every
// iteration because more pods come online during startup and we want to
// ensure they are also checked.
// We get the new list of pods, replication controllers, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nittiest of nits: the comment for WaitForPodsRunningReady could also use this touch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitted

@spxtr
Copy link
Contributor

spxtr commented Nov 17, 2016

Okay no objection from me. Hopefully we don't cherrypick and then find out it doesn't quite fix the problem.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 17, 2016
@krousey krousey added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 17, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 68539c0 into kubernetes:master Nov 17, 2016
@krousey
Copy link
Contributor Author

krousey commented Nov 17, 2016

@saad-ali I marked this as a cherry pick candidate to be put in the 1.5 release branch.

@saad-ali
Copy link
Member

@saad-ali I marked this as a cherry pick candidate to be put in the 1.5 release branch.

Ack, will try to pick it up today

@saad-ali
Copy link
Member

This PR has been picked up in 1.5 as this afternoon. The upgrade tests are still failing.

I guess this will happen until this fix is cherry-picked to the 1.3 and 1.4 branches?

CC @jessfraz to cherry pick this PR to 1.4

@saad-ali saad-ali modified the milestones: v1.4, v1.5 Nov 18, 2016
@krousey
Copy link
Contributor Author

krousey commented Nov 18, 2016

@saad-ali This had the intended affect. If you notice, about 100 more tests are now being run, and only one is failing now. This is likely a new failure that I will look into.

@jessfraz jessfraz added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Nov 19, 2016
@jessfraz
Copy link
Contributor

cherry-picked in #37140

@jessfraz jessfraz added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Nov 19, 2016
k8s-github-robot pushed a commit that referenced this pull request Nov 23, 2016
Automatic merge from submit-queue

Guard the ready replica checking by server version

I fixed replica readiness checking for 1.4->1.5 upgrades by using a field that only exists in versions >=1.4.0 in #36924 

This fixed a lot of issues in 1.4->1.5 upgrade testing, but did not fix 1.3->1.5 upgrade tests. I've disabled replica checking for 1.3 masters as the old logic was broken anyway.

This will not affect the 1.3 CI tests. Just 1.3 -> {1.4, 1.5} upgrade tests.

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/kubernetes-e2e-gke-container_vm-1.3-container_vm-1.5-upgrade-cluster-new/330?log
is an example of this breakage. This is the tell-tale logs:

```console
Nov 22 09:40:50.469: INFO: 11 / 11 pods in namespace 'kube-system' are running and ready (506 seconds elapsed)
Nov 22 09:40:50.469: INFO: expected 5 pod replicas in namespace 'kube-system', 0 are Running and Ready.
Nov 22 09:40:50.469: INFO: POD  NODE  PHASE  GRACE  CONDITIONS
```
jessfraz added a commit that referenced this pull request Dec 2, 2016
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.4" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/test cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

e2e logic for matching pods to ReplicationControllers is broken
8 participants