Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix deployment e2e flake: Update DeploymentReaper.Stop to use ObservedGeneration #21857

Merged
merged 3 commits into from
Feb 24, 2016

Conversation

nikhiljindal
Copy link
Contributor

Removing race condition in DeploymentReaper.Stop() by waiting for the ObservedGeneration to be updated.
Support for ObservedGeneration for deployments was added in #21754

Fixes #21753 and #21798

cc @bgrant0607 @janetkuo @Kargakis @mqliang @ironcladlou @soltysh

@k8s-github-robot
Copy link

Labelling this PR as size/M

@k8s-github-robot k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 24, 2016
@nikhiljindal nikhiljindal assigned bgrant0607 and unassigned lavalamp Feb 24, 2016
@k8s-bot
Copy link

k8s-bot commented Feb 24, 2016

GCE e2e build/test failed for commit 2f66087cff6b40805bedde7162c53b523b6eb487.

@k8s-bot
Copy link

k8s-bot commented Feb 24, 2016

GCE e2e build/test failed for commit 54476ddaa5f0291b26b92710f879004ec645cc37.

// https://github.com/kubernetes/kubernetes/issues/20966
// Instead deployment should be Paused at this point and not at next TODO.
d.Spec.Paused = false
d.Spec.Paused = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Paused is set, then Replicas and RevisionHistoryLimit won't be acted upon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't matter much, I suppose.

@bgrant0607
Copy link
Member

Tests failed.

@nikhiljindal
Copy link
Contributor Author

Yes kubectl delete deployment seems to be failing with

Throttling request took 56.206692ms, request: http://127.0.0.1:8080/apis/extensions/v1beta1/namespaces/default/deployments/nginx

@nikhiljindal
Copy link
Contributor Author

ah we are not even updating ObservedGeneration for a paused deployment :)
Am fixing.

@k8s-github-robot
Copy link

PR needs rebase

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 24, 2016
@bgrant0607
Copy link
Member

Yes, we really need pause to just halt rollout progress. See #20273.

@k8s-bot
Copy link

k8s-bot commented Feb 24, 2016

GCE e2e build/test failed for commit bde1332ef34bf6ff55c638b4f8b68387647eefb7.

@nikhiljindal nikhiljindal removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 24, 2016
@nikhiljindal
Copy link
Contributor Author

Rebased and added another commit to update status for paused deployments.
The tests should pass now.

@k8s-github-robot
Copy link

Labelling this PR as size/L

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 24, 2016
@@ -420,9 +420,11 @@ func (dc *DeploymentController) syncDeployment(key string) error {
d := *obj.(*extensions.Deployment)

if d.Spec.Paused {
// Dont take any action for paused deployment.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put a TODO here to implement scaling: #20273

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@k8s-bot
Copy link

k8s-bot commented Feb 24, 2016

GCE e2e test build/test passed for commit 9b6b3f80661d90e909b6eff56476755389887288.

@nikhiljindal
Copy link
Contributor Author

Updated code as per comments.
Verified that all tests are passing.

@k8s-bot
Copy link

k8s-bot commented Feb 24, 2016

GCE e2e test build/test passed for commit 9d72599.

@soltysh
Copy link
Contributor

soltysh commented Feb 24, 2016

Fixes #20966

@nikhiljindal two comments from my side: one big related to the order of removing Deployment before removing RS and the other one regarding missing test case for paused deployments.

@bgrant0607 bgrant0607 added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. kind/flake Categorizes issue or PR as related to a flaky test. labels Feb 24, 2016
@bgrant0607
Copy link
Member

"paused deployment should be ignored by the controller" is now failing continuously due to timeout since observedGeneration is not set. We should try to get this in ASAP.

@bgrant0607
Copy link
Member

@soltysh has a good point about deleting the Deployment last, so the user can access the Deployment in the case that there's an error in an earlier step. I forgot about that. We need to add a comment, and should document that as a kubectl convention.

@nikhiljindal Please move the delete back to the end, sorry.

@nikhiljindal
Copy link
Contributor Author

Pushed another commit to move deployment delete back at the end.
PTAL.

@nikhiljindal
Copy link
Contributor Author

Re: a test case for paused deployment, we already have a testPausedDeployment() in test/e2e/deployment.go.
@soltysh Do you think that is enough?

@k8s-bot
Copy link

k8s-bot commented Feb 24, 2016

GCE e2e test build/test passed for commit 2a538e3.

return err
}
allRSs := append(controller.FilterActiveReplicaSets(oldRSs), newRS)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the following stanza here so that cleanup policy for paused deployments (#20966) won't be blocked on #20273

if deployment.Spec.RevisionHistoryLimit != nil {
        // Cleanup old replica sets
        dc.cleanupOldReplicaSets(oldRSs, deployment)
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can have it as a follow-up. If so, I will split it from my PR, once this merges.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but let's do that as a followup.

@bgrant0607
Copy link
Member

LGTM

@bgrant0607 bgrant0607 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 24, 2016
@bgrant0607
Copy link
Member

Merging to (hopefully) fix flakes.

bgrant0607 added a commit that referenced this pull request Feb 24, 2016
fix deployment e2e flake: Update DeploymentReaper.Stop to use ObservedGeneration
@bgrant0607 bgrant0607 merged commit 0b5edab into kubernetes:master Feb 24, 2016
@soltysh
Copy link
Contributor

soltysh commented Feb 25, 2016

Re: a test case for paused deployment, we already have a testPausedDeployment() in test/e2e/deployment.go.
@soltysh Do you think that is enough?

Personally I prefer having unit test for every functionality, these are the only ones I run frequently when working on a piece of code. e2e are great, but I usually run them at the end of dev process. So if you don't mind I'd rather have some unit tests to cover the code. Since this is already merged a followup would be nice :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants