Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "Garbage collector should support cascading deletion of custom resources" flake #61499

Merged
merged 1 commit into from
Mar 22, 2018

Conversation

jennybuckley
Copy link

@jennybuckley jennybuckley commented Mar 21, 2018

What this PR does / why we need it:
These webhook configuration objects were intended to have no effect on admission requests, and are just used to ensure that webhook configuration objects can always be deleted.

However, I think they are causing an entirely separate test to flake (Garbage collector should support cascading deletion of custom resources) because of the interaction between these two webhooks, which are run on all admission requests, and #61355 (Unable to create/update CRD when mutating webhook configured) when the two tests are run at around the same time.

This PR will make the effects of the webhook e2e test on other tests easier to reason about.

The flakiness is being caused by a legitimate failure of the underlying code (That CRDs currently don't work with webhooks), but not a failure which either of "Garbage collector should support cascading deletion of custom resources" or "AdmissionWebhooks Should not be able to prevent deleting validating-webhook-configurations or mutating-webhook-configurations" is supposed to reliably catch.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #59997

Release note:

NONE

/sig api-machinery
/kind flake

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. kind/flake Categorizes issue or PR as related to a flaky test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 21, 2018
@k8s-ci-robot k8s-ci-robot requested review from deads2k and liggitt March 21, 2018 21:24
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 21, 2018
@jennybuckley
Copy link
Author

jennybuckley commented Mar 21, 2018

To support this claim, https://storage.googleapis.com/k8s-gubernator/triage/index.html?pr=1&text=internal%20error&test=should%20support%20cascading%20deletion%20of%20custom%20resources shows that the error only starts occuring around 16 days ago, right when my PR to add that webhook test was merged.

@jennybuckley
Copy link
Author

/priority failing-test
/priority important-soon

@k8s-ci-robot k8s-ci-robot added kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Mar 21, 2018
Resources: []string{"*"},
APIGroups: []string{""},
APIVersions: []string{"v1"},
Resources: []string{"pods"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change it to some non-exist resources? Intercepting CRUD of "pods" can still break other tests.

@jennybuckley
Copy link
Author

/test pull-kubernetes-e2e-kops-aws

@jennybuckley
Copy link
Author

/test pull-kubernetes-integration

@jennybuckley
Copy link
Author

/test pull-kubernetes-e2e-kops-aws

@caesarxuchao
Copy link
Member

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 21, 2018
@janetkuo
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: caesarxuchao, janetkuo, jennybuckley

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 21, 2018
@jennybuckley jennybuckley changed the title Remove wildcard matching from no-op e2e test webhooks Fix "Garbage collector should support cascading deletion of custom resources" flake Mar 21, 2018
Resources: []string{"*"},
APIGroups: []string{""},
APIVersions: []string{"v1"},
Resources: []string{"invalid"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to make sure a failed hook can't prevent its own removal, you should match the webhook configuration resource and the delete operation, right?

Copy link
Author

@jennybuckley jennybuckley Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually one configuration already doing that, these ones are there to test if a hook can prevent deleting another hook's configuration (to prevent a cycle of hooks preventing each other's deletions)

Copy link
Author

@jennybuckley jennybuckley Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these ones are there to test if a hook can prevent deleting another hook's configuration

I don't see how selecting an invalid resource tests that case

Copy link
Author

@jennybuckley jennybuckley Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These webhook configurations should be able to be created and deleted without interference from the webhook I linked above. If webhooks are able to prevent deletion of other webhook configurations, the webhook I linked above will cause this check to fail, which it did, as shown in this test result from #59840 with just the test commit and not the fix:

test/e2e/apimachinery/webhook.go:147
deleting webhook config e2e-test-should-be-removable-validating-webhook-config with namespace e2e-tests-webhook-qlmqf
Expected error:
    <*errors.StatusError | 0xc421454b40>: {
        ErrStatus: {
            TypeMeta: {Kind: "", APIVersion: ""},
            ListMeta: {SelfLink: "", ResourceVersion: "", Continue: ""},
            Status: "Failure",
            Message: "admission webhook \"deny-webhook-configuration-deletions.k8s.io\" denied the request: this webhook denies all requests",
            Reason: "",
            Details: nil,
            Code: 500,
        },
    }
    admission webhook "deny-webhook-configuration-deletions.k8s.io" denied the request: this webhook denies all requests
not to have occurred
test/e2e/apimachinery/webhook.go:815

Copy link
Author

@jennybuckley jennybuckley Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test goes like this:

  1. Create a webhook which is configured to prevent deletion of ValidatingWebhookConfiguration and MutatingWebhookConfiguration objects
  2. Attempt to create and delete a dummy ValidatingWebhookConfiguration
  3. Attempt to create and delete a dummy MutatingWebhookConfiguration
  4. Attempt to delete the webhook registered in (1) using this cleanup function which is called here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks.

Resources: []string{"*"},
APIGroups: []string{""},
APIVersions: []string{"v1"},
Resources: []string{"invalid"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 61378, 60915, 61499, 61507, 61478). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 2919818 into kubernetes:master Mar 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[upgrade test failure] Garbage collector should support cascading deletion of custom resources
6 participants