Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e: deflake volume tests #129369

Merged
merged 1 commit into from
Jan 8, 2025
Merged

Conversation

carlory
Copy link
Member

@carlory carlory commented Dec 23, 2024

What type of PR is this?

/kind flake

What this PR does / why we need it:

Failed Job: https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/122016/pull-kubernetes-e2e-gce/1871036396114808832

image

audit.log: https://storage.googleapis.com/kubernetes-ci-logs/pr-logs/pull/122016/pull-kubernetes-e2e-gce/1871036396114808832/artifacts/e2e-c539298169-674b9-master/kube-apiserver-audit.log

{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"RequestResponse","auditID":"8872cae8-da58-48a9-9eaf-98bcc404299c","stage":"ResponseComplete","requestURI":"/apis/storage.k8s.io/v1/storageclasses","verb":"create","user":{"username":"kubecfg","groups":["system:masters","system:authenticated"],"extra":{"authentication.kubernetes.io/credential-id":["X509SHA256=51b211b47046b0c4c58738fcbce2004fdee8a7b473a852fb305add4c13a8ab47"]}},"sourceIPs":["35.222.117.50"],"userAgent":"e2e.test/v1.33.0 (linux/amd64) kubernetes/4a1d1c8 -- [sig-storage] CSI Mock volume fsgroup policies CSI FSGroupPolicy [LinuxOnly] should modify fsGroup if fsGroupPolicy=File","objectRef":{"resource":"storageclasses","name":"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","apiGroup":"storage.k8s.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":201},"requestObject":{"kind":"StorageClass","apiVersion":"storage.k8s.io/v1","metadata":{"name":"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","creationTimestamp":null},"provisioner":"csi-mock-csi-mock-volumes-fsgroup-policy-5610","reclaimPolicy":"Delete","volumeBindingMode":"Immediate"},"responseObject":{"kind":"StorageClass","apiVersion":"storage.k8s.io/v1","metadata":{"name":"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","uid":"7d51a2f0-fe79-4ddf-8bab-3480120c7f76","resourceVersion":"9335","creationTimestamp":"2024-12-23T04:01:01Z","managedFields":[{"manager":"e2e.test","operation":"Update","apiVersion":"storage.k8s.io/v1","time":"2024-12-23T04:01:01Z","fieldsType":"FieldsV1","fieldsV1":{"f:provisioner":{},"f:reclaimPolicy":{},"f:volumeBindingMode":{}}}]},"provisioner":"csi-mock-csi-mock-volumes-fsgroup-policy-5610","reclaimPolicy":"Delete","volumeBindingMode":"Immediate"},"requestReceivedTimestamp":"2024-12-23T04:01:01.519483Z","stageTimestamp":"2024-12-23T04:01:01.546747Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}}
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"RequestResponse","auditID":"a939cc7c-5136-42b3-bfb4-bed1861afc22","stage":"ResponseComplete","requestURI":"/apis/storage.k8s.io/v1/storageclasses","verb":"create","user":{"username":"kubecfg","groups":["system:masters","system:authenticated"],"extra":{"authentication.kubernetes.io/credential-id":["X509SHA256=51b211b47046b0c4c58738fcbce2004fdee8a7b473a852fb305add4c13a8ab47"]}},"sourceIPs":["35.222.117.50"],"userAgent":"e2e.test/v1.33.0 (linux/amd64) kubernetes/4a1d1c8 -- [sig-storage] CSI Mock volume fsgroup policies CSI FSGroupPolicy Update [LinuxOnly] should update fsGroup if update from None to default","objectRef":{"resource":"storageclasses","name":"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","apiGroup":"storage.k8s.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"storageclasses.storage.k8s.io \"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610\" already exists","reason":"AlreadyExists","details":{"name":"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","group":"storage.k8s.io","kind":"storageclasses"},"code":409},"requestObject":{"kind":"StorageClass","apiVersion":"storage.k8s.io/v1","metadata":{"name":"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","creationTimestamp":null},"provisioner":"csi-mock-csi-mock-volumes-fsgroup-policy-5610","reclaimPolicy":"Delete","volumeBindingMode":"Immediate"},"responseObject":{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"storageclasses.storage.k8s.io \"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610\" already exists","reason":"AlreadyExists","details":{"name":"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","group":"storage.k8s.io","kind":"storageclasses"},"code":409},"requestReceivedTimestamp":"2024-12-23T04:02:56.788518Z","stageTimestamp":"2024-12-23T04:02:56.797622Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}}
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"RequestResponse","auditID":"1d518e9e-530e-4718-91ab-b5eaf1249c74","stage":"ResponseComplete","requestURI":"/apis/storage.k8s.io/v1/storageclasses/csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","verb":"delete","user":{"username":"kubecfg","groups":["system:masters","system:authenticated"],"extra":{"authentication.kubernetes.io/credential-id":["X509SHA256=51b211b47046b0c4c58738fcbce2004fdee8a7b473a852fb305add4c13a8ab47"]}},"sourceIPs":["35.222.117.50"],"userAgent":"e2e.test/v1.33.0 (linux/amd64) kubernetes/4a1d1c8 -- [sig-storage] CSI Mock volume fsgroup policies CSI FSGroupPolicy [LinuxOnly] should modify fsGroup if fsGroupPolicy=File","objectRef":{"resource":"storageclasses","name":"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","apiGroup":"storage.k8s.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":200},"requestObject":{"kind":"DeleteOptions","apiVersion":"meta.k8s.io/__internal"},"responseObject":{"kind":"StorageClass","apiVersion":"storage.k8s.io/v1","metadata":{"name":"csi-mock-sc-csi-mock-volumes-fsgroup-policy-5610","uid":"7d51a2f0-fe79-4ddf-8bab-3480120c7f76","resourceVersion":"14592","creationTimestamp":"2024-12-23T04:01:01Z","managedFields":[{"manager":"e2e.test","operation":"Update","apiVersion":"storage.k8s.io/v1","time":"2024-12-23T04:01:01Z","fieldsType":"FieldsV1","fieldsV1":{"f:provisioner":{},"f:reclaimPolicy":{},"f:volumeBindingMode":{}}}]},"provisioner":"csi-mock-csi-mock-volumes-fsgroup-policy-5610","reclaimPolicy":"Delete","volumeBindingMode":"Immediate"},"requestReceivedTimestamp":"2024-12-23T04:02:59.974752Z","stageTimestamp":"2024-12-23T04:02:59.990925Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}}

Test 1: "e2e.test/v1.33.0 (linux/amd64) kubernetes/4a1d1c8 -- [sig-storage] CSI Mock volume fsgroup policies CSI FSGroupPolicy [LinuxOnly] should modify fsGroup if fsGroupPolicy=File"

Test 2: "e2e.test/v1.33.0 (linux/amd64) kubernetes/4a1d1c8 -- [sig-storage] CSI Mock volume fsgroup policies CSI FSGroupPolicy Update [LinuxOnly] should update fsGroup if update from None to default"

2 tests create a storage class with the same name because the name is generated by the base name of the sc manifest file and the namespace from the framework while these tests have same base name and same namespace is created before the It container is executed.

Timeline from the audit log:

Test 1:

  1. create sc: "requestReceivedTimestamp":"2024-12-23T04:01:01.519483Z","stageTimestamp":"2024-12-23T04:01:01.546747Z"
  2. delete namespace: "requestReceivedTimestamp": "2024-12-23T04:01:47.402392Z","stageTimestamp": "2024-12-23T04:01:47.447485Z",
  3. delete sc: "requestReceivedTimestamp":"2024-12-23T04:02:59.974752Z","stageTimestamp":"2024-12-23T04:02:59.990925Z"

Test 2:

  1. create namespace: "requestReceivedTimestamp": "2024-12-23T04:02:54.243583Z","stageTimestamp": "2024-12-23T04:02:54.310733Z", (namespace is re-created)
  2. create sc: "requestReceivedTimestamp":"2024-12-23T04:02:56.788518Z","stageTimestamp":"2024-12-23T04:02:56.797622Z" (sc is re-created before the previous sc is deleted)

Defercleanup stack:

  1. utils.CreateFromManifests https://github.com/carlory/kubernetes/blob/fix-118037-2/test/e2e/storage/utils/create.go#L161
  2. generateDriverCleanupFunc https://github.com/carlory/kubernetes/blob/master/test/e2e/storage/drivers/csi.go#L1085

So the test namespace is deleted before the sc is deleted. It can explain the root cause of the question #119431 (comment)

Which issue(s) this PR fixes:

Fixes #118037

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/flake Categorizes issue or PR as related to a flaky test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 23, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/test sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 23, 2024
@carlory
Copy link
Member Author

carlory commented Dec 23, 2024

/cc @pohly

@k8s-ci-robot k8s-ci-robot requested a review from pohly December 23, 2024 09:29
ginkgo.By(fmt.Sprintf("deleting the test namespace: %s", testns))
// Delete the primary namespace but it's okay to fail here because this namespace will
// also be deleted by framework.Aftereach hook
_ = tryFunc(func() { f.DeleteNamespace(ctx, testns) })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not deleting this namespace here seems like the right fix to me.

Great persistence with tracking this one down, thanks @carlory!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 8, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 4e7538bb444a437a1423c98d299e05b11ca3c052

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: carlory, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 8, 2025
@k8s-ci-robot k8s-ci-robot merged commit 4ab6035 into kubernetes:master Jan 8, 2025
17 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.33 milestone Jan 8, 2025
@carlory carlory deleted the fix-118037-2 branch January 8, 2025 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
3 participants