Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add regression tests for sandbox pods when no CNI plugins are initialized #10771

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nabeelmohamed
Copy link

What is the problem which i am trying to solve?

As mentioned in PR #10744, when no CNI plugins are initialized, the pod network teardown within sandbox_run.go gets deferred and fails, causing NetNS removal to get blocked. This results in a failed pod creation that cannot be cleaned up.

Describing the solution

To address issue #10768, I've added a regression test to sandbox_run_rollback_test.go to ensure that the sandbox creation fails with the expected error message when no CNI plugins are initialized, as identified in PR #10744. @sameersaeed @mikebrow

@k8s-ci-robot
Copy link

Hi @nabeelmohamed. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

sbConfig := PodSandboxConfig(t.Name(), "nocni", WithPodLabels(labels))

t.Log("Inject CNI plugin initialization failpoint")
injectCNIFailpoint(t, sbConfig, map[string]string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can make the following change to resolve the syntax error:

conf := &failpointConf{
		Add: "1*error(no CNI plugins initialized))",
}
injectCNIFailpoint(t, sbConfig, conf)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you also please run gofmt -w -s <containerd-path>/integration/sandbox_run_rollback_test.go after making the necessary changes to resolve the linter error?

@sameersaeed
Copy link
Contributor

please also note that PR #10770 has been opened to resolve the issue of pod network teardown failures blocking cleanup

@nabeelmohamed
Copy link
Author

i think that error can be fixed by
t.Log("Inject CNI plugin initialization failpoint")
conf := &failpointConf{
Add: "1*error(no CNI plugins initialized)",
}
injectCNIFailpoint(t, sbConfig, conf)
adding these 'failpointconf' struct to this. since passing a map[string]string to the injectCNIFailpoint function, but it's expecting a pointer to a failpointConf struct

@mikebrow
Copy link
Member

Please rebase now that #10839 has been merged.

@mikebrow
Copy link
Member

@nabeelmohamed can you rebase your test and make it work correctly with the update?

@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch from d587e55 to 71bf817 Compare October 23, 2024 04:41
@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch from 6cfd2a3 to 37e9587 Compare October 23, 2024 04:53
@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch 4 times, most recently from 92e2b4a to bffc458 Compare October 23, 2024 06:30
@sameersaeed
Copy link
Contributor

Hi @nabeelmohamed, I left some suggestions on your most recent commit - please review and let me know if you have any questions!

t.Log("ListPodSandbox with the specific label")
l, err := runtimeService.ListPodSandbox(&criapiv1.PodSandboxFilter{LabelSelector: labels})
require.NoError(t, err)
require.Len(t, l, 1)
Copy link
Contributor

@sameersaeed sameersaeed Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to make sure that no pods are up if the 'no CNI plugins initialized' error is thrown, so the number of pods returned by ListPodSandbox should be 0, not 1.

@nabeelmohamed
Copy link
Author

Hi @sameersaeed

I’ve made the changes as per your suggestions. Specifically, I’ve:

Removed the detailed PodSandbox state checks.

Updated the test to assert that zero pods are up when the 'no CNI plugins initialized' error is thrown.

Added cleanup logic to remove the pod used for this check.

Included assertions to ensure no pods are up before and after the containerd restart.

Thanks for your guidance. Please let me know if there are any further adjustments needed.

@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch from 56db713 to b53fb1b Compare November 9, 2024 10:22
@sameersaeed
Copy link
Contributor

sameersaeed commented Nov 9, 2024

Hi @nabeelmohamed, thanks for making the requested changes. There are still some minor nitpicks I have with the current code based on my previous suggestions.

I've made some additional changes on my own branch. Could you please merge these changes onto your branch?

Please also make sure to rebase and run gofmt if needed, so that the tests pass. Thank you!

@sameersaeed
Copy link
Contributor

sameersaeed commented Nov 11, 2024

Thank you @nabeelmohamed! Just seems like the gofmt you ran in commit 0cbcd3a modified some unexpected files. We would only need to run it on the file we are changing for this test, sandbox_run_rollback_test.go

To fix this, could you please first revert to commit 7526ea1, (i.e. git reset --hard 7526ea1), and then squash the extra 2 commits that you have on this branch (i.e. git reset --soft HEAD~2)?

After doing the above, you can check if you need to run gofmt on sandbox_run_rollback_test.go by running gofmt -l integration from within your containerd directory. If you see the test file being outputted, you can run gofmt -s -w ./integration/sandbox_run_rollback_test.go to format the file before pushing. Otherwise, it should be fine to push without any further changes (i.e., git add . + git commit -s --amend + git push [...]).

One other minor thing - when running git commit -s --amend as mentioned above, in the commit message for 7526ea1, can you please remove any extra text after the Signed-off-by / Co-authored-by lines?

i.e., in the following, you can remove the line modified sandbox_run_roollback_test.go by merging and rebasing:

Add regression tests for sandbox pods when no CNI plugins are initialized

Co-authored-by: Mohamed Nabeel <137497525+nabeelmohamed@users.noreply.github.com>
Signed-off-by: Sameer <sameer.saeed@live.ca>

modified sandbox_run_roollback_test.go by merging and rebasing

@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch from 0cbcd3a to a29f1f0 Compare November 12, 2024 04:55
@sameersaeed
Copy link
Contributor

/retest

@k8s-ci-robot
Copy link

@sameersaeed: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

require.NoError(t, err)
require.Len(t, l, 1)
sb := l[0]
require.Equal(t, criapiv1.PodSandboxState_SANDBOX_NOTREADY, sb.State)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can leave out the extra checks. We should just have an assert to pass the test if we see that 0 pods are up after checking the output of ListPodSandbox, and fail it if that isn't the case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 328 - 329, 354 - 381 and 386 - 390 can be removed. We just need to make sure no pods come up and, in the case that this test fails, remove any pods that may get created.

Calling runtimeService.RemovePodSandbox(sb.Id) may also throw an error if no pods are up. It would instead be better to only make the call if the number of pods outputted by ListPodSandbox is greater than 0.

i.e.,

l, err := runtimeService.ListPodSandbox(&criapiv1.PodFilter{LabelSelector: labels})
require.Len(t, l, 0)

if len(l) > 0 {
  t.Log("Cleanup leaky pod(s)")
  cleanupPods(t, runtimeService)
}

injectCNIFailpoint(t, sbConfig, conf)

// Continue with the rest of your test...
t.Log("Create a sandbox")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change the log message to something like: "Create a sandbox pod while having no CNI plugins initialized"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also remove unnecessary comments on lines 337 and 343

require.Error(t, err)
require.ErrorContains(t, err, "no CNI plugins initialized")

t.Log("ListPodSandbox with the specific label")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also change this log message to something like: "No pods should be up, as CNI failure is expected clean up the pod that was attempted to be created"

t.Log("No pods should be up, as CNI failure is expected to clean up the pod that was attempted to be created")
l, err := runtimeService.ListPodSandbox(&criapiv1.PodSandboxFilter{LabelSelector: labels})
require.NoError(t, err)
require.Len(t, l, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From looking over again, it would probably be better to change this require on L349 to an assert, i.e.:
assert.Len(t, l, 0)

With this change, the test will not immediately exit in a failing case, so we can properly cleanup the pod(s) if needed before finishing the test

Please also rebase if necessary and make sure to use git commit -s --amend for the commit message

@mikebrow
Copy link
Member

/ok-to-test

@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch 2 times, most recently from 3b25598 to 4cd3211 Compare November 13, 2024 05:41
@sameersaeed
Copy link
Contributor

Hi @nabeelmohamed, thanks for making the requested changes - seems like all the tests are passing as expected now.

Could you please rebase once again? Please also make sure to set the commit message using git -s --amend once again, as was mentioned in my previous post

@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch from f2adeb2 to f562d30 Compare November 21, 2024 04:48
@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch from f562d30 to 4f647b3 Compare November 21, 2024 05:39
@nabeelmohamed
Copy link
Author

hey @sameersaeed ,i dont know why all of a sudden there is 5 failing test which is not in the previos check. all i done is rebase with upstream and force pushed with commit amend message.can you guide me....

@sameersaeed
Copy link
Contributor

sameersaeed commented Nov 22, 2024

Hi @nabeelmohamed, it seems that there were failures in the latest CI run that are not related to the changes from this PR's test code. You can rerun the tests with the /retest command if needed.

Before you do so, could you please use git commit -s --amend to set modify the commit message to the following?

Add regression tests for sandbox pods when no CNI plugins are initialized

Co-authored-by: Mohamed Nabeel <137497525+nabeelmohamed@users.noreply.github.com>
Signed-off-by: Sameer <sameer.saeed@live.ca>

@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch from 117ca1c to 5ac091f Compare November 23, 2024 10:24
@sameersaeed
Copy link
Contributor

Hi @nabeelmohamed, from checking your most recent commit (5ac091f), it seems that the commit message is still set as the following:

Add regression tests for sandbox pods when no CNI plugins are initialized
Co-authored-by: Mohamed Nabeel <137497525+nabeelmohamed@users.noreply.github.com>

Signed-off-by: nabeelmohamed <mohamednabeel321@gmail.com>

As mentioned, could you please change it to the following by using git commit -s --amend and editing the commit message?

Add regression tests for sandbox pods when no CNI plugins are initialized

Co-authored-by: Sameer <sameer.saeed@live.ca>
Signed-off-by: nabeelmohamed <mohamednabeel321@gmail.com>

@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch from 5ac091f to 7ace914 Compare November 25, 2024 04:52
@sameersaeed
Copy link
Contributor

Hi @nabeelmohamed - sorry for the delay, I didn't receive a notification on your most recent rebase

Could you please rebase once again and tag me or @mikebrow after doing so? Thank you

…ized

Co-authored-by: Sameer <sameer.saeed@live.ca>

Signed-off-by: nabeelmohamed <mohamednabeel321@gmail.com>
@nabeelmohamed nabeelmohamed force-pushed the Adding-regression-tests-for-sandbox-pods-when-no-CNI-plugins-are-initialized branch from 2bf4346 to fb33422 Compare December 4, 2024 05:00
@nabeelmohamed
Copy link
Author

Hi @nabeelmohamed - sorry for the delay, I didn't receive a notification on your most recent rebase

Could you please rebase once again and tag me or @mikebrow after doing so? Thank you

hey @sameersaeed just rebased and force pushed. also I didn't forget to use git commit --amend command . could you please review this

@sameersaeed
Copy link
Contributor

/retest

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants