Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bin/test-cleanup: Delete test namespaces in paralllel. #1339

Merged
merged 6 commits into from
Jul 27, 2018

Conversation

briansmith
Copy link
Contributor

@briansmith briansmith commented Jul 17, 2018

My kubectl (1.11.1) waits until the kubectl delete operation completes,
unlike previous versions which issue the deletion request and then
immediate return. As a result of this change, bin/test-cleanup become
much slower than it used to be in my configuration. Fix this by issuing
all the deletions in parallel with a single kubectl delete.

Also, make bin/test-cleanup exit with a failure status code if no
namespaces are deleted. Previously, with kubectl 1.11.1, the script
would exit with a success error code when no namespaces were
deleted.

Signed-off-by: Brian Smith brian@briansmith.org

@briansmith briansmith self-assigned this Jul 17, 2018
@briansmith briansmith requested a review from klingerf July 17, 2018 02:44
Copy link
Contributor

@klingerf klingerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, but I had just one small change request.

bin/test-cleanup Outdated
for ns in $(kubectl get ns | grep "^$linkerd_namespace-" | cut -f1 -d ' '); do
kubectl delete ns "$ns"
done
kubectl get ns | grep "^$linkerd_namespace-" | cut -f1 -d ' ' | xargs kubectl delete ns "$linkerd_namespace"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command won't return an error if the linkerd namespace does not exist. It also won't delete the linkerd namespace unless at least one linkerd- namespace exists. Would you mind splitting it up into two separate commands?

kubectl delete ns "$linkerd_namespace"
kubectl get ns | grep "^$linkerd_namespace-" | cut -f1 -d ' ' | xargs kubectl delete ns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that then kubectl won't start deleting the linkerd-* namespaces until after linkerd is completely deleted. Instead I will try using a different grep pattern that will match linkerd as well as linkerd-*.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that. If you just modify the grep pattern, however, then the script still won't return an error if the linkerd namespace does not exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. If the linkerd namespace doesn't exist but some linkerd-* namespaces exist (e.g. because a previous invocation was interrupted or otherwise failed) then I would want the script to delete the linkerd-* namespaces and return 0. If there are no namespaces that match then I think returning 0 makes sense too, because then you can always do bin/test-cleanup && bin/test-run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I often run tests in multiple different namespaces and sometimes forget which namespace I'm trying to cleanup. The script as written will return an error if I try to cleanup a namespace that doesn't exist, which then prompts me to cleanup the correct namespace. Also, I'm running kubectl v1.9.7 and I don't see the delay issue that you're seeing. Maybe it's a regression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script as written will return an error if I try to cleanup a namespace that doesn't exist, which then prompts me to cleanup the correct namespace.

I verified that the updated script will exit with a failure exit code (and prints a message) if there is no linkerd namespace and there are no linkerd-* namespaces. I think this is the right behavior; if there are linkerd-* namespaces then they should be deleted even if there is no linkerd namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm running kubectl v1.9.7 and I don't see the delay issue that you're seeing. Maybe it's a regression?

IDK. It might be a configuration setting on my part. However, the waiting behavior is very useful so I'd like to accomodate it.

bin/test-cleanup Outdated
done
# `kubectl delete ns` will fail if there are no matching matching namespaces,
# causing this script to exit with an error status.
kubectl get ns | cut -f1 -d ' ' | grep -E "^$linkerd_namespace(-|$)" | xargs kubectl delete ns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating! This still doesn't return an error for me if there are no matching namespaces:

$ ./bin/test-cleanup asdf; echo $?
cleaning up namespace [asdf] and associated test namespaces
0

I think the issue is that my version of kubectl behaves like this:

$ echo "" | xargs kubectl delete ns; echo $?
0

@briansmith
Copy link
Contributor Author

briansmith commented Jul 25, 2018

kubectl v1.9.7

I believe the reason you're seeing different behavior than me is because you are using an old version of kubectl. I am using v1.11.1:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.4", GitCommit:"bee2d1505c4fe820744d26d41ecd3fdd4a3d6546", GitTreeState:"clean", BuildDate:"2018-03-12T16:21:35Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

The default behavior of kubectl delete changed in kubernetes/kubernetes#64034.

So, I suggest we solve this by keeping the PR as-is and adding a note to TEST.md that one should use kubectl 1.11.1 or later.

@briansmith
Copy link
Contributor Author

I added the note to TEST.md.

I also tested the behavior of bin/test-cleanup on the master branch (without this PR) and found that it actually is broken w.r.t. the exit code:

$ bin/test-cleanup asdfasdf && echo "it worked"
cleaning up namespace [asdfasdf] and associated test namespaces
Error from server (NotFound): namespaces "asdfasdf" not found
it worked

So this PR actually fixes a bug, as well. I've updated the PR description accordingly.

TEST.md Outdated
@@ -84,6 +84,7 @@ cluster. Prior to running the test suite, verify that:

- The Conduit docker images you're trying to test have been built and are
accessible to the Kubernetes cluster to which you are deploying
- `kubectl version` reports the client version as v1.11.1 or later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use gcloud to manage my kubectl installation, and gcloud's version of kubectl is at v1.9.7 currently (release notes). I have a strong preference for our minimum required version to not exceed the version that's available via gcloud.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a strong preference for our minimum required version to not exceed the version that's available via gcloud.

What's the rationale?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale?

Never mind.

I just installed kubectl through gcloud and the version is "v1.11.0."

Copy link
Contributor

@klingerf klingerf Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rationale is that we shouldn't require the latest available version of kubectl to delete a few namespaces; I would rather update the script so that it works regardless of kubectl version.

I think something like this would work:

if ! namespaces=$(kubectl get ns -oname | grep "/$linkerd_namespace"); then
  echo "no namespaces found for [$linkerd_namespace]" >&2
  exit 1
else
  kubectl delete $namespaces
fi

That will exit with an error if no namespaces exist, and otherwise it will delete all namespaces in parallel.

My kubectl (1.9.4) waits until the `kubectl delete` operation completes,
unlike previous versions which issue the deletion request and then
immediate return. As a result of this change, bin/test-cleanup become
much slower since each deletion was happening serially. Fix this by
issuing all the deletions in parallel with a single `kubectl delete`.

Signed-off-by: Brian Smith <brian@briansmith.org>
Signed-off-by: Brian Smith <brian@briansmith.org>
Signed-off-by: Brian Smith <brian@briansmith.org>
This reverts commit a9ebd47.

Signed-off-by: Brian Smith <brian@briansmith.org>
@briansmith briansmith force-pushed the b/test-cleanup-parallel branch from a9ebd47 to 285614d Compare July 27, 2018 01:18
Copy link
Contributor

@klingerf klingerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating. I had just two follow-up comments, and it looks like your most recent commit is missing the DCO signoff.

bin/test-cleanup Outdated
done
# `kubectl delete ns` will fail if there are no matching matching namespaces,
# causing this script to exit with an error status.
if ! namespaces=$(kubectl get ns | cut -f1 -d ' ' | grep -E "^$linkerd_namespace(-|$)"); then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use kubectl get ns -o name here, there's no need to cut the output to extract the name field. Note that the name output format also includes a resource type, and the resource type changed from plural to singular in kubectl 1.10 (e.g. namespace/linkerd instead of namespaces/linkerd). But this should work:

if ! namespaces=$(kubectl get ns -oname | grep -E "/$linkerd_namespace(-|$)"); then

bin/test-cleanup Outdated
kubectl delete ns "$ns"
done
# `kubectl delete ns` will fail if there are no matching matching namespaces,
# causing this script to exit with an error status.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is no longer relevant.

Signed-off-by: Brian Smith <brian@briansmith.org>
Signed-off-by: Brian Smith <brian@briansmith.org>
@briansmith briansmith force-pushed the b/test-cleanup-parallel branch from 285614d to 3d04dee Compare July 27, 2018 20:27
@briansmith
Copy link
Contributor Author

I made the changes you suggested.

Copy link
Contributor

@klingerf klingerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⭐️ Great, thanks for updating. Looks good to me.

@briansmith briansmith merged commit 7467e36 into master Jul 27, 2018
@briansmith briansmith deleted the b/test-cleanup-parallel branch July 27, 2018 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants