-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing hang in archive.CopyFileWithTar with invalid dst #17088
Conversation
LGTM |
@swernli Can we also have some tests, please? It looks like it's a severe enough issue to make it worth the effort to add a test. |
@swernli I agree with @unclejack, if you could add a test that'd be fantastic. |
@unclejack @tiborvass I spent a day or so working on a test, but found that the issue is much harder to repro in Linux than on Windows, and these tests do not yet run on Windows. Perhaps I can upload what I have and you can advise the right pattern to use? The issue is that in order to get the problem to repro, the destination of the copy has to be invalid somehow. Linux doesn't have the strict folder naming that Windows does, so that leaves trying to copy into a folder that is inaccessible. Unfortunately, the tests appear to run as root, meaning I can't really create a folder that I can't access. I was going to look for a way to set the user ID to non-root for this part of the test to see if that would work. Another concern is that when the failure does occur, since it causes a deadlock, go sees that no threads are progressing, and then crashes the test with a stack trace instead of giving a more reasonable test failure. Is this acceptable in a CI test? The output on my machine looks weird, since I only see messages for the previous test and then suddenly a crash, without even a clear indication that it was my test that caused it. |
79a050a
to
bc4dbfb
Compare
@unclejack @tiborvass Ok, I think I finally have a unit test that accurately reproduces the bug without my changes and confirms the fix with my changes. Any feedback on the test would be appreciated! |
bc4dbfb
to
dea15da
Compare
@swernli it looks like that test doesn't pass on janky 😔 |
@calavera Are the tests run as root? I'm trying to mark a directory as immutable using chattr, but it seems like that is the step that's failing. This works fine if I run the tests manually, but I'm not familiar with the environment the tests run in for janky. Should I expect chattr to work there? |
bf2e44f
to
6fba626
Compare
@tiborvass @calavera I really need to get more clarity on how these tests are run in order to get a working test. I need to be able to create a directory that I cannot copy into. The solutions I've tried that work locally are creating the folder with 0000 permissions (which only works as non-root users, since root has access to the folder anyway) and modifying the folder using chattr so that it is immutable (which fails in your test environment either because chattr is unavailable or blocked). So far every strategy that I've tried for testing this only works locally, but not in your test environment. Since this is much easier to repro on Windows (simply pick an invalid directory name, like "not:valid") perhaps this is a unit test better left for when we actually have Windows tests for archive? |
that makes sense, we can move that test to |
Yeah that makes sense. |
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
6fba626
to
a150eee
Compare
@tiborvass @calavera Ok, what I've done instead is add the test only in archive_windows_test.go. Right now the tests in this folder don't actually compile on Windows, but if I comment enough of them out so that this one actually gets run, it correctly tests the scenario. When we eventually get to this folder in our Windows CI work, we'll already have at least one working test in place! |
That sounds perfect. The more we can work towards that goal the better 😸 LGTM |
Fixing hang in archive.CopyFileWithTar with invalid dst
Thanks! |
@jhowardmsft @jstarks @vbatts @tiborvass
archive.CopyFileWithTar can hit a hang if the destination path is invalid (either an invalid path name or an inaccessible folder), because archive.Untar will return the error without closing the Reader pipe but after CopyFileWithTar has started the go func that is waiting on the pipe. The fix here is to check the error of archive.Untar and if it failed, explicitly close the pipe to unblock the thread waiting on it.
This issue is hard to hit in Linux, since the unit tests run as root and there aren't any invalid directory names we could give, but in Windows simply including "ADD myfile.txt /invalid:directory/" in a Dockerfile would cause the build to hang and the handles to the files in the layer to be leaked when the daemon is killed.
Signed-off-by: Stefan J. Wernli swernli@microsoft.com