Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: test-thread-priority.c fails on Linux #4382

Open
nikneym opened this issue Apr 6, 2024 · 18 comments
Open

test: test-thread-priority.c fails on Linux #4382

nikneym opened this issue Apr 6, 2024 · 18 comments

Comments

@nikneym
Copy link

nikneym commented Apr 6, 2024

Hi, assertion fails at ASSERT_EQ(priority, 0) in the following:

#ifdef __linux__
  ASSERT_OK(uv_thread_getpriority(pthread_self(), &priority));
  ASSERT_EQ(priority, 0);
  ASSERT_OK(uv_thread_setpriority(pthread_self(), UV_THREAD_PRIORITY_LOWEST));
  ASSERT_OK(uv_thread_getpriority(pthread_self(), &priority));
  ASSERT_EQ(priority, (0 - UV_THREAD_PRIORITY_LOWEST * 2));
#endif

make check output:

not ok 355 - thread_priority
# exit code 134
# Output from process `thread_priority`:
# Assertion failed in test/test-thread-priority.c on line 92: `priority == 0` (6 == 0)
  • Version: HEAD
  • Platform: Linux 6.8.0-76060800daily20240311-generic x86_64 GNU/Linux
@bnoordhuis
Copy link
Member

The test makes the (perhaps somewhat naive) assumption that the main thread is created with a priority of 0. Do you happen to know what scheduling policy and priority were used when you ran the test?

The fix is probably quite simple:

diff --git a/test/test-thread-priority.c b/test/test-thread-priority.c
index 0aaf2977..769b485e 100644
--- a/test/test-thread-priority.c
+++ b/test/test-thread-priority.c
@@ -88,8 +88,6 @@ TEST_IMPL(thread_priority) {
  * test set nice value for the calling thread with default schedule policy
 */
 #ifdef __linux__
-  ASSERT_OK(uv_thread_getpriority(pthread_self(), &priority));
-  ASSERT_EQ(priority, 0);
   ASSERT_OK(uv_thread_setpriority(pthread_self(), UV_THREAD_PRIORITY_LOWEST));
   ASSERT_OK(uv_thread_getpriority(pthread_self(), &priority));
   ASSERT_EQ(priority, (0 - UV_THREAD_PRIORITY_LOWEST * 2));

@nikneym
Copy link
Author

nikneym commented Apr 7, 2024

The test makes the (perhaps somewhat naive) assumption that the main thread is created with a priority of 0. Do you happen to know what scheduling policy and priority were used when you ran the test?

I don't know how to do this. Where should I look while tests are running?

I've applied the diff and it fails with the following:

not ok 355 - thread_priority
# exit code 134
# Output from process `thread_priority`:
# Assertion failed in test/test-thread-priority.c on line 93: `uv_thread_setpriority(pthread_self(), UV_THREAD_PRIORITY_LOWEST) == 0` (-13 == 0)

@bnoordhuis
Copy link
Member

That -13 error suggests it's a permission error. Can you run strace -o trace.log -f build/uv_run_tests_a thread_priority thread_priority and post the contents of trace.log?

@nikneym
Copy link
Author

nikneym commented Apr 12, 2024

Hi, sorry for the late reply. Directory named build has not created actually, I'm using make to build and test things. Should I switch to cmake?

@bnoordhuis
Copy link
Member

No, that's okay. As long as you can strace the test.

@glaubitz
Copy link

The test makes the (perhaps somewhat naive) assumption that the main thread is created with a priority of 0. Do you happen to know what scheduling policy and priority were used when you ran the test?

The fix is probably quite simple:

diff --git a/test/test-thread-priority.c b/test/test-thread-priority.c
index 0aaf2977..769b485e 100644
--- a/test/test-thread-priority.c
+++ b/test/test-thread-priority.c
@@ -88,8 +88,6 @@ TEST_IMPL(thread_priority) {
  * test set nice value for the calling thread with default schedule policy
 */
 #ifdef __linux__
-  ASSERT_OK(uv_thread_getpriority(pthread_self(), &priority));
-  ASSERT_EQ(priority, 0);
   ASSERT_OK(uv_thread_setpriority(pthread_self(), UV_THREAD_PRIORITY_LOWEST));
   ASSERT_OK(uv_thread_getpriority(pthread_self(), &priority));
   ASSERT_EQ(priority, (0 - UV_THREAD_PRIORITY_LOWEST * 2));

This affects Debian as well, see: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071194#10

I agree that the assumption about nice == 0 should just be removed here.

@glaubitz
Copy link

Gentoo is also patching this line out: https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-libs/libuv/files/libuv-1.48.0-test-thread-priority-portage.patch

So, I would suggest removing these two asserts as making assumptions about the build environment is not really good idea.

@dod38fr
Copy link
Contributor

dod38fr commented Jun 19, 2024

ok, I'll patch this out from Debian package.

Thanks for the help

@bnoordhuis
Copy link
Member

If the ASSERT_EQ(priority, (0 - UV_THREAD_PRIORITY_LOWEST * 2)); assert trips up, that suggests the uv_thread_setpriority call isn't working, or isn't working properly. That's somewhat concerning and should be investigated.

@dod38fr
Copy link
Contributor

dod38fr commented Jun 20, 2024

We're not there yet.

I've patched out the lines:

  ASSERT_OK(uv_thread_getpriority(pthread_self(), &priority));
  ASSERT_EQ(priority, 0);

Test on hppa, loong64, powerpc, ppc64, sparc64 and x32 fail on the line after:

not ok 354 - thread_priority
# exit code 6
# Output from process `thread_priority`:
# Assertion failed in ./test/test-thread-priority.c on line 91: `uv_thread_setpriority(pthread_self(), UV_THREAD_PRIORITY_LOWEST) == 0` (-13 == 0)

See detailed logs in https://buildd.debian.org/status/package.php?p=libuv1

@bnoordhuis
Copy link
Member

-13 is EACCES and suggests the setpriority(2) system call refuses to change the process's nice value.

My guess is debian buildbots run thing at low niceness (would make sense for a batch system) and that's what's tripping up the test. If you can strace the test, it should be very easy confirm; you can trace the test with:

strace  <flags> ./build/uv_run_tests_a thread_priority thread_priority

I suppose libuv has a bug in that it sets niceness from -4 to +4 but the full range is -20 to +19.

@dod38fr
Copy link
Contributor

dod38fr commented Jun 25, 2024

Unfortunately, I cannot reproduce this problem on Debian ppc64 machines, so I cannot strace it.

@glaubitz
Copy link

glaubitz commented Jun 25, 2024 via email

@dod38fr
Copy link
Contributor

dod38fr commented Jun 25, 2024

Oops, I did not.

Here the strace output:

(sid_ppc64-dchroot)dod@perotto:~/libuv1-1.48.0$ nice strace ./obj-powerpc64-linux-gnu/uv_run_tests thread_priority
execve("./obj-powerpc64-linux-gnu/uv_run_tests", ["./obj-powerpc64-linux-gnu/uv_run"..., "thread_priority"], 0x7fffd5169808 /* 19 vars */) = 0
brk(NULL)                               = 0x10007280000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/home/dod/libuv1-1.48.0/obj-powerpc64-linux-gnu/libuv.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0\0\0\0\0\0"..., 832) = 832
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1099664, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 394752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fffae520000
mmap(0x7fffae570000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x40000) = 0x7fffae570000
close(3)                                = 0
openat(AT_FDCWD, "/home/dod/libuv1-1.48.0/obj-powerpc64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=10991, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 10991, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fffae510000
close(3)                                = 0
openat(AT_FDCWD, "/lib/powerpc64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\2\1\3\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0\0\0!\235\320"..., 832) = 832
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=2306480, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 2353712, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fffae2d0000
mmap(0x7fffae4e0000, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x210000) = 0x7fffae4e0000
close(3)                                = 0
set_tid_address(0x7fffae623950)         = 2462874
set_robust_list(0x7fffae623960, 24)     = 0
rseq(0x7fffae623fa0, 0x20, 0, 0xfe5000b) = 0
mprotect(0x7fffae4e0000, 131072, PROT_READ) = 0
mprotect(0x7fffae570000, 65536, PROT_READ) = 0
mprotect(0x125700000, 65536, PROT_READ) = 0
mprotect(0x7fffae610000, 65536, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
munmap(0x7fffae510000, 10991)           = 0
geteuid()                               = 3143
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[PIPE], sa_flags=SA_RESTART}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
getrandom("\x51\x38\x11\x19\x81\x82\x74\xcd", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x10007280000
brk(0x100072b0000)                      = 0x100072b0000
unlink("/tmp/uv-test-sock")             = -1 ENOENT (No such file or directory)
unlink("/tmp/uv-test-sock2")            = -1 ENOENT (No such file or directory)
unlink("/tmp/uv-test-sock3")            = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp", O_RDWR|O_EXCL|O_TMPFILE, 0600) = 3
fcntl(3, F_GETFL)                       = 0x414002 (flags O_RDWR|O_LARGEFILE|O_TMPFILE)
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fffae623950) = 2462875
pipe2([4, 5], 0)                        = 0
rt_sigaction(SIGRT_1, {sa_handler=0x7fffae4efd10, sa_mask=[], sa_flags=SA_ONSTACK|SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
mmap(NULL, 327680, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7fffae280000
mprotect(0x7fffae290000, 262144, PROT_READ|PROT_WRITE) = 0
rt_sigprocmask(SIG_BLOCK, ~[], [], 8)   = 0
clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fffae2cf1f0, parent_tid=0x7fffae2cf1f0, exit_signal=0, stack=0x7fffae280000, stack_size=0x4e970, tls=0x7fffae2d68e0} => {parent_tid=[2462876]}, 88) = 2462876
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
pselect6(5, [4], NULL, NULL, {tv_sec=5, tv_nsec=0}, NULL) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=2462875, si_uid=3143, si_status=SIGABRT, si_utime=0, si_stime=0} ---
pselect6(5, [4], NULL, NULL, {tv_sec=4, tv_nsec=998629413}, NULL) = 1 (in [4], left {tv_sec=4, tv_nsec=998626653})
close(4)                                = 0
close(5)                                = 0
write(1, "not ok 1 - thread_priority\n", 27not ok 1 - thread_priority
) = 27
write(1, "# exit code 6\n", 14# exit code 6
)         = 14
write(1, "# ", 2# )                       = 2
newfstatat(3, "", {st_mode=S_IFREG|0600, st_size=145, ...}, AT_EMPTY_PATH) = 0
write(1, "Output from process `thread_prio"..., 39Output from process `thread_priority`:
) = 39
newfstatat(3, "", {st_mode=S_IFREG|0600, st_size=145, ...}, AT_EMPTY_PATH) = 0
_llseek(3, 0, [0], SEEK_SET)            = 0
read(3, "Assertion failed in ./test/test-"..., 8192) = 145
read(3, "", 8192)                       = 0
write(1, "# ", 2# )                       = 2
write(1, "Assertion failed in ./test/test-"..., 144Assertion failed in ./test/test-thread-priority.c on line 91: `uv_thread_setpriority(pthread_self(), UV_THREAD_PRIORITY_LOWEST) == 0` (-13 == 0)) = 144
write(1, "\n", 1
)                       = 1
close(3)                                = 0
exit_group(6)                           = ?
+++ exited with 6 +++

@juanarbol
Copy link
Contributor

This is not longer failing.

repeat 3000 ./build/uv_run_tests_a thread_priority thread_priority
in simple_task
in simple_task
....
uname -a
Linux ramanujan 6.12.4-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 09 Dec 2024 14:31:57 +0000 x86_64 GNU/Linux

Should this kept opened?

@thesamesam
Copy link

Did you run it under a low niceness/chrt setting?

@saghul
Copy link
Member

saghul commented Dec 15, 2024

I'd move to close. If a system is configured in such a way our test suite fails it doesn't necessarily mean we need to change something.

The test makes a reasonable assumption about the default priority...

@bnoordhuis
Copy link
Member

Oh, I missed @dod38fr's reply... unfortunately, it's a trace of the test runner, not the test itself (that's why you have to specify thread_priority twice) so it doesn't help pinpoint the bug.

The test makes a reasonable assumption about the default priority...

I don't know if I'd call it "reasonable", more "works okay most of the time".

I'd at least remove or change that ASSERT_EQ(priority, 0) simply because its assumption is unfounded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants