Closed
Description
It happens occasionally and the problem disappears after restart Node.js process.
- Version:
1.23.2
- Platform:
Linux n18-035-207 4.4.0-33.bm.1-amd64 #1 SMP Wed, 24 Jan 2018 15:50:58 +0800 x86_64 GNU/Linux
- Node.js:
v8.15.0
$ strace -p PID
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 123, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 123, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
epoll_ctl(4, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17, u64=69435872986005521}}) = 0
epoll_pwait(4, {{EPOLLIN|EPOLLHUP, {u32=17, u64=69435872986005521}}}, 1024, 122, NULL, 8) = 1
^CProcess 517 detached
$ lsof -d 17
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
node 517 root 17u unix 0xffff881057cb2000 0t0 760434497 socket
Metadata
Metadata
Assignees
Labels
No labels
Activity
leeight commentedon Jan 25, 2019
More info after
gcore PID
leeight commentedon Jan 25, 2019
saghul commentedon Jan 25, 2019
Do you have a (small) test case to reproduce this?
leeight commentedon Jan 25, 2019
@saghul Not yet. if i can reproduce it, i'll post it asap.
leeight commentedon Jan 30, 2019
@saghul The following code can reproduce it.
client.js
server.js
Reproduce steps:
Now the cpu usage is 100%
santigimeno commentedon Jan 30, 2019
It seems the issue is that, when paused, node doesn't read from the socket so the EOF cannot be detected. The reason for this is described in nodejs/node-v0.x-archive#8200, though tbh, I'm not sure about it. I would raise the issue in the nodejs repository and see how it goes.
leeight commentedon Jan 31, 2019
@santigimeno Ok, thanks
leeight commentedon Jan 31, 2019
@santigimeno @saghul Remove the
POLLIN
flag will resolve this issue.https://github.com/libuv/libuv/blob/v1.x/src/unix/pipe.c#L218
oyyd commentedon Jan 31, 2019
It seems that if we don't call
uv_read_start()
on a pipe socket, theepoll_pwait
inside ofuv__io_poll
won't block the thread so that it will loop too frequently. For example, theclient
below will have high cpu usage if we don't calluv_read_start()
:server:
client:
As it can be reproduced on uv so that I guess it's not simply an issue of node.
tsyeyuanfeng commentedon Jan 31, 2019
uv_pipe_connect
registers POLLIN event at io watcher right after pipe is connected, but at this momentstream->read_cb
andstream->flags
is not set.If socket receives data from peer, epoll will trigger POLLIN event. However the related io watcher will fail to read socket, for
stream->cb
is null andstream->flags
is zero.Since data is not read, epoll keep triggering POLLIN event, which will lead to event loop keep running fast.
In contrast,
uv__tcp_connect
doesn't register POLLIN event at the very start, so tcp socket won't trigger this issue.So I think it is reasonable to remove the POLLIN flag at
uv_pipe_connect
to resolve this issue.santigimeno commentedon Jan 31, 2019
Yes, it seems reasonable. Good find. Could you open a PR? Thanks
tsyeyuanfeng commentedon Jan 31, 2019
Ok
17 remaining items