Description
Describe the bug
OTP26.0-rc2 includes a change that optimises erlang:port_command
. This change, while certainly appreciated!, breaks applications that are using erlang:port_command
as a workaround to the issue of selective receive in gen_tcp
as explained in https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbit_common/src/rabbit_writer.erl#L388-L405. RabbitMQ can't handle any connections if started on OTP26 (of course at this point we are not declaring it as OTP26-compatible, so users shouldn't do that, but we are looking for a way to ship releases that would work on both 25 and 26 and a compile-time check is insufficient here).
Using erlang:port_command
like that might be considered unsupported, but we know it's at least:
RabbitMQ
emqtt
Erlang PostgreSQL Database Client
Therefore, this is a breaking change for at least some applications out there.
We'd appreciate some help or guidance how to resolve this.
- Is the whole
erlang:port_command
instead ofgen_tcp:send
a performance myth that should be retired? - Can
erlang:port_command
in OTP26 be made backwards compatible, so that apps like the above still work? - If not, I guess at least it should be added to the Potential Incompatibilities list
To Reproduce
side note: the following test case was written by AI :)
-module(send_tcp).
-export([start/0]).
start() ->
{ok, Socket} = gen_tcp:connect("localhost", 3456, [binary, {packet, 0}]),
erlang:port_command(Socket, "Hello from erlang:port_command"),
gen_tcp:close(Socket).
This exits with einval
(I have nc -l 3456
running in another terminal):
1> send_tcp:start().
** exception exit: einval
Expected behavior
Ideally: erlang:port_command
would be backwards compatible, while still optimised, but I guess that's not an option
Alternatively:
- It'd behave the old way in some cases (backwards compatible, but not optimised in some cases?)
- Backport the optimisation to OTP25 so that the updated code works well with older OTP versions
- Mention this change in the potential incompatibilities list
Affected versions
OTP26.0-rc2 and newer
the commit that changed the behaviour: f17f802#diff-86e7e31a3a23b92ea754ad259d8ffd6eb347deb522218fae1803878bf9d15e8b
Additional context
slack thread:
https://erlanger.slack.com/archives/C055DJA49/p1681483290185709
Activity
rickard-green commentedon Apr 18, 2023
This change does not change
port_command()
at all. This is a change in the internal protocol betweenprim_inet
and the inet-driver which also is completely backwards compatible. I don't see why it should be mentioned as being incompatible when it is not. Misusing the functionality by issuingport_command()
operations like that on your own is not supported.jhogberg commentedon Apr 18, 2023
How about?
seriyps commentedon Apr 18, 2023
In
epgsql
direct port call was implemented (the idea was borrowed from RabbitMQ) becauseepgsql
tends to accumulate a large message queue (due toactive=true
by default). And pre-OTP26gen_tcp:send
had a performance issues when the process calling it has a long message queue.I think in
epgsql
we will use-if(?OTP_RELEASE >= 26). ... use normal gen_tcp:send -else. ... use port_command ... -endif.
.However even on pre-OTP26
gen_tcp:send
should become less of a problem, since we now supportactive=N
modeMake tcp send OTP 26 compatible
mkuratczyk commentedon Apr 19, 2023
Thanks for the suggestion with
persistent_term
. We've been testing it since yesterday and it does the trick and we can't find any measurable cost of this lookup.I'm happy to close the issue. I hope not too many apps will be in a similar position when upgrading to OTP26. We certainly need to update the network-related parts of RabbitMQ as they are 10+ years old for the most part.
Make tcp send OTP 26 compatible
9 remaining items