You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The documentation does not warn of unsynchronized access to Out_channel operations, as they are generally protected by internal channel locks.
The documentation for close mentions that flush should not raise Sys_error when called on a closed channel, but does
strictly speaking not specify the behaviour of flush in parallel with close:
valclose : t -> unit(** Close the given channel, flushing all buffered write operations. Output functions raise a [Sys_error] exception when they are applied to a closed output channel, except {!close} and {!flush}, which do nothing when applied to an already closed channel. Note that {!close} may raise [Sys_error] if the operating system signals an error when flushing or closing. *)
The problem is that flush on a still-open channel ends up in caml_flush_partial which calls check_pending that may
temporarily unlock the channel to process pending actions. This creates a small window for caml_ml_close_channel to lock and close the
underlying file descriptor, and offset channel->curr by 1 from channel->buff into dummy_buff. As a result, when caml_flush_partial resumes, it will attempt to output 1 character, fail, and raise an exception.
Before jumping into discussions of fixes, we may want to discuss how to proceed.
Here's a few suggestions:
Do nothing, as this is misusing the Stdlib and/or invoking unspecified behaviour
Update the documentation to warn of this behaviour
Patch the code to prevent the exception. This option is delicate as caml_flush_partial
is simultaniously responsible for raising an exception for its other callers. After a bit
of experimentation, my best bet at this would be to wrap a handler around flush on the
OCaml-side in stdlib.ml, which shouldn't be a performance bottleneck.
...
Note: the above example program can easily be modified to race on two closes also resulting
in a Sys_error. The last sentence of the documentation could account for that behaviour
though.
Thanks to @ncik-roberts for figuring out the above explanation.
We've encountered that
Stdlib/Out_channel.flush
may raise aSys_error
exception when used in parallel with aclose
.Consider this reproducer program:
with this behaviour:
The documentation does not warn of unsynchronized access to
Out_channel
operations, as they are generally protected by internal channel locks.The documentation for
close
mentions thatflush
should not raiseSys_error
when called on a closed channel, but doesstrictly speaking not specify the behaviour of
flush
in parallel withclose
:The problem is that
flush
on a still-open channel ends up incaml_flush_partial
which callscheck_pending
that maytemporarily unlock the channel to process pending actions. This creates a small window for
caml_ml_close_channel
to lock and close theunderlying file descriptor, and offset
channel->curr
by 1 fromchannel->buff
intodummy_buff
. As a result, whencaml_flush_partial
resumes, it will attempt to output 1 character, fail, and raise an exception.Before jumping into discussions of fixes, we may want to discuss how to proceed.
Here's a few suggestions:
caml_flush_partial
is simultaniously responsible for raising an exception for its other callers. After a bit
of experimentation, my best bet at this would be to wrap a handler around
flush
on theOCaml-side in stdlib.ml, which shouldn't be a performance bottleneck.
Note: the above example program can easily be modified to race on two
close
s also resultingin a
Sys_error
. The last sentence of the documentation could account for that behaviourthough.
Thanks to @ncik-roberts for figuring out the above explanation.
CC to @damiendoligez who last had his fingers in these parts in #12678
The text was updated successfully, but these errors were encountered: