-
-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GC-safety issue with writev pointers in File. #2775
Conversation
Prior to this commit, there was a garbage-collection safety issue with the way that the `_pending_writev` array worked in the `File` class. The pending writev data was being stored in a flat array with element type of `USize`, with the byte pointers being converted to type `USize` in order to be stored alongside the buffer length numbers in the same array. Storing these pointers as numbers meant that they are no longer traced by the garbage as being pointers, meaning that the garbage collector would try to free the underlying memory due to no longer being pointed to by any valid pointer. This PR fixes the issue by using an `Array[(Pointer[U8], USize)]` instead of an `Array[USize]`. This also eliminates some complexity in the code that was related to accessing the array data in blocks of 2 elements per logical element. This resolves part of #2526. We could resolve the issue entirely by doing the same fix for TCPConnection, though that may have some addiitonal complexity from needing to be compatible with IOCP on windows.
Nice @jemc |
@@ -478,28 +477,28 @@ class File | |||
let writev_batch_size = @pony_os_writev_max() | |||
while pending_total > 0 do | |||
// Determine the number of bytes and buffers to send. | |||
num_to_send = (_pending_writev.size().i32() / 2) - num_sent.i32() | |||
num_to_send = _pending_writev.size().i32() - num_sent.i32() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I merge, can you explain the removal of /2
here and *2
etc elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, each io vector in the array took up two indices, one for the pointer itself and one for the size.
So, for example, if there were 4 io vectors, _pending_writev: Array[USize]
would be 8 elements long, with indices 1, 3, 5, & 7 being pointers (converted to USize
), and indices 2, 4, 6, & 8 being the associated sizes to those pointers.
After this change, if there are 4 io vectors, then _pending_writev: Array[(Pointer[U8] tag, USize)]
will be 4 elements long, with each element being a tuple tuple containing both parts that were previously encoded at separate indices in the array.
@jemc i left a question for you before merging. Mostly, I want to make sure I understand the change fully before hitting merge. Re: |
Yes, the idea is to remove the |
Opened #2779. |
Prior to this commit, the TCPConnection `pending_writev` buffer had to have a secondary buffer called `_pending` to ensure that the data wouldn't get GC'd too early by the runtime. This commit removes that hack and replaces it with a slightly different hack based on the strategy used in `File` in PR ponylang#2775. The hacky part is that now we have two buffers for `writev` data. One for windows (`_pending_windows`) and one for non-windows (`_pending_posix`) to account for the difference in order of the struct between windows and posix. This, however, seems less bad than the previous hack of having the secondary buffer to ensure GC safety. resolves ponylang#2782 resolves ponylang#2779
Prior to this commit, the TCPConnection `pending_writev` buffer had to have a secondary buffer called `_pending` to ensure that the data wouldn't get GC'd too early by the runtime. This commit removes that hack and replaces it with a slightly different hack based on the strategy used in `File` in PR ponylang#2775. The hacky part is that now we have two buffers for `writev` data. One for windows (`_pending_windows`) and one for non-windows (`_pending_posix`) to account for the difference in order of the struct between windows and posix. This, however, seems less bad than the previous hack of having the secondary buffer to ensure GC safety. resolves ponylang#2782 resolves ponylang#2779
Prior to this commit, the TCPConnection `pending_writev` buffer had to have a secondary buffer called `_pending` to ensure that the data wouldn't get GC'd too early by the runtime. This commit removes that hack and replaces it with a slightly different hack based on the strategy used in `File` in PR ponylang#2775. The hacky part is that now we have two buffers for `writev` data. One for windows (`_pending_windows`) and one for non-windows (`_pending_posix`) to account for the difference in order of the struct between windows and posix. This, however, seems less bad than the previous hack of having the secondary buffer to ensure GC safety. resolves ponylang#2782 resolves ponylang#2779
Prior to this commit, the TCPConnection `pending_writev` buffer had to have a secondary buffer called `_pending` to ensure that the data wouldn't get GC'd too early by the runtime. This commit removes that hack and replaces it with a slightly different hack based on the strategy used in `File` in PR #2775. The hacky part is that now we have two buffers for `writev` data. One for windows (`_pending_windows`) and one for non-windows (`_pending_posix`) to account for the difference in order of the struct between windows and posix. This, however, seems less bad than the previous hack of having the secondary buffer to ensure GC safety. resolves #2782 resolves #2779
Prior to this commit, the TCPConnection `pending_writev` buffer had to have a secondary buffer called `_pending` to ensure that the data wouldn't get GC'd too early by the runtime. This commit removes that hack and replaces it with a slightly different hack based on the strategy used in `File` in PR ponylang#2775. The hacky part is that now we have two buffers for `writev` data. One for windows (`_pending_windows`) and one for non-windows (`_pending_posix`) to account for the difference in order of the struct between windows and posix. This, however, seems less bad than the previous hack of having the secondary buffer to ensure GC safety. resolves ponylang#2782 resolves ponylang#2779
Prior to this commit, there was a garbage-collection safety issue
with the way that the
_pending_writev
array worked in theFile
class. The pending writev data was being stored in a flat array
with element type of
USize
, with the byte pointers being convertedto type
USize
in order to be stored alongside the buffer lengthnumbers in the same array.
Storing these pointers as numbers meant that they are no longer
traced by the garbage as being pointers, meaning that the garbage
collector would try to free the underlying memory due to no longer
being pointed to by any valid pointer.
This PR fixes the issue by using an
Array[(Pointer[U8], USize)]
instead of an
Array[USize]
. This also eliminates some complexityin the code that was related to accessing the array data in blocks
of 2 elements per logical element.
This resolves part of #2526.
We could resolve the issue entirely by doing the same fix for
TCPConnection, though that may have some addiitonal complexity from
needing to be compatible with IOCP on windows.