You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This fails due to the underlying call to IO.getZeroTerminatedByteArray - this should really be looking for double nulls not single nulls for wide Charsets.
getZeroTerminatedByteArray is used to return the bytes of a string sans the null terminator. It does this by taking the given string address and calling strlen on it. strlen only looks for \0, and then that length is used to allocate and populate a Java byte array.
This would be a problem if there's any embedded null bytes, which is obviously a problem for UTF-16 in ASCII range.
This is going to be a much more difficult fix, since the actual strlen call happens inside native code. Whenever we change native code, we need to rebuild the native stubs across platforms.
I'm also not sure that just changing strlen is the right fix. These functions have no way of knowing what encoding the bytes are in.
Here's what I think we should do:
As a workaround, you could work with the strings as bytes and deal with the nulls yourself. Not ideal, I know.
Add a second version of this logic that takes either an encoding or an explicit terminator to look for, along the lines of getTerminatedByteArray(addr, [terminator|encoding]).
Finally figure out how to set up VMs for all the platforms we support, so we can more easily update the native bits (ping @tduehr).
@blschatz Possible for you to turn that into a pull request we can integrate? I'm not sure how you're using that within jnr-ffi and your own code (i.e. I'd like to see some examples and ideally tests in a PR).
Activity
headius commentedon Apr 23, 2015
This should probably be using Java's charset logic to decode. Will investigate.
headius commentedon Apr 23, 2015
Ahh I see, it's just looking for the nulls to peel them off. Will see what I can do.
headius commentedon Apr 23, 2015
Ok, I understand now.
getZeroTerminatedByteArray is used to return the bytes of a string sans the null terminator. It does this by taking the given string address and calling strlen on it. strlen only looks for \0, and then that length is used to allocate and populate a Java byte array.
This would be a problem if there's any embedded null bytes, which is obviously a problem for UTF-16 in ASCII range.
This is going to be a much more difficult fix, since the actual strlen call happens inside native code. Whenever we change native code, we need to rebuild the native stubs across platforms.
I'm also not sure that just changing strlen is the right fix. These functions have no way of knowing what encoding the bytes are in.
Here's what I think we should do:
getTerminatedByteArray(addr, [terminator|encoding])
.blschatz commentedon Apr 27, 2015
My fix was as follows:
}
headius commentedon Sep 26, 2016
@blschatz Possible for you to turn that into a pull request we can integrate? I'm not sure how you're using that within jnr-ffi and your own code (i.e. I'd like to see some examples and ideally tests in a PR).
fix jnr#30