-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update/relax str/String utf8 safety docs #134598
base: master
Are you sure you want to change the base?
Changes from 3 commits
ce3a5ce
803eca4
bb003e4
a29cb96
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ | |
use super::Utf8Error; | ||
use super::validations::run_utf8_validation; | ||
use crate::{mem, ptr}; | ||
use crate::ptr; | ||
|
||
/// Converts a slice of bytes to a string slice. | ||
/// | ||
|
@@ -145,7 +145,10 @@ pub const fn from_utf8_mut(v: &mut [u8]) -> Result<&mut str, Utf8Error> { | |
/// | ||
/// # Safety | ||
/// | ||
/// The bytes passed in must be valid UTF-8. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should keep this requirement, to preserve the freedom for tooling (miri, ubchecks, kani, whatever) to validate it, even if we don't in release builds. People can always pointer-cast if they want to do turn a |
||
/// This function is unsafe because it does not check that the bytes passed | ||
/// to it are valid UTF-8. If this constraint is violated, it may cause | ||
/// memory unsafety issues with future users of the `str`, as the rest of | ||
/// the standard library [assumes that `str`s are valid UTF-8](str#invariant). | ||
/// | ||
/// # Examples | ||
/// | ||
|
@@ -169,9 +172,11 @@ pub const fn from_utf8_mut(v: &mut [u8]) -> Result<&mut str, Utf8Error> { | |
#[rustc_const_stable(feature = "const_str_from_utf8_unchecked", since = "1.55.0")] | ||
#[rustc_diagnostic_item = "str_from_utf8_unchecked"] | ||
pub const unsafe fn from_utf8_unchecked(v: &[u8]) -> &str { | ||
// SAFETY: the caller must guarantee that the bytes `v` are valid UTF-8. | ||
// Also relies on `&str` and `&[u8]` having the same layout. | ||
unsafe { mem::transmute(v) } | ||
// SAFETY: the pointer dereference is safe because that pointer | ||
// comes from a reference which is guaranteed to be valid for reads. | ||
// If the input bytes are not valid UTF-8, then the returned `&str` will | ||
// have invalid UTF-8, which is unsafe but not immediate UB. | ||
unsafe { &*(v as *const [u8] as *const str) } | ||
} | ||
|
||
/// Converts a slice of bytes to a string slice without checking | ||
|
@@ -197,10 +202,10 @@ pub const unsafe fn from_utf8_unchecked(v: &[u8]) -> &str { | |
#[rustc_const_stable(feature = "const_str_from_utf8_unchecked_mut", since = "1.83.0")] | ||
#[rustc_diagnostic_item = "str_from_utf8_unchecked_mut"] | ||
pub const unsafe fn from_utf8_unchecked_mut(v: &mut [u8]) -> &mut str { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm positive there's a lint for this. Is that not enabled in |
||
// SAFETY: the caller must guarantee that the bytes `v` | ||
// are valid UTF-8, thus the cast to `*mut str` is safe. | ||
// Also, the pointer dereference is safe because that pointer | ||
// SAFETY: the pointer dereference is safe because that pointer | ||
// comes from a reference which is guaranteed to be valid for writes. | ||
// If the input bytes are not valid UTF-8, then the returned `&mut str` will | ||
// have invalid UTF-8, which is unsafe but not immediate UB. | ||
unsafe { &mut *(v as *mut [u8] as *mut str) } | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this list should be cut way down.
There should be a couple of things that are the gateways to using it in ways that temporarily break invariants that make this promise, but everything else should be allowed to check if it feels like.
TBH, I'd be tempted to say that the only thing that explicitly allows it is
as_mut_vec
, since you can always do whatever you wanted on thatVec<u8>
.(Let others make the case why they need anything else.)