-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std: lazily allocate the main thread handle #132654
base: master
Are you sure you want to change the base?
Conversation
r? @ChrisDenton rustbot has assigned @ChrisDenton. Use |
This comment has been minimized.
This comment has been minimized.
EDIT: I was right! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy with this approach but it would be nice to have some consensus on it to (hopefully) avoid further churn. I won't insist on it though.
library/std/src/thread/mod.rs
Outdated
#[cfg(not(target_thread_local))] | ||
#[allow(unused)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is also used when 64-bit atomics are unavailable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any more specific cfg
that can be used? #[allow(unused)]
, while sometimes necessary, makes me nervous because it could silently become actual dead code in the future.
library/std/src/thread/thread.rs
Outdated
// INVARIANT: must be valid UTF-8 | ||
name: Option<CString>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems worse. Having an invariant that can be violated in safe code is not great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wondered if it made sense to have a new-type that held both invariants (UTF-8 & c-string \0 ending) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Utf8CStr is a fairly common type in the ecosystem, it would make sense for us to start tinkering with providing it.
Also could you separate the file move into a different commit? It's easier to diff that way. |
This comment has been minimized.
This comment has been minimized.
☔ The latest upstream changes (presumably #133219) made this pull request unmergeable. Please resolve the merge conflicts. |
I've undone the @rustbot ready |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
If the variable does not need a destructor, `std` uses racy initialization for creating TLS keys on Windows. With just the right timing, this can lead to `TlsFree` being called. Unfortunately, with rust-lang#132654 this is hit quite often, so miri should definitely support `TlsFree` ([documentation](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsfree)). I'm filing this here instead of in the miri repo so that rust-lang#132654 isn't blocked for so long.
Blocked on #133457. This PR times everything just right so that the |
// Safety: We only expose an opaque pointer, which maintains the `Pin` invariant. | ||
let inner = unsafe { Pin::into_inner_unchecked(arc) }; | ||
Arc::into_raw(inner) as *const () | ||
pub(crate) fn with_current_name<F, R>(f: F) -> R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is called from signal handler context, I think there should be a doc comment here explicitly saying that the function may be called from signal handler context. It seems fairly easily to accidentally break this property in the future.
use crate::ffi::{CStr, CString}; | ||
use crate::str; | ||
|
||
/// Like a `String` it's guaranteed UTF-8 and like a `CString` it's null terminated. | ||
pub(crate) struct ThreadNameString { | ||
inner: CString, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this was a Cow<'static, CString>
, could the main thread reference a "main\0"
string literal and thus avoid a few else if main_thread::get() == Some(self.inner.id)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thread ID method has the advantage of working inside TLS destructors, so e.g. panicking in a TLS destructor on main now prints the correct name. And since we need the infrastructure anyway to detect the main thread in the first place, I figured this was better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the duplication of logic unfortunate, and IMO it'd be worth reducing.
But I am just commenting as a bystander here, hopefully I never have to dig into this code and figure out how it holds together. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm planning to do a cleanup PR for this module next... it's just way to large to understand, even though the actual logic is mostly quite local.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thread ID method has the advantage of working inside TLS destructors, so e.g. panicking in a TLS destructor on main now prints the correct name. And since we need the infrastructure anyway to detect the main thread in the first place, I figured this was better.
Sorry, that scenario wasn't the issue. The actual reason why this mustn't change is that there is now nothing stopping thread::current
from being called before main, which would lead to an incorrect name being stored. The ID-based detection doesn't have this issue, because the name will be correct as soon as the runtime has initialized the ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that is subtle. More comments would be good. :)
library/std/src/thread/mod.rs
Outdated
return f(Some("main")); | ||
} | ||
} else if let Some(main) = main_thread::get() | ||
&& let Some(id) = current::id::get() | ||
&& id == main | ||
{ | ||
return f(Some("main")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems worth some comments explaining why "main" has two separate cases.
miri: implement `TlsFree` If the variable does not need a destructor, `std` uses racy initialization for creating TLS keys on Windows. With just the right timing, this can lead to `TlsFree` being called. Unfortunately, with rust-lang#132654 this is hit quite often, so miri should definitely support `TlsFree` ([documentation](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsfree)). I'm filing this here instead of in the miri repo so that rust-lang#132654 isn't blocked for so long.
Rollup merge of rust-lang#133457 - joboet:miri-tlsfree, r=saethlin miri: implement `TlsFree` If the variable does not need a destructor, `std` uses racy initialization for creating TLS keys on Windows. With just the right timing, this can lead to `TlsFree` being called. Unfortunately, with rust-lang#132654 this is hit quite often, so miri should definitely support `TlsFree` ([documentation](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsfree)). I'm filing this here instead of in the miri repo so that rust-lang#132654 isn't blocked for so long.
This reverts commit 0747f28.
rust-lang#123550 eliminated the allocation of the main thread handle, but at the cost of greatly increased complexity. This PR proposes another approach: Instead of creating the main thread handle itself, the runtime simply remembers the thread ID of the main thread. The main thread handle is then only allocated when it is used, using the same lazy-initialization mechanism as for non-runtime use of `thread::current`, and the `name` method uses the thread ID to identify the main thread handle and return the correct name ("main") for it. Thereby, we also allow accessing thread::current before main: as the runtime no longer tries to install its own handle, this will no longer trigger an abort. Rather, the name returned from name will only be "main" after the runtime initialization code has run, but I think that is acceptable. This new approach also requires some changes to the signal handling code, as calling `thread::current` would now allocate when called on the main thread, which is not acceptable. I fixed this by adding a new function (`with_current_name`) that performs all the naming logic without allocation or without initializing the thread ID (which could allocate on some platforms).
Originally authored by GnomedDev
I've rebased this on top of #133457 and added some comments explaining everything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This looks good to me. I do have one query but otherwise r=me.
fn main() { | ||
fn main() | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems... weird. Is that intentional?
☔ The latest upstream changes (presumably #122565) made this pull request unmergeable. Please resolve the merge conflicts. |
#123550 eliminated the allocation of the main thread handle, but at the cost of greatly increased complexity. This PR proposes another approach: Instead of creating the main thread handle itself, the runtime simply remembers the thread ID of the main thread. The main thread handle is then only allocated when it is used, using the same lazy-initialization mechanism as for non-runtime use of
thread::current
, and thename
method uses the thread ID to identify the main thread handle and return the correct name ("main") for it.Thereby, we also allow accessing
thread::current
before main: as the runtime no longer tries to install its own handle, this will no longer trigger an abort. Rather, the name returned fromname
will only be "main" after the runtime initialization code has run, but I think that is acceptable.This new approach also requires some changes to the signal handling code, as calling
thread::current
would now allocate when called on the main thread, which is not acceptable. I fixed this by adding a new function (with_current_name
) that performs all the naming logic without allocation or without initializing the thread ID (which could allocate on some platforms).Reverts #123550, CC @GnomedDev