-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add fenv cache to task struct #51288
Conversation
64aa1de
to
dbafc69
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work
@@ -510,6 +510,8 @@ JL_NO_ASAN static void ctx_switch(jl_task_t *lastt) | |||
jl_set_pgcstack(&t->gcstack); | |||
jl_signal_fence(); | |||
lastt->ptls = NULL; | |||
fegetenv(&lastt->fenv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should come before the task switch, as some task switch implementations will clobber it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where specifically should it go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather, it looks like the fesetenv is the one that needed to move
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where should it go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to push to this branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already done. I don't know about the propagating behavior though. I think we might want to start every task with a call to set fesetenv(FE_DFL_ENV)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not convinced that this is a good thing to do. We have not yet specified what fenv
means on the language level and if we want to support it in these cases. Echoing #51277 (comment)
Doesn't longjmp/setjmp save fenv? E.g. why do we have to do manually?
I don't think so. That is a very significant problem for anyone using this API, but I don't think it means it is wrong to do this PR (e.g. https://github.com/lattera/glibc/blob/master/sysdeps/x86_64/jmpbuf-offsets.h and https://github.com/lattera/glibc/blob/master/sysdeps/x86_64/__longjmp.S have no space to be capable of storing it) |
Two notes: The C standard defines I do still maintain that doing so is preemptive without adding language level support for Also see https://internals.rust-lang.org/t/equivalent-of-pragma-stdc-fenv-access-on/14934 I am okay with merging this partial support under the understanding that there is a declaration of intent to actually develop the full language semantics and that setting or accessing fenv remains UB. Because right now messing with fenv and then entering the compiler that does some constant evaluation sounds like a recipe for causing hard to track down bugs. |
147e605
to
eb91a2e
Compare
might not be good to do, but it at least isn't worse than what we already do
I disagree with this. This has a real (if small) costs and unclear benefit. |
Fair: I didn't expect AMD to seem so bad at this.
|
Fixes #51277, though we give no guarantee that it keeps working this way, or that calling `setrounding_raw` won't lead to other undefined behavior. To give some examples: ```julia julia> t = Base.Rounding.setrounding_raw(Float64, Base.Rounding.to_fenv(RoundDown)) do Task(() -> println(rounding(Float64))) end Task (runnable) @0x000000010dff04c0 julia> rounding(Float64) RoundingMode{:Nearest}() julia> wait(schedule(t)) RoundingMode{:Down}() # currently gives RoundingMode{:Nearest}() julia> rounding(Float64) RoundingMode{:Nearest}() julia> Base.Rounding.setrounding_raw(Float64, Base.Rounding.to_fenv(RoundDown)) do Threads.@threads :static for i = 1:Threads.nthreads() println(Threads.threadid() => rounding(Float64)) end end 1 => RoundingMode{:Down}() 2 => RoundingMode{:Down}() # currently gives RoundingMode{:Nearest}() 4 => RoundingMode{:Down}() # currently gives RoundingMode{:Nearest}() 3 => RoundingMode{:Down}() # currently gives RoundingMode{:Nearest}() ```
For me this more than doubled task switch time, from 120ns to 290ns. This at least needs to be discussed. |
This reverts commit 12aa9de.
Fixes #51277. I know @vchuravy is skeptical, but I wanted to try it out.
To give some examples: