Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-128400: Only show the current thread in faulthandler if the GIL is disabled #128425

Merged
merged 10 commits into from
Jan 3, 2025

Conversation

ZeroIntensity
Copy link
Member

@ZeroIntensity ZeroIntensity commented Jan 2, 2025

A few notes here:

  • Displaying other threads is useful for debugging, so some cases such as where the GIL is re-enabled or during a garbage collection (and therefore threads are stopped) still show all threads.
  • Py_FatalError still displays all threads, but that's a seperate bug--is it safe to stop-the-world during a fatal error?
  • Should this get backported?

📚 Documentation preview 📚: https://cpython-previews--128425.org.readthedocs.build/

{
return 0;
}
if (tstate->interp->gc.collecting)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gc.collecting field doesn't necessarily imply that all threads are paused:

  • gc.collecting is set before the stop-the-world pause
  • The GC resumes threads while calling finalizers

You could check tstate->interp->stoptheworld.world_stopped and maybe also that tstate matches stoptheworld.requester. (The faulting thread may not be attached to the interpreter when it crashes.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fully convinced that this is worth the extra complexity or the extra chance of crashing during the faulthandler:

  • Most crashes probably won't occur during a stop-the-world pause and for those that do, the stack traces of other threads are even less likely to be relevant (because they're paused).
  • Even during a stop-the-world pause, looping over the linked list of thread states is not safe without a HEAD_LOCK(), which you can't do during a signal handler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was on the fence about it too. I threw it in there because it seemed simple enough at first, but I think it's better to remove it too.

@colesbury
Copy link
Contributor

Py_FatalError still displays all threads, but that's a seperate bug--is it safe to stop-the-world during a fatal error?

I think it's best not to stop-the-world. Deadlocks that hide the fatal error message would not be fun.

@ZeroIntensity
Copy link
Member Author

I'll deal with Py_FatalError in a follow-up (after this gets merged to prevent conflicts).

@ZeroIntensity ZeroIntensity changed the title gh-128400: Only show the current thread in faulthandler if other threads are active gh-128400: Only show the current thread in faulthandler if the GIL is disabled Jan 2, 2025
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Some minor formatting suggestions below

Modules/faulthandler.c Outdated Show resolved Hide resolved
Modules/faulthandler.c Outdated Show resolved Hide resolved
ZeroIntensity and others added 2 commits January 2, 2025 17:41
Co-authored-by: Sam Gross <colesbury@gmail.com>
Co-authored-by: Sam Gross <colesbury@gmail.com>
@ZeroIntensity
Copy link
Member Author

ASan failure looks unrelated.

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@colesbury
Copy link
Contributor

Should this get backported?

I'd lean towards "no" for now

@colesbury colesbury merged commit befcfdf into python:main Jan 3, 2025
45 checks passed
@ZeroIntensity ZeroIntensity deleted the faulthandler-nogil branch January 3, 2025 19:15
WolframAlph pushed a commit to WolframAlph/cpython that referenced this pull request Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants