-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pthread_exit crashes on threads created by std::thread::spawn in 1.84, not 1.83, breaking pyo3-log #135929
Comments
Relevant discussions: rust-lang/unsafe-code-guidelines#404 |
Even if it's bad, I believe we need to have a working way to write Python libraries in Rust, even if the libraries spawn Rust threads that can potentially call into Python. The other argument is that Rust 1.84 has the MSRV-aware resolver, so e.g. reverting the change just for 1.84, leaving people stuck in Rust 1.84, is much better than leaving them stuck in 1.83. I believe it would be possible to make pyo3-log marshal everything to a Python thread, but that would not work if there are non-pyo3-log causes of the problem. |
cc @vorner for the pyo3-log problem |
A solution is also possible in pyo3: Create C++ code that does PyGILState_STATE safe_gil_ensure() {
try {
return PyGILState_Ensure();
} catch(...) {
for(;;) pause();
}
} and replace the calls to |
cc @davidhewitt to discuss best resolution here |
See also vorner/pyo3-log#30 and PyO3/pyo3#2102 (@arielb1 I see you have already commented on the It seems to me that getting a fix into PyO3 is the most reasonable approach. The proposed C++ snippet... seems fine, but is also somewhat heavy. I would imagine that requires adding I don't see exactly which change in 1.84 is responsible for the regression, is it well understood? I also think this has always been UB, and 1.84 didn't change that? See also PyO3/pyo3#3386 - really what I think needs to happen is there needs to be a way to attempt to acquire the GIL which can fail gracefully. |
As far as I can tell it has always been UB. The change in #129582 had made it crash with panic=abort, before that it also crashed with panic=unwind. As far as I can tell Python has no non-racy way of fallibly acquiring the GIL. I think that doing the Python 3.14 thing is good enough.
Sure. But I can't think of an easier non-racy way of doing things. |
Meta
Tested on Ubuntu 24.04 and Amazon Linux 2, x86_64.
The Production Problem
The crate pyo3-log installs a bridge that makes log functions call into Python. This means that all calls to
logging::info!
etc will take the GIL.Python has a stage during interpreter shutdown where attempts to take the GIL will cause a
pthread_exit
. Python 3.14 (still Alpha today, targeted to be released by the end of this year) will change this in python/cpython#87135 - but that will take some time to reach people.This means that if you have a Python program that uses a Rust library and pyo3-log, that spawning a Rust thread, that is calling
logging::info!
in a way unsynchronized with interpreter exit, you'll have unpredicatable crashes in 1.84.Minified Program
This program:
when compiled with the following options
crashes with this confusing error
This crash happens:
When using
-C panic=unwind
instead, on all versions of the compiler, you get this error:I seen the claim in Zulip (https://rust-lang.zulipchat.com/#narrow/channel/122651-general/topic/pthread_exit.20from.20a.20Rust-spawned.20thread) that this is undefined behavior, but I'll rather not break pyo3-log
The text was updated successfully, but these errors were encountered: