-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialization fixes #30
Conversation
…s single event loop
(can you maybe add me as reviewer to this PR? thanks for the work already though! I just cant judge this with one glance, its a bit of work) |
Yeah, the changes are a bit complex and they're currently in-progress, so feel free to take your time or ask questions. I might push a few more changes later today as well, but it's nowhere near ready to merge yet. I assigned you to the PR, but I have to admit I'm not entirely sure how to make you a reviewer. Do you need to be added as a maintainer of the project before I can add you as a reviewer? Edit: I believe you should be able to comment on specific lines and make changes in a fork. Is there anything else that needs reviewer permissions? |
Weird, no need then, I just wanted to request it if possible as a todo item. No need if it turns out to be a bit difficult/uncomfortable. Assigning works, thanks! I'll do an in-depth look at this later. |
…nt is future_into_py which uses non-cached event loop
Ok, I think I'm done making changes for today. From here on out it should mostly just be documentation work, bikeshedding names, figuring out the best way to go about deprecating 0.13, and any easy optimizations we can take advantage of. I'm not super concerned about performance for this release since it's more of a bugfix, so I don't necessarily think we need to add the benchmarks just yet. I would imagine any optimizations we do will probably not have much effect on the public facing API, but let me know if you think I'm wrong about this. I'll try to resume work on it sometime next week since I have a busy weekend planned. I should still have time to answer any questions on here or gitter if you have them. |
…ttle more complicated now
…arameter on test attributes optional to support previous usage
…is more consistent
…are correct, changing back to get_event_loop for compatibility with Python <3.7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forced myself to give this a go after drabbling around for a while, so this is just a glance with some exra ideas, i dont think much of this is very in-depth, so take it as a preliminary review.
static EVENT_LOOP: OnceCell<PyObject> = OnceCell::new(); | ||
|
||
const EXPECT_INIT: &str = "PyO3 Asyncio has not been initialized"; | ||
static CACHED_EVENT_LOOP: OnceCell<PyObject> = OnceCell::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be thread-local or the likes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CACHED_EVENT_LOOP
is to support the deprecated 0.13
API. It's ignored entirely in the new functions.
Ok(asyncio(py)?.getattr("ensure_future")?.into()) | ||
})? | ||
.as_ref(py) | ||
.call1((awaitable,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensure_future
either requires a loop=
kv parameter or must be run on the same thread that the asyncio loop is from, as it calls get_event_loop()
internally, is this intended? shouldn't we guard against the possibility that this can be called inside a non-python thread?
Here's the source code for ensure_future
, fwiw;
def ensure_future(coro_or_future, *, loop=None):
"""Wrap a coroutine or an awaitable in a future.
If the argument is a Future, it is returned directly.
"""
if coroutines.iscoroutine(coro_or_future):
if loop is None:
loop = events.get_event_loop()
task = loop.create_task(coro_or_future)
if task._source_traceback:
del task._source_traceback[-1]
return task
elif futures.isfuture(coro_or_future):
if loop is not None and loop is not futures._get_loop(coro_or_future):
raise ValueError('loop argument must agree with Future')
return coro_or_future
elif inspect.isawaitable(coro_or_future):
return ensure_future(_wrap_awaitable(coro_or_future), loop=loop)
else:
raise TypeError('An asyncio.Future, a coroutine or an awaitable is '
'required')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, but I believe it is being used correctly already. ensure_future
is only called within the PyEnsureFuture
callback, which is only called using call_soon_threadsafe
. call_soon_threadsafe
will schedule the callback to run on the event loop's thread.
} | ||
|
||
fn asyncio_get_event_loop(py: Python) -> PyResult<&PyAny> { | ||
asyncio(py)?.call_method0("get_event_loop") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we make a custom wrapper type for event loops that is !Send
, to ensure that methods on it aren't called from alternative threads? Just an idea, to guard against footguns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making the event loop wrapper !Send
probably wouldn't allow us to signal completion on a tokio
or async_std
thread. This library should guard against this issue by proxying everything through call_soon_threadsafe
when it needs to complete.
In principle I don't have a problem with getting rid of potential footguns, I'm just not sure how to go about doing it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks for replying though
I tried out this branch against a pytest based test with async support. from setuptools_rust_starter import sleep_for, foo
import asyncio
async def test_python_sleep(loop):
await asyncio.sleep(1)
async def test_rust_sleep_foo(loop):
await foo()
async def test_rust_sleep(loop):
await sleep_for(1) After updating the rust syntax to match the new structs for building async libs this worked very well. Would this be a useful test and example to incorporate into the library as I think using pytest would be a common proof that the library (rust based) is working. |
We should definitely have some tests that try to do "weird stuff", like the start-an-event-loop-on-a-different-branch as i mentioned earlier. Python has failure modes for those, and expectations of how those things work, maybe we should try to match them. |
I saw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a problem, i'm not sure how much further it goes, but i think the last comment here can at least help inform some design decisions that i realised while reading this.
Basically, task-local variables are the way to go, but i think they're the only way to go, and to not allow assumptions about where and when the asyncio event loop is outside of that. (at least not without explicitly stating those assumptions, and/or wrapping return types in Option<_>
, asyncio's event_loop-related functions work with thread-local variables.)
Here's my proposal;
- Have every task-local reference to "their" event loop be an
OnceCell<Arc<PyObject>>
that holds a strong count to that object - Somewhere, there's a
Vec
ofWeak<PyObject>
s which keep track of seen event loops in the program, when all tasks exit, the weak reference cannot be brought up again.- Getting a value from this vec would be a three-state enum;
None, One(Arc<PyObject>), Multiple(&[Arc<PyObject>])
, forcing the caller to make a choice on how to handle the first and the last state.
- Getting a value from this vec would be a three-state enum;
- Have a
(Weak<PyObject>, PyObject)
somewhere which holds a reference to the (Main OS thread? First?) Event loop, to (when theWeak
becomes inaccessible) perform extra "health checks" on the event loop, to make sure it's still running, before "reviving" the weak reference (somehow?) and returning the main thread/first event loop to the caller.
These scenarios would handle asyncio weirdness eloquently, while it's most likely not needed for 95% of the uses, I think it's worth it to be exhaustive and correct for the 5% here.
src/async_std.rs
Outdated
async_std::task_local! { | ||
static EVENT_LOOP: OnceCell<PyObject> = OnceCell::new() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment on what you did here, i think this is sound, as the task is corresponding to a coroutine that is sent to a single event loop, so caching it this way is entirely correct. 👍 ❤️
src/async_std.rs
Outdated
where | ||
F: Future<Output = R> + Send + 'static, | ||
{ | ||
EVENT_LOOP.with(|c| c.set(event_loop).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the accesses to the EVENT_LOOP
task-local variable is correct here? is this ran from inside a task in the async-std event loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally, yes it is ran inside an async-std task. There are also public functions scope
and scope_local
that are not necessarily ran inside an async-std task. I think we can probably just add that requirement as a comment in the docs.
src/lib.rs
Outdated
pub fn get_running_loop(py: Python) -> PyResult<&PyAny> { | ||
// Ideally should call get_running_loop, but calls get_event_loop for compatibility between | ||
// versions. | ||
asyncio(py)?.call_method0("get_event_loop") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function might initialise an event loop bound to the current thread or error out (as the "current thread" can possibly not be the main OS thread, are you sure this function should exist at all?
Shouldn't we try to close off access to event loops like this if the function's calling "origin" cannot be ascertained? (random threads or tasks)? Shouldn't we only explicitly set event loop references in the context of the tasks that spawned corresponding coroutine objects?
I'm slightly uncomfortable with this behaviour, as anyone can call this coming from the rust-async threads, and then accidentally create an event loop bound to to the worker thread that task was calling it from. This does not grab the "main thread event loop", and i don't think it should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, get_running_loop
on the python side exists, but it only grabs an event loop from the current thread (which could be used to grab that task-local event loop reference as the coroutine gets passed off onto asyncio).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_running_loop
exists in the asyncio
API, but it was added in 3.7. Since we want to maintain compatibility with 3.6, we are limited to the functionality it provides. We could potentially try get_running_loop
first, then try get_event_loop
for compatibility, but that only fixes it for Python >3.7 (which is probably ok as long as we document it).
I don't think it necessarily needs to be public (although I'm not 100% sure on that). The reason we need it at all is because it's used when Python calls into pyo3. Essentially, when a Python coroutine calls a #[pyfunction]
from the asyncio
event loop, we can use get_running_loop
to get the event loop associated with the current OS thread and store it as a task local.
If we made it private, the same problems would still exist since I believe something like this has to exist internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly uncomfortable with this behaviour, as anyone can call this coming from the rust-async threads, and then accidentally create an event loop bound to to the worker thread that task was calling it from.
The docs for get_event_loop
state that it will only try to create an event loop if the current OS thread is the main thread, so I believe this is not possible. It's true that it would be problematic for people to call this function from any rust thread, but I think the worst it does is error out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and made this function call get_running_loop
if available and fall back to get_event_loop
if not. This will make the function a bit more permissive on Python 3.6, but the difference is documented now.
Just a heads up @sansyrox, I just merged in #33 which may break your code if you're using tokio. #33 makes the tokio initialization lazy, so as long as you're using multithreaded tokio, the only thing you need to do is get rid of the call to |
Quick glance through documentation looks good 👍 |
Thanks! Unfortunately it's not done yet. That was just me integrating the existing pyo3 async/await guide in and making the necessary updates to it. I still need to add a migration guide and improve the recommendations for native modules / alternate event loops / caveats etc. Not sure if I'll be able to get to all of that today, but after I write that up it'd be awesome to get a review on that too! |
…into init-fixes
…ented caveat for asyncio.run
Alright, I'm gonna take a break for now for some family stuff. All that's left to do is provide a small migration guide and open a PR on I think the Primer in the README should be a good enough justification for the |
README.md
Outdated
|
||
This new release should address most the core issues that users have reported in the `v0.13` release, so I think we can expect more stability going forward. | ||
|
||
Also, a special thanks to @ShadowJonathan for helping with the design and review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
README.md
Outdated
|
||
In order to detect the Python event loop at the callsite, we need something like `asyncio.get_running_loop` that works for _both Python and Rust_. In Python, `asyncio.get_running_loop` uses thread-local data to retrieve the event loop associated with the current thread. What we need in Rust is something that can retrieve the Python event loop associated with the current _task_. | ||
|
||
Enter `pyo3_asyncio::<runtime>::get_current_loop`. This function first checks task-local data for a Python event loop, then falls back on `asyncio.get_running_loop` if no task-local event loop is found. This way both bases are convered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random, but this also made me realise that most of my concerns with threadedness, and event loops on multiple threads, will be covered.
There's still 1 edgecase, which is if someone somehow spawned a future that escaped the task scope, or introduced one which hasn't been assigned from pyo3-asyncio at all (from a control flow that never spawned from python, i.e. built-in python). But i think this is good enough for 95% of all usage cases, and so 99.99% of all usages.
Edit: one idea i had to counter most of this is to make get_current_loop
an Option<>
(if it wasn't already one), or Result<>
where E
could be something along the lines of "cannot acquire reference to event loop: no task context and no running event loop on current thread" (CannotGetELRef(NoContextAndNoThreadLocalLoop)
(wow, getting into Java territory here)). This'd force the programmer to guide their program towards python, get "inside" python on a particular thread, and grab a reference to the event loop, before spawning tasks in there or other operations, which in turn makes other areas of pyo3-asyncio more explicit.
^ This is an idea for the future, i'll probably turn it into an issue, but i'm just capturing my thoughts here while reading. This portion is completely fine 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it might be good to make a variant of get_current_loop
that doesn't panic (something like try_get_current_loop
). I guess the main question surrounding this is if there is ever a case where this is a recoverable error rather than a design flaw in the user's code. I agree that this is something that can be addressed later on in a follow-up though.
Just reviewed it, I think this is good enough for release, I wouldn't call this 1.0 ready (my brain demands that all edge-cases are solved first), but this is certainly far better than the assumptions made before this change, and thus a lot more resilient to more common "weird" use-cases. |
Thanks! I appreciate it. I think this is a major improvement over what we had before, although I'm a bit unhappy that it's going to be such a disruptive change. Not much we can do about it though. It's best to rip the band-aid off early on IMO.
For sure, I think there's still room for improvement. I'm not entirely confident we can cover those edge cases practically, but I'm open to ideas on that. I think it helps at least for now that we provide the |
Planning on doing the full doc review sometime tomorrow. I think this PR is ready to merge, but I'm going to hold off for just a little while longer in case anyone else has any last-minute comments or changes they want to make. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Just taken a quick read, looks like a great step forward for the library. Nice work!
src/lib.rs
Outdated
/// Attempt to initialize the Python and Rust event loops | ||
/// | ||
/// - Must be called before any other pyo3-asyncio functions. | ||
/// - Calling `try_init` a second time returns `Ok(())` and does nothing. | ||
/// > In future versions this may return an `Err`. | ||
#[deprecated(since = "0.14.0")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a good idea to add note
to this and other deprecated functions, even if it's pointing to a Github issue or elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe to the migration guide 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, most of the notes should probably at least mention the migration guide. Good catch!
|
||
__Before you get started, I personally recommend taking a look at [Event Loop References and Thread-awareness](#event-loop-references-and-thread-awareness) in order to get a better grasp on the motivation behind these changes and the nuance involved in using the new conversions.__ | ||
|
||
### 0.14 Highlights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't try_init
also be mentioned here?
Also, for a later issue; maybe it's best to move chunks of this to a dedicated wiki page, or to docs.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll add that
|
||
So what's changed from `v0.13` to `v0.14`? | ||
|
||
Well, a lot actually. There were some pretty major flaws in the initialization behaviour of `v0.13`. While it would have been nicer to address these issues without changing the public API, I decided it'd be better to break some of the old API rather than completely change the underlying behaviour of the existing functions. I realize this is going to be a bit of a headache, so hopefully this section will help you through it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe this file should be wrapped to 100 characters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it probably should. There's quite a few lines that go on forever, but it's always a pain to do this in the README since I don't have a formatter for markdown. I wouldn't be surprised if VSCode has an extension for it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I've got to run an errand real quick, but I'll get back to this in a couple hours. I think what's left to do is:
The README is pretty monolithic now, which is not great. I think the "Event Loop References and Thread-Awareness" section especially should go somewhere else (probably API docs). @davidhewitt was talking about shortening the guide a bit and linking back to this repo for details, which I can get on board with. I like having the quickstart in the pyo3 guide to show examples for both applications and native extensions, but maybe some of the other details such as testing can stay in this README (or just link to the testing module the API docs). |
Alright, looks like things have calmed down on this so I think I'm going to go ahead and merge. I'll try my best to review the docs, check links etc and hopefully release 0.14 sometime this evening! |
Ongoing work for #29
try_init
andtry_close
, but sincetry_init
should only be called when using the 0.13 API andtry_close
only performs the cleanup iftry_init
was called, it has effectively been removed for 0.14asyncio.get_event_loop
if it does not.These changes unfortunately add more nuance to the conversions. It's more important to be aware of whether a conversion is taking place on a Python asyncio-aware thread, an async Rust task, or a plain Rust thread.
I do think these changes make the library more correct, but I also think that it's important to help existing users transition. There was some uncertainty in the discussions for #29 about whether or not a compatibility layer was warranted, but after implementing some of these changes I'm even more inclined to say that it's necessary.
try_init
is slated to be removedtry_close
can keep theThreadPoolExecutor
logic provided that it only closes the executor down iftry_init
was previously calledinto_future
needs to be moved into the runtime-specific modules, the one currently in lib.rs should be renamed tointo_future_with_loop
to denote the explicit loop parameterinto_coroutine
was going to be renamed tofuture_into_py
anyway"compat"
feature gate or deprecation attributes, maybe both with default features=["compat"]?The compat layer should make it much easier for downstream developers to transition to the newer, more correct, API. Ideally upgrading to 0.14 should not break existing users, but still encourage them to start using the more correct API .
asyncio.run
should now be possible)#[main]
attributes callpyo3::prepare_freethreaded_python
regardless of pyo3'sauto-initialize
featureasyncio.run
,get_running_loop
doesn't work. The top level coroutine forasyncio.run
should be a Python coroutine.#[allow(deprecated)]
to suppress warnings)scope
rules, thread awarenesspytest
regression testing (multipleasyncio.run
calls?)