Discussion: System return life time escape hatch to enable async systems #17037

Neo-Zhixing · 2024-12-30T09:24:56Z

Objective

This PR is more of a discussion than it is a PR.

I would like to create a system with a return value that captures the 'world lifetime. The system will then be wrapped in a system wrapper which takes the return value and do something fancy on it.

Specifically, my systems return a Future. The wrapping system stores the returned future and repeatedly call poll on it. It will yield the control back to the scheduler whenever [Future::poll] returns Pending using #17036.

This means that the returned value will have to capture the lifetimes of the system. Intuitively this should be ok, because the wrapping system shares the same lifetimes as the inner system.

However, this can't be done because SystemParamFunction is only implemented for functions that are valid for all lifetimes.

Solution

This PR introduces an escape hatch when the system return value captures lifetimes.

For potentially problematic cases:

fn system1<'w>(a: ResMut<'w, FooBar>) -> &'w mut FooBar {
  return a.into_inner()
}
fn system2<'w>(In(a): In<&'w mut FooBar>, b: ResMut<'w, FooBar>) {
  // two mutable references to the same FooBar!
}
system1.pipe(system2)

We do have mechanisms to detect this at runtime. Bevy will print system1 could not access system parameter ResMut<FooBar>.

The solution introduced in this PR forces Rust to forget about the lifetimes with a transmute. Without the transmute, rustc gives the following error message:

error: lifetime may not live long enough
    --> crates/bevy_ecs/src/system/function_system.rs:1043:17
     |
1020 |         impl<'w, 's, In, Out, Func, $($param: SystemParam),*> SystemParamFunction<(HasSystemInput, fn(In...
     |                  -- lifetime `'s` defined here
...
1033 |             fn run(&mut self, input: In::Inner<'_>, param_value: SystemParamItem< ($($param,)*)>) -> Out {
     |                                                     ----------- has type `<(F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14, F15) as system_param::SystemParam>::Item<'_, '2>`
...
1043 |                 call_inner(self, input, $($param),*)
     |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ argument requires that `'s` must outlive `'2`
...
1051 | all_tuples!(impl_system_function, 0, 16, F);
     | ------------------------------------------- in this macro invocation
     |
     = note: this error originates in the macro `impl_system_function` which comes from the expansion of the macro `all_tuples` (in Nightly builds, run with -Z macro-backtrace for more info)

However, I'm not 100% happy about this. Are there any other methods that we can employ here to properly and safely teach Rust about our lifetime assumption when the system must return a value that captures lifetimes?

hymm · 2024-12-30T21:03:15Z

Holding a reference to anything in the world would be unsound as any exclusive access to &mut World could invalidate the reference. The only way to do this is to convert the references to pointers, but I don't think there's a way for bevy to provide a safe way of dereferencing those pointers for the reason above.

chescock · 2024-12-30T21:49:39Z

It might be possible to make something sound if you made System::Out be a GAT, like

unsafe fn run_unsafe<'w>(&mut self, input, world: UnsafeWorldCell<'w>) -> Self::Out<'w>

And from there you could make a variant of pipe that required the parameters to be disjoint but let the second system take an input with lifetimes, using the SystemInput trait.

None of that would help with futures, though! If the future captures the 'w lifetime then the entire future would need to be dropped before the system returns to the scheduler. But it sounds like you're trying to store the future and then resume it on a later run, by which point all of the borrows would have been invalidated.

hymm · 2024-12-30T21:52:27Z

Thinking about this some more, it might work to create a new smart pointer type that would store the raw pointer, entity, and EntityLocation and then that is checked to make sure the location hasn't changed when dereferenced. I think there's all the info needed in the EntityLocation to make sure the pointer hasn't moved. There would need to be some mechanism to make sure you can't hold the & and &mut references that are handed out across a yield either using lifetimes somehow or using a closure.

Having said that I'm not sure there's that much value in holding a pointer across a yield point. I feel like an async system api will look a lot more like how exclusive systems work, where you hold something like a QueryState that doesn't have any references itself. You would then pass it a reference to the world to work with the data, but then those references would not be allowed to be held across the await. You might then provide a mechanism for pausing and resuming iteration, with some way of informing the user if data has been removed or added to the query.

Neo-Zhixing · 2024-12-31T21:53:12Z

@hymm I see. That make a lot of sense. I'll experiment on this a bit more.

BenjaminBrienen added the S-Needs-Design This issue requires design work to think about how it would best be accomplished label Dec 30, 2024

alice-i-cecile added the A-ECS Entities, components, systems, and events label Dec 30, 2024

alice-i-cecile requested review from chescock and hymm December 30, 2024 19:45

alice-i-cecile added the A-Tasks Tools for parallel and async work label Dec 30, 2024

alice-i-cecile changed the title ~~Discussion: System return life time escape hatch~~ Discussion: System return life time escape hatch to enable async systems Dec 30, 2024

alice-i-cecile added the D-Unsafe Touches with unsafe code in some way label Dec 30, 2024

Neo-Zhixing mentioned this pull request Jan 1, 2025

System Yielding #17036

Open

System return life time escape hatch

954aedd

Neo-Zhixing force-pushed the rhyolite/system-return-lifetime branch from e400d78 to 954aedd Compare January 1, 2025 05:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: System return life time escape hatch to enable async systems #17037

Discussion: System return life time escape hatch to enable async systems #17037

Neo-Zhixing commented Dec 30, 2024 •

edited

Loading

hymm commented Dec 30, 2024

chescock commented Dec 30, 2024

hymm commented Dec 30, 2024

Neo-Zhixing commented Dec 31, 2024

Discussion: System return life time escape hatch to enable async systems #17037

Are you sure you want to change the base?

Discussion: System return life time escape hatch to enable async systems #17037

Conversation

Neo-Zhixing commented Dec 30, 2024 • edited Loading

Objective

Solution

hymm commented Dec 30, 2024

chescock commented Dec 30, 2024

hymm commented Dec 30, 2024

Neo-Zhixing commented Dec 31, 2024

Neo-Zhixing commented Dec 30, 2024 •

edited

Loading