Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: System return life time escape hatch to enable async systems #17037

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Neo-Zhixing
Copy link
Contributor

@Neo-Zhixing Neo-Zhixing commented Dec 30, 2024

Objective

This PR is more of a discussion than it is a PR.

I would like to create a system with a return value that captures the 'world lifetime. The system will then be wrapped in a system wrapper which takes the return value and do something fancy on it.

Specifically, my systems return a Future. The wrapping system stores the returned future and repeatedly call poll on it. It will yield the control back to the scheduler whenever [Future::poll] returns Pending using #17036.

This means that the returned value will have to capture the lifetimes of the system. Intuitively this should be ok, because the wrapping system shares the same lifetimes as the inner system.

However, this can't be done because SystemParamFunction is only implemented for functions that are valid for all lifetimes.

Solution

  • This PR introduces an escape hatch when the system return value captures lifetimes.

For potentially problematic cases:

fn system1<'w>(a: ResMut<'w, FooBar>) -> &'w mut FooBar {
  return a.into_inner()
}
fn system2<'w>(In(a): In<&'w mut FooBar>, b: ResMut<'w, FooBar>) {
  // two mutable references to the same FooBar!
}
system1.pipe(system2)

We do have mechanisms to detect this at runtime. Bevy will print system1 could not access system parameter ResMut<FooBar>.

The solution introduced in this PR forces Rust to forget about the lifetimes with a transmute. Without the transmute, rustc gives the following error message:

error: lifetime may not live long enough
    --> crates/bevy_ecs/src/system/function_system.rs:1043:17
     |
1020 |         impl<'w, 's, In, Out, Func, $($param: SystemParam),*> SystemParamFunction<(HasSystemInput, fn(In...
     |                  -- lifetime `'s` defined here
...
1033 |             fn run(&mut self, input: In::Inner<'_>, param_value: SystemParamItem< ($($param,)*)>) -> Out {
     |                                                     ----------- has type `<(F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14, F15) as system_param::SystemParam>::Item<'_, '2>`
...
1043 |                 call_inner(self, input, $($param),*)
     |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ argument requires that `'s` must outlive `'2`
...
1051 | all_tuples!(impl_system_function, 0, 16, F);
     | ------------------------------------------- in this macro invocation
     |
     = note: this error originates in the macro `impl_system_function` which comes from the expansion of the macro `all_tuples` (in Nightly builds, run with -Z macro-backtrace for more info)

However, I'm not 100% happy about this. Are there any other methods that we can employ here to properly and safely teach Rust about our lifetime assumption when the system must return a value that captures lifetimes?

@BenjaminBrienen BenjaminBrienen added the S-Needs-Design This issue requires design work to think about how it would best be accomplished label Dec 30, 2024
@alice-i-cecile alice-i-cecile added the A-ECS Entities, components, systems, and events label Dec 30, 2024
@alice-i-cecile alice-i-cecile added the A-Tasks Tools for parallel and async work label Dec 30, 2024
@hymm
Copy link
Contributor

hymm commented Dec 30, 2024

Holding a reference to anything in the world would be unsound as any exclusive access to &mut World could invalidate the reference. The only way to do this is to convert the references to pointers, but I don't think there's a way for bevy to provide a safe way of dereferencing those pointers for the reason above.

@chescock
Copy link
Contributor

It might be possible to make something sound if you made System::Out be a GAT, like

unsafe fn run_unsafe<'w>(&mut self, input, world: UnsafeWorldCell<'w>) -> Self::Out<'w>

And from there you could make a variant of pipe that required the parameters to be disjoint but let the second system take an input with lifetimes, using the SystemInput trait.

None of that would help with futures, though! If the future captures the 'w lifetime then the entire future would need to be dropped before the system returns to the scheduler. But it sounds like you're trying to store the future and then resume it on a later run, by which point all of the borrows would have been invalidated.

@hymm
Copy link
Contributor

hymm commented Dec 30, 2024

Thinking about this some more, it might work to create a new smart pointer type that would store the raw pointer, entity, and EntityLocation and then that is checked to make sure the location hasn't changed when dereferenced. I think there's all the info needed in the EntityLocation to make sure the pointer hasn't moved. There would need to be some mechanism to make sure you can't hold the & and &mut references that are handed out across a yield either using lifetimes somehow or using a closure.

Having said that I'm not sure there's that much value in holding a pointer across a yield point. I feel like an async system api will look a lot more like how exclusive systems work, where you hold something like a QueryState that doesn't have any references itself. You would then pass it a reference to the world to work with the data, but then those references would not be allowed to be held across the await. You might then provide a mechanism for pausing and resuming iteration, with some way of informing the user if data has been removed or added to the query.

@alice-i-cecile alice-i-cecile changed the title Discussion: System return life time escape hatch Discussion: System return life time escape hatch to enable async systems Dec 30, 2024
@alice-i-cecile alice-i-cecile added the D-Unsafe Touches with unsafe code in some way label Dec 30, 2024
@Neo-Zhixing
Copy link
Contributor Author

@hymm I see. That make a lot of sense. I'll experiment on this a bit more.

@Neo-Zhixing Neo-Zhixing mentioned this pull request Jan 1, 2025
@Neo-Zhixing Neo-Zhixing force-pushed the rhyolite/system-return-lifetime branch from e400d78 to 954aedd Compare January 1, 2025 05:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events A-Tasks Tools for parallel and async work D-Unsafe Touches with unsafe code in some way S-Needs-Design This issue requires design work to think about how it would best be accomplished
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants