bevy/crates/bevy_ecs/src/query/par_iter.rs
Chris Russell f7e112a3c9
Let query items borrow from query state to avoid needing to clone (#15396)
# Objective

Improve the performance of `FilteredEntity(Ref|Mut)` and
`Entity(Ref|Mut)Except`.

`FilteredEntityRef` needs an `Access<ComponentId>` to determine what
components it can access. There is one stored in the query state, but
query items cannot borrow from the state, so it has to `clone()` the
access for each row. Cloning the access involves memory allocations and
can be expensive.


## Solution

Let query items borrow from their query state.  

Add an `'s` lifetime to `WorldQuery::Item` and `WorldQuery::Fetch`,
similar to the one in `SystemParam`, and provide `&'s Self::State` to
the fetch so that it can borrow from the state.

Unfortunately, there are a few cases where we currently return query
items from temporary query states: the sorted iteration methods create a
temporary state to query the sort keys, and the
`EntityRef::components<Q>()` methods create a temporary state for their
query.

To allow these to continue to work with most `QueryData`
implementations, introduce a new subtrait `ReleaseStateQueryData` that
converts a `QueryItem<'w, 's>` to `QueryItem<'w, 'static>`, and is
implemented for everything except `FilteredEntity(Ref|Mut)` and
`Entity(Ref|Mut)Except`.

`#[derive(QueryData)]` will generate `ReleaseStateQueryData`
implementations that apply when all of the subqueries implement
`ReleaseStateQueryData`.

This PR does not actually change the implementation of
`FilteredEntity(Ref|Mut)` or `Entity(Ref|Mut)Except`! That will be done
as a follow-up PR so that the changes are easier to review. I have
pushed the changes as chescock/bevy#5.

## Testing

I ran performance traces of many_foxes, both against main and against
chescock/bevy#5, both including #15282. These changes do appear to make
generalized animation a bit faster:

(Red is main, yellow is chescock/bevy#5)

![image](https://github.com/user-attachments/assets/de900117-0c6a-431d-ab62-c013834f97a9)


## Migration Guide

The `WorldQuery::Item` and `WorldQuery::Fetch` associated types and the
`QueryItem` and `ROQueryItem` type aliases now have an additional
lifetime parameter corresponding to the `'s` lifetime in `Query`. Manual
implementations of `WorldQuery` will need to update the method
signatures to include the new lifetimes. Other uses of the types will
need to be updated to include a lifetime parameter, although it can
usually be passed as `'_`. In particular, `ROQueryItem` is used when
implementing `RenderCommand`.

Before: 

```rust
fn render<'w>(
    item: &P,
    view: ROQueryItem<'w, Self::ViewQuery>,
    entity: Option<ROQueryItem<'w, Self::ItemQuery>>,
    param: SystemParamItem<'w, '_, Self::Param>,
    pass: &mut TrackedRenderPass<'w>,
) -> RenderCommandResult;
```

After: 

```rust
fn render<'w>(
    item: &P,
    view: ROQueryItem<'w, '_, Self::ViewQuery>,
    entity: Option<ROQueryItem<'w, '_, Self::ItemQuery>>,
    param: SystemParamItem<'w, '_, Self::Param>,
    pass: &mut TrackedRenderPass<'w>,
) -> RenderCommandResult;
```

---

Methods on `QueryState` that take `&mut self` may now result in
conflicting borrows if the query items capture the lifetime of the
mutable reference. This affects `get()`, `iter()`, and others. To fix
the errors, first call `QueryState::update_archetypes()`, and then
replace a call `state.foo(world, param)` with
`state.query_manual(world).foo_inner(param)`. Alternately, you may be
able to restructure the code to call `state.query(world)` once and then
make multiple calls using the `Query`.

Before:
```rust
let mut state: QueryState<_, _> = ...;
let d1 = state.get(world, e1);
let d2 = state.get(world, e2); // Error: cannot borrow `state` as mutable more than once at a time
println!("{d1:?}");
println!("{d2:?}");
```

After: 
```rust
let mut state: QueryState<_, _> = ...;

state.update_archetypes(world);
let d1 = state.get_manual(world, e1);
let d2 = state.get_manual(world, e2);
// OR
state.update_archetypes(world);
let d1 = state.query(world).get_inner(e1);
let d2 = state.query(world).get_inner(e2);
// OR
let query = state.query(world);
let d1 = query.get_inner(e1);
let d1 = query.get_inner(e2);

println!("{d1:?}");
println!("{d2:?}");
```
2025-06-16 21:05:41 +00:00

465 lines
18 KiB
Rust

use crate::{
batching::BatchingStrategy,
component::Tick,
entity::{EntityEquivalent, UniqueEntityEquivalentVec},
world::unsafe_world_cell::UnsafeWorldCell,
};
use super::{QueryData, QueryFilter, QueryItem, QueryState, ReadOnlyQueryData};
use alloc::vec::Vec;
/// A parallel iterator over query results of a [`Query`](crate::system::Query).
///
/// This struct is created by the [`Query::par_iter`](crate::system::Query::par_iter) and
/// [`Query::par_iter_mut`](crate::system::Query::par_iter_mut) methods.
pub struct QueryParIter<'w, 's, D: QueryData, F: QueryFilter> {
pub(crate) world: UnsafeWorldCell<'w>,
pub(crate) state: &'s QueryState<D, F>,
pub(crate) last_run: Tick,
pub(crate) this_run: Tick,
pub(crate) batching_strategy: BatchingStrategy,
}
impl<'w, 's, D: QueryData, F: QueryFilter> QueryParIter<'w, 's, D, F> {
/// Changes the batching strategy used when iterating.
///
/// For more information on how this affects the resultant iteration, see
/// [`BatchingStrategy`].
pub fn batching_strategy(mut self, strategy: BatchingStrategy) -> Self {
self.batching_strategy = strategy;
self
}
/// Runs `func` on each query result in parallel.
///
/// # Panics
/// If the [`ComputeTaskPool`] is not initialized. If using this from a query that is being
/// initialized and run from the ECS scheduler, this should never panic.
///
/// [`ComputeTaskPool`]: bevy_tasks::ComputeTaskPool
#[inline]
pub fn for_each<FN: Fn(QueryItem<'w, 's, D>) + Send + Sync + Clone>(self, func: FN) {
self.for_each_init(|| {}, |_, item| func(item));
}
/// Runs `func` on each query result in parallel on a value returned by `init`.
///
/// `init` may be called multiple times per thread, and the values returned may be discarded between tasks on any given thread.
/// Callers should avoid using this function as if it were a parallel version
/// of [`Iterator::fold`].
///
/// # Example
///
/// ```
/// use bevy_utils::Parallel;
/// use crate::{bevy_ecs::prelude::Component, bevy_ecs::system::Query};
/// #[derive(Component)]
/// struct T;
/// fn system(query: Query<&T>){
/// let mut queue: Parallel<usize> = Parallel::default();
/// // queue.borrow_local_mut() will get or create a thread_local queue for each task/thread;
/// query.par_iter().for_each_init(|| queue.borrow_local_mut(),|local_queue, item| {
/// **local_queue += 1;
/// });
///
/// // collect value from every thread
/// let entity_count: usize = queue.iter_mut().map(|v| *v).sum();
/// }
/// ```
///
/// # Panics
/// If the [`ComputeTaskPool`] is not initialized. If using this from a query that is being
/// initialized and run from the ECS scheduler, this should never panic.
///
/// [`ComputeTaskPool`]: bevy_tasks::ComputeTaskPool
#[inline]
pub fn for_each_init<FN, INIT, T>(self, init: INIT, func: FN)
where
FN: Fn(&mut T, QueryItem<'w, 's, D>) + Send + Sync + Clone,
INIT: Fn() -> T + Sync + Send + Clone,
{
let func = |mut init, item| {
func(&mut init, item);
init
};
#[cfg(any(target_arch = "wasm32", not(feature = "multi_threaded")))]
{
let init = init();
// SAFETY:
// This method can only be called once per instance of QueryParIter,
// which ensures that mutable queries cannot be executed multiple times at once.
// Mutable instances of QueryParIter can only be created via an exclusive borrow of a
// Query or a World, which ensures that multiple aliasing QueryParIters cannot exist
// at the same time.
unsafe {
self.state
.query_unchecked_manual_with_ticks(self.world, self.last_run, self.this_run)
.into_iter()
.fold(init, func);
}
}
#[cfg(all(not(target_arch = "wasm32"), feature = "multi_threaded"))]
{
let thread_count = bevy_tasks::ComputeTaskPool::get().thread_num();
if thread_count <= 1 {
let init = init();
// SAFETY: See the safety comment above.
unsafe {
self.state
.query_unchecked_manual_with_ticks(self.world, self.last_run, self.this_run)
.into_iter()
.fold(init, func);
}
} else {
// Need a batch size of at least 1.
let batch_size = self.get_batch_size(thread_count).max(1);
// SAFETY: See the safety comment above.
unsafe {
self.state.par_fold_init_unchecked_manual(
init,
self.world,
batch_size,
func,
self.last_run,
self.this_run,
);
}
}
}
}
#[cfg(all(not(target_arch = "wasm32"), feature = "multi_threaded"))]
fn get_batch_size(&self, thread_count: usize) -> u32 {
let max_items = || {
let id_iter = self.state.matched_storage_ids.iter();
if self.state.is_dense {
// SAFETY: We only access table metadata.
let tables = unsafe { &self.world.world_metadata().storages().tables };
id_iter
// SAFETY: The if check ensures that matched_storage_ids stores TableIds
.map(|id| unsafe { tables[id.table_id].entity_count() })
.max()
} else {
let archetypes = &self.world.archetypes();
id_iter
// SAFETY: The if check ensures that matched_storage_ids stores ArchetypeIds
.map(|id| unsafe { archetypes[id.archetype_id].len() })
.max()
}
.map(|v| v as usize)
.unwrap_or(0)
};
self.batching_strategy
.calc_batch_size(max_items, thread_count) as u32
}
}
/// A parallel iterator over the unique query items generated from an [`Entity`] list.
///
/// This struct is created by the [`Query::par_iter_many`] method.
///
/// [`Entity`]: crate::entity::Entity
/// [`Query::par_iter_many`]: crate::system::Query::par_iter_many
pub struct QueryParManyIter<'w, 's, D: QueryData, F: QueryFilter, E: EntityEquivalent> {
pub(crate) world: UnsafeWorldCell<'w>,
pub(crate) state: &'s QueryState<D, F>,
pub(crate) entity_list: Vec<E>,
pub(crate) last_run: Tick,
pub(crate) this_run: Tick,
pub(crate) batching_strategy: BatchingStrategy,
}
impl<'w, 's, D: ReadOnlyQueryData, F: QueryFilter, E: EntityEquivalent + Sync>
QueryParManyIter<'w, 's, D, F, E>
{
/// Changes the batching strategy used when iterating.
///
/// For more information on how this affects the resultant iteration, see
/// [`BatchingStrategy`].
pub fn batching_strategy(mut self, strategy: BatchingStrategy) -> Self {
self.batching_strategy = strategy;
self
}
/// Runs `func` on each query result in parallel.
///
/// # Panics
/// If the [`ComputeTaskPool`] is not initialized. If using this from a query that is being
/// initialized and run from the ECS scheduler, this should never panic.
///
/// [`ComputeTaskPool`]: bevy_tasks::ComputeTaskPool
#[inline]
pub fn for_each<FN: Fn(QueryItem<'w, 's, D>) + Send + Sync + Clone>(self, func: FN) {
self.for_each_init(|| {}, |_, item| func(item));
}
/// Runs `func` on each query result in parallel on a value returned by `init`.
///
/// `init` may be called multiple times per thread, and the values returned may be discarded between tasks on any given thread.
/// Callers should avoid using this function as if it were a parallel version
/// of [`Iterator::fold`].
///
/// # Example
///
/// ```
/// use bevy_utils::Parallel;
/// use crate::{bevy_ecs::prelude::{Component, Res, Resource, Entity}, bevy_ecs::system::Query};
/// # use core::slice;
/// use bevy_platform::prelude::Vec;
/// # fn some_expensive_operation(_item: &T) -> usize {
/// # 0
/// # }
///
/// #[derive(Component)]
/// struct T;
///
/// #[derive(Resource)]
/// struct V(Vec<Entity>);
///
/// impl<'a> IntoIterator for &'a V {
/// // ...
/// # type Item = &'a Entity;
/// # type IntoIter = slice::Iter<'a, Entity>;
/// #
/// # fn into_iter(self) -> Self::IntoIter {
/// # self.0.iter()
/// # }
/// }
///
/// fn system(query: Query<&T>, entities: Res<V>){
/// let mut queue: Parallel<usize> = Parallel::default();
/// // queue.borrow_local_mut() will get or create a thread_local queue for each task/thread;
/// query.par_iter_many(&entities).for_each_init(|| queue.borrow_local_mut(),|local_queue, item| {
/// **local_queue += some_expensive_operation(item);
/// });
///
/// // collect value from every thread
/// let final_value: usize = queue.iter_mut().map(|v| *v).sum();
/// }
/// ```
///
/// # Panics
/// If the [`ComputeTaskPool`] is not initialized. If using this from a query that is being
/// initialized and run from the ECS scheduler, this should never panic.
///
/// [`ComputeTaskPool`]: bevy_tasks::ComputeTaskPool
#[inline]
pub fn for_each_init<FN, INIT, T>(self, init: INIT, func: FN)
where
FN: Fn(&mut T, QueryItem<'w, 's, D>) + Send + Sync + Clone,
INIT: Fn() -> T + Sync + Send + Clone,
{
let func = |mut init, item| {
func(&mut init, item);
init
};
#[cfg(any(target_arch = "wasm32", not(feature = "multi_threaded")))]
{
let init = init();
// SAFETY:
// This method can only be called once per instance of QueryParManyIter,
// which ensures that mutable queries cannot be executed multiple times at once.
// Mutable instances of QueryParManyUniqueIter can only be created via an exclusive borrow of a
// Query or a World, which ensures that multiple aliasing QueryParManyIters cannot exist
// at the same time.
unsafe {
self.state
.query_unchecked_manual_with_ticks(self.world, self.last_run, self.this_run)
.iter_many_inner(&self.entity_list)
.fold(init, func);
}
}
#[cfg(all(not(target_arch = "wasm32"), feature = "multi_threaded"))]
{
let thread_count = bevy_tasks::ComputeTaskPool::get().thread_num();
if thread_count <= 1 {
let init = init();
// SAFETY: See the safety comment above.
unsafe {
self.state
.query_unchecked_manual_with_ticks(self.world, self.last_run, self.this_run)
.iter_many_inner(&self.entity_list)
.fold(init, func);
}
} else {
// Need a batch size of at least 1.
let batch_size = self.get_batch_size(thread_count).max(1);
// SAFETY: See the safety comment above.
unsafe {
self.state.par_many_fold_init_unchecked_manual(
init,
self.world,
&self.entity_list,
batch_size,
func,
self.last_run,
self.this_run,
);
}
}
}
}
#[cfg(all(not(target_arch = "wasm32"), feature = "multi_threaded"))]
fn get_batch_size(&self, thread_count: usize) -> u32 {
self.batching_strategy
.calc_batch_size(|| self.entity_list.len(), thread_count) as u32
}
}
/// A parallel iterator over the unique query items generated from an [`EntitySet`].
///
/// This struct is created by the [`Query::par_iter_many_unique`] and [`Query::par_iter_many_unique_mut`] methods.
///
/// [`EntitySet`]: crate::entity::EntitySet
/// [`Query::par_iter_many_unique`]: crate::system::Query::par_iter_many_unique
/// [`Query::par_iter_many_unique_mut`]: crate::system::Query::par_iter_many_unique_mut
pub struct QueryParManyUniqueIter<'w, 's, D: QueryData, F: QueryFilter, E: EntityEquivalent + Sync>
{
pub(crate) world: UnsafeWorldCell<'w>,
pub(crate) state: &'s QueryState<D, F>,
pub(crate) entity_list: UniqueEntityEquivalentVec<E>,
pub(crate) last_run: Tick,
pub(crate) this_run: Tick,
pub(crate) batching_strategy: BatchingStrategy,
}
impl<'w, 's, D: QueryData, F: QueryFilter, E: EntityEquivalent + Sync>
QueryParManyUniqueIter<'w, 's, D, F, E>
{
/// Changes the batching strategy used when iterating.
///
/// For more information on how this affects the resultant iteration, see
/// [`BatchingStrategy`].
pub fn batching_strategy(mut self, strategy: BatchingStrategy) -> Self {
self.batching_strategy = strategy;
self
}
/// Runs `func` on each query result in parallel.
///
/// # Panics
/// If the [`ComputeTaskPool`] is not initialized. If using this from a query that is being
/// initialized and run from the ECS scheduler, this should never panic.
///
/// [`ComputeTaskPool`]: bevy_tasks::ComputeTaskPool
#[inline]
pub fn for_each<FN: Fn(QueryItem<'w, 's, D>) + Send + Sync + Clone>(self, func: FN) {
self.for_each_init(|| {}, |_, item| func(item));
}
/// Runs `func` on each query result in parallel on a value returned by `init`.
///
/// `init` may be called multiple times per thread, and the values returned may be discarded between tasks on any given thread.
/// Callers should avoid using this function as if it were a parallel version
/// of [`Iterator::fold`].
///
/// # Example
///
/// ```
/// use bevy_utils::Parallel;
/// use crate::{bevy_ecs::{prelude::{Component, Res, Resource, Entity}, entity::UniqueEntityVec, system::Query}};
/// # use core::slice;
/// # use crate::bevy_ecs::entity::UniqueEntityIter;
/// # fn some_expensive_operation(_item: &T) -> usize {
/// # 0
/// # }
///
/// #[derive(Component)]
/// struct T;
///
/// #[derive(Resource)]
/// struct V(UniqueEntityVec);
///
/// impl<'a> IntoIterator for &'a V {
/// // ...
/// # type Item = &'a Entity;
/// # type IntoIter = UniqueEntityIter<slice::Iter<'a, Entity>>;
/// #
/// # fn into_iter(self) -> Self::IntoIter {
/// # self.0.iter()
/// # }
/// }
///
/// fn system(query: Query<&T>, entities: Res<V>){
/// let mut queue: Parallel<usize> = Parallel::default();
/// // queue.borrow_local_mut() will get or create a thread_local queue for each task/thread;
/// query.par_iter_many_unique(&entities).for_each_init(|| queue.borrow_local_mut(),|local_queue, item| {
/// **local_queue += some_expensive_operation(item);
/// });
///
/// // collect value from every thread
/// let final_value: usize = queue.iter_mut().map(|v| *v).sum();
/// }
/// ```
///
/// # Panics
/// If the [`ComputeTaskPool`] is not initialized. If using this from a query that is being
/// initialized and run from the ECS scheduler, this should never panic.
///
/// [`ComputeTaskPool`]: bevy_tasks::ComputeTaskPool
#[inline]
pub fn for_each_init<FN, INIT, T>(self, init: INIT, func: FN)
where
FN: Fn(&mut T, QueryItem<'w, 's, D>) + Send + Sync + Clone,
INIT: Fn() -> T + Sync + Send + Clone,
{
let func = |mut init, item| {
func(&mut init, item);
init
};
#[cfg(any(target_arch = "wasm32", not(feature = "multi_threaded")))]
{
let init = init();
// SAFETY:
// This method can only be called once per instance of QueryParManyUniqueIter,
// which ensures that mutable queries cannot be executed multiple times at once.
// Mutable instances of QueryParManyUniqueIter can only be created via an exclusive borrow of a
// Query or a World, which ensures that multiple aliasing QueryParManyUniqueIters cannot exist
// at the same time.
unsafe {
self.state
.query_unchecked_manual_with_ticks(self.world, self.last_run, self.this_run)
.iter_many_unique_inner(self.entity_list)
.fold(init, func);
}
}
#[cfg(all(not(target_arch = "wasm32"), feature = "multi_threaded"))]
{
let thread_count = bevy_tasks::ComputeTaskPool::get().thread_num();
if thread_count <= 1 {
let init = init();
// SAFETY: See the safety comment above.
unsafe {
self.state
.query_unchecked_manual_with_ticks(self.world, self.last_run, self.this_run)
.iter_many_unique_inner(self.entity_list)
.fold(init, func);
}
} else {
// Need a batch size of at least 1.
let batch_size = self.get_batch_size(thread_count).max(1);
// SAFETY: See the safety comment above.
unsafe {
self.state.par_many_unique_fold_init_unchecked_manual(
init,
self.world,
&self.entity_list,
batch_size,
func,
self.last_run,
self.this_run,
);
}
}
}
}
#[cfg(all(not(target_arch = "wasm32"), feature = "multi_threaded"))]
fn get_batch_size(&self, thread_count: usize) -> u32 {
self.batching_strategy
.calc_batch_size(|| self.entity_list.len(), thread_count) as u32
}
}