Commit Graph

193 Commits

Author SHA1 Message Date
BD103
c890cc9dc3
Overhaul picking benchmarks (#17033)
# Objective

- Part of #16647.
- This PR goes through our `ray_cast::ray_mesh_intersection()`
benchmarks and overhauls them with more comments and better
extensibility. The code is also a lot less duplicated!

## Solution

- Create a `Benchmarks` enum that describes all of the different kind of
scenarios we want to benchmark.
- Merge all of our existing benchmark functions into a single one,
`bench()`, which sets up the scenarios all at once.
- Add comments to `mesh_creation()` and `ptoxznorm()`, and move some
lines around to be a bit clearer.
- Make the benchmarks use the new `bench!` macro, as part of #16647.
- Rename many functions and benchmarks to be clearer.

## For reviewers

I split this PR up into several, easier to digest commits. You might
find it easier to review by looking through each commit, instead of the
complete file changes.

None of my changes actually modifies the behavior of the benchmarks;
they still track the exact same test cases. There shouldn't be
significant changes in benchmark performance before and after this PR.

## Testing

List all picking benchmarks: `cargo bench -p benches --bench picking --
--list`

Run the benchmarks once in debug mode: `cargo test -p benches --bench
picking`

Run the benchmarks and analyze their performance: `cargo bench -p
benches --bench picking`

- Check out the generated HTML report in
`./target/criterion/report/index.html` once you're done!

---

## Showcase

List of all picking benchmarks, after having been renamed:

<img width="524" alt="image"
src="https://github.com/user-attachments/assets/a1b53daf-4a8b-4c45-a25a-c6306c7175d1"
/>

Example report for
`picking::ray_mesh_intersection::cull_intersect/100_vertices`:

<img width="992" alt="image"
src="https://github.com/user-attachments/assets/a1aaf53f-ce21-4bef-89c4-b982bb158f5d"
/>
2024-12-30 21:04:24 +00:00
BD103
291cb31798
Overhaul bezier curve benchmarks (#17016)
# Objective

- Part of #16647.
- The benchmarks for bezier curves have several issues and do not yet
use the new `bench!` naming scheme.

## Solution

- Make all `bevy_math` benchmarks use the `bench!` macro for their name.
- Delete the `build_accel_cubic()` benchmark, since it was an exact
duplicate of `build_pos_cubic()`.
- Remove `collect::<Vec<_>>()` call in `build_pos_cubic()` and replace
it with a `for` loop.
- Combine all of the benchmarks that measure `curve.position()` under a
single group, `curve_position`, and extract the common bench routine
into a helper function.
- Move the time calculation for the `curve.ease()` benchmark into the
setup closure so it is not tracked.
- Rename the benchmarks to be more descriptive on what they do.
  - `easing_1000` -> `segment_ease`
  - `cubic_position_Vec2` -> `curve_position/vec2`
  - `cubic_position_Vec3A` -> `curve_position/vec3a`
  - `cubic_position_Vec3` -> `curve_position/vec3`
  - `build_pos_cubic_100_points` -> `curve_iter_positions`

## Testing

- `cargo test -p benches --bench math`
- `cargo bench -p benches --bench math`
  - Then open `./target/criterion/report/index.html` to see the report!

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-12-29 20:55:08 +00:00
BD103
17ad855653
Migrate to core::hint::black_box() (#16980)
# Objective

Many of our benchmarks use
[`criterion::black_box()`](https://docs.rs/criterion/latest/criterion/fn.black_box.html),
which is used to prevent the compiler from optimizing away computation
that we're trying to time. This can be slow, though, because
`criterion::black_box()` forces a point read each time it is called
through
[`ptr::road_volatile()`](https://doc.rust-lang.org/stable/std/ptr/fn.read_volatile.html).

In Rust 1.66, the standard library introduced
[`core::hint::black_box()`](https://doc.rust-lang.org/nightly/std/hint/fn.black_box.html)
(and `std::hint::black_box()`). This is an intended replacement for
`criterion`'s version that uses compiler intrinsics instead of volatile
pointer reads, and thus has no runtime overhead. This increases
benchmark accuracy, which is always nice 👍

Note that benchmarks may _appear_ to improve in performance after this
change, but that's just because we are eliminating the pointer read
overhead.

## Solution

- Deny `criterion::black_box` in `clippy.toml`.
- Fix all imports.

## Testing

- `cargo clippy -p benches --benches`
2024-12-29 19:33:42 +00:00
BD103
f391522483
Test benchmarks in CI (#16987)
# Objective

- This reverts #16833, and completely goes against #16803.
- Turns out running `cargo test --benches` runs each benchmark once,
without timing it, just to ensure nothing panics. This is actually
desired because we can use it to verify benchmarks are working correctly
without all the time constraints of actual benchmarks.

## Solution

- Add the `--benches` flag to the CI test command.

## Testing

- `cargo run -p ci -- test`
2024-12-29 19:33:06 +00:00
BD103
c03e494a26
Migrate reflection benchmarks to new naming system (#16986)
# Objective

- Please see #16647 for the full reasoning behind this change.

## Solution

- Create the `bench!` macro, which generates the name of the benchmark
at compile time.

Migrating is a single line change, and it will automatically update if
you move the benchmark to a different module:

  ```diff
  + use benches::bench;

  fn my_benchmark(c: &mut Criterion) {
  -   c.bench_function("my_benchmark", |b| {});
  +   c.bench_function(bench!("my_benchmark"), |b| {});
  }
  ```

- Migrate all reflection benchmarks to use `bench!`.
- Fix a few places where `black_box()` or Criterion is misused.

## Testing

```sh
cd benches

# Will take a long time!
cargo bench --bench reflect

# List out the names of all reflection benchmarks, to ensure I didn't miss anything.
cargo bench --bench reflect -- --list

# Check for linter warnings.
cargo clippy --bench reflect

# Run each benchmark once.
cargo test --bench reflect
```
2024-12-26 22:28:09 +00:00
Oliver Maskery
a7ae0807ac
Fix panic in benches caused by missing resources (#16956)
# Objective

- To fix the benches panicking on `main`

## Solution

- It appears that systems requiring access to a non-existing `Res` now
causes a panic
- Some of the benches run systems that access resources that have not
been inserted into the world
- I have made it so that those resources are inserted into the world

## Testing

- I ran all the ecs benches and they all run without panicking

Co-authored-by: Oliver Maskery <oliver@wellplayed.games>
2024-12-24 17:18:03 +00:00
BD103
20277006ce
Add benchmarks and compile_fail tests back to workspace (#16858)
# Objective

- Our benchmarks and `compile_fail` tests lag behind the rest of the
engine because they are not in the Cargo workspace, so not checked by
CI.
- Fixes #16801, please see it for further context!

## Solution

- Add benchmarks and `compile_fail` tests to the Cargo workspace.
- Fix any leftover formatting issues and documentation.

## Testing

- I think CI should catch most things!

## Questions

<details>
<summary>Outdated issue I was having with function reflection being
optional</summary>

The `reflection_types` example is failing in Rust-Analyzer for me, but
not a normal check.

```rust
error[E0004]: non-exhaustive patterns: `ReflectRef::Function(_)` not covered
   --> examples/reflection/reflection_types.rs:81:11
    |
81  |     match value.reflect_ref() {
    |           ^^^^^^^^^^^^^^^^^^^ pattern `ReflectRef::Function(_)` not covered
    |
note: `ReflectRef<'_>` defined here
   --> /Users/bdeep/dev/bevy/bevy/crates/bevy_reflect/src/kind.rs:178:1
    |
178 | pub enum ReflectRef<'a> {
    | ^^^^^^^^^^^^^^^^^^^^^^^
...
188 |     Function(&'a dyn Function),
    |     -------- not covered
    = note: the matched value is of type `ReflectRef<'_>`
help: ensure that all possible cases are being handled by adding a match arm with a wildcard pattern or an explicit pattern as shown
    |
126 ~         ReflectRef::Opaque(_) => {},
127 +         ReflectRef::Function(_) => todo!()
    |
```

I think it is because the following line is feature-gated:


cc0f6a8db4/examples/reflection/reflection_types.rs (L117-L122)

My theory for why this is happening is because the benchmarks enabled
`bevy_reflect`'s `function` feature, which gets merged with the rest of
the features when RA checks the workspace, but the `#[cfg(...)]` gate in
the example isn't detecting it:


cc0f6a8db4/benches/Cargo.toml (L19)

Any thoughts on how to fix this? It's not blocking, since the example
still compiles as normal, but it's just RA and the command `cargo check
--workspace --all-targets` appears to fail.

</summary>
2024-12-21 22:30:29 +00:00
eugineerd
20049d4c34
Faster entity cloning (#16717)
# Objective

#16132 introduced entity cloning functionality, and while it works and
is useful, it can be made faster. This is the promised follow-up to
improve performance.

## Solution

**PREFACE**: This is my first time writing `unsafe` in rust and I have
only vague idea about what I'm doing. I would encourage reviewers to
scrutinize `unsafe` parts in particular.

The solution is to clone component data to an intermediate buffer and
use `EntityWorldMut::insert_by_ids` to insert components without
additional archetype moves.

To facilitate this, `EntityCloner::clone_entity` now reads all
components of the source entity and provides clone handlers with the
ability to read component data straight from component storage using
`read_source_component` and write to an intermediate buffer using
`write_target_component`. `ComponentId` is used to check that requested
type corresponds to the type available on source entity.

Reflect-based handler is a little trickier to pull of: we only have
`&dyn Reflect` and no direct access to the underlying data.
`ReflectFromPtr` can be used to get `&dyn Reflect` from concrete
component data, but to write it we need to create a clone of the
underlying data using `Reflect`. For this reason only components that
have `ReflectDefault` or `ReflectFromReflect` or `ReflectFromWorld` can
be cloned, all other components will be skipped. The good news is that
this is actually only a temporary limitation: once #13432 lands we will
be able to clone component without requiring one of these `type data`s.

This PR also introduces `entity_cloning` benchmark to better compare
changes between the PR and main, you can see the results in the
**showcase** section.

## Testing

- All previous tests passing
- Added test for fast reflect clone path (temporary, will be removed
after reflection-based cloning lands)
- Ran miri

## Showcase
Here's a table demonstrating the improvement:

| **benchmark** | **main, avg** | **PR, avg** | **change, avg** |
| ----------------------- | ------------- | ----------- |
--------------- |
| many components reflect | 18.505 µs | 2.1351 µs | -89.095% |
| hierarchy wide reflect* | 22.778 ms | 4.1875 ms | -81.616% |
| hierarchy tall reflect* | 107.24 µs | 26.322 µs | -77.141% |
| hierarchy many reflect | 78.533 ms | 9.7415 ms | -87.596% |
| many components clone | 1.3633 µs | 758.17 ns | -45.937% |
| hierarchy wide clone* | 2.7716 ms | 3.3411 ms | +20.546% |
| hierarchy tall clone* | 17.646 µs | 20.190 µs | +17.379% |
| hierarchy many clone | 5.8779 ms | 4.2650 ms | -27.439% |

*: these benchmarks have entities with only 1 component

## Considerations
Once #10154 is resolved a large part of the functionality in this PR
will probably become obsolete. It might still be a little bit faster
than using command batching, but the complexity might not be worth it.

## Migration Guide
- `&EntityCloner` in component clone handlers is changed to `&mut
ComponentCloneCtx` to better separate data.
- Changed `EntityCloneHandler` from enum to struct and added convenience
functions to add default clone and reflect handler more easily.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Chris Russell <8494645+chescock@users.noreply.github.com>
2024-12-18 20:03:39 +00:00
BD103
6178ce93e8
Fix Clippy lints in benchmarks (#16808)
# Objective

- Closes #16804.
- This copies over our lint configuration to our benchmarks and fixes
any lints.

## Solution

- Copied over our Clippy configuration from the root `Cargo.toml` to
`benches/Cargo.toml`.
- Fixed any warnings that Clippy emitted.

## Testing

- `cd benches && cargo clippy --benches`
2024-12-14 05:53:17 +00:00
BD103
c4a24d5b51
Make benchmark setup consistent (#16733)
# Objective

- Benchmarks are inconsistently setup, there are several that do not
compile, and several others that need to be reformatted.
- Related / precursor to #16647, this is part of my attempt migrating
[`bevy-bencher`](https://github.com/TheBevyFlock/bevy-bencher) to the
official benchmarks.

## Solution

> [!TIP]
>
> I recommend reviewing this PR commit-by-commit, instead of all at
once!

In 5d26f56eb9 I reorganized how benches
were registered. Now this is one `[[bench]]` per Bevy crate. In each
crate benchmark folder, there is a `main.rs` that calls
`criterion_main!`. I also disabled automatic benchmark discovery, which
isn't necessarily required, but may clear up confusion with our custom
setup. I also fixed a few errors that were causing the benchmarks to
fail to compile.

In afc8d33a87 I ran `rustfmt` on all of
the benchmarks.

In d6cdf960ab I fixed all of the Clippy
warnings.

In ee94d48f50 I fixed some of the
benchmarks' usage of `black_box()`. I ended up opening
https://github.com/rust-lang/rust/pull/133942 due to this, which should
help prevent this in the future.

In cbe1688dcd I renamed all of the ECS
benchmark groups to be called `benches`, to be consistent with the other
crate benchmarks.

In e701c212cd and
8815bb78b0 I re-ordered some imports and
module definitions, and uplifted `fragmentation/mod.rs` to
`fragementation.rs`.

Finally, in b0065e0b0b I organized
`Cargo.toml` and bumped Criterion to v0.5.

## Testing

- `cd benches && cargo clippy --benches`
- `cd benches && cargo fmt --all`
2024-12-10 20:39:14 +00:00
Gino Valente
d21c7a1911
bevy_reflect: Function Overloading (Generic & Variadic Functions) (#15074)
# Objective

Currently function reflection requires users to manually monomorphize
their generic functions. For example:

```rust
fn add<T: Add<Output=T>>(a: T, b: T) -> T {
    a + b
}

// We have to specify the type of `T`:
let reflect_add = add::<i32>.into_function();
```

This PR doesn't aim to solve that problem—this is just a limitation in
Rust. However, it also means that reflected functions can only ever work
for a single monomorphization. If we wanted to support other types for
`T`, we'd have to create a separate function for each one:

```rust
let reflect_add_i32 = add::<i32>.into_function();
let reflect_add_u32 = add::<u32>.into_function();
let reflect_add_f32 = add::<f32>.into_function();
// ...
```

So in addition to requiring manual monomorphization, we also lose the
benefit of having a single function handle multiple argument types.

If a user wanted to create a small modding script that utilized function
reflection, they'd have to either:
- Store all sets of supported monomorphizations and require users to
call the correct one
- Write out some logic to find the correct function based on the given
arguments

While the first option would work, it wouldn't be very ergonomic. The
second option is better, but it adds additional complexity to the user's
logic—complexity that `bevy_reflect` could instead take on.

## Solution

Introduce [function
overloading](https://en.wikipedia.org/wiki/Function_overloading).

A `DynamicFunction` can now be overloaded with other `DynamicFunction`s.
We can rewrite the above code like so:

```rust
let reflect_add = add::<i32>
    .into_function()
    .with_overload(add::<u32>)
    .with_overload(add::<f32>);
```

When invoked, the `DynamicFunction` will attempt to find a matching
overload for the given set of arguments.

And while I went into this PR only looking to improve generic function
reflection, I accidentally added support for variadic functions as well
(hence why I use the broader term "overload" over "generic").

```rust
// Supports 1 to 4 arguments
let multiply_all = (|a: i32| a)
    .into_function()
    .with_overload(|a: i32, b: i32| a * b)
    .with_overload(|a: i32, b: i32, c: i32| a * b * c)
    .with_overload(|a: i32, b: i32, c: i32, d: i32| a * b * c * d);
```

This is simply an added bonus to this particular implementation. ~~Full
variadic support (i.e. allowing for an indefinite number of arguments)
will be added in a later PR.~~ I actually decided to limit the maximum
number of arguments to 63 to supplement faster lookups, a reduced memory
footprint, and faster cloning.

### Alternatives & Rationale

I explored a few options for handling generic functions. This PR is the
one I feel the most confident in, but I feel I should mention the others
and why I ultimately didn't move forward with them.

#### Adding `GenericDynamicFunction`

**TL;DR:** Adding a distinct `GenericDynamicFunction` type unnecessarily
splits and complicates the API.

<details>
<summary>Details</summary>

My initial explorations involved a dedicated `GenericDynamicFunction` to
contain and handle the mappings.

This was initially started back when `DynamicFunction` was distinct from
`DynamicClosure`. My goal was to not prevent us from being able to
somehow make `DynamicFunction` implement `Copy`. But once we reverted
back to a single `DynamicFunction`, that became a non-issue.

But that aside, the real problem was that it created a split in the API.
If I'm using a third-party library that uses function reflection, I have
to know whether to request a `DynamicFunction` or a
`GenericDynamicFunction`. I might not even know ahead of time which one
I want. It might need to be determined at runtime.

And if I'm creating a library, I might want a type to contain both
`DynamicFunction` and `GenericDynamicFunction`. This might not be
possible if, for example, I need to store the function in a `HashMap`.

The other concern is with `IntoFunction`. Right now `DynamicFunction`
trivially implements `IntoFunction` since it can just return itself. But
what should `GenericDynamicFunction` do? It could return itself wrapped
into a `DynamicFunction`, but then the API for `DynamicFunction` would
have to account for this. So then what was the point of having a
separate `GenericDynamicFunction` anyways?

And even apart from `IntoFunction`, there's nothing stopping someone
from manually creating a generic `DynamicFunction` through lying about
its `FunctionInfo` and wrapping a `GenericDynamicFunction`.

That being said, this is probably the "best" alternative if we added a
`Function` trait and stored functions as `Box<dyn Function>`.

However, I'm not convinced we gain much from this. Sure, we could keep
the API for `DynamicFunction` the same, but consumers of `Function` will
need to account for `GenericDynamicFunction` regardless (e.g. handling
multiple `FunctionInfo`, a ranged argument count, etc.). And for all
cases, except where using `DynamicFunction` directly, you end up
treating them all like `GenericDynamicFunction`.

Right now, if we did go with `GenericDynamicFunction`, the only major
benefit we'd gain would be saving 24 bytes. If memory ever does become
an issue here, we could swap over. But I think for the time being it's
better for us to pursue a clearer mental model and end-user ergonomics
through unification.

</details>

##### Using the `FunctionRegistry`

**TL;DR:** Having overloads only exist in the `FunctionRegistry`
unnecessarily splits and complicates the API.

<details>
<summary>Details</summary>

Another idea was to store the overloads in the `FunctionRegistry`. Users
would then just call functions directly through the registry (i.e.
`registry.call("my_func", my_args)`).

I didn't go with this option because of how it specifically relies on
the functions being registered. You'd not only always need access to the
registry, but you'd need to ensure that the functions you want to call
are even registered.

It also means you can't just store a generic `DynamicFunction` on a
type. Instead, you'll need to store the function's name and use that to
look up the function in the registry—even if it's only ever used by that
type.

Doing so also removes all the benefits of `DynamicFunction`, such as the
ability to pass it to functions accepting `IntoFunction`, modify it if
needed, and so on.

Like `GenericDynamicFunction` this introduces a split in the ecosystem:
you either store `DynamicFunction`, store a string to look up the
function, or force `DynamicFunction` to wrap your generic function
anyways. Or worse yet: have `DynamicFunction` wrap the lookup function
using `FunctionRegistryArc`.

</details>

#### Generic `ArgInfo`

**TL;DR:** Allowing `ArgInfo` and `ReturnInfo` to store the generic
information introduces a footgun when interpreting `FunctionInfo`.

<details>
<summary>Details</summary>

Regardless of how we represent a generic function, one thing is clear:
we need to be able to represent the information for such a function.

This PR does so by introducing a `FunctionInfoType` enum to wrap one or
more `FunctionInfo` values.

Originally, I didn't do this. I had `ArgInfo` and `ReturnInfo` allow for
generic types. This allowed us to have a single `FunctionInfo` to
represent our function, but then I realized that it actually lies about
our function.

If we have two `ArgInfo` that both allow for either `i32` or `u32`, what
does this tell us about our function? It turns out: nothing! We can't
know whether our function takes `(i32, i32)`, `(u32, u32)`, `(i32,
u32)`, or `(u32, i32)`.

It therefore makes more sense to just represent a function with multiple
`FunctionInfo` since that's really what it's made up of.

</details>

#### Flatten `FunctionInfo`

**TL;DR:** Flattening removes additional per-overload information some
users may desire and prevents us from adding more information in the
future.

<details>
<summary>Details</summary>

Why don't we just flatten multiple `FunctionInfo` into just one that can
contain multiple signatures?

This is something we could do, but I decided against it for a few
reasons:
- The only thing we'd be able to get rid of for each signature would be
the `name`. While not enough to not do it, it doesn't really suggest we
*have* to either.
- Some consumers may want access to the names of the functions that make
up the overloaded function. For example, to track a bug where an
undesirable function is being added as an overload. Or to more easily
locate the original function of an overload.
- We may eventually allow for more information to be stored on
`FunctionInfo`. For example, we may allow for documentation to be stored
like we do for `TypeInfo`. Consumers of this documentation may want
access to the documentation of each overload as they may provide
documentation specific to that overload.

</details>

## Testing

This PR adds lots of tests and benchmarks, and also adds to the example.

To run the tests:

```
cargo test --package bevy_reflect --all-features
```

To run the benchmarks:

```
cargo bench --bench reflect_function --all-features
```

To run the example:

```
cargo run --package bevy --example function_reflection --all-features
```

### Benchmarks

One of my goals with this PR was to leave the typical case of
non-overloaded functions largely unaffected by the changes introduced in
this PR. ~~And while the static size of `DynamicFunction` has increased
by 17% (from 136 to 160 bytes), the performance has generally stayed the
same~~ The static size of `DynamicFunction` has decreased from 136 to
112 bytes, while calling performance has generally stayed the same:

|                                     | `main` | 7d293ab | 252f3897d |
|-------------------------------------|--------|---------|-----------|
| `into/function`                     | 37 ns  | 46 ns   | 142 ns    |
| `with_overload/01_simple_overload`  | -      | 149 ns  | 268 ns    |
| `with_overload/01_complex_overload` | -      | 332 ns  | 431 ns    |
| `with_overload/10_simple_overload`  | -      | 1266 ns | 2618 ns   |
| `with_overload/10_complex_overload` | -      | 2544 ns | 4170 ns   |
| `call/function`                     | 57 ns  | 58 ns   | 61 ns     |
| `call/01_simple_overload`           | -      | 255 ns  | 242 ns    |
| `call/01_complex_overload`          | -      | 595 ns  | 431 ns    |
| `call/10_simple_overload`           | -      | 740 ns  | 699 ns    |
| `call/10_complex_overload`          | -      | 1824 ns | 1618 ns   |

For the overloaded function tests, the leading number indicates how many
overloads there are: `01` indicates 1 overload, `10` indicates 10
overloads. The `complex` cases have 10 unique generic types and 10
arguments, compared to the `simple` 1 generic type and 2 arguments.

I aimed to prioritize the performance of calling the functions over
creating them, hence creation speed tends to be a bit slower.

There may be other optimizations we can look into but that's probably
best saved for a future PR.

The important bit is that the standard ~~`into/function`~~ and
`call/function` benchmarks show minimal regressions. Since the latest
changes, `into/function` does have some regressions, but again the
priority was `call/function`. We can probably optimize `into/function`
if needed in the future.

---

## Showcase

Function reflection now supports [function
overloading](https://en.wikipedia.org/wiki/Function_overloading)! This
can be used to simulate generic functions:

```rust
fn add<T: Add<Output=T>>(a: T, b: T) -> T {
    a + b
}

let reflect_add = add::<i32>
    .into_function()
    .with_overload(add::<u32>)
    .with_overload(add::<f32>);

let args = ArgList::default().push_owned(25_i32).push_owned(75_i32);  
let result = func.call(args).unwrap().unwrap_owned();  
assert_eq!(result.try_take::<i32>().unwrap(), 100);  
  
let args = ArgList::default().push_owned(25.0_f32).push_owned(75.0_f32);  
let result = func.call(args).unwrap().unwrap_owned();  
assert_eq!(result.try_take::<f32>().unwrap(), 100.0);
```

You can also simulate variadic functions:

```rust
#[derive(Reflect, PartialEq, Debug)]
struct Player {
    name: Option<String>,
    health: u32,
}

// Creates a `Player` with one of the following:  
// - No name and 100 health  
// - A name and 100 health  
// - No name and custom health  
// - A name and custom health
let create_player = (|| Player {
        name: None,
        health: 100,
    })
    .into_function()
    .with_overload(|name: String| Player {
        name: Some(name),
        health: 100,
    })
    .with_overload(|health: u32| Player {
        name: None,
        health
    })
    .with_overload(|name: String, health: u32| Player {
        name: Some(name),
        health,
    });

let args = ArgList::default()
    .push_owned(String::from("Urist"))
    .push_owned(55_u32);
    
let player = create_player
    .call(args)
    .unwrap()
    .unwrap_owned()
    .try_take::<Player>()
    .unwrap();
	
assert_eq!(
    player,
    Player {
        name: Some(String::from("Urist")),
        health: 55
    }
);
```
2024-12-10 01:51:47 +00:00
Zachary Harrold
a35811d088
Add Immutable Component Support (#16372)
# Objective

- Fixes #16208

## Solution

- Added an associated type to `Component`, `Mutability`, which flags
whether a component is mutable, or immutable. If `Mutability= Mutable`,
the component is mutable. If `Mutability= Immutable`, the component is
immutable.
- Updated `derive_component` to default to mutable unless an
`#[component(immutable)]` attribute is added.
- Updated `ReflectComponent` to check if a component is mutable and, if
not, panic when attempting to mutate.

## Testing

- CI
- `immutable_components` example.

---

## Showcase

Users can now mark a component as `#[component(immutable)]` to prevent
safe mutation of a component while it is attached to an entity:

```rust
#[derive(Component)]
#[component(immutable)]
struct Foo {
    // ...
}
```

This prevents creating an exclusive reference to the component while it
is attached to an entity. This is particularly powerful when combined
with component hooks, as you can now fully track a component's value,
ensuring whatever invariants you desire are upheld. Before this would be
done my making a component private, and manually creating a `QueryData`
implementation which only permitted read access.

<details>
  <summary>Using immutable components as an index</summary>
  
```rust
/// This is an example of a component like [`Name`](bevy::prelude::Name), but immutable.
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Component)]
#[component(
    immutable,
    on_insert = on_insert_name,
    on_replace = on_replace_name,
)]
pub struct Name(pub &'static str);

/// This index allows for O(1) lookups of an [`Entity`] by its [`Name`].
#[derive(Resource, Default)]
struct NameIndex {
    name_to_entity: HashMap<Name, Entity>,
}

impl NameIndex {
    fn get_entity(&self, name: &'static str) -> Option<Entity> {
        self.name_to_entity.get(&Name(name)).copied()
    }
}

fn on_insert_name(mut world: DeferredWorld<'_>, entity: Entity, _component: ComponentId) {
    let Some(&name) = world.entity(entity).get::<Name>() else {
        unreachable!()
    };
    let Some(mut index) = world.get_resource_mut::<NameIndex>() else {
        return;
    };

    index.name_to_entity.insert(name, entity);
}

fn on_replace_name(mut world: DeferredWorld<'_>, entity: Entity, _component: ComponentId) {
    let Some(&name) = world.entity(entity).get::<Name>() else {
        unreachable!()
    };
    let Some(mut index) = world.get_resource_mut::<NameIndex>() else {
        return;
    };

    index.name_to_entity.remove(&name);
}

// Setup our name index
world.init_resource::<NameIndex>();

// Spawn some entities!
let alyssa = world.spawn(Name("Alyssa")).id();
let javier = world.spawn(Name("Javier")).id();

// Check our index
let index = world.resource::<NameIndex>();

assert_eq!(index.get_entity("Alyssa"), Some(alyssa));
assert_eq!(index.get_entity("Javier"), Some(javier));

// Changing the name of an entity is also fully capture by our index
world.entity_mut(javier).insert(Name("Steven"));

// Javier changed their name to Steven
let steven = javier;

// Check our index
let index = world.resource::<NameIndex>();

assert_eq!(index.get_entity("Javier"), None);
assert_eq!(index.get_entity("Steven"), Some(steven));
```
  
</details>

Additionally, users can use `Component<Mutability = ...>` in trait
bounds to enforce that a component _is_ mutable or _is_ immutable. When
using `Component` as a trait bound without specifying `Mutability`, any
component is applicable. However, methods which only work on mutable or
immutable components are unavailable, since the compiler must be
pessimistic about the type.

## Migration Guide

- When implementing `Component` manually, you must now provide a type
for `Mutability`. The type `Mutable` provides equivalent behaviour to
earlier versions of `Component`:
```rust
impl Component for Foo {
    type Mutability = Mutable;
    // ...
}
```
- When working with generic components, you may need to specify that
your generic parameter implements `Component<Mutability = Mutable>`
rather than `Component` if you require mutable access to said component.
- The entity entry API has had to have some changes made to minimise
friction when working with immutable components. Methods which
previously returned a `Mut<T>` will now typically return an
`OccupiedEntry<T>` instead, requiring you to add an `into_mut()` to get
the `Mut<T>` item again.

## Draft Release Notes

Components can now be made immutable while stored within the ECS.

Components are the fundamental unit of data within an ECS, and Bevy
provides a number of ways to work with them that align with Rust's rules
around ownership and borrowing. One part of this is hooks, which allow
for defining custom behavior at key points in a component's lifecycle,
such as addition and removal. However, there is currently no way to
respond to _mutation_ of a component using hooks. The reasons for this
are quite technical, but to summarize, their addition poses a
significant challenge to Bevy's core promises around performance.
Without mutation hooks, it's relatively trivial to modify a component in
such a way that breaks invariants it intends to uphold. For example, you
can use `core::mem::swap` to swap the components of two entities,
bypassing the insertion and removal hooks.

This means the only way to react to this modification is via change
detection in a system, which then begs the question of what happens
_between_ that alteration and the next run of that system?
Alternatively, you could make your component private to prevent
mutation, but now you need to provide commands and a custom `QueryData`
implementation to allow users to interact with your component at all.

Immutable components solve this problem by preventing the creation of an
exclusive reference to the component entirely. Without an exclusive
reference, the only way to modify an immutable component is via removal
or replacement, which is fully captured by component hooks. To make a
component immutable, simply add `#[component(immutable)]`:

```rust
#[derive(Component)]
#[component(immutable)]
struct Foo {
    // ...
}
```

When implementing `Component` manually, there is an associated type
`Mutability` which controls this behavior:

```rust
impl Component for Foo {
    type Mutability = Mutable;
    // ...
}
```

Note that this means when working with generic components, you may need
to specify that a component is mutable to gain access to certain
methods:

```rust
// Before
fn bar<C: Component>() {
    // ...
}

// After
fn bar<C: Component<Mutability = Mutable>>() {
    // ...
}
```

With this new tool, creating index components, or caching data on an
entity should be more user friendly, allowing libraries to provide APIs
relying on components and hooks to uphold their invariants.

## Notes

- ~~I've done my best to implement this feature, but I'm not happy with
how reflection has turned out. If any reflection SMEs know a way to
improve this situation I'd greatly appreciate it.~~ There is an
outstanding issue around the fallibility of mutable methods on
`ReflectComponent`, but the DX is largely unchanged from `main` now.
- I've attempted to prevent all safe mutable access to a component that
does not implement `Component<Mutability = Mutable>`, but there may
still be some methods I have missed. Please indicate so and I will
address them, as they are bugs.
- Unsafe is an escape hatch I am _not_ attempting to prevent. Whatever
you do with unsafe is between you and your compiler.
- I am marking this PR as ready, but I suspect it will undergo fairly
major revisions based on SME feedback.
- I've marked this PR as _Uncontroversial_ based on the feature, not the
implementation.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Benjamin Brienen <benjamin.brienen@outlook.com>
Co-authored-by: Gino Valente <49806985+MrGVSV@users.noreply.github.com>
Co-authored-by: Nuutti Kotivuori <naked@iki.fi>
2024-12-05 14:27:48 +00:00
andriyDev
b1e4512648
Fix the picking backend features not actually disabling the features (#16470)
# Objective

- Fixes #16469.

## Solution

- Make the picking backend features not enabled by default in each
sub-crate.
- Make features in `bevy_internal` to set the backend features
- Make the root `bevy` crate set the features by default.

## Testing

- The mesh and sprite picking examples still work correctly.
2024-11-22 18:14:16 +00:00
Joona Aalto
3ada15ee1c
Add more Glam types and constructors to prelude (#16261)
# Objective

Glam has some common and useful types and helpers that are not in the
prelude of `bevy_math`. This includes shorthand constructors like
`vec3`, or even `Vec3A`, the aligned version of `Vec3`.

```rust
// The "normal" way to create a 3D vector
let vec = Vec3::new(2.0, 1.0, -3.0);

// Shorthand version
let vec = vec3(2.0, 1.0, -3.0);
```

## Solution

Add the following types and methods to the prelude:

- `vec2`, `vec3`, `vec3a`, `vec4`
- `uvec2`, `uvec3`, `uvec4`
- `ivec2`, `ivec3`, `ivec4`
- `bvec2`, `bvec3`, `bvec3a`, `bvec4`, `bvec4a`
- `mat2`, `mat3`, `mat3a`, `mat4`
- `quat` (not sure if anyone uses this, but for consistency)
- `Vec3A`
- `BVec3A`, `BVec4A`
- `Mat3A`

I did not add the u16, i16, or f64 variants like `dvec2`, since there
are currently no existing types like those in the prelude.

The shorthand constructors are currently used a lot in some places in
Bevy, and not at all in others. In a follow-up, we might want to
consider if we have a preference for the shorthand, and make a PR to
change the codebase to use it more consistently.
2024-11-11 18:47:16 +00:00
Benjamin Brienen
4df8b1998e
Allow or fix dead code in benches (#16282)
# Objective

Fixes #15806

## Solution

Fix an undeclared module and expect `dead_code`.

## Testing

Run this command and see no `dead_code` warnings.

`cargo +nightly check --benches --target-dir ../target --manifest-path
./benches/Cargo.toml`
2024-11-07 22:19:07 +00:00
Aevyrie
54b323ec80
Mesh picking fixes (#16110)
# Objective

- Mesh picking is noisy when a non triangle list is used
- Mesh picking runs even when users don't need it
- Resolve #16065 

## Solution

- Don't add the mesh picking plugin by default
- Remove error spam
2024-10-27 19:03:48 +00:00
MiniaczQ
e5e44888c6
Validate param benchmarks (#15885)
# Objective

Benchmark overhead of validation for:
- `DynSystemParam`,
- `ParamSet`,
- combinator systems.

Needed for #15606

## Solution

As noted in objective, I've added 3 benchmarks, where each uses an
excessive amount of the specific functionality.
I benchmark on the level of schedules, rather than individual
`validate_param` calls, so we get a better idea how changes to the code
impact memory-lookup, etc. related side effects.

## Testing

```
param/combinator_system/8_piped_systems
                        time:   [1.7560 µs 1.7865 µs 1.8180 µs]
                        change: [+4.5244% +6.7955% +9.1413%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

param/combinator_system/8_dyn_params_system
                        time:   [89.354 ns 89.790 ns 90.300 ns]
                        change: [+0.6751% +1.6825% +2.6842%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe

param/combinator_system/8_variant_param_set_system
                        time:   [88.295 ns 89.202 ns 90.208 ns]
                        change: [+0.1320% +1.0060% +1.8482%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
```

2 back-to-back runs of the benchmarks, there is quire a lot of noise,
can use feedback on fixing that
2024-10-15 02:38:22 +00:00
Pablo Reinhardt
d96a9d15f6
Migrate from Query::single and friends to Single (#15872)
# Objective

- closes #15866

## Solution

- Simply migrate where possible.

## Testing

- Expect that CI will do most of the work. Examples is another way of
testing this, as most of the work is in that area.
---

## Notes
For now, this PR doesn't migrate `QueryState::single` and friends as for
now, this look like another issue. So for example, QueryBuilders that
used single or `World::query` that used single wasn't migrated. If there
is a easy way to migrate those, please let me know.

Most of the uses of `Query::single` were removed, the only other uses
that I found was related to tests of said methods, so will probably be
removed when we remove `Query::single`.
2024-10-13 20:32:06 +00:00
JaySpruce
3d6b24880e
Add insert_batch and variations (#15702)
# Objective

`insert_or_spawn_batch` exists, but a version for just inserting doesn't
- Closes #2693 
- Closes #8384 
- Adopts/supersedes #8600 

## Solution

Add `insert_batch`, along with the most common `insert` variations:
- `World::insert_batch`
- `World::insert_batch_if_new`
- `World::try_insert_batch`
- `World::try_insert_batch_if_new`
- `Commands::insert_batch`
- `Commands::insert_batch_if_new`
- `Commands::try_insert_batch`
- `Commands::try_insert_batch_if_new`

## Testing

Added tests, and added a benchmark for `insert_batch`.
Performance is slightly better than `insert_or_spawn_batch` when only
inserting:


![Code_HPnUN0QeWe](https://github.com/user-attachments/assets/53091e4f-6518-43f4-a63f-ae57d5470c66)

<details>
<summary>old benchmark</summary>

This was before reworking it to remove the `UnsafeWorldCell`:


![Code_QhXJb8sjlJ](https://github.com/user-attachments/assets/1061e2a7-a521-48e1-a799-1b6b8d1c0b93)
</details>

---

## Showcase

Usage is the same as `insert_or_spawn_batch`:
```
use bevy_ecs::{entity::Entity, world::World, component::Component};
#[derive(Component)]
struct A(&'static str);
#[derive(Component, PartialEq, Debug)]
struct B(f32);

let mut world = World::new();
let entity_a = world.spawn_empty().id();
let entity_b = world.spawn_empty().id();
world.insert_batch([
    (entity_a, (A("a"), B(0.0))),
    (entity_b, (A("b"), B(1.0))),
]);

assert_eq!(world.get::<B>(entity_a), Some(&B(0.0)));

```
2024-10-13 18:14:16 +00:00
Joona Aalto
0e30b68b20
Add mesh picking backend and MeshRayCast system parameter (#15800)
# Objective

Closes #15545.

`bevy_picking` supports UI and sprite picking, but not mesh picking.
Being able to pick meshes would be extremely useful for various games,
tools, and our own examples, as well as scene editors and inspectors.
So, we need a mesh picking backend!

Luckily,
[`bevy_mod_picking`](https://github.com/aevyrie/bevy_mod_picking) (which
`bevy_picking` is based on) by @aevyrie already has a [backend for
it](74f0c3c0fb/backends/bevy_picking_raycast/src/lib.rs)
using [`bevy_mod_raycast`](https://github.com/aevyrie/bevy_mod_raycast).
As a side product of adding mesh picking, we also get support for
performing ray casts on meshes!

## Solution

Upstream a large chunk of the immediate-mode ray casting functionality
from `bevy_mod_raycast`, and add a mesh picking backend based on
`bevy_mod_picking`. Huge thanks to @aevyrie who did all the hard work on
these incredible crates!

All meshes are pickable by default. Picking can be disabled for
individual entities by adding `PickingBehavior::IGNORE`, like normal.
Or, if you want mesh picking to be entirely opt-in, you can set
`MeshPickingBackendSettings::require_markers` to `true` and add a
`RayCastPickable` component to the desired camera and target entities.

You can also use the new `MeshRayCast` system parameter to cast rays
into the world manually:

```rust
fn ray_cast_system(mut ray_cast: MeshRayCast, foo_query: Query<(), With<Foo>>) {
    let ray = Ray3d::new(Vec3::ZERO, Dir3::X);

    // Only ray cast against entities with the `Foo` component.
    let filter = |entity| foo_query.contains(entity);

    // Never early-exit. Note that you can change behavior per-entity.
    let early_exit_test = |_entity| false;

    // Ignore the visibility of entities. This allows ray casting hidden entities.
    let visibility = RayCastVisibility::Any;

    let settings = RayCastSettings::default()
        .with_filter(&filter)
        .with_early_exit_test(&early_exit_test)
        .with_visibility(visibility);

    // Cast the ray with the settings, returning a list of intersections.
    let hits = ray_cast.cast_ray(ray, &settings);
}
```

This is largely a direct port, but I did make several changes to match
our APIs better, remove things we don't need or that I think are
unnecessary, and do some general improvements to code quality and
documentation.

### Changes Relative to `bevy_mod_raycast` and `bevy_mod_picking`

- Every `Raycast` and "raycast" has been renamed to `RayCast` and "ray
cast" (similar reasoning as the "Naming" section in #15724)
- `Raycast` system param has been renamed to `MeshRayCast` to avoid
naming conflicts and to be explicit that it is not for colliders
- `RaycastBackend` has been renamed to `MeshPickingBackend`
- `RayCastVisibility` variants are now `Any`, `Visible`, and
`VisibleInView` instead of `Ignore`, `MustBeVisible`, and
`MustBeVisibleAndInView`
- `NoBackfaceCulling` has been renamed to `RayCastBackfaces`, to avoid
implying that it affects the rendering of backfaces for meshes (it
doesn't)
- `SimplifiedMesh` and `RayCastBackfaces` live near other ray casting
API types, not in their own 10 LoC module
- All intersection logic and types are in the same `intersections`
module, not split across several modules
- Some intersection types have been renamed to be clearer and more
consistent
	- `IntersectionData` -> `RayMeshHit` 
	- `RayHit` -> `RayTriangleHit`
- General documentation and code quality improvements

### Removed / Not Ported

- Removed unused ray helpers and types, like `PrimitiveIntersection`
- Removed getters on intersection types, and made their properties
public
- There is no `2d` feature, and `Raycast::mesh_query` and
`Raycast::mesh2d_query` have been merged into `MeshRayCast::mesh_query`,
which handles both 2D and 3D
- I assume this existed previously because `Mesh2dHandle` used to be in
`bevy_sprite`. Now both the 2D and 3D mesh are in `bevy_render`.
- There is no `debug` feature or ray debug rendering
- There is no deferred API (`RaycastSource`)
- There is no `CursorRayPlugin` (the picking backend handles this)

### Note for Reviewers

In case it's helpful, the [first
commit](281638ef10)
here is essentially a one-to-one port. The rest of the commits are
primarily refactoring and cleaning things up in the ways listed earlier,
as well as changes to the module structure.

It may also be useful to compare the original [picking
backend](74f0c3c0fb/backends/bevy_picking_raycast/src/lib.rs)
and [`bevy_mod_raycast`](https://github.com/aevyrie/bevy_mod_raycast) to
this PR. Feel free to mention if there are any changes that I should
revert or something I should not include in this PR.

## Testing

I tested mesh picking and relevant components in some examples, for both
2D and 3D meshes, and added a new `mesh_picking` example. I also
~~stole~~ ported over the [ray-mesh intersection
benchmark](dbc5ef32fe/benches/ray_mesh_intersection.rs)
from `bevy_mod_raycast`.

---

## Showcase

Below is a version of the `2d_shapes` example modified to demonstrate 2D
mesh picking. This is not included in this PR.


https://github.com/user-attachments/assets/7742528c-8630-4c00-bacd-81576ac432bf

And below is the new `mesh_picking` example:


https://github.com/user-attachments/assets/b65c7a5a-fa3a-4c2d-8bbd-e7a2c772986e

There is also a really cool new `mesh_ray_cast` example ported over from
`bevy_mod_raycast`:


https://github.com/user-attachments/assets/3c5eb6c0-bd94-4fb0-bec6-8a85668a06c9

---------

Co-authored-by: Aevyrie <aevyrie@gmail.com>
Co-authored-by: Trent <2771466+tbillington@users.noreply.github.com>
Co-authored-by: François Mockers <mockersf@gmail.com>
2024-10-13 17:24:19 +00:00
Christian Hughes
219b5930f1
Rename App/World::observe to add_observer, EntityWorldMut::observe_entity to observe. (#15754)
# Objective

- Closes #15752

Calling the functions `App::observe` and `World::observe` doesn't make
sense because you're not "observing" the `App` or `World`, you're adding
an observer that listens for an event that occurs *within* the `World`.
We should rename them to better fit this.

## Solution

Renames:
- `App::observe` -> `App::add_observer`
- `World::observe` -> `World::add_observer`
- `Commands::observe` -> `Commands::add_observer`
- `EntityWorldMut::observe_entity` -> `EntityWorldMut::observe`

(Note this isn't a breaking change as the original rename was introduced
earlier this cycle.)

## Testing

Reusing current tests.
2024-10-09 15:39:29 +00:00
Trashtalk217
d1bd46d45e
Deprecate get_or_spawn (#15652)
# Objective

After merging retained rendering world #15320, we now have a good way of
creating a link between worlds (*HIYAA intensifies*). This means that
`get_or_spawn` is no longer necessary for that function. Entity should
be opaque as the warning above `get_or_spawn` says. This is also part of
#15459.

I'm deprecating `get_or_spawn_batch` in a different PR in order to keep
the PR small in size.

## Solution

Deprecate `get_or_spawn` and replace it with `get_entity` in most
contexts. If it's possible to query `&RenderEntity`, then the entity is
synced and `render_entity.id()` is initialized in the render world.

## Migration Guide

If you are given an `Entity` and you want to do something with it, use
`Commands.entity(...)` or `World.entity(...)`. If instead you want to
spawn something use `Commands.spawn(...)` or `World.spawn(...)`. If you
are not sure if an entity exists, you can always use `get_entity` and
match on the `Option<...>` that is returned.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-10-07 16:08:22 +00:00
Kristoffer Søholm
336c23c1aa
Rename observe to observe_entity on EntityWorldMut (#15616)
# Objective

The current observers have some unfortunate footguns where you can end
up confused about what is actually being observed. For apps you can
chain observe like `app.observe(..).observe(..)` which works like you
would expect, but if you try the same with world the first `observe()`
will return the `EntityWorldMut` for the created observer, and the
second `observe()` will only observe on the observer entity. It took
several hours for multiple people on discord to figure this out, which
is not a great experience.

## Solution

Rename `observe` on entities to `observe_entity`. It's slightly more
verbose when you know you have an entity, but it feels right to me that
observers for specific things have more specific naming, and it prevents
this issue completely.

Another possible solution would be to unify `observe` on `App` and
`World` to have the same kind of return type, but I'm not sure exactly
what that would look like.

## Testing

Simple name change, so only concern is docs really.

---


## Migration Guide

The `observe()` method on entities has been renamed to
`observe_entity()` to prevent confusion about what is being observed in
some cases.
2024-10-03 17:05:26 +00:00
rudderbucky
2da8d17a44
Add try_despawn methods to World/Commands (#15480)
# Objective

Fixes #14511.

`despawn` allows you to remove entities from the world. However, if the
entity does not exist, it emits a warning. This may not be intended
behavior for many users who have use cases where they need to call
`despawn` regardless of if the entity actually exists (see the issue),
or don't care in general if the entity already doesn't exist.

(Also trying to gauge interest on if this feature makes sense, I'd
personally love to have it, but I could see arguments that this might be
a footgun. Just trying to help here 😄 If there's no contention I could
also implement this for `despawn_recursive` and `despawn_descendants` in
the same PR)

## Solution

Add `try_despawn`, `try_despawn_recursive` and
`try_despawn_descendants`.

Modify `World::despawn_with_caller` to also take in a `warn` boolean
argument, which is then considered when logging the warning. Set
`log_warning` to `true` in the case of `despawn`, and `false` in the
case of `try_despawn`.

## Testing

Ran `cargo run -p ci` on macOS, it seemed fine.
2024-10-03 16:21:05 +00:00
rudderbucky
5e81154e9c
Despawn and despawn_recursive benchmarks (#15610)
# Objective

Add despawn and despawn_recursive benchmarks in a similar vein to the
spawn benchmark.

## Testing

Ran `cargo bench` from `benches` and it compiled fine.

On my machine:
```
despawn_world/1_entities
                        time:   [3.1495 ns 3.1574 ns 3.1652 ns]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
despawn_world/10_entities
                        time:   [28.629 ns 28.674 ns 28.720 ns]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
despawn_world/100_entities
                        time:   [286.95 ns 287.41 ns 287.90 ns]
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
despawn_world/1000_entities
                        time:   [2.8739 µs 2.9001 µs 2.9355 µs]
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) high mild
  6 (6.00%) high severe
despawn_world/10000_entities
                        time:   [28.535 µs 28.617 µs 28.698 µs]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

despawn_world_recursive/1_entities
                        time:   [5.2270 ns 5.2507 ns 5.2907 ns]
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low mild
  6 (6.00%) high mild
  4 (4.00%) high severe
despawn_world_recursive/10_entities
                        time:   [57.495 ns 57.590 ns 57.691 ns]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
despawn_world_recursive/100_entities
                        time:   [514.43 ns 518.91 ns 526.88 ns]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
despawn_world_recursive/1000_entities
                        time:   [5.0362 µs 5.0463 µs 5.0578 µs]
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe
despawn_world_recursive/10000_entities
                        time:   [51.159 µs 51.603 µs 52.215 µs]
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe
```
2024-10-03 14:59:37 +00:00
Tim
461305b3d7
Revert "Have EntityCommands methods consume self for easier chaining" (#15523)
As discussed in #15521

- Partial revert of #14897, reverting the change to the methods to
consume `self`
- The `insert_if` method is kept

The migration guide of #14897 should be removed
Closes #15521

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-10-02 12:47:26 +00:00
Zachary Harrold
d70595b667
Add core and alloc over std Lints (#15281)
# Objective

- Fixes #6370
- Closes #6581

## Solution

- Added the following lints to the workspace:
  - `std_instead_of_core`
  - `std_instead_of_alloc`
  - `alloc_instead_of_core`
- Used `cargo +nightly fmt` with [item level use
formatting](https://rust-lang.github.io/rustfmt/?version=v1.6.0&search=#Item%5C%3A)
to split all `use` statements into single items.
- Used `cargo clippy --workspace --all-targets --all-features --fix
--allow-dirty` to _attempt_ to resolve the new linting issues, and
intervened where the lint was unable to resolve the issue automatically
(usually due to needing an `extern crate alloc;` statement in a crate
root).
- Manually removed certain uses of `std` where negative feature gating
prevented `--all-features` from finding the offending uses.
- Used `cargo +nightly fmt` with [crate level use
formatting](https://rust-lang.github.io/rustfmt/?version=v1.6.0&search=#Crate%5C%3A)
to re-merge all `use` statements matching Bevy's previous styling.
- Manually fixed cases where the `fmt` tool could not re-merge `use`
statements due to conditional compilation attributes.

## Testing

- Ran CI locally

## Migration Guide

The MSRV is now 1.81. Please update to this version or higher.

## Notes

- This is a _massive_ change to try and push through, which is why I've
outlined the semi-automatic steps I used to create this PR, in case this
fails and someone else tries again in the future.
- Making this change has no impact on user code, but does mean Bevy
contributors will be warned to use `core` and `alloc` instead of `std`
where possible.
- This lint is a critical first step towards investigating `no_std`
options for Bevy.

---------

Co-authored-by: François Mockers <francois.mockers@vleue.com>
2024-09-27 00:59:59 +00:00
targrub
de3c70a8d3
Update `glam to 0.29, encase` to 0.10. (#15249)
# Objective

Updating ``glam`` to 0.29, ``encase`` to 0.10.

## Solution

Update the necessary ``Cargo.toml`` files.

## Testing

Ran ``cargo run -p ci`` on Windows; no issues came up.

---------

Co-authored-by: aecsocket <aecsocket@tutanota.com>
2024-09-23 19:44:02 +00:00
Benjamin Brienen
27bea6abf7
Bubbling observers traversal should use query data (#15385)
# Objective

Fixes #14331

## Solution

- Make `Traversal` a subtrait of `ReadOnlyQueryData`
- Update implementations and usages

## Testing

- Updated unit tests

## Migration Guide

Update implementations of `Traversal`.

---------

Co-authored-by: Christian Hughes <9044780+ItsDoot@users.noreply.github.com>
2024-09-23 18:08:36 +00:00
Gino Valente
59c0521690
bevy_reflect: Add Function trait (#15205)
# Objective

While #13152 added function reflection, it didn't really make functions
reflectable. Instead, it made it so that they can be called with
reflected arguments and return reflected data. But functions themselves
cannot be reflected.

In other words, we can't go from `DynamicFunction` to `dyn
PartialReflect`.

## Solution

Allow `DynamicFunction` to actually be reflected.

This PR adds the `Function` reflection subtrait (and corresponding
`ReflectRef`, `ReflectKind`, etc.). With this new trait, we're able to
implement `PartialReflect` on `DynamicFunction`.

### Implementors

`Function` is currently only implemented for `DynamicFunction<'static>`.
This is because we can't implement it generically over all
functions—even those that implement `IntoFunction`.

What about `DynamicFunctionMut`? Well, this PR does **not** implement
`Function` for `DynamicFunctionMut`.

The reasons for this are a little complicated, but it boils down to
mutability. `DynamicFunctionMut` requires `&mut self` to be invoked
since it wraps a `FnMut`. However, we can't really model this well with
`Function`. And if we make `DynamicFunctionMut` wrap its internal
`FnMut` in a `Mutex` to allow for `&self` invocations, then we run into
either concurrency issues or recursion issues (or, in the worst case,
both).

So for the time-being, we won't implement `Function` for
`DynamicFunctionMut`. It will be better to evaluate it on its own. And
we may even consider the possibility of removing it altogether if it
adds too much complexity to the crate.

### Dynamic vs Concrete

One of the issues with `DynamicFunction` is the fact that it's both a
dynamic representation (like `DynamicStruct` or `DynamicList`) and the
only way to represent a function.

Because of this, it's in a weird middle ground where we can't easily
implement full-on `Reflect`. That would require `Typed`, but what static
`TypeInfo` could it provide? Just that it's a `DynamicFunction`? None of
the other dynamic types implement `Typed`.

However, by not implementing `Reflect`, we lose the ability to downcast
back to our `DynamicStruct`. Our only option is to call
`Function::clone_dynamic`, which clones the data rather than by simply
downcasting. This works in favor of the `PartialReflect::try_apply`
implementation since it would have to clone anyways, but is definitely
not ideal. This is also the reason I had to add `Debug` as a supertrait
on `Function`.

For now, this PR chooses not to implement `Reflect` for
`DynamicFunction`. We may want to explore this in a followup PR (or even
this one if people feel strongly that it's strictly required).

The same is true for `FromReflect`. We may decide to add an
implementation there as well, but it's likely out-of-scope of this PR.

## Testing

You can test locally by running:

```
cargo test --package bevy_reflect --all-features
```

---

## Showcase

You can now pass around a `DynamicFunction` as a `dyn PartialReflect`!
This also means you can use it as a field on a reflected type without
having to ignore it (though you do need to opt out of `FromReflect`).

```rust
#[derive(Reflect)]
#[reflect(from_reflect = false)]
struct ClickEvent {
    callback: DynamicFunction<'static>,
}

let event: Box<dyn Struct> = Box::new(ClickEvent {
    callback: (|| println!("Clicked!")).into_function(),
});

// We can access our `DynamicFunction` as a `dyn PartialReflect`
let callback: &dyn PartialReflect = event.field("callback").unwrap();

// And access function-related methods via the new `Function` trait
let ReflectRef::Function(callback) = callback.reflect_ref() else {
    unreachable!()
};

// Including calling the function
callback.reflect_call(ArgList::new()).unwrap(); // Prints: Clicked!
```
2024-09-22 14:19:12 +00:00
Benjamin Brienen
1b8c1c1242
simplify std::mem references (#15315)
# Objective
- Fixes #15314

## Solution

- Remove unnecessary usings and simplify references to those functions.

## Testing

CI
2024-09-19 21:28:16 +00:00
Benjamin Brienen
b45d83ebda
Rename Add to Queue for methods with deferred semantics (#15234)
# Objective

- Fixes #15106

## Solution

- Trivial refactor to rename the method. The duplicate method `push` was
removed as well. This will simpify the API and make the semantics more
clear. `Add` implies that the action happens immediately, whereas in
reality, the command is queued to be run eventually.
- `ChildBuilder::add_command` has similarly been renamed to
`queue_command`.

## Testing

Unit tests should suffice for this simple refactor.

---

## Migration Guide

- `Commands::add` and `Commands::push` have been replaced with
`Commnads::queue`.
- `ChildBuilder::add_command` has been renamed to
`ChildBuilder::queue_command`.
2024-09-17 00:17:49 +00:00
Adam
9bda913e36
Remove redundent information and optimize dynamic allocations in Table (#12929)
# Objective

- fix #12853
- Make `Table::allocate` faster

## Solution
The PR consists of multiple steps:

1) For the component data: create a new data-structure that's similar to
`BlobVec` but doesn't store `len` & `capacity` inside of it: "BlobArray"
(name suggestions welcome)
2) For the `Tick` data: create a new data-structure that's similar to
`ThinSlicePtr` but supports dynamic reallocation: "ThinArrayPtr" (name
suggestions welcome)
3) Create a new data-structure that's very similar to `Column` that
doesn't store `len` & `capacity` inside of it: "ThinColumn"
4) Adjust the `Table` implementation to use `ThinColumn` instead of
`Column`

The result is that only one set of `len` & `capacity` is stored in
`Table`, in `Table::entities`

### Notes Regarding Performance
Apart from shaving off some excess memory in `Table`, the changes have
also brought noteworthy performance improvements:
The previous implementation relied on `Vec::reserve` &
`BlobVec::reserve`, but that redundantly repeated the same if statement
(`capacity` == `len`). Now that check could be made at the `Table` level
because the capacity and length of all the columns are synchronized;
saving N branches per allocation. The result is a respectable
performance improvement per every `Table::reserve` (and subsequently
`Table::allocate`) call.

I'm hesitant to give exact numbers because I don't have a lot of
experience in profiling and benchmarking, but these are the results I
got so far:

*`add_remove_big/table` benchmark after the implementation:*


![after_add_remove_big_table](https://github.com/bevyengine/bevy/assets/46227443/b667da29-1212-4020-8bb0-ec0f15bb5f8a)

*`add_remove_big/table` benchmark in main branch (measured in comparison
to the implementation):*


![main_add_remove_big_table](https://github.com/bevyengine/bevy/assets/46227443/41abb92f-3112-4e01-b935-99696eb2fe58)

*`add_remove_very_big/table` benchmark after the implementation:*


![after_add_remove_very_big](https://github.com/bevyengine/bevy/assets/46227443/f268a155-295b-4f55-ab02-f8a9dcc64fc2)

*`add_remove_very_big/table` benchmark in main branch (measured in
comparison to the implementation):*


![main_add_remove_very_big](https://github.com/bevyengine/bevy/assets/46227443/78b4e3a6-b255-47c9-baee-1a24c25b9aea)

cc @james7132 to verify

---

## Changelog

- New data-structure that's similar to `BlobVec` but doesn't store `len`
& `capacity` inside of it: `BlobArray`
- New data-structure that's similar to `ThinSlicePtr` but supports
dynamic allocation:`ThinArrayPtr`
- New data-structure that's very similar to `Column` that doesn't store
`len` & `capacity` inside of it: `ThinColumn`
- Adjust the `Table` implementation to use `ThinColumn` instead of
`Column`
- New benchmark: `add_remove_very_big` to benchmark the performance of
spawning a lot of entities with a lot of components (15) each

## Migration Guide

`Table` now uses `ThinColumn` instead of `Column`. That means that
methods that previously returned `Column`, will now return `ThinColumn`
instead.

`ThinColumn` has a much more limited and low-level API, but you can
still achieve the same things in `ThinColumn` as you did in `Column`.
For example, instead of calling `Column::get_added_tick`, you'd call
`ThinColumn::get_added_ticks_slice` and index it to get the specific
added tick.

---------

Co-authored-by: James Liu <contact@jamessliu.com>
2024-09-16 22:52:05 +00:00
re0312
739007f148
Opportunistically use dense iter for archetypal iteration in Par_iter (#14673)
# Objective

- follow of #14049 ,we could use it on our Parallel Iterator,this pr
also unified the used function in both regular iter and parallel
iterations.


## Performance 


![image](https://github.com/user-attachments/assets/cba700bc-169c-4b58-b504-823bdca8ec05)

no performance regression for regular itertaion

3.5X faster in hybrid parallel iteraion,this number is far greater than
the benefits obtained in regular iteration(~1.81) because mutable
iterations on continuous memory can effectively reduce the cost of
mataining core cache coherence
2024-09-03 23:41:10 +00:00
Shane
484721be80
Have EntityCommands methods consume self for easier chaining (#14897)
# Objective

Fixes #14883

## Solution

Pretty simple update to `EntityCommands` methods to consume `self` and
return it rather than taking `&mut self`. The things probably worth
noting:

* I added `#[allow(clippy::should_implement_trait)]` to the `add` method
because it causes a linting conflict with `std::ops::Add`.
* `despawn` and `log_components` now return `Self`. I'm not sure if
that's exactly the desired behavior so I'm happy to adjust if that seems
wrong.

## Testing

Tested with `cargo run -p ci`. I think that should be sufficient to call
things good.

## Migration Guide

The most likely migration needed is changing code from this:

```
        let mut entity = commands.get_or_spawn(entity);

        if depth_prepass {
            entity.insert(DepthPrepass);
        }
        if normal_prepass {
            entity.insert(NormalPrepass);
        }
        if motion_vector_prepass {
            entity.insert(MotionVectorPrepass);
        }
        if deferred_prepass {
            entity.insert(DeferredPrepass);
        }
```

to this:

```
        let mut entity = commands.get_or_spawn(entity);

        if depth_prepass {
            entity = entity.insert(DepthPrepass);
        }
        if normal_prepass {
            entity = entity.insert(NormalPrepass);
        }
        if motion_vector_prepass {
            entity = entity.insert(MotionVectorPrepass);
        }
        if deferred_prepass {
            entity.insert(DeferredPrepass);
        }
```

as can be seen in several of the example code updates here. There will
probably also be instances where mutable `EntityCommands` vars no longer
need to be mutable.
2024-08-26 18:24:59 +00:00
EdJoPaTo
938d810766
Apply unused_qualifications lint (#14828)
# Objective

Fixes #14782

## Solution

Enable the lint and fix all upcoming hints (`--fix`). Also tried to
figure out the false-positive (see review comment). Maybe split this PR
up into multiple parts where only the last one enables the lint, so some
can already be merged resulting in less many files touched / less
potential for merge conflicts?

Currently, there are some cases where it might be easier to read the
code with the qualifier, so perhaps remove the import of it and adapt
its cases? In the current stage it's just a plain adoption of the
suggestions in order to have a base to discuss.

## Testing

`cargo clippy` and `cargo run -p ci` are happy.
2024-08-21 12:29:33 +00:00
Gino Valente
2b4180ca8f
bevy_reflect: Function reflection terminology refactor (#14813)
# Objective

One of the changes in #14704 made `DynamicFunction` effectively the same
as `DynamicClosure<'static>`. This change meant that the de facto
function type would likely be `DynamicClosure<'static>` instead of the
intended `DynamicFunction`, since the former is much more flexible.

We _could_ explore ways of making `DynamicFunction` implement `Copy`
using some unsafe code, but it likely wouldn't be worth it. And users
would likely still reach for the convenience of
`DynamicClosure<'static>` over the copy-ability of `DynamicFunction`.

The goal of this PR is to fix this confusion between the two types.

## Solution

Firstly, the `DynamicFunction` type was removed. Again, it was no
different than `DynamicClosure<'static>` so it wasn't a huge deal to
remove.

Secondly, `DynamicClosure<'env>` and `DynamicClosureMut<'env>` were
renamed to `DynamicFunction<'env>` and `DynamicFunctionMut<'env>`,
respectively.

Yes, we still ultimately kept the naming of `DynamicFunction`, but
changed its behavior to that of `DynamicClosure<'env>`. We need a term
to refer to both functions and closures, and "function" was the best
option.


[Originally](https://discord.com/channels/691052431525675048/1002362493634629796/1274091992162242710),
I was going to go with "callable" as the replacement term to encompass
both functions and closures (e.g. `DynamciCallable<'env>`). However, it
was
[suggested](https://discord.com/channels/691052431525675048/1002362493634629796/1274653581777047625)
by @SkiFire13 that the simpler "function" term could be used instead.

While "callable" is perhaps the better umbrella term—being truly
ambiguous over functions and closures— "function" is more familiar, used
more often, easier to discover, and is subjectively just
"better-sounding".

## Testing

Most changes are purely swapping type names or updating documentation,
but you can verify everything still works by running the following
command:

```
cargo test --package bevy_reflect
```
2024-08-19 21:52:36 +00:00
re0312
3bd039e821
Skip empty archetype/table (#14749)
# Objective

- As sander commneted on discord
[link](https://discord.com/channels/691052431525675048/749335865876021248/1273414144091230228),

![image](https://github.com/user-attachments/assets/62f2b6f3-1aaf-49d9-bafa-bf62b83b10be)





## Performance

![image](https://github.com/user-attachments/assets/11122940-1547-42ae-9576-0e1a93fd9f5f)

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Giacomo Stevanato <giaco.stevanato@gmail.com>
2024-08-15 14:07:20 +00:00
Mike
3d460e98ec
Fix CI bench compile check (#14728)
# Objective

- Fixes #14723 

## Solution

- add the manifest path to the cargo command

## Testing

- ran `cargo run -p ci -- bench-check` locally
2024-08-14 13:23:00 +00:00
Matty
4ace888e4b
Fix broken bezier curve benchmark (#14677)
# Objective

Apparently #14382 broke this, but it's not a part of CI, so it wasn't
found until earlier today.

## Solution

Update the benchmark like we updated the examples.

## Testing

Running `cargo bench` actually works now.
2024-08-12 16:10:11 +00:00
Gino Valente
91fa4bb649
bevy_reflect: Function reflection benchmarks (#14647)
# Objective

It would be good to have benchmarks handy for function reflection as it
continues to be worked on.

## Solution

Add some basic benchmarks for function reflection.

## Testing

To test locally, run the following in the `benches` directory:

```
cargo bench --bench reflect_function
```

## Results

Here are a couple of the results (M1 Max MacBook Pro):

<img width="936" alt="Results of benching calling functions vs closures
via reflection. Closures average about 40ns, while functions average
about 55ns"
src="https://github.com/user-attachments/assets/b9a6c585-5fbe-43db-9a7b-f57dbd3815e3">
<img width="936" alt="Results of benching converting functions vs
closures into their dynamic representations. Closures average about
34ns, while functions average about 37ns"
src="https://github.com/user-attachments/assets/4614560a-7192-4c1e-9ade-7bc5a4ca68e3">

Currently, it seems `DynamicClosure` is just a bit more performant. This
is likely due to the fact that `DynamicFunction` stores its function
object in an `Arc` instead of a `Box` so that it can be `Send + Sync`
(and also `Clone`).

We'll likely need to do the same for `DynamicClosure` so I suspect these
results to change in the near future.
2024-08-11 03:02:06 +00:00
Robert Walter
70a18d26e2
Glam 0.28 update - adopted (#14613)
Basically it's https://github.com/bevyengine/bevy/pull/13792 with the
bumped versions of `encase` and `hexasphere`.

---------

Co-authored-by: Robert Swain <robert.swain@gmail.com>
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-08-06 01:28:00 +00:00
Periwink
ec4cf024f8
Add a ComponentIndex and update QueryState creation/update to use it (#13460)
# Objective

To implement relations we will need to add a `ComponentIndex`, which is
a map from a Component to the list of archetypes that contain this
component.
One of the reasons is that with fragmenting relations the number of
archetypes will explode, so it will become inefficient to create and
update the query caches by iterating through the list of all archetypes.

In this PR, we introduce the `ComponentIndex`, and we update the
`QueryState` to make use of it:
- if a query has at least 1 required component (i.e. something other
than `()`, `Entity` or `Option<>`, etc.): for each of the required
components we find the list of archetypes that contain it (using the
ComponentIndex). Then, we select the smallest list among these. This
gives a small subset of archetypes to iterate through compared with
iterating through all new archetypes
- if it doesn't, then we keep using the current approach of iterating
through all new archetypes


# Implementation
- This breaks query iteration order, in the sense that we are not
guaranteed anymore to return results in the order in which the
archetypes were created. I think this should be fine because this wasn't
an explicit bevy guarantee so users should not be relying on this. I
updated a bunch of unit tests that were failing because of this.

- I had an issue with the borrow checker because iterating the list of
potential archetypes requires access to `&state.component_access`, which
was conflicting with the calls to
```
  if state.new_archetype_internal(archetype) {
      state.update_archetype_component_access(archetype, access);
  }
```
which need a mutable access to the state.

The solution I chose was to introduce a `QueryStateView` which is a
temporary view into the `QueryState` which enables a "split-borrows"
kind of approach. It is described in detail in this blog post:
https://smallcultfollowing.com/babysteps/blog/2018/11/01/after-nll-interprocedural-conflicts/

# Test

The unit tests pass.

Benchmark results:
```
❯ critcmp main pr
group                                  main                                   pr
-----                                  ----                                   --
iter_fragmented/base                   1.00   342.2±25.45ns        ? ?/sec    1.02   347.5±16.24ns        ? ?/sec
iter_fragmented/foreach                1.04   165.4±11.29ns        ? ?/sec    1.00    159.5±4.27ns        ? ?/sec
iter_fragmented/foreach_wide           1.03      3.3±0.04µs        ? ?/sec    1.00      3.2±0.06µs        ? ?/sec
iter_fragmented/wide                   1.03      3.1±0.06µs        ? ?/sec    1.00      3.0±0.08µs        ? ?/sec
iter_fragmented_sparse/base            1.00      6.5±0.14ns        ? ?/sec    1.02      6.6±0.08ns        ? ?/sec
iter_fragmented_sparse/foreach         1.00      6.3±0.08ns        ? ?/sec    1.04      6.6±0.08ns        ? ?/sec
iter_fragmented_sparse/foreach_wide    1.00     43.8±0.15ns        ? ?/sec    1.02     44.6±0.53ns        ? ?/sec
iter_fragmented_sparse/wide            1.00     29.8±0.44ns        ? ?/sec    1.00     29.8±0.26ns        ? ?/sec
iter_simple/base                       1.00      8.2±0.10µs        ? ?/sec    1.00      8.2±0.09µs        ? ?/sec
iter_simple/foreach                    1.00      3.8±0.02µs        ? ?/sec    1.02      3.9±0.03µs        ? ?/sec
iter_simple/foreach_sparse_set         1.00     19.0±0.26µs        ? ?/sec    1.01     19.3±0.16µs        ? ?/sec
iter_simple/foreach_wide               1.00     17.8±0.24µs        ? ?/sec    1.00     17.9±0.31µs        ? ?/sec
iter_simple/foreach_wide_sparse_set    1.06     95.6±6.23µs        ? ?/sec    1.00     90.6±0.59µs        ? ?/sec
iter_simple/sparse_set                 1.00     19.3±1.63µs        ? ?/sec    1.01     19.5±0.29µs        ? ?/sec
iter_simple/system                     1.00      8.1±0.10µs        ? ?/sec    1.00      8.1±0.09µs        ? ?/sec
iter_simple/wide                       1.05     37.7±2.53µs        ? ?/sec    1.00     35.8±0.57µs        ? ?/sec
iter_simple/wide_sparse_set            1.00     95.7±1.62µs        ? ?/sec    1.00     95.9±0.76µs        ? ?/sec
par_iter_simple/with_0_fragment        1.04     35.0±2.51µs        ? ?/sec    1.00     33.7±0.49µs        ? ?/sec
par_iter_simple/with_1000_fragment     1.00     50.4±2.52µs        ? ?/sec    1.01     51.0±3.84µs        ? ?/sec
par_iter_simple/with_100_fragment      1.02     40.3±2.23µs        ? ?/sec    1.00     39.5±1.32µs        ? ?/sec
par_iter_simple/with_10_fragment       1.14     38.8±7.79µs        ? ?/sec    1.00     34.0±0.78µs        ? ?/sec
```
2024-08-06 00:57:15 +00:00
re0312
8235daaea0
Opportunistically use dense iteration for archetypal iteration (#14049)
# Objective
- currently, bevy employs sparse iteration if any of the target
components in the query are stored in a sparse set. it may lead to
increased cache misses in some cases, potentially impacting performance.
- partial fixes #12381 

## Solution

- use dense iteration when an archetype and its table have the same
entity count.
- to avoid introducing complicate unsafe noise, this pr only implement
for `for_each ` style iteration.
- added a benchmark to test performance for hybrid iteration.


## Performance


![image](https://github.com/bevyengine/bevy/assets/45868716/5cce13cf-6ff2-4861-9576-e75edc63bd46)

nearly 2x win in specific scenarios, and no performance degradation in
other test cases.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Christian Hughes <9044780+ItsDoot@users.noreply.github.com>
2024-08-02 21:18:15 +00:00
re0312
e5bf59d712
Recalibrated observe benchmark (#14381)
# Objective

- The event propagation benchmark is largely derived from
bevy_eventlistener. However, it doesn't accurately reflect performance
of bevy side, as our event bubble propagation is based on observer.


## Solution

- added several new benchmarks that focuse on observer itself rather
than event bubble
2024-07-18 18:25:33 +00:00
Miles Silberling-Cook
ed2b8e0f35
Minimal Bubbling Observers (#13991)
# Objective

Add basic bubbling to observers, modeled off `bevy_eventlistener`.

## Solution

- Introduce a new `Traversal` trait for components which point to other
entities.
- Provide a default `TraverseNone: Traversal` component which cannot be
constructed.
- Implement `Traversal` for `Parent`.
- The `Event` trait now has an associated `Traversal` which defaults to
`TraverseNone`.
- Added a field `bubbling: &mut bool` to `Trigger` which can be used to
instruct the runner to bubble the event to the entity specified by the
event's traversal type.
- Added an associated constant `SHOULD_BUBBLE` to `Event` which
configures the default bubbling state.
- Added logic to wire this all up correctly.

Introducing the new associated information directly on `Event` (instead
of a new `BubblingEvent` trait) lets us dispatch both bubbling and
non-bubbling events through the same api.

## Testing

I have added several unit tests to cover the common bugs I identified
during development. Running the unit tests should be enough to validate
correctness. The changes effect unsafe portions of the code, but should
not change any of the safety assertions.

## Changelog

Observers can now bubble up the entity hierarchy! To create a bubbling
event, change your `Derive(Event)` to something like the following:

```rust
#[derive(Component)]
struct MyEvent;

impl Event for MyEvent {
    type Traverse = Parent; // This event will propagate up from child to parent.
    const AUTO_PROPAGATE: bool = true; // This event will propagate by default.
}
```

You can dispatch a bubbling event using the normal
`world.trigger_targets(MyEvent, entity)`.

Halting an event mid-bubble can be done using
`trigger.propagate(false)`. Events with `AUTO_PROPAGATE = false` will
not propagate by default, but you can enable it using
`trigger.propagate(true)`.

If there are multiple observers attached to a target, they will all be
triggered by bubbling. They all share a bubbling state, which can be
accessed mutably using `trigger.propagation_mut()` (`trigger.propagate`
is just sugar for this).

You can choose to implement `Traversal` for your own types, if you want
to bubble along a different structure than provided by `bevy_hierarchy`.
Implementers must be careful never to produce loops, because this will
cause bevy to hang.

## Migration Guide
+ Manual implementations of `Event` should add associated type `Traverse
= TraverseNone` and associated constant `AUTO_PROPAGATE = false`;
+ `Trigger::new` has new field `propagation: &mut Propagation` which
provides the bubbling state.
+ `ObserverRunner` now takes the same `&mut Propagation` as a final
parameter.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Torstein Grindvik <52322338+torsteingrindvik@users.noreply.github.com>
Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2024-07-15 13:39:41 +00:00
re0312
f0bdce7425
Fair Change Detection Benchmarking (#11173)
# Objective

- #4972 introduce a benchmark to measure chang detection performance
- However,it uses `iter_batch ` cause a lot of overhead in clone data to
each routine closure(it feels like a bug in`iter_batch `) and constructs
new query in every iter.This overhead masks the real change detection
throughput we want to measure. Instead of evaluating raw change
detection, the benchmark ends up dominated by data cloning and
allocation costs.


## Solution

- Use iter_batch_ref to reduce the benchmark overload 
- Use cached query to better reflect real-world usage scenarios.
- Add more benmark

---

## Changelog
2024-06-26 12:46:41 +00:00
charlotte
4c3b7679ec
#12502 Remove limit on RenderLayers. (#13317)
# Objective

Remove the limit of `RenderLayer` by using a growable mask using
`SmallVec`.

Changes adopted from @UkoeHB's initial PR here
https://github.com/bevyengine/bevy/pull/12502 that contained additional
changes related to propagating render layers.

Changes

## Solution

The main thing needed to unblock this is removing `RenderLayers` from
our shader code. This primarily affects `DirectionalLight`. We are now
computing a `skip` field on the CPU that is then used to skip the light
in the shader.

## Testing

Checked a variety of examples and did a quick benchmark on `many_cubes`.
There were some existing problems identified during the development of
the original pr (see:
https://discord.com/channels/691052431525675048/1220477928605749340/1221190112939872347).
This PR shouldn't change any existing behavior besides removing the
layer limit (sans the comment in migration about `all` layers no longer
being possible).

---

## Changelog

Removed the limit on `RenderLayers` by using a growable bitset that only
allocates when layers greater than 64 are used.

## Migration Guide

- `RenderLayers::all()` no longer exists. Entities expecting to be
visible on all layers, e.g. lights, should compute the active layers
that are in use.

---------

Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>
2024-05-16 16:15:47 +00:00
andristarr
bb76a2c69c
multi_threaded feature rename (#12997)
# Objective

Fixes #12966

## Solution

Renaming multi_threaded feature to match snake case

## Migration Guide

Bevy feature multi-threaded should be refered to multi_threaded from now
on.
2024-05-06 20:49:32 +00:00
Martín Maita
32cd0c5dc1
Update glam version requirement from 0.25 to 0.27 (#12757)
# Objective

- Update glam version requirement to latest version.

## Solution

- Updated `glam` version requirement from 0.25 to 0.27.
- Updated `encase` and `encase_derive_impl` version requirement from 0.7
to 0.8.
- Updated `hexasphere` version requirement from 10.0 to 12.0.
- Breaking changes from glam changelog:
- [0.26.0] Minimum Supported Rust Version bumped to 1.68.2 for impl
From<bool> for {f32,f64} support.
- [0.27.0] Changed implementation of vector fract method to match the
Rust implementation instead of the GLSL implementation, that is self -
self.trunc() instead of self - self.floor().

---

## Migration Guide

- When using `glam` exports, keep in mind that `vector` `fract()` method
now matches Rust implementation (that is `self - self.trunc()` instead
of `self - self.floor()`). If you want to use the GLSL implementation
you should now use `fract_gl()`.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-05-02 18:42:34 +00:00
BD103
9ee02e87d3
Remove version field for non-publish crates and update descriptions (#13100)
# Objective

- The [`version`] field in `Cargo.toml` is optional for crates not
published on <https://crates.io>.
- We have several `publish = false` tools in this repository that still
have a version field, even when it's not useful.

[`version`]:
https://doc.rust-lang.org/cargo/reference/manifest.html#the-version-field

## Solution

- Remove the [`version`] field for all crates where `publish = false`.
- Update the description on a few crates and remove extra newlines as
well.
2024-04-26 11:55:03 +00:00
re0312
4ca8cf5d66
Cluster small table/archetype into single Task in parallel iteration (#12846)
# Objective

- Fix #7303
- bevy would spawn a lot of tasks in parallel iteration when it matchs a
large storage and many small storage ,it significantly increase the
overhead of schedule.

## Solution

- collect small storage into one task
2024-04-04 07:09:26 +00:00
Cameron
01649f13e2
Refactor App and SubApp internals for better separation (#9202)
# Objective

This is a necessary precursor to #9122 (this was split from that PR to
reduce the amount of code to review all at once).

Moving `!Send` resource ownership to `App` will make it unambiguously
`!Send`. `SubApp` must be `Send`, so it can't wrap `App`.

## Solution

Refactor `App` and `SubApp` to not have a recursive relationship. Since
`SubApp` no longer wraps `App`, once `!Send` resources are moved out of
`World` and into `App`, `SubApp` will become unambiguously `Send`.

There could be less code duplication between `App` and `SubApp`, but
that would break `App` method chaining.

## Changelog

- `SubApp` no longer wraps `App`.
- `App` fields are no longer publicly accessible.
- `App` can no longer be converted into a `SubApp`.
- Various methods now return references to a `SubApp` instead of an
`App`.
## Migration Guide

- To construct a sub-app, use `SubApp::new()`. `App` can no longer
convert into `SubApp`.
- If you implemented a trait for `App`, you may want to implement it for
`SubApp` as well.
- If you're accessing `app.world` directly, you now have to use
`app.world()` and `app.world_mut()`.
- `App::sub_app` now returns `&SubApp`.
- `App::sub_app_mut`  now returns `&mut SubApp`.
- `App::get_sub_app` now returns `Option<&SubApp>.`
- `App::get_sub_app_mut` now returns `Option<&mut SubApp>.`
2024-03-31 03:16:10 +00:00
W. Black
622f9a35b6
Torus benchmark (#12781)
# Objective

- Primitive meshing is suboptimal
- Improve primitive meshing

## Solution

- Add primitive meshing benchmark
- Allows measuring future improvements

---

First of a few PRs to refactor and improve primitive meshing.
2024-03-29 20:30:26 +00:00
targrub
13cbb9cf10
Move commands module into bevy::ecs::world (#12234)
# Objective

Fixes https://github.com/bevyengine/bevy/issues/11628

## Migration Guide

`Command` and `CommandQueue` have migrated from `bevy_ecs::system` to
`bevy_ecs::world`, so `use bevy_ecs::world::{Command, CommandQueue};`
when necessary.
2024-03-02 23:13:45 +00:00
James Liu
bc82749012
Remove APIs deprecated in 0.13 (#11974)
# Objective
We deprecated quite a few APIs in 0.13. 0.13 has shipped already. It
should be OK to remove them in 0.14's release. Fixes #4059. Fixes #9011.

## Solution
Remove them.
2024-02-19 19:04:47 +00:00
David M. Lary
0dccfb5788
Stepping disabled performance fix (#11959)
# Objective

* Fixes #11932 (performance impact when stepping is disabled)

## Solution

The `Option<FixedBitSet>` argument added to `ScheduleExecutor::run()` in
#8453 caused a measurable performance impact even when stepping is
disabled. This can be seen by the benchmark of running `Schedule:run()`
on an empty schedule in a tight loop
(https://github.com/bevyengine/bevy/issues/11932#issuecomment-1950970236).

I was able to get the same performance results as on 0.12.1 by changing
the argument
`ScheduleExecutor::run()` from `Option<FixedBitSet>` to
`Option<&FixedBitSet>`. The down-side of this change is that
`Schedule::run()` now takes about 6% longer (3.7319 ms vs 3.9855ns) when
stepping is enabled

---

## Changelog
* Change `ScheduleExecutor::run()` `_skipped_systems` from
`Option<FixedBitSet>` to `Option<&FixedBitSet>`
* Added a few benchmarks to measure `Schedule::run()` performance with
various executors
2024-02-19 17:02:14 +00:00
Doonv
1c67e020f7
Move EntityHash related types into bevy_ecs (#11498)
# Objective

Reduce the size of `bevy_utils`
(https://github.com/bevyengine/bevy/issues/11478)

## Solution

Move `EntityHash` related types into `bevy_ecs`. This also allows us
access to `Entity`, which means we no longer need `EntityHashMap`'s
first generic argument.

---

## Changelog

- Moved `bevy::utils::{EntityHash, EntityHasher, EntityHashMap,
EntityHashSet}` into `bevy::ecs::entity::hash` .
- Removed `EntityHashMap`'s first generic argument. It is now hardcoded
to always be `Entity`.

## Migration Guide

- Uses of `bevy::utils::{EntityHash, EntityHasher, EntityHashMap,
EntityHashSet}` now have to be imported from `bevy::ecs::entity::hash`.
- Uses of `EntityHashMap` no longer have to specify the first generic
parameter. It is now hardcoded to always be `Entity`.
2024-02-12 15:02:24 +00:00
BD103
3d2d61d063
Use batch spawn in benchmarks (#11611)
# Objective

- The benchmarks for `bevy_ecs`' `iter_simple` group use `for` loops
instead of `World::spawn_batch`.
- There's a TODO comment that says to batch spawn them.

## Solution

- Replace the `for` loops with `World::spawn_batch`.
2024-02-01 19:23:09 +00:00
Doonv
99449931d2
Add README to benches (#11508)
# Objective

It is unclear how to run Bevy's benchmarks

## Solution

Add a README to the benches, with documentation that tells you what the
benchmarks are, and how to run them.

---------

Co-authored-by: Rob Parrett <robparrett@gmail.com>
2024-01-24 17:11:28 +00:00
Natalie Bonnibel Baker
b257fffef8
Change Entity::generation from u32 to NonZeroU32 for niche optimization (#9907)
# Objective

- Implements change described in
https://github.com/bevyengine/bevy/issues/3022
- Goal is to allow Entity to benefit from niche optimization, especially
in the case of Option<Entity> to reduce memory overhead with structures
with empty slots

## Discussion
- First PR attempt: https://github.com/bevyengine/bevy/pull/3029
- Discord:
https://discord.com/channels/691052431525675048/1154573759752183808/1154573764240093224

## Solution

- Change `Entity::generation` from u32 to NonZeroU32 to allow for niche
optimization.
- The reason for changing generation rather than index is so that the
costs are only encountered on Entity free, instead of on Entity alloc
- There was some concern with generations being used, due to there being
some desire to introduce flags. This was more to do with the original
retirement approach, however, in reality even if generations were
reduced to 24-bits, we would still have 16 million generations available
before wrapping and current ideas indicate that we would be using closer
to 4-bits for flags.
- Additionally, another concern was the representation of relationships
where NonZeroU32 prevents us using the full address space, talking with
Joy it seems unlikely to be an issue. The majority of the time these
entity references will be low-index entries (ie. `ChildOf`, `Owes`),
these will be able to be fast lookups, and the remainder of the range
can use slower lookups to map to the address space.
- It has the additional benefit of being less visible to most users,
since generation is only ever really set through `from_bits` type
methods.
- `EntityMeta` was changed to match
- On free, generation now explicitly wraps:
- Originally, generation would panic in debug mode and wrap in release
mode due to using regular ops.
- The first attempt at this PR changed the behavior to "retire" slots
and remove them from use when generations overflowed. This change was
controversial, and likely needs a proper RFC/discussion.
- Wrapping matches current release behaviour, and should therefore be
less controversial.
- Wrapping also more easily migrates to the retirement approach, as
users likely to exhaust the exorbitant supply of generations will code
defensively against aliasing and that defensive code is less likely to
break than code assuming that generations don't wrap.
- We use some unsafe code here when wrapping generations, to avoid
branch on NonZeroU32 construction. It's guaranteed safe due to how we
perform wrapping and it results in significantly smaller ASM code.
    - https://godbolt.org/z/6b6hj8PrM 

## Migration

- Previous `bevy_scene` serializations have a high likelihood of being
broken, as they contain 0th generation entities.

## Current Issues
 
- `Entities::reserve_generations` and `EntityMapper` wrap now, even in
debug - although they technically did in release mode already so this
probably isn't a huge issue. It just depends if we need to change
anything here?

---------

Co-authored-by: Natalie Baker <natalie.baker@advancednavigation.com>
2024-01-08 23:03:00 +00:00
irate
ec14e946b8
Update glam, encase and hexasphere (#11082)
Update to `glam` 0.25, `encase` 0.7 and `hexasphere` to 10.0

## Changelog
Added the `FloatExt` trait to the `bevy_math` prelude which adds `lerp`,
`inverse_lerp` and `remap` methods to the `f32` and `f64` types.
2024-01-08 22:58:45 +00:00
James Liu
2148518758
Override QueryIter::fold to port Query::for_each perf gains to select Iterator combinators (#6773)
# Objective
After #6547, `Query::for_each` has been capable of automatic
vectorization on certain queries, which is seeing a notable (>50% CPU
time improvements) for iteration. However, `Query::for_each` isn't
idiomatic Rust, and lacks the flexibility of iterator combinators.

Ideally, `Query::iter` and friends should be able to achieve the same
results. However, this does seem to blocked upstream
(rust-lang/rust#104914) by Rust's loop optimizations.

## Solution
This is an intermediate solution and refactor. This moves the
`Query::for_each` implementation onto the `Iterator::fold`
implementation for `QueryIter` instead. This should result in the same
automatic vectorization optimization on all `Iterator` functions that
internally use fold, including `Iterator::for_each`, `Iterator::count`,
etc.

With this, it should close the gap between the two completely.
Internally, this PR changes `Query::for_each` to use
`query.iter().for_each(..)` instead of the duplicated implementation.

Separately, the duplicate implementations of internal iteration (i.e.
`Query::par_for_each`) now use portions of the current `Query::for_each`
implementation factored out into their own functions.

This also massively cleans up our internal fragmentation of internal
iteration options, deduplicating the iteration code used in `for_each`
and `par_iter().for_each()`.

---

## Changelog
Changed: `Query::for_each`, `Query::for_each_mut`, `Query::for_each`,
and `Query::for_each_mut` have been moved to `QueryIter`'s
`Iterator::for_each` implementation, and still retains their performance
improvements over normal iteration. These APIs are deprecated in 0.13
and will be removed in 0.14.

---------

Co-authored-by: JoJoJet <21144246+JoJoJet@users.noreply.github.com>
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2023-12-01 09:09:55 +00:00
Zachary Harrold
52c11f6986
Ran cargo fmt on benches crate (#10758)
# Objective

- Format `benches` crate to match current Rust standards.

## Solution

- Ran `cargo fmt` in the `benches` crate.

## Notes

I accidentally came across this when working on the `Drop`
implementation for `CommandQueue` and rather embarrassingly let it sneak
into my PR there. I think it makes sense to ensure this crate is also
well formatted to avoid it in the future.
2023-11-27 18:41:35 +00:00
scottmcm
713b1d8fa4
Optimize Entity::eq (#10519)
(This is my first PR here, so I've probably missed some things. Please
let me know what else I should do to help you as a reviewer!)

# Objective

Due to https://github.com/rust-lang/rust/issues/117800, the `derive`'d
`PartialEq::eq` on `Entity` isn't as good as it could be. Since that's
used in hashtable lookup, let's improve it.

## Solution

The derived `PartialEq::eq` short-circuits if the generation doesn't
match. However, having a branch there is sub-optimal, especially on
64-bit systems like x64 that could just load the whole `Entity` in one
load anyway.

Due to complications around `poison` in LLVM and the exact details of
what unsafe code is allowed to do with reference in Rust
(https://github.com/rust-lang/unsafe-code-guidelines/issues/346), LLVM
isn't allowed to completely remove the short-circuiting. `&Entity` is
marked `dereferencable(8)` so LLVM knows it's allowed to *load* all 8
bytes -- and does so -- but it has to assume that the `index` might be
undef/poison if the `generation` doesn't match, and thus while it finds
a way to do it without needing a branch, it has to do something slightly
more complicated than optimal to combine the results. (LLVM is allowed
to change non-short-circuiting code to use branches, but not the other
way around.)

Here's a link showing the codegen today:
<https://rust.godbolt.org/z/9WzjxrY7c>
```rust
#[no_mangle]
pub fn demo_eq_ref(a: &Entity, b: &Entity) -> bool {
    a == b
}
```
ends up generating the following assembly:
```asm
demo_eq_ref:
        movq    xmm0, qword ptr [rdi]
        movq    xmm1, qword ptr [rsi]
        pcmpeqd xmm1, xmm0
        pshufd  xmm0, xmm1, 80
        movmskpd        eax, xmm0
        cmp     eax, 3
        sete    al
        ret
```
(It's usually not this bad in real uses after inlining and LTO, but it
makes a strong demo.)

This PR manually implements `PartialEq::eq` *without* short-circuiting,
and because that tells LLVM that neither the generations nor the index
can be poison, it doesn't need to be so careful and can generate the
"just compare the two 64-bit values" code you'd have probably already
expected:
```asm
demo_eq_ref:
        mov     rax, qword ptr [rsi]
        cmp     qword ptr [rdi], rax
        sete    al
        ret
```

Since this doesn't change the representation of `Entity`, if it's
instead passed by *value*, then each `Entity` is two `u32` registers,
and the old and the new code do exactly the same thing. (Other
approaches, like changing `Entity` to be `[u32; 2]` or `u64`, affect
this case.)

This should hopefully merge easily with changes like
https://github.com/bevyengine/bevy/pull/9907 that also want to change
`Entity`.

## Benchmarks

I'm not super-confident that I got my machine fully consistent for
benchmarking, but whether I run the old or the new one first I get
reasonably consistent results.

Here's a fairly typical example of the benchmarks I added in this PR:

![image](https://github.com/bevyengine/bevy/assets/18526288/24226308-4616-4082-b0ff-88fc06285ef1)

Building the sets seems to be basically the same. It's usually reported
as noise, but sometimes I see a few percent slower or faster.

But lookup hits in particular -- since a hit checks that the key is
equal -- consistently shows around 10% improvement.

`cargo run --example many_cubes --features bevy/trace_tracy --release --
--benchmark` showed as slightly faster with this change, though if I had
to bet I'd probably say it's more noise than meaningful (but at least
it's not worse either):

![image](https://github.com/bevyengine/bevy/assets/18526288/58bb8c96-9c45-487f-a5ab-544bbfe9fba0)

This is my first PR here -- and my first time running Tracy -- so please
let me know what else I should run, or run things on your own more
reliable machines to double-check.

---

## Changelog

(probably not worth including)

Changed: micro-optimized `Entity::eq` to help LLVM slightly.

## Migration Guide

(I really hope nobody was using this on uninitialized entities where
sufficiently tortured `unsafe` could could technically notice that this
has changed.)
2023-11-14 02:06:21 +00:00
Pixelstorm
faa1b57de5
Global TaskPool API improvements (#10008)
# Objective

Reduce code duplication and improve APIs of Bevy's [global
taskpools](https://github.com/bevyengine/bevy/blob/main/crates/bevy_tasks/src/usages.rs).

## Solution

- As all three of the global taskpools have identical implementations
and only differ in their identifiers, this PR moves the implementation
into a macro to reduce code duplication.
- The `init` method is renamed to `get_or_init` to more accurately
reflect what it really does.
- Add a new `try_get` method that just returns `None` when the pool is
uninitialized, to complement the other getter methods.
- Minor documentation improvements to accompany the above changes.

---

## Changelog

- Added a new `try_get` method to the global TaskPools
- The global TaskPools' `init` method has been renamed to `get_or_init`
for clarity
- Documentation improvements

## Migration Guide

- Uses of `ComputeTaskPool::init`, `AsyncComputeTaskPool::init` and
`IoTaskPool::init` should be changed to `::get_or_init`.
2023-10-23 20:48:48 +00:00
Nicola Papale
47409c8a72
Add inline(never) to bench systems (#9824)
# Objective

It is difficult to inspect the generated assembly of benchmark systems
using a tool such as `cargo-asm`

## Solution

Mark the related functions as `#[inline(never)]`. This way, you can pass
the module name as argument to `cargo-asm` to get the generated assembly
for the given function.

It may have as side effect to make benchmarks a bit more predictable and
useful too. As it prevents inlining where in bevy no inlining could
possibly take place.

### Measurements

Following the recommendations in
<https://easyperf.net/blog/2019/08/02/Perf-measurement-environment-on-Linux>,
I

1. Put my CPU in "AMD ECO" mode, which surprisingly is the equivalent of
disabling turboboost, giving more consistent performances
2. Disabled all hyperthreading cores using `echo 0 >
/sys/devices/system/cpu/cpu{11,12…}/online`
3. Set the scaling governor to `performance`
4. Manually disabled AMD boost with `echo 0 >
/sys/devices/system/cpu/cpufreq/boost`
5. Set the nice level of the criterion benchmark using `cargo bench … &
sudo renice -n -5 -p $! ; fg`
6. Not running any other program than the benchmarks (outside of system
daemons and the X11 server)

With this setup, running multiple times the same benchmarks on `main`
gives me a lot of "regression" and "improvement" messages, which is
absurd given that no code changed.

On this branch, there is still some spurious performance change
detection, but they are much less frequent.

This only accounts for `iter_simple` and `iter_frag` benchmarks of
course.
2023-10-02 12:52:18 +00:00
Nicola Papale
359e6c718d
Use single threaded executor for archetype benches (#9835)
# Objective

`no_archetype` benchmark group results were very noisy

## Solution

Use the `SingeThreaded` executor.

On my machine, this makes the `no_archetype` bench group 20 to 30 times
faster. Meaning that most of the runtime was accounted by the
multithreaded scheduler. ie: the benchmark was not testing system
archetype update, but the overhead of multithreaded scheduling.

With this change, the benchmark results are more meaningful.

The add_archetypes function is also simplified.
2023-09-18 16:06:42 +00:00
Mike
c61faf35b8
fix deprecation warning in bench (#9823)
# Objective

- Fix deprecation warning when running benches.

## Solution

- iter -> read
2023-09-16 09:27:13 +00:00
Joseph
8eb6ccdd87
Remove useless single tuples and trailing commas (#9720)
# Objective

Title
2023-09-08 21:46:54 +00:00
Nicola Papale
ee3cc8ca86
Fix erronenous glam version (#9653)
# Objective

- Fix compilation issue with wrongly specified glam version
- bevy uses `Vec2::INFINITY`, depends on `0.24` (equivalent to `0.24.0`)
yet it was only introduced in version `0.24.1`

Context:
https://discord.com/channels/691052431525675048/692572690833473578/1146586570787397794

## Solution

- Bump glam version.
2023-08-31 12:55:17 +00:00
Mike
33fdc5f5db
Move schedule name into Schedule (#9600)
# Objective

- Move schedule name into `Schedule` to allow the schedule name to be
used for errors and tracing in Schedule methods
- Fixes #9510

## Solution

- Move label onto `Schedule` and adjust api's on `World` and `Schedule`
to not pass explicit label where it makes sense to.
- add name to errors and tracing.
- `Schedule::new` now takes a label so either add the label or use
`Schedule::default` which uses a default label. `default` is mostly used
in doc examples and tests.

---

## Changelog

- move label onto `Schedule` to improve error message and logging for
schedules.

## Migration Guide

`Schedule::new` and `App::add_schedule`
```rust
// old
let schedule = Schedule::new();
app.add_schedule(MyLabel, schedule);

// new
let schedule = Schedule::new(MyLabel);
app.add_schedule(schedule);
```

if you aren't using a label and are using the schedule struct directly
you can use the default constructor.
```rust
// old
let schedule = Schedule::new();
schedule.run(world);

// new
let schedule = Schedule::default();
schedule.run(world);
```

`Schedules:insert`
```rust
// old
let schedule = Schedule::new();
schedules.insert(MyLabel, schedule);

// new
let schedule = Schedule::new(MyLabel);
schedules.insert(schedule);
```

`World::add_schedule`
```rust
// old
let schedule = Schedule::new();
world.add_schedule(MyLabel, schedule);

// new
let schedule = Schedule::new(MyLabel);
world.add_schedule(schedule);
```
2023-08-28 20:44:48 +00:00
jnhyatt
087a345579
Rename Bezier to CubicBezier for clarity (#9554)
# Objective

A Bezier curve is a curve defined by two or more control points. In the
simplest form, it's just a line. The (arguably) most common type of
Bezier curve is a cubic Bezier, defined by four control points. These
are often used in animation, etc. Bevy has a Bezier curve struct called
`Bezier`. However, this is technically a misnomer as it only represents
cubic Bezier curves.

## Solution

This PR changes the struct name to `CubicBezier` to more accurately
reflect the struct's usage. Since it's exposed in Bevy's prelude, it can
potentially collide with other `Bezier` implementations. While that
might instead be an argument for removing it from the prelude, there's
also something to be said for adding a more general `Bezier` into Bevy,
in which case we'd likely want to use the name `Bezier`. As a final
motivator, not only is the struct located in `cubic_spines.rs`, there
are also several other spline-related structs which follow the
`CubicXxx` naming convention where applicable. For example,
`CubicSegment` represents a cubic Bezier curve (with coefficients
pre-baked).

---

## Migration Guide

- Change all `Bezier` references to `CubicBezier`
2023-08-28 17:37:42 +00:00
Nicola Papale
7b0809b532
Add reflect path parsing benchmark (#9364)
# Objective

We want to measure performance on path reflection parsing.

## Solution

Benchmark path-based reflection:
- Add a benchmark for `ParsedPath::parse`

It's fairly noisy, this is why I added the 3% threshold.

Ideally we would fix the noisiness though. Supposedly I'm seeding the
RNG correctly, so there shouldn't be much observable variance. Maybe
someone can help spot the issue.
2023-08-25 14:30:26 +00:00
Mike
ac8f36743e
enable multithreading on benches (#9388)
# Objective

- Enable the new multithreading feature on benches
2023-08-11 22:08:36 +00:00
Joseph
ddbfa48711
Simplify parallel iteration methods (#8854)
# Objective

The `QueryParIter::for_each_mut` function is required when doing
parallel iteration with mutable queries.
This results in an unfortunate stutter:
`query.par_iter_mut().par_for_each_mut()` ('mut' is repeated).

## Solution

- Make `for_each` compatible with mutable queries, and deprecate
`for_each_mut`. In order to prevent `for_each` from being called
multiple times in parallel, we take ownership of the QueryParIter.

---

## Changelog

- `QueryParIter::for_each` is now compatible with mutable queries.
`for_each_mut` has been deprecated as it is now redundant.

## Migration Guide

The method `QueryParIter::for_each_mut` has been deprecated and is no
longer functional. Use `for_each` instead, which now supports mutable
queries.

```rust
// Before:
query.par_iter_mut().for_each_mut(|x| ...);

// After:
query.par_iter_mut().for_each(|x| ...);
```

The method `QueryParIter::for_each` now takes ownership of the
`QueryParIter`, rather than taking a shared reference.

```rust
// Before:
let par_iter = my_query.par_iter().batching_strategy(my_batching_strategy);
par_iter.for_each(|x| {
    // ...Do stuff with x...
    par_iter.for_each(|y| {
        // ...Do nested stuff with y...
    });
});

// After:
my_query.par_iter().batching_strategy(my_batching_strategy).for_each(|x| {
    // ...Do stuff with x...
    my_query.par_iter().batching_strategy(my_batching_strategy).for_each(|y| {
        // ...Do nested stuff with y...
    });
});
```
2023-07-23 11:09:24 +00:00
JoJoJet
db8d3651e0
Migrate the rest of the engine to UnsafeWorldCell (#8833)
# Objective

Follow-up to #6404 and #8292.

Mutating the world through a shared reference is surprising, and it
makes the meaning of `&World` unclear: sometimes it gives read-only
access to the entire world, and sometimes it gives interior mutable
access to only part of it.

This is an up-to-date version of #6972.

## Solution

Use `UnsafeWorldCell` for all interior mutability. Now, `&World`
*always* gives you read-only access to the entire world.

---

## Changelog

TODO - do we still care about changelogs?

## Migration Guide

Mutating any world data using `&World` is now considered unsound -- the
type `UnsafeWorldCell` must be used to achieve interior mutability. The
following methods now accept `UnsafeWorldCell` instead of `&World`:

- `QueryState`: `get_unchecked`, `iter_unchecked`,
`iter_combinations_unchecked`, `for_each_unchecked`,
`get_single_unchecked`, `get_single_unchecked_manual`.
- `SystemState`: `get_unchecked_manual`

```rust
let mut world = World::new();
let mut query = world.query::<&mut T>();

// Before:
let t1 = query.get_unchecked(&world, entity_1);
let t2 = query.get_unchecked(&world, entity_2);

// After:
let world_cell = world.as_unsafe_world_cell();
let t1 = query.get_unchecked(world_cell, entity_1);
let t2 = query.get_unchecked(world_cell, entity_2);
```

The methods `QueryState::validate_world` and
`SystemState::matches_world` now take a `WorldId` instead of `&World`:

```rust
// Before:
query_state.validate_world(&world);

// After:
query_state.validate_world(world.id());
```

The methods `QueryState::update_archetypes` and
`SystemState::update_archetypes` now take `UnsafeWorldCell` instead of
`&World`:

```rust
// Before:
query_state.update_archetypes(&world);

// After:
query_state.update_archetypes(world.as_unsafe_world_cell_readonly());
```
2023-06-15 01:31:56 +00:00
Natanael Mojica
f135535cd6
Rename Command's "write" method to "apply" (#8814)
# Objective

- Fixes #8811 .

## Solution

- Rename "write" method to "apply" in Command trait definition.
- Rename other implementations of command trait throughout bevy's code
base.

---

## Changelog

- Changed: `Command::write` has been changed to `Command::apply`
- Changed: `EntityCommand::write` has been changed to
`EntityCommand::apply`

## Migration Guide

- `Command::write` implementations need to be changed to implement
`Command::apply` instead. This is a mere name change, with no further
actions needed.
- `EntityCommand::write` implementations need to be changed to implement
`EntityCommand::apply` instead. This is a mere name change, with no
further actions needed.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2023-06-12 17:53:47 +00:00
CatThingy
89cbc78d3d
Require #[derive(Event)] on all Events (#7086)
# Objective

Be consistent with `Resource`s and `Components` and have `Event` types
be more self-documenting.
Although not susceptible to accidentally using a function instead of a
value due to `Event`s only being initialized by their type, much of the
same reasoning for removing the blanket impl on `Resource` also applies
here.

* Not immediately obvious if a type is intended to be an event
* Prevent invisible conflicts if the same third-party or primitive types
are used as events
* Allows for further extensions (e.g. opt-in warning for missed events)

## Solution

Remove the blanket impl for the `Event` trait. Add a derive macro for
it.

---

## Changelog

- `Event` is no longer implemented for all applicable types. Add the
`#[derive(Event)]` macro for events.

## Migration Guide

* Add the `#[derive(Event)]` macro for events. Third-party types used as
events should be wrapped in a newtype.
2023-06-06 14:44:32 +00:00
JoJoJet
85a918a8dd
Improve safety for the multi-threaded executor using UnsafeWorldCell (#8292)
# Objective

Fix #7833.

Safety comments in the multi-threaded executor don't really talk about
system world accesses, which makes it unclear if the code is actually
valid.

## Solution

Update the `System` trait to use `UnsafeWorldCell`. This type's API is
written in a way that makes it much easier to cleanly maintain safety
invariants. Use this type throughout the multi-threaded executor, with a
liberal use of safety comments.

---

## Migration Guide

The `System` trait now uses `UnsafeWorldCell` instead of `&World`. This
type provides a robust API for interior mutable world access.
- The method `run_unsafe` uses this type to manage world mutations
across multiple threads.
- The method `update_archetype_component_access` uses this type to
ensure that only world metadata can be used.

```rust
let mut system = IntoSystem::into_system(my_system);
system.initialize(&mut world);

// Before:
system.update_archetype_component_access(&world);
unsafe { system.run_unsafe(&world) }

// After:
system.update_archetype_component_access(world.as_unsafe_world_cell_readonly());
unsafe { system.run_unsafe(world.as_unsafe_world_cell()) }
```

---------

Co-authored-by: James Liu <contact@jamessliu.com>
2023-05-29 15:22:10 +00:00
François
0736195a1e
update syn, encase, glam and hexasphere (#8573)
# Objective

- Fixes #8282 
- Update `syn` to 2.0, `encase` to 0.6, `glam` to 0.24 and `hexasphere`
to 9.0


Blocked ~~on https://github.com/teoxoy/encase/pull/42~~ and ~~on
https://github.com/OptimisticPeach/hexasphere/pull/17~~

---------

Co-authored-by: Nicola Papale <nicopap@users.noreply.github.com>
Co-authored-by: JoJoJet <21144246+JoJoJet@users.noreply.github.com>
2023-05-16 01:24:17 +00:00
Gino Valente
75130bd5ec
bevy_reflect: Better proxies (#6971)
# Objective

> This PR is based on discussion from #6601

The Dynamic types (e.g. `DynamicStruct`, `DynamicList`, etc.) act as
both:
1. Dynamic containers which may hold any arbitrary data
2. Proxy types which may represent any other type

Currently, the only way we can represent the proxy-ness of a Dynamic is
by giving it a name.

```rust
// This is just a dynamic container
let mut data = DynamicStruct::default();

// This is a "proxy"
data.set_name(std::any::type_name::<Foo>());
```

This type name is the only way we check that the given Dynamic is a
proxy of some other type. When we need to "assert the type" of a `dyn
Reflect`, we call `Reflect::type_name` on it. However, because we're
only using a string to denote the type, we run into a few gotchas and
limitations.

For example, hashing a Dynamic proxy may work differently than the type
it proxies:

```rust
#[derive(Reflect, Hash)]
#[reflect(Hash)]
struct Foo(i32);

let concrete = Foo(123);
let dynamic = concrete.clone_dynamic();

let concrete_hash = concrete.reflect_hash();
let dynamic_hash = dynamic.reflect_hash();

// The hashes are not equal because `concrete` uses its own `Hash` impl
// while `dynamic` uses a reflection-based hashing algorithm
assert_ne!(concrete_hash, dynamic_hash);
```

Because the Dynamic proxy only knows about the name of the type, it's
unaware of any other information about it. This means it also differs on
`Reflect::reflect_partial_eq`, and may include ignored or skipped fields
in places the concrete type wouldn't.

## Solution

Rather than having Dynamics pass along just the type name of proxied
types, we can instead have them pass around the `TypeInfo`.

Now all Dynamic types contain an `Option<&'static TypeInfo>` rather than
a `String`:

```diff
pub struct DynamicTupleStruct {
-    type_name: String,
+    represented_type: Option<&'static TypeInfo>,
    fields: Vec<Box<dyn Reflect>>,
}
```

By changing `Reflect::get_type_info` to
`Reflect::represented_type_info`, hopefully we make this behavior a
little clearer. And to account for `None` values on these dynamic types,
`Reflect::represented_type_info` now returns `Option<&'static
TypeInfo>`.

```rust
let mut data = DynamicTupleStruct::default();

// Not proxying any specific type
assert!(dyn_tuple_struct.represented_type_info().is_none());

let type_info = <Foo as Typed>::type_info();
dyn_tuple_struct.set_represented_type(Some(type_info));
// Alternatively:
// let dyn_tuple_struct = foo.clone_dynamic();

// Now we're proxying `Foo`
assert!(dyn_tuple_struct.represented_type_info().is_some());
```

This means that we can have full access to all the static type
information for the proxied type. Future work would include
transitioning more static type information (trait impls, attributes,
etc.) over to the `TypeInfo` so it can actually be utilized by Dynamic
proxies.

### Alternatives & Rationale

> **Note** 
> These alternatives were written when this PR was first made using a
`Proxy` trait. This trait has since been removed.

<details>
<summary>View</summary>

#### Alternative: The `Proxy<T>` Approach

I had considered adding something like a `Proxy<T>` type where `T` would
be the Dynamic and would contain the proxied type information.

This was nice in that it allows us to explicitly determine whether
something is a proxy or not at a type level. `Proxy<DynamicStruct>`
proxies a struct. Makes sense.

The reason I didn't go with this approach is because (1) tuples, (2)
complexity, and (3) `PartialReflect`.

The `DynamicTuple` struct allows us to represent tuples at runtime. It
also allows us to do something you normally can't with tuples: add new
fields. Because of this, adding a field immediately invalidates the
proxy (e.g. our info for `(i32, i32)` doesn't apply to `(i32, i32,
NewField)`). By going with this PR's approach, we can just remove the
type info on `DynamicTuple` when that happens. However, with the
`Proxy<T>` approach, it becomes difficult to represent this behavior—
we'd have to completely control how we access data for `T` for each `T`.

Secondly, it introduces some added complexities (aside from the manual
impls for each `T`). Does `Proxy<T>` impl `Reflect`? Likely yes, if we
want to represent it as `dyn Reflect`. What `TypeInfo` do we give it?
How would we forward reflection methods to the inner type (remember, we
don't have specialization)? How do we separate this from Dynamic types?
And finally, how do all this in a way that's both logical and intuitive
for users?

Lastly, introducing a `Proxy` trait rather than a `Proxy<T>` struct is
actually more inline with the [Unique Reflect
RFC](https://github.com/bevyengine/rfcs/pull/56). In a way, the `Proxy`
trait is really one part of the `PartialReflect` trait introduced in
that RFC (it's technically not in that RFC but it fits well with it),
where the `PartialReflect` serves as a way for proxies to work _like_
concrete types without having full access to everything a concrete
`Reflect` type can do. This would help bridge the gap between the
current state of the crate and the implementation of that RFC.

All that said, this is still a viable solution. If the community
believes this is the better path forward, then we can do that instead.
These were just my reasons for not initially going with it in this PR.

#### Alternative: The Type Registry Approach

The `Proxy` trait is great and all, but how does it solve the original
problem? Well, it doesn't— yet!

The goal would be to start moving information from the derive macro and
its attributes to the generated `TypeInfo` since these are known
statically and shouldn't change. For example, adding `ignored: bool` to
`[Un]NamedField` or a list of impls.

However, there is another way of storing this information. This is, of
course, one of the uses of the `TypeRegistry`. If we're worried about
Dynamic proxies not aligning with their concrete counterparts, we could
move more type information to the registry and require its usage.

For example, we could replace `Reflect::reflect_hash(&self)` with
`Reflect::reflect_hash(&self, registry: &TypeRegistry)`.

That's not the _worst_ thing in the world, but it is an ergonomics loss.

Additionally, other attributes may have their own requirements, further
restricting what's possible without the registry. The `Reflect::apply`
method will require the registry as well now. Why? Well because the
`map_apply` function used for the `Reflect::apply` impls on `Map` types
depends on `Map::insert_boxed`, which (at least for `DynamicMap`)
requires `Reflect::reflect_hash`. The same would apply when adding
support for reflection-based diffing, which will require
`Reflect::reflect_partial_eq`.

Again, this is a totally viable alternative. I just chose not to go with
it for the reasons above. If we want to go with it, then we can close
this PR and we can pursue this alternative instead.

#### Downsides

Just to highlight a quick potential downside (likely needs more
investigation): retrieving the `TypeInfo` requires acquiring a lock on
the `GenericTypeInfoCell` used by the `Typed` impls for generic types
(non-generic types use a `OnceBox which should be faster). I am not sure
how much of a performance hit that is and will need to run some
benchmarks to compare against.

</details>

### Open Questions

1. Should we use `Cow<'static, TypeInfo>` instead? I think that might be
easier for modding? Perhaps, in that case, we need to update
`Typed::type_info` and friends as well?
2. Are the alternatives better than the approach this PR takes? Are
there other alternatives?

---

## Changelog

### Changed

- `Reflect::get_type_info` has been renamed to
`Reflect::represented_type_info`
- This method now returns `Option<&'static TypeInfo>` rather than just
`&'static TypeInfo`

### Added

- Added `Reflect::is_dynamic` method to indicate when a type is dynamic
- Added a `set_represented_type` method on all dynamic types

### Removed

- Removed `TypeInfo::Dynamic` (use `Reflect::is_dynamic` instead)
- Removed `Typed` impls for all dynamic types

## Migration Guide

- The Dynamic types no longer take a string type name. Instead, they
require a static reference to `TypeInfo`:

    ```rust
    #[derive(Reflect)]
    struct MyTupleStruct(f32, f32);
    
    let mut dyn_tuple_struct = DynamicTupleStruct::default();
    dyn_tuple_struct.insert(1.23_f32);
    dyn_tuple_struct.insert(3.21_f32);
    
    // BEFORE:
    let type_name = std::any::type_name::<MyTupleStruct>();
    dyn_tuple_struct.set_name(type_name);
    
    // AFTER:
    let type_info = <MyTupleStruct as Typed>::type_info();
    dyn_tuple_struct.set_represented_type(Some(type_info));
    ```

- `Reflect::get_type_info` has been renamed to
`Reflect::represented_type_info` and now also returns an
`Option<&'static TypeInfo>` (instead of just `&'static TypeInfo`):

    ```rust
    // BEFORE:
    let info: &'static TypeInfo = value.get_type_info();
    // AFTER:
let info: &'static TypeInfo = value.represented_type_info().unwrap();
    ```

- `TypeInfo::Dynamic` and `DynamicInfo` has been removed. Use
`Reflect::is_dynamic` instead:
   
    ```rust
    // BEFORE:
    if matches!(value.get_type_info(), TypeInfo::Dynamic) {
      // ...
    }
    // AFTER:
    if value.is_dynamic() {
      // ...
    }
    ```

---------

Co-authored-by: radiish <cb.setho@gmail.com>
2023-04-26 12:17:46 +00:00
Gino Valente
c320f54b50
bevy_reflect: Add benches for Struct::clone_dynamic and Reflect::get_type_info (#7764)
# Objective

There aren't any reflection bench tests for `Struct::clone_dynamic` or
`Reflect::get_type_info`.

## Solution

Add benches for `Struct::clone_dynamic` and `Reflect::get_type_info`.
2023-04-17 15:19:08 +00:00
James Liu
8e82c88131
Basic event benchmarks (#8251)
# Objective
Fix #7731. Add basic Event sending and iteration benchmarks to
bevy_ecs's benchmark suite.

## Solution
Add said benchmarks scaling from 100 to 50,000 events.

Not sure if I want to include a randomization of the events going in,
the current implementation might be too easy for the compiler to
optimize.

---------

Co-authored-by: JoJoJet <21144246+JoJoJet@users.noreply.github.com>
2023-03-31 07:12:18 +00:00
Carter Anderson
aefe1f0739
Schedule-First: the new and improved add_systems (#8079)
Co-authored-by: Mike <mike.hsu@gmail.com>
2023-03-18 01:45:34 +00:00
JoJoJet
fd1af7c8b8
Replace multiple calls to add_system with add_systems (#8001) 2023-03-10 18:15:22 +00:00
Edgar Geier
cbbf8ac575 Update glam to 0.23 (#7883)
# Objective

- Update `glam` to the latest version.

## Solution

- Update `glam` to version `0.23`.

Since the breaking change in `glam` only affects the `scalar-math` feature, this should cause no issues.
2023-03-04 11:42:27 +00:00
Aevyrie
2ea0061018 Add generic cubic splines to bevy_math (#7683)
# Objective

- Make cubic splines more flexible and more performant
- Remove the existing spline implementation that is generic over many degrees
  - This is a potential performance footgun and adds type complexity for negligible gain.
- Add implementations of:
  - Bezier splines
  - Cardinal splines (inc. Catmull-Rom)
  - B-Splines
  - Hermite splines

https://user-images.githubusercontent.com/2632925/221780519-495d1b20-ab46-45b4-92a3-32c46da66034.mp4


https://user-images.githubusercontent.com/2632925/221780524-2b154016-699f-404f-9c18-02092f589b04.mp4


https://user-images.githubusercontent.com/2632925/221780525-f934f99d-9ad4-4999-bae2-75d675f5644f.mp4


## Solution

- Implements the concept that splines are curve generators (e.g. https://youtu.be/jvPPXbo87ds?t=3488) via the `CubicGenerator` trait.
- Common splines are bespoke data types that implement this trait. This gives us flexibility to add custom spline-specific methods on these types, while ultimately all generating a `CubicCurve`.
- All splines generate `CubicCurve`s, which are a chain of precomputed polynomial coefficients. This means that all splines have the same evaluation cost, as the calculations for determining position, velocity, and acceleration are all identical. In addition, `CubicCurve`s are simply a list of `CubicSegment`s, which are evaluated from t=0 to t=1. This also means cubic splines of different type can be chained together, as ultimately they all are simply a collection of `CubicSegment`s.
- Because easing is an operation on a singe segment of a Bezier curve, we can simply implement easing on `Beziers` that use the `Vec2` type for points. Higher level crates such as `bevy_ui` can wrap this in a more ergonomic interface as needed.

### Performance
Measured on a desktop i5 8600K (6-year-old CPU):
- easing: 2.7x faster (19ns)
- cubic vec2 position sample: 1.5x faster (1.8ns)
- cubic vec3 position sample: 1.5x faster (2.6ns)
- cubic vec3a position sample: 1.9x faster (1.4ns)

On a laptop i7 11800H:
- easing: 16ns
- cubic vec2 position sample: 1.6ns
- cubic vec3 position sample: 2.3ns
- cubic vec3a position sample: 1.2ns

---

## Changelog

- Added a generic cubic curve trait, and implementation for Cardinal splines (including Catmull-Rom), B-Splines, Beziers, and Hermite Splines. 2D cubic curve segments also implement easing functionality for animation.
2023-03-03 22:06:42 +00:00
JoJoJet
9733613c07 Add marker traits to distinguish base sets from regular system sets (#7863)
# Objective

Base sets, added in #7466  are a special type of system set. Systems can only be added to base sets via `in_base_set`, while non-base sets can only be added via `in_set`. Unfortunately this is currently guarded by a runtime panic, which presents an unfortunate toe-stub when the wrong method is used. The delayed response between writing code and encountering the error (possibly hours) makes the distinction between base sets and other sets much more difficult to learn.

## Solution

Add the marker traits `BaseSystemSet` and `FreeSystemSet`. `in_base_set` and `in_set` now respectively accept these traits, which moves the runtime panic to a compile time error.

---

## Changelog

+ Added the marker trait `BaseSystemSet`, which is distinguished from a `FreeSystemSet`. These are both subtraits of `SystemSet`.

## Migration Guide

None if merged with 0.10
2023-03-02 13:22:58 +00:00
Aevyrie
2a598d3e5a Add Beziers to bevy_math (#7653)
# Objective

- Adds foundational math for Bezier curves, useful for UI/2D/3D animation and smooth paths.

https://user-images.githubusercontent.com/2632925/218883143-e138f994-1795-40da-8c59-21d779666991.mp4

## Solution

- Adds the generic `Bezier` type, and a `Point` trait. The `Point` trait allows us to use control points of any dimension, as long as they support vector math. I've implemented it for `f32`(1D), `Vec2`(2D), and `Vec3`/`Vec3A`(3D).
- Adds `CubicBezierEasing` on top of `Bezier` with the addition of an implementation of cubic Bezier easing, which is a foundational tool for UI animation.
  - This involves solving for $t$ in the parametric Bezier function $B(t)$ using the Newton-Raphson method to find a value with error $\leq$ 1e-7, capped at 8 iterations.
- Added type aliases for common Bezier curves: `CubicBezier2d`, `CubicBezier3d`, `QuadraticBezier2d`, and `QuadraticBezier3d`. These types use `Vec3A` to represent control points, as this was found to have an 80-90% speedup over using `Vec3`.
- Benchmarking shows quadratic/cubic Bezier evaluations $B(t)$ take \~1.8/2.4ns respectively. Easing, which requires an iterative solve takes \~50ns for cubic Beziers. 

---

## Changelog

- Added `CubicBezier2d`, `CubicBezier3d`, `QuadraticBezier2d`, and `QuadraticBezier3d` types with methods for sampling position, velocity, and acceleration. The generic `Bezier` type is also available, and generic over any degree of Bezier curve.
- Added `CubicBezierEasing`, with additional methods to allow for smooth easing animations.
2023-02-20 18:34:52 +00:00
Niklas Eicker
0bce78439b Cleanup system sets called labels (#7678)
# Objective

We have a few old system labels that are now system sets but are still named or documented as labels. Documentation also generally mentioned system labels in some places.


## Solution

- Clean up naming and documentation regarding system sets

## Migration Guide

`PrepareAssetLabel` is now called `PrepareAssetSet`
2023-02-14 21:46:07 +00:00
JoJoJet
f7fbfaf9c7 Rename an outdated benchmark from "run criteria" to "run condition" (#7645)
# Objective

The benchmark for run conditions still uses the outdated terminology "run criteria".

## Solution

Update the name.
2023-02-12 19:15:44 +00:00
张林伟
aa4170d9a4 Rename schedule v3 to schedule (#7519)
# Objective

- Follow up of https://github.com/bevyengine/bevy/pull/7267

## Solution

- Rename schedule_v3 to schedule
- Suppress "module inception" lint
2023-02-06 18:44:40 +00:00
Alice Cecile
206c7ce219 Migrate engine to Schedule v3 (#7267)
Huge thanks to @maniwani, @devil-ira, @hymm, @cart, @superdump and @jakobhellermann for the help with this PR.

# Objective

- Followup #6587.
- Minimal integration for the Stageless Scheduling RFC: https://github.com/bevyengine/rfcs/pull/45

## Solution

- [x]  Remove old scheduling module
- [x] Migrate new methods to no longer use extension methods
- [x] Fix compiler errors
- [x] Fix benchmarks
- [x] Fix examples
- [x] Fix docs
- [x] Fix tests

## Changelog

### Added

- a large number of methods on `App` to work with schedules ergonomically
- the `CoreSchedule` enum
- `App::add_extract_system` via the `RenderingAppExtension` trait extension method
- the private `prepare_view_uniforms` system now has a public system set for scheduling purposes, called `ViewSet::PrepareUniforms`

### Removed

- stages, and all code that mentions stages
- states have been dramatically simplified, and no longer use a stack
- `RunCriteriaLabel`
- `AsSystemLabel` trait
- `on_hierarchy_reports_enabled` run criteria (now just uses an ad hoc resource checking run condition)
- systems in `RenderSet/Stage::Extract` no longer warn when they do not read data from the main world
- `RunCriteriaLabel`
- `transform_propagate_system_set`: this was a nonstandard pattern that didn't actually provide enough control. The systems are already `pub`: the docs have been updated to ensure that the third-party usage is clear.

### Changed

- `System::default_labels` is now `System::default_system_sets`.
- `App::add_default_labels` is now `App::add_default_sets`
- `CoreStage` and `StartupStage` enums are now `CoreSet` and `StartupSet`
- `App::add_system_set` was renamed to `App::add_systems`
- The `StartupSchedule` label is now defined as part of the `CoreSchedules` enum
-  `.label(SystemLabel)` is now referred to as `.in_set(SystemSet)`
- `SystemLabel` trait was replaced by `SystemSet`
- `SystemTypeIdLabel<T>` was replaced by `SystemSetType<T>`
- The `ReportHierarchyIssue` resource now has a public constructor (`new`), and implements `PartialEq`
- Fixed time steps now use a schedule (`CoreSchedule::FixedTimeStep`) rather than a run criteria.
- Adding rendering extraction systems now panics rather than silently failing if no subapp with the `RenderApp` label is found.
- the `calculate_bounds` system, with the `CalculateBounds` label, is now in `CoreSet::Update`, rather than in `CoreSet::PostUpdate` before commands are applied. 
- `SceneSpawnerSystem` now runs under `CoreSet::Update`, rather than `CoreStage::PreUpdate.at_end()`.
- `bevy_pbr::add_clusters` is no longer an exclusive system
- the top level `bevy_ecs::schedule` module was replaced with `bevy_ecs::scheduling`
- `tick_global_task_pools_on_main_thread` is no longer run as an exclusive system. Instead, it has been replaced by `tick_global_task_pools`, which uses a `NonSend` resource to force running on the main thread.

## Migration Guide

- Calls to `.label(MyLabel)` should be replaced with `.in_set(MySet)`
- Stages have been removed. Replace these with system sets, and then add command flushes using the `apply_system_buffers` exclusive system where needed.
- The `CoreStage`, `StartupStage, `RenderStage` and `AssetStage`  enums have been replaced with `CoreSet`, `StartupSet, `RenderSet` and `AssetSet`. The same scheduling guarantees have been preserved.
  - Systems are no longer added to `CoreSet::Update` by default. Add systems manually if this behavior is needed, although you should consider adding your game logic systems to `CoreSchedule::FixedTimestep` instead for more reliable framerate-independent behavior.
  - Similarly, startup systems are no longer part of `StartupSet::Startup` by default. In most cases, this won't matter to you.
  - For example, `add_system_to_stage(CoreStage::PostUpdate, my_system)` should be replaced with 
  - `add_system(my_system.in_set(CoreSet::PostUpdate)`
- When testing systems or otherwise running them in a headless fashion, simply construct and run a schedule using `Schedule::new()` and `World::run_schedule` rather than constructing stages
- Run criteria have been renamed to run conditions. These can now be combined with each other and with states.
- Looping run criteria and state stacks have been removed. Use an exclusive system that runs a schedule if you need this level of control over system control flow.
- For app-level control flow over which schedules get run when (such as for rollback networking), create your own schedule and insert it under the `CoreSchedule::Outer` label.
- Fixed timesteps are now evaluated in a schedule, rather than controlled via run criteria. The `run_fixed_timestep` system runs this schedule between `CoreSet::First` and `CoreSet::PreUpdate` by default.
- Command flush points introduced by `AssetStage` have been removed. If you were relying on these, add them back manually.
- Adding extract systems is now typically done directly on the main app. Make sure the `RenderingAppExtension` trait is in scope, then call `app.add_extract_system(my_system)`.
- the `calculate_bounds` system, with the `CalculateBounds` label, is now in `CoreSet::Update`, rather than in `CoreSet::PostUpdate` before commands are applied. You may need to order your movement systems to occur before this system in order to avoid system order ambiguities in culling behavior.
- the `RenderLabel` `AppLabel` was renamed to `RenderApp` for clarity
- `App::add_state` now takes 0 arguments: the starting state is set based on the `Default` impl.
- Instead of creating `SystemSet` containers for systems that run in stages, simply use `.on_enter::<State::Variant>()` or its `on_exit` or `on_update` siblings.
- `SystemLabel` derives should be replaced with `SystemSet`. You will also need to add the `Debug`, `PartialEq`, `Eq`, and `Hash` traits to satisfy the new trait bounds.
- `with_run_criteria` has been renamed to `run_if`. Run criteria have been renamed to run conditions for clarity, and should now simply return a bool.
- States have been dramatically simplified: there is no longer a "state stack". To queue a transition to the next state, call `NextState::set`

## TODO

- [x] remove dead methods on App and World
- [x] add `App::add_system_to_schedule` and `App::add_systems_to_schedule`
- [x] avoid adding the default system set at inappropriate times
- [x] remove any accidental cycles in the default plugins schedule
- [x] migrate benchmarks
- [x] expose explicit labels for the built-in command flush points
- [x] migrate engine code
- [x] remove all mentions of stages from the docs
- [x] verify docs for States
- [x] fix uses of exclusive systems that use .end / .at_start / .before_commands
- [x] migrate RenderStage and AssetStage
- [x] migrate examples
- [x] ensure that transform propagation is exported in a sufficiently public way (the systems are already pub)
- [x] ensure that on_enter schedules are run at least once before the main app
- [x] re-enable opt-in to execution order ambiguities
- [x] revert change to `update_bounds` to ensure it runs in `PostUpdate`
- [x] test all examples
  - [x] unbreak directional lights
  - [x] unbreak shadows (see 3d_scene, 3d_shape, lighting, transparaency_3d examples)
  - [x] game menu example shows loading screen and menu simultaneously
  - [x] display settings menu is a blank screen
  - [x] `without_winit` example panics
- [x] ensure all tests pass
  - [x] SubApp doc test fails
  - [x] runs_spawn_local tasks fails
  - [x] [Fix panic_when_hierachy_cycle test hanging](https://github.com/alice-i-cecile/bevy/pull/120)

## Points of Difficulty and Controversy

**Reviewers, please give feedback on these and look closely**

1.  Default sets, from the RFC, have been removed. These added a tremendous amount of implicit complexity and result in hard to debug scheduling errors. They're going to be tackled in the form of "base sets" by @cart in a followup.
2. The outer schedule controls which schedule is run when `App::update` is called.
3. I implemented `Label for `Box<dyn Label>` for our label types. This enables us to store schedule labels in concrete form, and then later run them. I ran into the same set of problems when working with one-shot systems. We've previously investigated this pattern in depth, and it does not appear to lead to extra indirection with nested boxes.
4. `SubApp::update` simply runs the default schedule once. This sucks, but this whole API is incomplete and this was the minimal changeset.
5. `time_system` and `tick_global_task_pools_on_main_thread` no longer use exclusive systems to attempt to force scheduling order
6. Implemetnation strategy for fixed timesteps
7. `AssetStage` was migrated to `AssetSet` without reintroducing command flush points. These did not appear to be used, and it's nice to remove these bottlenecks.
8. Migration of `bevy_render/lib.rs` and pipelined rendering. The logic here is unusually tricky, as we have complex scheduling requirements.

## Future Work (ideally before 0.10)

- Rename schedule_v3 module to schedule or scheduling
- Add a derive macro to states, and likely a `EnumIter` trait of some form
- Figure out what exactly to do with the "systems added should basically work by default" problem
- Improve ergonomics for working with fixed timesteps and states
- Polish FixedTime API to match Time
- Rebase and merge #7415
- Resolve all internal ambiguities (blocked on better tools, especially #7442)
- Add "base sets" to replace the removed default sets.
2023-02-06 02:04:50 +00:00
James Liu
dfea88c64d Basic adaptive batching for parallel query iteration (#4777)
# Objective
Fixes #3184. Fixes #6640. Fixes #4798. Using `Query::par_for_each(_mut)` currently requires a `batch_size` parameter, which affects how it chunks up large archetypes and tables into smaller chunks to run in parallel. Tuning this value is difficult, as the performance characteristics entirely depends on the state of the `World` it's being run on. Typically, users will just use a flat constant and just tune it by hand until it performs well in some benchmarks. However, this is both error prone and risks overfitting the tuning on that benchmark.

This PR proposes a naive automatic batch-size computation based on the current state of the `World`.

## Background
`Query::par_for_each(_mut)` schedules a new Task for every archetype or table that it matches. Archetypes/tables larger than the batch size are chunked into smaller tasks. Assuming every entity matched by the query has an identical workload, this makes the worst case scenario involve using a batch size equal to the size of the largest matched archetype or table. Conversely, a batch size of `max {archetype, table} size / thread count * COUNT_PER_THREAD` is likely the sweetspot where the overhead of scheduling tasks is minimized, at least not without grouping small archetypes/tables together.

There is also likely a strict minimum batch size below which the overhead of scheduling these tasks is heavier than running the entire thing single-threaded.

## Solution

- [x] Remove the `batch_size` from `Query(State)::par_for_each`  and friends.
- [x] Add a check to compute `batch_size = max {archeytpe/table} size / thread count  * COUNT_PER_THREAD`
- [x] ~~Panic if thread count is 0.~~ Defer to `for_each` if the thread count is 1 or less.
- [x] Early return if there is no matched table/archetype. 
- [x] Add override option for users have queries that strongly violate the initial assumption that all iterated entities have an equal workload.

---

## Changelog
Changed: `Query::par_for_each(_mut)` has been changed to `Query::par_iter(_mut)` and will now automatically try to produce a batch size for callers based on the current `World` state.

## Migration Guide
The `batch_size` parameter for `Query(State)::par_for_each(_mut)` has been removed. These calls will automatically compute a batch size for you. Remove these parameters from all calls to these functions.

Before:
```rust
fn parallel_system(query: Query<&MyComponent>) {
   query.par_for_each(32, |comp| {
        ...
   });
}
```

After:

```rust
fn parallel_system(query: Query<&MyComponent>) {
   query.par_iter().for_each(|comp| {
        ...
   });
}
```

Co-authored-by: Arnav Choubey <56453634+x-52@users.noreply.github.com>
Co-authored-by: Robert Swain <robert.swain@gmail.com>
Co-authored-by: François <mockersf@gmail.com>
Co-authored-by: Corey Farwell <coreyf@rwell.org>
Co-authored-by: Aevyrie <aevyrie@gmail.com>
2023-01-20 08:47:20 +00:00
张林伟
bb79903938 Fix clippy issue for benches crate (#6806)
# Objective

- https://github.com/bevyengine/bevy/pull/3505 marked `S-Adopt-Me` , this pr is to continue his work.

## Solution

- run `cargo clippy --workspace --all-targets --all-features -- -Aclippy::type_complexity -Wclippy::doc_markdown -Wclippy::redundant_else -Wclippy::match_same_arms -Wclippy::semicolon_if_nothing_returned -Wclippy::explicit_iter_loop -Wclippy::map_flatten -Dwarnings` under benches dir.
- fix issue according to suggestion.
2023-01-11 09:32:07 +00:00
François
0aab699a84 Update glam 0.22, hexasphere 8.0, encase 0.4 (#6427)
# Objective

- Update glam to 0.22, hexasphere to 8.0, encase to 0.4

## Solution

- Update glam to 0.22, hexasphere to 8.0, encase to 0.4
- ~~waiting on https://github.com/teoxoy/encase/pull/17 and https://github.com/OptimisticPeach/hexasphere/pull/13~~
2022-11-07 19:44:13 +00:00
Thierry Berger
9dd019bf86 Change Detection Benchmarks (#4972)
# Objective

Fixes #4883.

## Solution

- Add benchmarks for Changed and Added

Disclaimer: I'm not that much familiar with benchmarking.
2022-11-04 17:53:54 +00:00
James Liu
ec8c8fbc8a Remove unnecesary branches/panics from Query accesses (#6461)
# Objective
Supercedes #6452. Upon inspection of the [generated assembly](https://gist.github.com/james7132/c2740c6941b80d7912f1e8888e223cbb#file-original-s) of a [simple Bevy binary](https://gist.github.com/james7132/c2740c6941b80d7912f1e8888e223cbb#file-source-rs) compiled with `cargo rustc --release -- --emit asm`, it's apparent that there are multiple unnecessary branches in the generated assembly:

```assembly
.LBB5_5:
	cmpq	%r10, %r11
	je	.LBB5_15
	movq	(%r11), %rcx
	movq	328(%r15), %rdx
	cmpq	%rdx, %rcx
	jae	.LBB5_14
	movq	312(%r15), %rdi
	leaq	(%rcx,%rcx,2), %rcx
	shlq	$5, %rcx
	movq	336(%r12), %rdx
	movq	64(%rdi,%rcx), %rax
	cmpq	%rdx, %rax
	jbe	.LBB5_4
	leaq	(%rdi,%rcx), %rsi
	movq	48(%rsi), %rbp
	shlq	$4, %rdx
	cmpq	$0, (%rbp,%rdx)
	je	.LBB5_4
	movq	344(%r12), %rbx
	cmpq	%rbx, %rax
	jbe	.LBB5_4
	shlq	$4, %rbx
	cmpq	$0, (%rbp,%rbx)
	je	.LBB5_4
	addq	$8, %r11
	movq	88(%rdi,%rcx), %rcx
	testq	%rcx, %rcx
	je	.LBB5_5
	movq	(%rsi), %rax
	movq	8(%rbp,%rdx), %rdx
	leaq	(%rdx,%rdx,4), %rdi
	shlq	$4, %rdi
	movq	32(%rax,%rdi), %rdx
	movq	56(%rax,%rdi), %r8
	movq	8(%rbp,%rbx), %rbp
	leaq	(%rbp,%rbp,4), %rbp
	shlq	$4, %rbp
	movq	32(%rax,%rbp), %r9
	xorl	%ebp, %ebp
	jmp	.LBB5_13
	.p2align	4, 0x90
```

Almost every one of the instructions starting with `j` is a potential branch, which can significantly slow down accesses. Of these, two labels are both common and never used:

```asm
.LBB5_14:
	leaq	__unnamed_2(%rip), %r8
	callq	_ZN4core9panicking18panic_bounds_check17h70367088e72af65aE
	ud2
.LBB5_4:
	callq	_ZN8bevy_ecs5query25debug_checked_unreachable17h0855ff520ceaea77E
	ud2
	.seh_endproc
```

These correpsond to subprocedure calls to panicking due to out of bounds from indexing `Tables` and `debug_checked_unreadable`. Both of which should be inlined and optimized out, but are not.

## Solution
Make `debug_checked_unreachable` a macro to forcibly inline either `unreachable!()` in debug builds, and `std::hint::unreachable_unchecked()` in release mode. Replace the `Tables` and `Archetype` index access with `get(id).unwrap_or_else(|| debug_checked_unreachable!())` to assume that the table or archetype provided exists.

This has no external breaking change of any kind.

The equivalent section of code with these changes removes most of the conditional jump instructions:

```asm
.LBB5_5:
	movss	(%rbx,%rbp,4), %xmm0
	movl	%r14d, 4(%r8,%rbp,8)
	addss	(%rdi,%rbp,4), %xmm0
	movss	%xmm0, (%rdi,%rbp,4)
	incq	%rbp
.LBB5_1:
	cmpq	%rdx, %rbp
	jne	.LBB5_5
	.p2align	4, 0x90
.LBB5_2:
	cmpq	%rcx, %rax
	je	.LBB5_6
	movq	(%rax), %rdx
	addq	$8, %rax
	movq	312(%rsi), %rbp
	leaq	(%rdx,%rdx,2), %rbx
	shlq	$5, %rbx
	movq	88(%rbp,%rbx), %rdx
	testq	%rdx, %rdx
	je	.LBB5_2
	leaq	(%rbx,%rbp), %r8
	movq	336(%r15), %rdi
	movq	344(%r15), %r9
	movq	48(%rbp,%rbx), %r10
	shlq	$4, %rdi
	movq	(%r8), %rbx
	movq	8(%r10,%rdi), %rdi
	leaq	(%rdi,%rdi,4), %rbp
	shlq	$4, %rbp
	movq	32(%rbx,%rbp), %rdi
	movq	56(%rbx,%rbp), %r8
	shlq	$4, %r9
	movq	8(%r10,%r9), %rbp
	leaq	(%rbp,%rbp,4), %rbp
	shlq	$4, %rbp
	movq	32(%rbx,%rbp), %rbx
	xorl	%ebp, %ebp
	jmp	.LBB5_5
.LBB5_6:
	addq	$40, %rsp
	popq	%rbx
	popq	%rbp
	popq	%rdi
	popq	%rsi
	popq	%r14
	popq	%r15
	retq
	.seh_endproc

```

## Performance

Microbenchmarks results:

<details>

```
group                                                    main                                     no-panic-query
-----                                                    ----                                     --------------
busy_systems/01x_entities_03_systems                     1.20     42.4±2.66µs        ? ?/sec      1.00     35.3±1.68µs        ? ?/sec
busy_systems/01x_entities_06_systems                     1.32     83.8±3.50µs        ? ?/sec      1.00     63.6±1.72µs        ? ?/sec
busy_systems/01x_entities_09_systems                     1.15    113.3±8.90µs        ? ?/sec      1.00     98.2±6.15µs        ? ?/sec
busy_systems/01x_entities_12_systems                     1.27   160.8±32.44µs        ? ?/sec      1.00    126.6±4.70µs        ? ?/sec
busy_systems/01x_entities_15_systems                     1.12    179.6±3.71µs        ? ?/sec      1.00   160.3±11.03µs        ? ?/sec
busy_systems/02x_entities_03_systems                     1.18     76.8±3.14µs        ? ?/sec      1.00     65.2±3.17µs        ? ?/sec
busy_systems/02x_entities_06_systems                     1.16    144.6±6.10µs        ? ?/sec      1.00    124.5±5.14µs        ? ?/sec
busy_systems/02x_entities_09_systems                     1.19    215.3±9.18µs        ? ?/sec      1.00    181.5±5.67µs        ? ?/sec
busy_systems/02x_entities_12_systems                     1.20    266.7±8.33µs        ? ?/sec      1.00    222.0±9.53µs        ? ?/sec
busy_systems/02x_entities_15_systems                     1.23   338.8±10.53µs        ? ?/sec      1.00    276.3±6.94µs        ? ?/sec
busy_systems/03x_entities_03_systems                     1.43    113.5±5.06µs        ? ?/sec      1.00     79.6±1.49µs        ? ?/sec
busy_systems/03x_entities_06_systems                     1.38   217.3±12.67µs        ? ?/sec      1.00    157.5±3.07µs        ? ?/sec
busy_systems/03x_entities_09_systems                     1.23   308.8±24.75µs        ? ?/sec      1.00    251.6±8.93µs        ? ?/sec
busy_systems/03x_entities_12_systems                     1.05   347.7±12.43µs        ? ?/sec      1.00   330.6±11.43µs        ? ?/sec
busy_systems/03x_entities_15_systems                     1.13   455.5±13.88µs        ? ?/sec      1.00   401.7±17.29µs        ? ?/sec
busy_systems/04x_entities_03_systems                     1.24    144.7±5.89µs        ? ?/sec      1.00    116.9±6.29µs        ? ?/sec
busy_systems/04x_entities_06_systems                     1.24   282.8±21.40µs        ? ?/sec      1.00   228.6±21.31µs        ? ?/sec
busy_systems/04x_entities_09_systems                     1.35   431.8±14.10µs        ? ?/sec      1.00    319.6±9.83µs        ? ?/sec
busy_systems/04x_entities_12_systems                     1.16   493.8±22.87µs        ? ?/sec      1.00   424.9±15.24µs        ? ?/sec
busy_systems/04x_entities_15_systems                     1.10   587.5±23.25µs        ? ?/sec      1.00   531.7±16.32µs        ? ?/sec
busy_systems/05x_entities_03_systems                     1.14    148.2±9.61µs        ? ?/sec      1.00    129.5±4.32µs        ? ?/sec
busy_systems/05x_entities_06_systems                     1.31   359.7±17.46µs        ? ?/sec      1.00   273.6±10.55µs        ? ?/sec
busy_systems/05x_entities_09_systems                     1.22   473.5±23.11µs        ? ?/sec      1.00   389.3±13.62µs        ? ?/sec
busy_systems/05x_entities_12_systems                     1.05   562.9±20.76µs        ? ?/sec      1.00   536.5±24.35µs        ? ?/sec
busy_systems/05x_entities_15_systems                     1.23   818.5±28.70µs        ? ?/sec      1.00   666.6±45.87µs        ? ?/sec
contrived/01x_entities_03_systems                        1.27     27.5±0.49µs        ? ?/sec      1.00     21.6±1.71µs        ? ?/sec
contrived/01x_entities_06_systems                        1.22     49.9±1.18µs        ? ?/sec      1.00     40.7±2.62µs        ? ?/sec
contrived/01x_entities_09_systems                        1.30     72.3±2.39µs        ? ?/sec      1.00     55.4±2.60µs        ? ?/sec
contrived/01x_entities_12_systems                        1.28     94.3±9.44µs        ? ?/sec      1.00     73.7±3.62µs        ? ?/sec
contrived/01x_entities_15_systems                        1.25    118.0±2.43µs        ? ?/sec      1.00     94.1±3.99µs        ? ?/sec
contrived/02x_entities_03_systems                        1.23     41.6±1.71µs        ? ?/sec      1.00     33.7±2.30µs        ? ?/sec
contrived/02x_entities_06_systems                        1.19     78.6±2.63µs        ? ?/sec      1.00     65.9±2.35µs        ? ?/sec
contrived/02x_entities_09_systems                        1.28    113.6±3.60µs        ? ?/sec      1.00     88.6±3.60µs        ? ?/sec
contrived/02x_entities_12_systems                        1.20    146.4±5.75µs        ? ?/sec      1.00    121.7±3.35µs        ? ?/sec
contrived/02x_entities_15_systems                        1.23    178.5±4.86µs        ? ?/sec      1.00    145.7±4.00µs        ? ?/sec
contrived/03x_entities_03_systems                        1.42     58.3±2.77µs        ? ?/sec      1.00     41.1±1.54µs        ? ?/sec
contrived/03x_entities_06_systems                        1.32    108.5±7.30µs        ? ?/sec      1.00     82.4±4.86µs        ? ?/sec
contrived/03x_entities_09_systems                        1.23    153.7±4.61µs        ? ?/sec      1.00    125.0±4.76µs        ? ?/sec
contrived/03x_entities_12_systems                        1.18    197.5±5.12µs        ? ?/sec      1.00    166.8±8.14µs        ? ?/sec
contrived/03x_entities_15_systems                        1.23    238.8±6.38µs        ? ?/sec      1.00    194.6±4.55µs        ? ?/sec
contrived/04x_entities_03_systems                        1.34     66.4±3.42µs        ? ?/sec      1.00     49.5±1.98µs        ? ?/sec
contrived/04x_entities_06_systems                        1.27    134.3±4.86µs        ? ?/sec      1.00    105.8±3.58µs        ? ?/sec
contrived/04x_entities_09_systems                        1.26    193.2±3.83µs        ? ?/sec      1.00    153.0±5.60µs        ? ?/sec
contrived/04x_entities_12_systems                        1.16    237.1±5.78µs        ? ?/sec      1.00   204.9±18.77µs        ? ?/sec
contrived/04x_entities_15_systems                        1.17    289.2±4.76µs        ? ?/sec      1.00    246.3±8.57µs        ? ?/sec
contrived/05x_entities_03_systems                        1.26     80.4±2.90µs        ? ?/sec      1.00     63.7±3.07µs        ? ?/sec
contrived/05x_entities_06_systems                        1.27   161.6±13.47µs        ? ?/sec      1.00    127.2±5.59µs        ? ?/sec
contrived/05x_entities_09_systems                        1.22    228.0±7.76µs        ? ?/sec      1.00    186.2±7.68µs        ? ?/sec
contrived/05x_entities_12_systems                        1.20    289.5±6.21µs        ? ?/sec      1.00    241.8±7.52µs        ? ?/sec
contrived/05x_entities_15_systems                        1.18   357.3±11.24µs        ? ?/sec      1.00    302.7±7.21µs        ? ?/sec
heavy_compute/base                                       1.01    302.4±3.52µs        ? ?/sec      1.00    300.2±3.40µs        ? ?/sec
iter_fragmented/base                                     1.00    348.1±7.51ns        ? ?/sec      1.01    351.9±8.32ns        ? ?/sec
iter_fragmented/foreach                                  1.03   239.8±23.78ns        ? ?/sec      1.00   233.8±18.12ns        ? ?/sec
iter_fragmented/foreach_wide                             1.00      3.9±0.13µs        ? ?/sec      1.02      4.0±0.22µs        ? ?/sec
iter_fragmented/wide                                     1.18      4.6±0.15µs        ? ?/sec      1.00      3.9±0.10µs        ? ?/sec
iter_fragmented_sparse/base                              1.02      8.1±0.15ns        ? ?/sec      1.00      7.9±0.56ns        ? ?/sec
iter_fragmented_sparse/foreach                           1.00      7.8±0.22ns        ? ?/sec      1.01      7.9±0.62ns        ? ?/sec
iter_fragmented_sparse/foreach_wide                      1.00     37.2±1.17ns        ? ?/sec      1.10     40.9±0.95ns        ? ?/sec
iter_fragmented_sparse/wide                              1.09     48.4±2.13ns        ? ?/sec      1.00    44.5±18.34ns        ? ?/sec
iter_simple/base                                         1.02      8.4±0.10µs        ? ?/sec      1.00      8.2±0.14µs        ? ?/sec
iter_simple/foreach                                      1.01      8.3±0.07µs        ? ?/sec      1.00      8.2±0.09µs        ? ?/sec
iter_simple/foreach_sparse_set                           1.00     25.3±0.32µs        ? ?/sec      1.02     25.7±0.42µs        ? ?/sec
iter_simple/foreach_wide                                 1.03     41.1±0.94µs        ? ?/sec      1.00     39.9±0.41µs        ? ?/sec
iter_simple/foreach_wide_sparse_set                      1.05    123.6±2.05µs        ? ?/sec      1.00    118.1±2.78µs        ? ?/sec
iter_simple/sparse_set                                   1.14     30.5±1.40µs        ? ?/sec      1.00     26.9±0.64µs        ? ?/sec
iter_simple/system                                       1.01      8.4±0.25µs        ? ?/sec      1.00      8.4±0.11µs        ? ?/sec
iter_simple/wide                                         1.18     48.2±0.62µs        ? ?/sec      1.00     40.7±0.38µs        ? ?/sec
iter_simple/wide_sparse_set                              1.12   140.8±21.56µs        ? ?/sec      1.00    126.0±2.30µs        ? ?/sec
query_get/50000_entities_sparse                          1.17    378.6±7.60µs        ? ?/sec      1.00   324.1±23.17µs        ? ?/sec
query_get/50000_entities_table                           1.08   330.9±10.90µs        ? ?/sec      1.00    306.8±4.98µs        ? ?/sec
query_get_component/50000_entities_sparse                1.00   976.7±19.55µs        ? ?/sec      1.00   979.8±35.87µs        ? ?/sec
query_get_component/50000_entities_table                 1.00  1029.0±15.11µs        ? ?/sec      1.05  1080.0±59.18µs        ? ?/sec
query_get_component_simple/system                        1.13   839.7±14.18µs        ? ?/sec      1.00   742.8±10.72µs        ? ?/sec
query_get_component_simple/unchecked                     1.01   909.0±15.17µs        ? ?/sec      1.00   898.0±13.56µs        ? ?/sec
query_get_many_10/50000_calls_sparse                     1.04      5.5±0.54ms        ? ?/sec      1.00      5.3±0.67ms        ? ?/sec
query_get_many_10/50000_calls_table                      1.01      4.9±0.49ms        ? ?/sec      1.00      4.8±0.45ms        ? ?/sec
query_get_many_2/50000_calls_sparse                      1.28  848.4±210.89µs        ? ?/sec      1.00   664.8±47.69µs        ? ?/sec
query_get_many_2/50000_calls_table                       1.05   779.0±73.85µs        ? ?/sec      1.00   739.2±83.02µs        ? ?/sec
query_get_many_5/50000_calls_sparse                      1.05      2.4±0.37ms        ? ?/sec      1.00      2.3±0.33ms        ? ?/sec
query_get_many_5/50000_calls_table                       1.00  1939.9±75.22µs        ? ?/sec      1.04      2.0±0.19ms        ? ?/sec
run_criteria/yes_using_query/001_systems                 1.00      3.7±0.38µs        ? ?/sec      1.30      4.9±0.14µs        ? ?/sec
run_criteria/yes_using_query/006_systems                 1.00      8.9±0.40µs        ? ?/sec      1.17     10.3±0.57µs        ? ?/sec
run_criteria/yes_using_query/011_systems                 1.00     13.9±0.49µs        ? ?/sec      1.08     15.0±0.89µs        ? ?/sec
run_criteria/yes_using_query/016_systems                 1.00     18.8±0.74µs        ? ?/sec      1.00     18.8±1.43µs        ? ?/sec
run_criteria/yes_using_query/021_systems                 1.07     24.1±0.87µs        ? ?/sec      1.00     22.6±1.58µs        ? ?/sec
run_criteria/yes_using_query/026_systems                 1.04     27.9±0.62µs        ? ?/sec      1.00     26.8±1.71µs        ? ?/sec
run_criteria/yes_using_query/031_systems                 1.09     33.3±1.03µs        ? ?/sec      1.00     30.5±2.18µs        ? ?/sec
run_criteria/yes_using_query/036_systems                 1.14     38.7±0.80µs        ? ?/sec      1.00     33.9±1.75µs        ? ?/sec
run_criteria/yes_using_query/041_systems                 1.18     43.7±1.07µs        ? ?/sec      1.00     37.0±2.39µs        ? ?/sec
run_criteria/yes_using_query/046_systems                 1.14     47.6±1.16µs        ? ?/sec      1.00     41.9±2.09µs        ? ?/sec
run_criteria/yes_using_query/051_systems                 1.17     52.9±2.04µs        ? ?/sec      1.00     45.3±1.75µs        ? ?/sec
run_criteria/yes_using_query/056_systems                 1.25     59.2±2.38µs        ? ?/sec      1.00     47.2±2.01µs        ? ?/sec
run_criteria/yes_using_query/061_systems                 1.28    66.1±15.84µs        ? ?/sec      1.00     51.5±2.47µs        ? ?/sec
run_criteria/yes_using_query/066_systems                 1.28     70.2±2.57µs        ? ?/sec      1.00     54.7±2.58µs        ? ?/sec
run_criteria/yes_using_query/071_systems                 1.30     75.5±2.27µs        ? ?/sec      1.00     58.2±3.31µs        ? ?/sec
run_criteria/yes_using_query/076_systems                 1.26     81.5±2.66µs        ? ?/sec      1.00     64.5±3.13µs        ? ?/sec
run_criteria/yes_using_query/081_systems                 1.29     89.7±2.58µs        ? ?/sec      1.00     69.3±3.47µs        ? ?/sec
run_criteria/yes_using_query/086_systems                 1.33     95.6±3.39µs        ? ?/sec      1.00     71.8±3.48µs        ? ?/sec
run_criteria/yes_using_query/091_systems                 1.25    102.0±3.67µs        ? ?/sec      1.00     81.4±4.82µs        ? ?/sec
run_criteria/yes_using_query/096_systems                 1.33    111.7±3.29µs        ? ?/sec      1.00     83.8±4.15µs        ? ?/sec
run_criteria/yes_using_query/101_systems                 1.29   113.2±12.04µs        ? ?/sec      1.00     87.7±5.15µs        ? ?/sec
world_query_for_each/50000_entities_sparse               1.00     47.4±0.51µs        ? ?/sec      1.00     47.3±0.33µs        ? ?/sec
world_query_for_each/50000_entities_table                1.00     27.2±0.50µs        ? ?/sec      1.00     27.2±0.17µs        ? ?/sec
world_query_get/50000_entities_sparse_wide               1.09    210.5±1.78µs        ? ?/sec      1.00    192.5±2.61µs        ? ?/sec
world_query_get/50000_entities_table                     1.00    127.7±2.09µs        ? ?/sec      1.07    136.2±5.95µs        ? ?/sec
world_query_get/50000_entities_table_wide                1.00    209.8±2.37µs        ? ?/sec      1.15    240.6±2.04µs        ? ?/sec
world_query_iter/50000_entities_sparse                   1.00     54.2±0.36µs        ? ?/sec      1.01     54.7±0.61µs        ? ?/sec
world_query_iter/50000_entities_table                    1.00     27.2±0.31µs        ? ?/sec      1.00     27.3±0.64µs        ? ?/sec
```
</details>

NOTE: This PR includes a change to enable LTO on our benchmarks to get a "fully optimized" baseline for our benchmarks. Both the main and the current PR's results were with LTO enabled.
2022-11-04 06:04:55 +00:00
JoJoJet
3d6706f86d Speed up Query::get_many and add benchmarks (#6400)
# Objective

* Add benchmarks for `Query::get_many`.
* Speed up `Query::get_many`.

## Solution

Previously, `get_many` and `get_many_mut` used the method `array::map`, which tends to optimize very poorly. This PR replaces uses of that method with loops.

## Benchmarks

| Benchmark name                       | Execution time | Change from this PR |
|--------------------------------------|----------------|---------------------|
| query_get_many_2/50000_calls_table   | 1.3732 ms      | -24.967%            |
| query_get_many_2/50000_calls_sparse  | 1.3826 ms      | -24.572%            |
| query_get_many_5/50000_calls_table   | 2.6833 ms      | -30.681%            |
| query_get_many_5/50000_calls_sparse  | 2.9936 ms      | -30.672%            |
| query_get_many_10/50000_calls_table  | 5.7771 ms      | -36.950%            |
| query_get_many_10/50000_calls_sparse | 7.4345 ms      | -36.987%            |
2022-11-01 03:51:41 +00:00