bevy/crates
Patrick Walton 4dadebd9c4
Improve performance by binning together opaque items instead of sorting them. (#12453)
Today, we sort all entities added to all phases, even the phases that
don't strictly need sorting, such as the opaque and shadow phases. This
results in a performance loss because our `PhaseItem`s are rather large
in memory, so sorting is slow. Additionally, determining the boundaries
of batches is an O(n) process.

This commit makes Bevy instead applicable place phase items into *bins*
keyed by *bin keys*, which have the invariant that everything in the
same bin is potentially batchable. This makes determining batch
boundaries O(1), because everything in the same bin can be batched.
Instead of sorting each entity, we now sort only the bin keys. This
drops the sorting time to near-zero on workloads with few bins like
`many_cubes --no-frustum-culling`. Memory usage is improved too, with
batch boundaries and dynamic indices now implicit instead of explicit.
The improved memory usage results in a significant win even on
unbatchable workloads like `many_cubes --no-frustum-culling
--vary-material-data-per-instance`, presumably due to cache effects.

Not all phases can be binned; some, such as transparent and transmissive
phases, must still be sorted. To handle this, this commit splits
`PhaseItem` into `BinnedPhaseItem` and `SortedPhaseItem`. Most of the
logic that today deals with `PhaseItem`s has been moved to
`SortedPhaseItem`. `BinnedPhaseItem` has the new logic.

Frame time results (in ms/frame) are as follows:

| Benchmark                | `binning` | `main`  | Speedup |
| ------------------------ | --------- | ------- | ------- |
| `many_cubes -nfc -vpi` | 232.179     | 312.123   | 34.43%  |
| `many_cubes -nfc`        | 25.874 | 30.117 | 16.40%  |
| `many_foxes`             | 3.276 | 3.515 | 7.30%   |

(`-nfc` is short for `--no-frustum-culling`; `-vpi` is short for
`--vary-per-instance`.)

---

## Changelog

### Changed

* Render phases have been split into binned and sorted phases. Binned
phases, such as the common opaque phase, achieve improved CPU
performance by avoiding the sorting step.

## Migration Guide

- `PhaseItem` has been split into `BinnedPhaseItem` and
`SortedPhaseItem`. If your code has custom `PhaseItem`s, you will need
to migrate them to one of these two types. `SortedPhaseItem` requires
the fewest code changes, but you may want to pick `BinnedPhaseItem` if
your phase doesn't require sorting, as that enables higher performance.

## Tracy graphs

`many-cubes --no-frustum-culling`, `main` branch:
<img width="1064" alt="Screenshot 2024-03-12 180037"
src="https://github.com/bevyengine/bevy/assets/157897/e1180ce8-8e89-46d2-85e3-f59f72109a55">

`many-cubes --no-frustum-culling`, this branch:
<img width="1064" alt="Screenshot 2024-03-12 180011"
src="https://github.com/bevyengine/bevy/assets/157897/0899f036-6075-44c5-a972-44d95895f46c">

You can see that `batch_and_prepare_binned_render_phase` is a much
smaller fraction of the time. Zooming in on that function, with yellow
being this branch and red being `main`, we see:

<img width="1064" alt="Screenshot 2024-03-12 175832"
src="https://github.com/bevyengine/bevy/assets/157897/0dfc8d3f-49f4-496e-8825-a66e64d356d0">

The binning happens in `queue_material_meshes`. Again with yellow being
this branch and red being `main`:
<img width="1064" alt="Screenshot 2024-03-12 175755"
src="https://github.com/bevyengine/bevy/assets/157897/b9b20dc1-11c8-400c-a6cc-1c2e09c1bb96">

We can see that there is a small regression in `queue_material_meshes`
performance, but it's not nearly enough to outweigh the large gains in
`batch_and_prepare_binned_render_phase`.

---------

Co-authored-by: James Liu <contact@jamessliu.com>
2024-03-30 02:55:02 +00:00
..
bevy_a11y Set the logo and favicon for all of Bevy's published crates (#12696) 2024-03-25 18:52:50 +00:00
bevy_animation Move FloatOrd into bevy_math (#12732) 2024-03-27 18:30:11 +00:00
bevy_app Move PanicHandlerPlugin into bevy_app (#12640) 2024-03-29 02:04:56 +00:00
bevy_asset Allow converting mutable handle borrows to AssetId. (#12759) 2024-03-28 15:53:26 +00:00
bevy_audio updated audio_source.rs documentation (#12765) 2024-03-28 19:10:09 +00:00
bevy_color Move Point out of cubic splines module and expand it (#12747) 2024-03-28 13:40:26 +00:00
bevy_core Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_core_pipeline Improve performance by binning together opaque items instead of sorting them. (#12453) 2024-03-30 02:55:02 +00:00
bevy_derive Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_dev_tools Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_diagnostic Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_dylib Set the logo and favicon for all of Bevy's published crates (#12696) 2024-03-25 18:52:50 +00:00
bevy_dynamic_plugin Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_ecs Add QueryState::contains, document complexity, and make as_nop pub(crate) (#12776) 2024-03-29 14:49:43 +00:00
bevy_ecs_compile_fail_tests Fix Ci failing over dead code in tests (#12623) 2024-03-21 18:08:47 +00:00
bevy_encase_derive Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_gilrs Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_gizmos Improve performance by binning together opaque items instead of sorting them. (#12453) 2024-03-30 02:55:02 +00:00
bevy_gltf Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_hierarchy Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_input Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_internal Fix ambiguities causing a crash (#12780) 2024-03-29 16:00:13 +00:00
bevy_log Fix unhandled null characters in Android logs (#12743) 2024-03-29 03:04:46 +00:00
bevy_macro_utils Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_macros_compile_fail_tests Fix Ci failing over dead code in tests (#12623) 2024-03-21 18:08:47 +00:00
bevy_math Fix Triangle2d/Triangle3d interior sampling to correctly follow triangle (#12766) 2024-03-29 13:10:23 +00:00
bevy_mikktspace Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_pbr Improve performance by binning together opaque items instead of sorting them. (#12453) 2024-03-30 02:55:02 +00:00
bevy_ptr Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_reflect Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_reflect_compile_fail_tests Fix Ci failing over dead code in tests (#12623) 2024-03-21 18:08:47 +00:00
bevy_render Improve performance by binning together opaque items instead of sorting them. (#12453) 2024-03-30 02:55:02 +00:00
bevy_scene Update the Children component of the parent entity when a scene gets deleted (#12710) 2024-03-29 13:13:32 +00:00
bevy_sprite Improve performance by binning together opaque items instead of sorting them. (#12453) 2024-03-30 02:55:02 +00:00
bevy_tasks Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_text Move FloatOrd into bevy_math (#12732) 2024-03-27 18:30:11 +00:00
bevy_time Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_transform Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_ui Improve performance by binning together opaque items instead of sorting them. (#12453) 2024-03-30 02:55:02 +00:00
bevy_utils Move FloatOrd into bevy_math (#12732) 2024-03-27 18:30:11 +00:00
bevy_window Forbid unsafe in most crates in the engine (#12684) 2024-03-27 03:30:08 +00:00
bevy_winit Add Clone to WinitSettings (#12787) 2024-03-29 17:24:11 +00:00