Fix performance regression with shadow mapping (#7914)

# Objective

- @mockersf identified a performance regression of about 25% longer frame times introduced by #7784 in a complex scene with the Amazon Lumberyard bistro scene with both exterior and interior variants and a number of point lights with shadow mapping enabled
  - The additional time seemed to be spent in the `ShadowPassNode`
  - `ShadowPassNode` encodes the draw commands for the shadow phase. Roughly the same numbers of entities were having draw commands encoded, so something about the way they were being encoded had changed.
  - One thing that definitely changed was that the pipeline used will be different depending on the alpha mode, and the scene has lots entities with opaque and blend materials. This suggested that maybe the pipeline was changing a lot so I tried a quick hack to see if it was the problem.

## Solution

- Sort the shadow phase items by their pipeline id
  - This groups phase items by their pipeline id, which significantly reduces pipeline rebinding required to the point that the performance regression was gone.
This commit is contained in:
Robert Swain 2023-03-06 00:00:40 +00:00
parent 10e6122c64
commit 2c0ff950d1
2 changed files with 16 additions and 4 deletions

View File

@ -26,7 +26,6 @@ use bevy_render::{
Extract,
};
use bevy_transform::{components::GlobalTransform, prelude::Transform};
use bevy_utils::FloatOrd;
use bevy_utils::{
tracing::{error, warn},
HashMap,
@ -1653,7 +1652,7 @@ pub struct Shadow {
}
impl PhaseItem for Shadow {
type SortKey = FloatOrd;
type SortKey = usize;
#[inline]
fn entity(&self) -> Entity {
@ -1662,7 +1661,7 @@ impl PhaseItem for Shadow {
#[inline]
fn sort_key(&self) -> Self::SortKey {
FloatOrd(self.distance)
self.pipeline.id()
}
#[inline]
@ -1672,7 +1671,10 @@ impl PhaseItem for Shadow {
#[inline]
fn sort(items: &mut [Self]) {
radsort::sort_by_key(items, |item| item.distance);
// The shadow phase is sorted by pipeline id for performance reasons.
// Grouping all draw commands using the same pipeline together performs
// better than rebinding everything at a high rate.
radsort::sort_by_key(items, |item| item.pipeline.id());
}
}

View File

@ -56,6 +56,11 @@ pub struct CachedRenderPipelineId(CachedPipelineId);
impl CachedRenderPipelineId {
/// An invalid cached render pipeline index, often used to initialize a variable.
pub const INVALID: Self = CachedRenderPipelineId(usize::MAX);
#[inline]
pub fn id(&self) -> usize {
self.0
}
}
/// Index of a cached compute pipeline in a [`PipelineCache`].
@ -65,6 +70,11 @@ pub struct CachedComputePipelineId(CachedPipelineId);
impl CachedComputePipelineId {
/// An invalid cached compute pipeline index, often used to initialize a variable.
pub const INVALID: Self = CachedComputePipelineId(usize::MAX);
#[inline]
pub fn id(&self) -> usize {
self.0
}
}
pub struct CachedPipeline {