
# Objective Upgrade to `wgpu` version `25.0`. Depends on https://github.com/bevyengine/naga_oil/pull/121 ## Solution ### Problem The biggest issue we face upgrading is the following requirement: > To facilitate this change, there was an additional validation rule put in place: if there is a binding array in a bind group, you may not use dynamic offset buffers or uniform buffers in that bind group. This requirement comes from vulkan rules on UpdateAfterBind descriptors. This is a major difficulty for us, as there are a number of binding arrays that are used in the view bind group. Note, this requirement does not affect merely uniform buffors that use dynamic offset but the use of *any* uniform in a bind group that also has a binding array. ### Attempted fixes The easiest fix would be to change uniforms to be storage buffers whenever binding arrays are in use: ```wgsl #ifdef BINDING_ARRAYS_ARE_USED @group(0) @binding(0) var<uniform> view: View; @group(0) @binding(1) var<uniform> lights: types::Lights; #else @group(0) @binding(0) var<storage> view: array<View>; @group(0) @binding(1) var<storage> lights: array<types::Lights>; #endif ``` This requires passing the view index to the shader so that we know where to index into the buffer: ```wgsl struct PushConstants { view_index: u32, } var<push_constant> push_constants: PushConstants; ``` Using push constants is no problem because binding arrays are only usable on native anyway. However, this greatly complicates the ability to access `view` in shaders. For example: ```wgsl #ifdef BINDING_ARRAYS_ARE_USED mesh_view_bindings::view.view_from_world[0].z #else mesh_view_bindings::view[mesh_view_bindings::view_index].view_from_world[0].z #endif ``` Using this approach would work but would have the effect of polluting our shaders with ifdef spam basically *everywhere*. Why not use a function? Unfortunately, the following is not valid wgsl as it returns a binding directly from a function in the uniform path. ```wgsl fn get_view() -> View { #if BINDING_ARRAYS_ARE_USED let view_index = push_constants.view_index; let view = views[view_index]; #endif return view; } ``` This also poses problems for things like lights where we want to return a ptr to the light data. Returning ptrs from wgsl functions isn't allowed even if both bindings were buffers. The next attempt was to simply use indexed buffers everywhere, in both the binding array and non binding array path. This would be viable if push constants were available everywhere to pass the view index, but unfortunately they are not available on webgpu. This means either passing the view index in a storage buffer (not ideal for such a small amount of state) or using push constants sometimes and uniform buffers only on webgpu. However, this kind of conditional layout infects absolutely everything. Even if we were to accept just using storage buffer for the view index, there's also the additional problem that some dynamic offsets aren't actually per-view but per-use of a setting on a camera, which would require passing that uniform data on *every* camera regardless of whether that rendering feature is being used, which is also gross. As such, although it's gross, the simplest solution just to bump binding arrays into `@group(1)` and all other bindings up one bind group. This should still bring us under the device limit of 4 for most users. ### Next steps / looking towards the future I'd like to avoid needing split our view bind group into multiple parts. In the future, if `wgpu` were to add `@builtin(draw_index)`, we could build a list of draw state in gpu processing and avoid the need for any kind of state change at all (see https://github.com/gfx-rs/wgpu/issues/6823). This would also provide significantly more flexibility to handle things like offsets into other arrays that may not be per-view. ### Testing Tested a number of examples, there are probably more that are still broken. --------- Co-authored-by: François Mockers <mockersf@gmail.com> Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com>
323 lines
11 KiB
Rust
323 lines
11 KiB
Rust
//! A shader that renders a mesh multiple times in one draw call.
|
|
//!
|
|
//! Bevy will automatically batch and instance your meshes assuming you use the same
|
|
//! `Handle<Material>` and `Handle<Mesh>` for all of your instances.
|
|
//!
|
|
//! This example is intended for advanced users and shows how to make a custom instancing
|
|
//! implementation using bevy's low level rendering api.
|
|
//! It's generally recommended to try the built-in instancing before going with this approach.
|
|
|
|
use bevy::pbr::SetMeshViewBindingArrayBindGroup;
|
|
use bevy::{
|
|
core_pipeline::core_3d::Transparent3d,
|
|
ecs::{
|
|
query::QueryItem,
|
|
system::{lifetimeless::*, SystemParamItem},
|
|
},
|
|
pbr::{
|
|
MeshPipeline, MeshPipelineKey, RenderMeshInstances, SetMeshBindGroup, SetMeshViewBindGroup,
|
|
},
|
|
prelude::*,
|
|
render::{
|
|
extract_component::{ExtractComponent, ExtractComponentPlugin},
|
|
mesh::{
|
|
allocator::MeshAllocator, MeshVertexBufferLayoutRef, RenderMesh, RenderMeshBufferInfo,
|
|
},
|
|
render_asset::RenderAssets,
|
|
render_phase::{
|
|
AddRenderCommand, DrawFunctions, PhaseItem, PhaseItemExtraIndex, RenderCommand,
|
|
RenderCommandResult, SetItemPipeline, TrackedRenderPass, ViewSortedRenderPhases,
|
|
},
|
|
render_resource::*,
|
|
renderer::RenderDevice,
|
|
sync_world::MainEntity,
|
|
view::{ExtractedView, NoFrustumCulling, NoIndirectDrawing},
|
|
Render, RenderApp, RenderSystems,
|
|
},
|
|
};
|
|
use bytemuck::{Pod, Zeroable};
|
|
|
|
/// This example uses a shader source file from the assets subdirectory
|
|
const SHADER_ASSET_PATH: &str = "shaders/instancing.wgsl";
|
|
|
|
fn main() {
|
|
App::new()
|
|
.add_plugins((DefaultPlugins, CustomMaterialPlugin))
|
|
.add_systems(Startup, setup)
|
|
.run();
|
|
}
|
|
|
|
fn setup(mut commands: Commands, mut meshes: ResMut<Assets<Mesh>>) {
|
|
commands.spawn((
|
|
Mesh3d(meshes.add(Cuboid::new(0.5, 0.5, 0.5))),
|
|
InstanceMaterialData(
|
|
(1..=10)
|
|
.flat_map(|x| (1..=10).map(move |y| (x as f32 / 10.0, y as f32 / 10.0)))
|
|
.map(|(x, y)| InstanceData {
|
|
position: Vec3::new(x * 10.0 - 5.0, y * 10.0 - 5.0, 0.0),
|
|
scale: 1.0,
|
|
color: LinearRgba::from(Color::hsla(x * 360., y, 0.5, 1.0)).to_f32_array(),
|
|
})
|
|
.collect(),
|
|
),
|
|
// NOTE: Frustum culling is done based on the Aabb of the Mesh and the GlobalTransform.
|
|
// As the cube is at the origin, if its Aabb moves outside the view frustum, all the
|
|
// instanced cubes will be culled.
|
|
// The InstanceMaterialData contains the 'GlobalTransform' information for this custom
|
|
// instancing, and that is not taken into account with the built-in frustum culling.
|
|
// We must disable the built-in frustum culling by adding the `NoFrustumCulling` marker
|
|
// component to avoid incorrect culling.
|
|
NoFrustumCulling,
|
|
));
|
|
|
|
// camera
|
|
commands.spawn((
|
|
Camera3d::default(),
|
|
Transform::from_xyz(0.0, 0.0, 15.0).looking_at(Vec3::ZERO, Vec3::Y),
|
|
// We need this component because we use `draw_indexed` and `draw`
|
|
// instead of `draw_indirect_indexed` and `draw_indirect` in
|
|
// `DrawMeshInstanced::render`.
|
|
NoIndirectDrawing,
|
|
));
|
|
}
|
|
|
|
#[derive(Component, Deref)]
|
|
struct InstanceMaterialData(Vec<InstanceData>);
|
|
|
|
impl ExtractComponent for InstanceMaterialData {
|
|
type QueryData = &'static InstanceMaterialData;
|
|
type QueryFilter = ();
|
|
type Out = Self;
|
|
|
|
fn extract_component(item: QueryItem<'_, '_, Self::QueryData>) -> Option<Self> {
|
|
Some(InstanceMaterialData(item.0.clone()))
|
|
}
|
|
}
|
|
|
|
struct CustomMaterialPlugin;
|
|
|
|
impl Plugin for CustomMaterialPlugin {
|
|
fn build(&self, app: &mut App) {
|
|
app.add_plugins(ExtractComponentPlugin::<InstanceMaterialData>::default());
|
|
app.sub_app_mut(RenderApp)
|
|
.add_render_command::<Transparent3d, DrawCustom>()
|
|
.init_resource::<SpecializedMeshPipelines<CustomPipeline>>()
|
|
.add_systems(
|
|
Render,
|
|
(
|
|
queue_custom.in_set(RenderSystems::QueueMeshes),
|
|
prepare_instance_buffers.in_set(RenderSystems::PrepareResources),
|
|
),
|
|
);
|
|
}
|
|
|
|
fn finish(&self, app: &mut App) {
|
|
app.sub_app_mut(RenderApp).init_resource::<CustomPipeline>();
|
|
}
|
|
}
|
|
|
|
#[derive(Clone, Copy, Pod, Zeroable)]
|
|
#[repr(C)]
|
|
struct InstanceData {
|
|
position: Vec3,
|
|
scale: f32,
|
|
color: [f32; 4],
|
|
}
|
|
|
|
fn queue_custom(
|
|
transparent_3d_draw_functions: Res<DrawFunctions<Transparent3d>>,
|
|
custom_pipeline: Res<CustomPipeline>,
|
|
mut pipelines: ResMut<SpecializedMeshPipelines<CustomPipeline>>,
|
|
pipeline_cache: Res<PipelineCache>,
|
|
meshes: Res<RenderAssets<RenderMesh>>,
|
|
render_mesh_instances: Res<RenderMeshInstances>,
|
|
material_meshes: Query<(Entity, &MainEntity), With<InstanceMaterialData>>,
|
|
mut transparent_render_phases: ResMut<ViewSortedRenderPhases<Transparent3d>>,
|
|
views: Query<(&ExtractedView, &Msaa)>,
|
|
) {
|
|
let draw_custom = transparent_3d_draw_functions.read().id::<DrawCustom>();
|
|
|
|
for (view, msaa) in &views {
|
|
let Some(transparent_phase) = transparent_render_phases.get_mut(&view.retained_view_entity)
|
|
else {
|
|
continue;
|
|
};
|
|
|
|
let msaa_key = MeshPipelineKey::from_msaa_samples(msaa.samples());
|
|
|
|
let view_key = msaa_key | MeshPipelineKey::from_hdr(view.hdr);
|
|
let rangefinder = view.rangefinder3d();
|
|
for (entity, main_entity) in &material_meshes {
|
|
let Some(mesh_instance) = render_mesh_instances.render_mesh_queue_data(*main_entity)
|
|
else {
|
|
continue;
|
|
};
|
|
let Some(mesh) = meshes.get(mesh_instance.mesh_asset_id) else {
|
|
continue;
|
|
};
|
|
let key =
|
|
view_key | MeshPipelineKey::from_primitive_topology(mesh.primitive_topology());
|
|
let pipeline = pipelines
|
|
.specialize(&pipeline_cache, &custom_pipeline, key, &mesh.layout)
|
|
.unwrap();
|
|
transparent_phase.add(Transparent3d {
|
|
entity: (entity, *main_entity),
|
|
pipeline,
|
|
draw_function: draw_custom,
|
|
distance: rangefinder.distance_translation(&mesh_instance.translation),
|
|
batch_range: 0..1,
|
|
extra_index: PhaseItemExtraIndex::None,
|
|
indexed: true,
|
|
});
|
|
}
|
|
}
|
|
}
|
|
|
|
#[derive(Component)]
|
|
struct InstanceBuffer {
|
|
buffer: Buffer,
|
|
length: usize,
|
|
}
|
|
|
|
fn prepare_instance_buffers(
|
|
mut commands: Commands,
|
|
query: Query<(Entity, &InstanceMaterialData)>,
|
|
render_device: Res<RenderDevice>,
|
|
) {
|
|
for (entity, instance_data) in &query {
|
|
let buffer = render_device.create_buffer_with_data(&BufferInitDescriptor {
|
|
label: Some("instance data buffer"),
|
|
contents: bytemuck::cast_slice(instance_data.as_slice()),
|
|
usage: BufferUsages::VERTEX | BufferUsages::COPY_DST,
|
|
});
|
|
commands.entity(entity).insert(InstanceBuffer {
|
|
buffer,
|
|
length: instance_data.len(),
|
|
});
|
|
}
|
|
}
|
|
|
|
#[derive(Resource)]
|
|
struct CustomPipeline {
|
|
shader: Handle<Shader>,
|
|
mesh_pipeline: MeshPipeline,
|
|
}
|
|
|
|
impl FromWorld for CustomPipeline {
|
|
fn from_world(world: &mut World) -> Self {
|
|
let mesh_pipeline = world.resource::<MeshPipeline>();
|
|
|
|
CustomPipeline {
|
|
shader: world.load_asset(SHADER_ASSET_PATH),
|
|
mesh_pipeline: mesh_pipeline.clone(),
|
|
}
|
|
}
|
|
}
|
|
|
|
impl SpecializedMeshPipeline for CustomPipeline {
|
|
type Key = MeshPipelineKey;
|
|
|
|
fn specialize(
|
|
&self,
|
|
key: Self::Key,
|
|
layout: &MeshVertexBufferLayoutRef,
|
|
) -> Result<RenderPipelineDescriptor, SpecializedMeshPipelineError> {
|
|
let mut descriptor = self.mesh_pipeline.specialize(key, layout)?;
|
|
|
|
descriptor.vertex.shader = self.shader.clone();
|
|
descriptor.vertex.buffers.push(VertexBufferLayout {
|
|
array_stride: size_of::<InstanceData>() as u64,
|
|
step_mode: VertexStepMode::Instance,
|
|
attributes: vec![
|
|
VertexAttribute {
|
|
format: VertexFormat::Float32x4,
|
|
offset: 0,
|
|
shader_location: 3, // shader locations 0-2 are taken up by Position, Normal and UV attributes
|
|
},
|
|
VertexAttribute {
|
|
format: VertexFormat::Float32x4,
|
|
offset: VertexFormat::Float32x4.size(),
|
|
shader_location: 4,
|
|
},
|
|
],
|
|
});
|
|
descriptor.fragment.as_mut().unwrap().shader = self.shader.clone();
|
|
Ok(descriptor)
|
|
}
|
|
}
|
|
|
|
type DrawCustom = (
|
|
SetItemPipeline,
|
|
SetMeshViewBindGroup<0>,
|
|
SetMeshViewBindingArrayBindGroup<1>,
|
|
SetMeshBindGroup<2>,
|
|
DrawMeshInstanced,
|
|
);
|
|
|
|
struct DrawMeshInstanced;
|
|
|
|
impl<P: PhaseItem> RenderCommand<P> for DrawMeshInstanced {
|
|
type Param = (
|
|
SRes<RenderAssets<RenderMesh>>,
|
|
SRes<RenderMeshInstances>,
|
|
SRes<MeshAllocator>,
|
|
);
|
|
type ViewQuery = ();
|
|
type ItemQuery = Read<InstanceBuffer>;
|
|
|
|
#[inline]
|
|
fn render<'w>(
|
|
item: &P,
|
|
_view: (),
|
|
instance_buffer: Option<&'w InstanceBuffer>,
|
|
(meshes, render_mesh_instances, mesh_allocator): SystemParamItem<'w, '_, Self::Param>,
|
|
pass: &mut TrackedRenderPass<'w>,
|
|
) -> RenderCommandResult {
|
|
// A borrow check workaround.
|
|
let mesh_allocator = mesh_allocator.into_inner();
|
|
|
|
let Some(mesh_instance) = render_mesh_instances.render_mesh_queue_data(item.main_entity())
|
|
else {
|
|
return RenderCommandResult::Skip;
|
|
};
|
|
let Some(gpu_mesh) = meshes.into_inner().get(mesh_instance.mesh_asset_id) else {
|
|
return RenderCommandResult::Skip;
|
|
};
|
|
let Some(instance_buffer) = instance_buffer else {
|
|
return RenderCommandResult::Skip;
|
|
};
|
|
let Some(vertex_buffer_slice) =
|
|
mesh_allocator.mesh_vertex_slice(&mesh_instance.mesh_asset_id)
|
|
else {
|
|
return RenderCommandResult::Skip;
|
|
};
|
|
|
|
pass.set_vertex_buffer(0, vertex_buffer_slice.buffer.slice(..));
|
|
pass.set_vertex_buffer(1, instance_buffer.buffer.slice(..));
|
|
|
|
match &gpu_mesh.buffer_info {
|
|
RenderMeshBufferInfo::Indexed {
|
|
index_format,
|
|
count,
|
|
} => {
|
|
let Some(index_buffer_slice) =
|
|
mesh_allocator.mesh_index_slice(&mesh_instance.mesh_asset_id)
|
|
else {
|
|
return RenderCommandResult::Skip;
|
|
};
|
|
|
|
pass.set_index_buffer(index_buffer_slice.buffer.slice(..), 0, *index_format);
|
|
pass.draw_indexed(
|
|
index_buffer_slice.range.start..(index_buffer_slice.range.start + count),
|
|
vertex_buffer_slice.range.start as i32,
|
|
0..instance_buffer.length as u32,
|
|
);
|
|
}
|
|
RenderMeshBufferInfo::NonIndexed => {
|
|
pass.draw(vertex_buffer_slice.range, 0..instance_buffer.length as u32);
|
|
}
|
|
}
|
|
RenderCommandResult::Success
|
|
}
|
|
}
|