This commit allows Bevy to bind 16 lightmaps at a time, if the current
platform supports bindless textures. Naturally, if bindless textures
aren't supported, Bevy falls back to binding only a single lightmap at a
time. As lightmaps are usually heavily atlased, I doubt many scenes will
use more than 16 lightmap textures.
This has little performance impact now, but it's desirable for us to
reap the benefits of multidraw and bindless textures on scenes that use
lightmaps. Otherwise, we might have to break batches in order to switch
those lightmaps.
Additionally, this PR slightly reduces the cost of binning because it
makes the lightmap index in `Opaque3dBinKey` 32 bits instead of an
`AssetId`.
## Migration Guide
* The `Opaque3dBinKey::lightmap_image` field is now
`Opaque3dBinKey::lightmap_slab`, which is a lightweight identifier for
an entire binding array of lightmaps.
I forgot to set `BINDLESS_SLOT_COUNT` in `ExtendedMaterial`'s
implementation of `AsBindGroup`, so it didn't actually become bindless.
In fact, it would usually crash with a shader/bind group layout
mismatch, because some parts of Bevy's renderer thought that the
resulting material was bindless while other parts didn't. This commit
corrects the situation.
I had to make `BINDLESS_SLOT_COUNT` a function instead of a constant
because the `ExtendedMaterial` version needs some logic. Unfortunately,
trait methods can't be `const fn`s, so it has to be a runtime function.
Updating dependencies; adopted version of #15696. (Supercedes #15696.)
Long answer: hashbrown is no longer using ahash by default, meaning that
we can't use the default-hasher methods with ahasher. So, we have to use
the longer-winded versions instead. This takes the opportunity to also
switch our default hasher as well, but without actually enabling the
default-hasher feature for hashbrown, meaning that we'll be able to
change our hasher more easily at the cost of all of these method calls
being obnoxious forever.
One large change from 0.15 is that `insert_unique_unchecked` is now
`unsafe`, and for cases where unsafe code was denied at the crate level,
I replaced it with `insert`.
## Migration Guide
`bevy_utils` has updated its version of `hashbrown` to 0.15 and now
defaults to `foldhash` instead of `ahash`. This means that if you've
hard-coded your hasher to `bevy_utils::AHasher` or separately used the
`ahash` crate in your code, you may need to switch to `foldhash` to
ensure that everything works like it does in Bevy.
This commit makes `StandardMaterial` use bindless textures, as
implemented in PR #16368. Non-bindless mode, as used for example in
Metal and WebGL 2, remains fully supported via a plethora of `#ifdef
BINDLESS` preprocessor definitions.
Unfortunately, this PR introduces quite a bit of unsightliness into the
PBR shaders. This is a result of the fact that WGSL supports neither
passing binding arrays to functions nor passing individual *elements* of
binding arrays to functions, except directly to texture sample
functions. Thus we're unable to use the `sample_texture` abstraction
that helped abstract over the meshlet and non-meshlet paths. I don't
think there's anything we can do to help this other than to suggest
improvements to upstream Naga.
The bindless PR (#16368) broke some examples:
* `specialized_mesh_pipeline` and `custom_shader_instancing` failed
because they expect to be able to render a mesh with no material, by
overriding enough of the render pipeline to be able to do so. This PR
fixes the issue by restoring the old behavior in which we extract meshes
even if they have no material.
* `texture_binding_array` broke because it doesn't implement
`AsBindGroup::unprepared_bind_group`. This was tricky to fix because
there's a very good reason why `texture_binding_array` doesn't implement
that method: there's no sensible way to do so with `wgpu`'s current
bindless API, due to its multiple levels of borrowed references. To fix
the example, I split `MaterialBindGroup` into
`MaterialBindlessBindGroup` and `MaterialNonBindlessBindGroup`, and
allow direct custom implementations of `AsBindGroup::as_bind_group` for
the latter type of bind groups. To opt in to the new behavior, return
the `AsBindGroupError::CreateBindGroupDirectly` error from your
`AsBindGroup::unprepared_bind_group` implementation, and Bevy will call
your custom `AsBindGroup::as_bind_group` method as before.
## Migration Guide
* Bevy will now unconditionally call
`AsBindGroup::unprepared_bind_group` for your materials, so you must no
longer panic in that function. Instead, return the new
`AsBindGroupError::CreateBindGroupDirectly` error, and Bevy will fall
back to calling `AsBindGroup::as_bind_group` as before.
This patch adds the infrastructure necessary for Bevy to support
*bindless resources*, by adding a new `#[bindless]` attribute to
`AsBindGroup`.
Classically, only a single texture (or sampler, or buffer) can be
attached to each shader binding. This means that switching materials
requires breaking a batch and issuing a new drawcall, even if the mesh
is otherwise identical. This adds significant overhead not only in the
driver but also in `wgpu`, as switching bind groups increases the amount
of validation work that `wgpu` must do.
*Bindless resources* are the typical solution to this problem. Instead
of switching bindings between each texture, the renderer instead
supplies a large *array* of all textures in the scene up front, and the
material contains an index into that array. This pattern is repeated for
buffers and samplers as well. The renderer now no longer needs to switch
binding descriptor sets while drawing the scene.
Unfortunately, as things currently stand, this approach won't quite work
for Bevy. Two aspects of `wgpu` conspire to make this ideal approach
unacceptably slow:
1. In the DX12 backend, all binding arrays (bindless resources) must
have a constant size declared in the shader, and all textures in an
array must be bound to actual textures. Changing the size requires a
recompile.
2. Changing even one texture incurs revalidation of all textures, a
process that takes time that's linear in the total size of the binding
array.
This means that declaring a large array of textures big enough to
encompass the entire scene is presently unacceptably slow. For example,
if you declare 4096 textures, then `wgpu` will have to revalidate all
4096 textures if even a single one changes. This process can take
multiple frames.
To work around this problem, this PR groups bindless resources into
small *slabs* and maintains a free list for each. The size of each slab
for the bindless arrays associated with a material is specified via the
`#[bindless(N)]` attribute. For instance, consider the following
declaration:
```rust
#[derive(AsBindGroup)]
#[bindless(16)]
struct MyMaterial {
#[buffer(0)]
color: Vec4,
#[texture(1)]
#[sampler(2)]
diffuse: Handle<Image>,
}
```
The `#[bindless(N)]` attribute specifies that, if bindless arrays are
supported on the current platform, each resource becomes a binding array
of N instances of that resource. So, for `MyMaterial` above, the `color`
attribute is exposed to the shader as `binding_array<vec4<f32>, 16>`,
the `diffuse` texture is exposed to the shader as
`binding_array<texture_2d<f32>, 16>`, and the `diffuse` sampler is
exposed to the shader as `binding_array<sampler, 16>`. Inside the
material's vertex and fragment shaders, the applicable index is
available via the `material_bind_group_slot` field of the `Mesh`
structure. So, for instance, you can access the current color like so:
```wgsl
// `uniform` binding arrays are a non-sequitur, so `uniform` is automatically promoted
// to `storage` in bindless mode.
@group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>;
...
@fragment
fn fragment(in: VertexOutput) -> @location(0) vec4<f32> {
let color = material_color[mesh[in.instance_index].material_bind_group_slot];
...
}
```
Note that portable shader code can't guarantee that the current platform
supports bindless textures. Indeed, bindless mode is only available in
Vulkan and DX12. The `BINDLESS` shader definition is available for your
use to determine whether you're on a bindless platform or not. Thus a
portable version of the shader above would look like:
```wgsl
#ifdef BINDLESS
@group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>;
#else // BINDLESS
@group(2) @binding(0) var<uniform> material_color: Color;
#endif // BINDLESS
...
@fragment
fn fragment(in: VertexOutput) -> @location(0) vec4<f32> {
#ifdef BINDLESS
let color = material_color[mesh[in.instance_index].material_bind_group_slot];
#else // BINDLESS
let color = material_color;
#endif // BINDLESS
...
}
```
Importantly, this PR *doesn't* update `StandardMaterial` to be bindless.
So, for example, `scene_viewer` will currently not run any faster. I
intend to update `StandardMaterial` to use bindless mode in a follow-up
patch.
A new example, `shaders/shader_material_bindless`, has been added to
demonstrate how to use this new feature.
Here's a Tracy profile of `submit_graph_commands` of this patch and an
additional patch (not submitted yet) that makes `StandardMaterial` use
bindless. Red is those patches; yellow is `main`. The scene was Bistro
Exterior with a hack that forces all textures to opaque. You can see a
1.47x mean speedup.

## Migration Guide
* `RenderAssets::prepare_asset` now takes an `AssetId` parameter.
* Bin keys now have Bevy-specific material bind group indices instead of
`wgpu` material bind group IDs, as part of the bindless change. Use the
new `MaterialBindGroupAllocator` to map from bind group index to bind
group ID.