348779850b
12 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
dda97880c4
|
Implement experimental GPU two-phase occlusion culling for the standard 3D mesh pipeline. (#17413)
*Occlusion culling* allows the GPU to skip the vertex and fragment shading overhead for objects that can be quickly proved to be invisible because they're behind other geometry. A depth prepass already eliminates most fragment shading overhead for occluded objects, but the vertex shading overhead, as well as the cost of testing and rejecting fragments against the Z-buffer, is presently unavoidable for standard meshes. We currently perform occlusion culling only for meshlets. But other meshes, such as skinned meshes, can benefit from occlusion culling too in order to avoid the transform and skinning overhead for unseen meshes. This commit adapts the same [*two-phase occlusion culling*] technique that meshlets use to Bevy's standard 3D mesh pipeline when the new `OcclusionCulling` component, as well as the `DepthPrepass` component, are present on the camera. It has these steps: 1. *Early depth prepass*: We use the hierarchical Z-buffer from the previous frame to cull meshes for the initial depth prepass, effectively rendering only the meshes that were visible in the last frame. 2. *Early depth downsample*: We downsample the depth buffer to create another hierarchical Z-buffer, this time with the current view transform. 3. *Late depth prepass*: We use the new hierarchical Z-buffer to test all meshes that weren't rendered in the early depth prepass. Any meshes that pass this check are rendered. 4. *Late depth downsample*: Again, we downsample the depth buffer to create a hierarchical Z-buffer in preparation for the early depth prepass of the next frame. This step is done after all the rendering, in order to account for custom phase items that might write to the depth buffer. Note that this patch has no effect on the per-mesh CPU overhead for occluded objects, which remains high for a GPU-driven renderer due to the lack of `cold-specialization` and retained bins. If `cold-specialization` and retained bins weren't on the horizon, then a more traditional approach like potentially visible sets (PVS) or low-res CPU rendering would probably be more efficient than the GPU-driven approach that this patch implements for most scenes. However, at this point the amount of effort required to implement a PVS baking tool or a low-res CPU renderer would probably be greater than landing `cold-specialization` and retained bins, and the GPU driven approach is the more modern one anyway. It does mean that the performance improvements from occlusion culling as implemented in this patch *today* are likely to be limited, because of the high CPU overhead for occluded meshes. Note also that this patch currently doesn't implement occlusion culling for 2D objects or shadow maps. Those can be addressed in a follow-up. Additionally, note that the techniques in this patch require compute shaders, which excludes support for WebGL 2. This PR is marked experimental because of known precision issues with the downsampling approach when applied to non-power-of-two framebuffer sizes (i.e. most of them). These precision issues can, in rare cases, cause objects to be judged occluded that in fact are not. (I've never seen this in practice, but I know it's possible; it tends to be likelier to happen with small meshes.) As a follow-up to this patch, we desire to switch to the [SPD-based hi-Z buffer shader from the Granite engine], which doesn't suffer from these problems, at which point we should be able to graduate this feature from experimental status. I opted not to include that rewrite in this patch for two reasons: (1) @JMS55 is planning on doing the rewrite to coincide with the new availability of image atomic operations in Naga; (2) to reduce the scope of this patch. A new example, `occlusion_culling`, has been added. It demonstrates objects becoming quickly occluded and disoccluded by dynamic geometry and shows the number of objects that are actually being rendered. Also, a new `--occlusion-culling` switch has been added to `scene_viewer`, in order to make it easy to test this patch with large scenes like Bistro. [*two-phase occlusion culling*]: https://medium.com/@mil_kru/two-pass-occlusion-culling-4100edcad501 [Aaltonen SIGGRAPH 2015]: https://www.advances.realtimerendering.com/s2015/aaltonenhaar_siggraph2015_combined_final_footer_220dpi.pdf [Some literature]: https://gist.github.com/reduz/c5769d0e705d8ab7ac187d63be0099b5?permalink_comment_id=5040452#gistcomment-5040452 [SPD-based hi-Z buffer shader from the Granite engine]: https://github.com/Themaister/Granite/blob/master/assets/shaders/post/hiz.comp ## Migration guide * When enqueuing a custom mesh pipeline, work item buffers are now created with `bevy::render::batching::gpu_preprocessing::get_or_create_work_item_buffer`, not `PreprocessWorkItemBuffers::new`. See the `specialized_mesh_pipeline` example. ## Showcase Occlusion culling example:  Bistro zoomed out, before occlusion culling:  Bistro zoomed out, after occlusion culling:  In this scene, occlusion culling reduces the number of meshes Bevy has to render from 1591 to 585. |
||
|
|
a6adced9ed
|
Deny derive_more error feature and replace it with thiserror (#16684)
# Objective - Remove `derive_more`'s error derivation and replace it with `thiserror` ## Solution - Added `derive_more`'s `error` feature to `deny.toml` to prevent it sneaking back in. - Reverted to `thiserror` error derivation ## Notes Merge conflicts were too numerous to revert the individual changes, so this reversion was done manually. Please scrutinise carefully during review. |
||
|
|
8718adc74f
|
Remove thiserror from bevy_render (#15765)
# Objective - Contributes to #15460 ## Solution - Removed `thiserror` from `bevy_render` |
||
|
|
d70595b667
|
Add core and alloc over std Lints (#15281)
# Objective - Fixes #6370 - Closes #6581 ## Solution - Added the following lints to the workspace: - `std_instead_of_core` - `std_instead_of_alloc` - `alloc_instead_of_core` - Used `cargo +nightly fmt` with [item level use formatting](https://rust-lang.github.io/rustfmt/?version=v1.6.0&search=#Item%5C%3A) to split all `use` statements into single items. - Used `cargo clippy --workspace --all-targets --all-features --fix --allow-dirty` to _attempt_ to resolve the new linting issues, and intervened where the lint was unable to resolve the issue automatically (usually due to needing an `extern crate alloc;` statement in a crate root). - Manually removed certain uses of `std` where negative feature gating prevented `--all-features` from finding the offending uses. - Used `cargo +nightly fmt` with [crate level use formatting](https://rust-lang.github.io/rustfmt/?version=v1.6.0&search=#Crate%5C%3A) to re-merge all `use` statements matching Bevy's previous styling. - Manually fixed cases where the `fmt` tool could not re-merge `use` statements due to conditional compilation attributes. ## Testing - Ran CI locally ## Migration Guide The MSRV is now 1.81. Please update to this version or higher. ## Notes - This is a _massive_ change to try and push through, which is why I've outlined the semi-automatic steps I used to create this PR, in case this fails and someone else tries again in the future. - Making this change has no impact on user code, but does mean Bevy contributors will be warned to use `core` and `alloc` instead of `std` where possible. - This lint is a critical first step towards investigating `no_std` options for Bevy. --------- Co-authored-by: François Mockers <francois.mockers@vleue.com> |
||
|
|
d7080369a7
|
Fix intra-doc links and make CI test them (#14076)
# Objective - Bevy currently has lot of invalid intra-doc links, let's fix them! - Also make CI test them, to avoid future regressions. - Helps with #1983 (but doesn't fix it, as there could still be explicit links to docs.rs that are broken) ## Solution - Make `cargo r -p ci -- doc-check` check fail on warnings (could also be changed to just some specific lints) - Manually fix all the warnings (note that in some cases it was unclear to me what the fix should have been, I'll try to highlight them in a self-review) |
||
|
|
d87505899f
|
Update render graph docs (#13495)
# Objective I'm reading some of the rendering code for the first time; and using this opportunity to flesh out some docs for the parts that I did not understand. rather than a questionable design choice is not a breaking change. --------- Co-authored-by: BD103 <59022059+BD103@users.noreply.github.com> |
||
|
|
7b8d502083
|
Fix beta lints (#12980)
# Objective - Fixes #12976 ## Solution This one is a doozy. - Run `cargo +beta clippy --workspace --all-targets --all-features` and fix all issues - This includes: - Moving inner attributes to be outer attributes, when the item in question has both inner and outer attributes - Use `ptr::from_ref` in more scenarios - Extend the valid idents list used by `clippy:doc_markdown` with more names - Use `Clone::clone_from` when possible - Remove redundant `ron` import - Add backticks to **so many** identifiers and items - I'm sorry whoever has to review this --- ## Changelog - Added links to more identifiers in documentation. |
||
|
|
16d28ccb91
|
RenderGraph Labelization (#10644)
# Objective
The whole `Cow<'static, str>` naming for nodes and subgraphs in
`RenderGraph` is a mess.
## Solution
Replaces hardcoded and potentially overlapping strings for nodes and
subgraphs inside `RenderGraph` with bevy's labelsystem.
---
## Changelog
* Two new labels: `RenderLabel` and `RenderSubGraph`.
* Replaced all uses for hardcoded strings with those labels
* Moved `Taa` label from its own mod to all the other `Labels3d`
* `add_render_graph_edges` now needs a tuple of labels
* Moved `ScreenSpaceAmbientOcclusion` label from its own mod with the
`ShadowPass` label to `LabelsPbr`
* Removed `NodeId`
* Renamed `Edges.id()` to `Edges.label()`
* Removed `NodeLabel`
* Changed examples according to the new label system
* Introduced new `RenderLabel`s: `Labels2d`, `Labels3d`, `LabelsPbr`,
`LabelsUi`
* Introduced new `RenderSubGraph`s: `SubGraph2d`, `SubGraph3d`,
`SubGraphUi`
* Removed `Reflect` and `Default` derive from `CameraRenderGraph`
component struct
* Improved some error messages
## Migration Guide
For Nodes and SubGraphs, instead of using hardcoded strings, you now
pass labels, which can be derived with structs and enums.
```rs
// old
#[derive(Default)]
struct MyRenderNode;
impl MyRenderNode {
pub const NAME: &'static str = "my_render_node"
}
render_app
.add_render_graph_node::<ViewNodeRunner<MyRenderNode>>(
core_3d::graph::NAME,
MyRenderNode::NAME,
)
.add_render_graph_edges(
core_3d::graph::NAME,
&[
core_3d::graph::node::TONEMAPPING,
MyRenderNode::NAME,
core_3d::graph::node::END_MAIN_PASS_POST_PROCESSING,
],
);
// new
use bevy::core_pipeline::core_3d::graph::{Labels3d, SubGraph3d};
#[derive(Debug, Hash, PartialEq, Eq, Clone, RenderLabel)]
pub struct MyRenderLabel;
#[derive(Default)]
struct MyRenderNode;
render_app
.add_render_graph_node::<ViewNodeRunner<MyRenderNode>>(
SubGraph3d,
MyRenderLabel,
)
.add_render_graph_edges(
SubGraph3d,
(
Labels3d::Tonemapping,
MyRenderLabel,
Labels3d::EndMainPassPostProcessing,
),
);
```
### SubGraphs
#### in `bevy_core_pipeline::core_2d::graph`
| old string-based path | new label |
|-----------------------|-----------|
| `NAME` | `SubGraph2d` |
#### in `bevy_core_pipeline::core_3d::graph`
| old string-based path | new label |
|-----------------------|-----------|
| `NAME` | `SubGraph3d` |
#### in `bevy_ui::render`
| old string-based path | new label |
|-----------------------|-----------|
| `draw_ui_graph::NAME` | `graph::SubGraphUi` |
### Nodes
#### in `bevy_core_pipeline::core_2d::graph`
| old string-based path | new label |
|-----------------------|-----------|
| `node::MSAA_WRITEBACK` | `Labels2d::MsaaWriteback` |
| `node::MAIN_PASS` | `Labels2d::MainPass` |
| `node::BLOOM` | `Labels2d::Bloom` |
| `node::TONEMAPPING` | `Labels2d::Tonemapping` |
| `node::FXAA` | `Labels2d::Fxaa` |
| `node::UPSCALING` | `Labels2d::Upscaling` |
| `node::CONTRAST_ADAPTIVE_SHARPENING` |
`Labels2d::ConstrastAdaptiveSharpening` |
| `node::END_MAIN_PASS_POST_PROCESSING` |
`Labels2d::EndMainPassPostProcessing` |
#### in `bevy_core_pipeline::core_3d::graph`
| old string-based path | new label |
|-----------------------|-----------|
| `node::MSAA_WRITEBACK` | `Labels3d::MsaaWriteback` |
| `node::PREPASS` | `Labels3d::Prepass` |
| `node::DEFERRED_PREPASS` | `Labels3d::DeferredPrepass` |
| `node::COPY_DEFERRED_LIGHTING_ID` | `Labels3d::CopyDeferredLightingId`
|
| `node::END_PREPASSES` | `Labels3d::EndPrepasses` |
| `node::START_MAIN_PASS` | `Labels3d::StartMainPass` |
| `node::MAIN_OPAQUE_PASS` | `Labels3d::MainOpaquePass` |
| `node::MAIN_TRANSMISSIVE_PASS` | `Labels3d::MainTransmissivePass` |
| `node::MAIN_TRANSPARENT_PASS` | `Labels3d::MainTransparentPass` |
| `node::END_MAIN_PASS` | `Labels3d::EndMainPass` |
| `node::BLOOM` | `Labels3d::Bloom` |
| `node::TONEMAPPING` | `Labels3d::Tonemapping` |
| `node::FXAA` | `Labels3d::Fxaa` |
| `node::UPSCALING` | `Labels3d::Upscaling` |
| `node::CONTRAST_ADAPTIVE_SHARPENING` |
`Labels3d::ContrastAdaptiveSharpening` |
| `node::END_MAIN_PASS_POST_PROCESSING` |
`Labels3d::EndMainPassPostProcessing` |
#### in `bevy_core_pipeline`
| old string-based path | new label |
|-----------------------|-----------|
| `taa::draw_3d_graph::node::TAA` | `Labels3d::Taa` |
#### in `bevy_pbr`
| old string-based path | new label |
|-----------------------|-----------|
| `draw_3d_graph::node::SHADOW_PASS` | `LabelsPbr::ShadowPass` |
| `ssao::draw_3d_graph::node::SCREEN_SPACE_AMBIENT_OCCLUSION` |
`LabelsPbr::ScreenSpaceAmbientOcclusion` |
| `deferred::DEFFERED_LIGHTING_PASS` | `LabelsPbr::DeferredLightingPass`
|
#### in `bevy_render`
| old string-based path | new label |
|-----------------------|-----------|
| `main_graph::node::CAMERA_DRIVER` | `graph::CameraDriverLabel` |
#### in `bevy_ui::render`
| old string-based path | new label |
|-----------------------|-----------|
| `draw_ui_graph::node::UI_PASS` | `graph::LabelsUi::UiPass` |
---
## Future work
* Make `NodeSlot`s also use types. Ideally, we have an enum with unit
variants where every variant resembles one slot. Then to make sure you
are using the right slot enum and make rust-analyzer play nicely with
it, we should make an associated type in the `Node` trait. With today's
system, we can introduce 3rd party slots to a node, and i wasnt sure if
this was used, so I didn't do this in this PR.
## Unresolved Questions
When looking at the `post_processing` example, we have a struct for the
label and a struct for the node, this seems like boilerplate and on
discord, @IceSentry (sowy for the ping)
[asked](https://discord.com/channels/691052431525675048/743663924229963868/1175197016947699742)
if a node could automatically introduce a label (or i completely
misunderstood that). The problem with that is, that nodes like
`EmptyNode` exist multiple times *inside the same* (sub)graph, so there
we need extern labels to distinguish between those. Hopefully we can
find a way to reduce boilerplate and still have everything unique. For
EmptyNode, we could maybe make a macro which implements an "empty node"
for a type, but for nodes which contain code and need to be present
multiple times, this could get nasty...
|
||
|
|
2c21d423fd
|
Make render graph slots optional for most cases (#8109)
# Objective
- Currently, the render graph slots are only used to pass the
view_entity around. This introduces significant boilerplate for very
little value. Instead of using slots for this, make the view_entity part
of the `RenderGraphContext`. This also means we won't need to have
`IN_VIEW` on every node and and we'll be able to use the default impl of
`Node::input()`.
## Solution
- Add `view_entity: Option<Entity>` to the `RenderGraphContext`
- Update all nodes to use this instead of entity slot input
---
## Changelog
- Add optional `view_entity` to `RenderGraphContext`
## Migration Guide
You can now get the view_entity directly from the `RenderGraphContext`.
When implementing the Node:
```rust
// 0.10
struct FooNode;
impl FooNode {
const IN_VIEW: &'static str = "view";
}
impl Node for FooNode {
fn input(&self) -> Vec<SlotInfo> {
vec![SlotInfo::new(Self::IN_VIEW, SlotType::Entity)]
}
fn run(
&self,
graph: &mut RenderGraphContext,
// ...
) -> Result<(), NodeRunError> {
let view_entity = graph.get_input_entity(Self::IN_VIEW)?;
// ...
Ok(())
}
}
// 0.11
struct FooNode;
impl Node for FooNode {
fn run(
&self,
graph: &mut RenderGraphContext,
// ...
) -> Result<(), NodeRunError> {
let view_entity = graph.view_entity();
// ...
Ok(())
}
}
```
When adding the node to the graph, you don't need to specify a slot_edge
for the view_entity.
```rust
// 0.10
let mut graph = RenderGraph::default();
graph.add_node(FooNode::NAME, node);
let input_node_id = draw_2d_graph.set_input(vec![SlotInfo::new(
graph::input::VIEW_ENTITY,
SlotType::Entity,
)]);
graph.add_slot_edge(
input_node_id,
graph::input::VIEW_ENTITY,
FooNode::NAME,
FooNode::IN_VIEW,
);
// add_node_edge ...
// 0.11
let mut graph = RenderGraph::default();
graph.add_node(FooNode::NAME, node);
// add_node_edge ...
```
## Notes
This PR paired with #8007 will help reduce a lot of annoying boilerplate
with the render nodes. Depending on which one gets merged first. It will
require a bit of clean up work to make both compatible.
I tagged this as a breaking change, because using the old system to get
the view_entity will break things because it's not a node input slot
anymore.
## Notes for reviewers
A lot of the diffs are just removing the slots in every nodes and graph
creation. The important part is mostly in the
graph_runner/CameraDriverNode.
|
||
|
|
daa57fe489 |
Add try_* to add_slot_edge, add_node_edge (#6720)
# Objective `add_node_edge` and `add_slot_edge` are fallible methods, but are always used with `.unwrap()`. `input_node` is often unwrapped as well. This points to having an infallible behaviour as default, with an alternative fallible variant if needed. Improves readability and ergonomics. ## Solution - Change `add_node_edge` and `add_slot_edge` to panic on error. - Change `input_node` to panic on `None`. - Add `try_add_node_edge` and `try_add_slot_edge` in case fallible methods are needed. - Add `get_input_node` to still be able to get an `Option`. --- ## Changelog ### Added - `try_add_node_edge` - `try_add_slot_edge` - `get_input_node` ### Changed - `add_node_edge` is now infallible (panics on error) - `add_slot_edge` is now infallible (panics on error) - `input_node` now panics on `None` ## Migration Guide Remove `.unwrap()` from `add_node_edge` and `add_slot_edge`. For cases where the error was handled, use `try_add_node_edge` and `try_add_slot_edge` instead. Remove `.unwrap()` from `input_node`. For cases where the option was handled, use `get_input_node` instead. Co-authored-by: Torstein Grindvik <52322338+torsteingrindvik@users.noreply.github.com> |
||
|
|
cba9bcc7ba |
improve error messages for render graph runner (#3930)
# Objective Currently, errors in the render graph runner are exposed via a `Result::unwrap()` panic message, which dumps the debug representation of the error. ## Solution This PR updates `render_system` to log the chain of errors, followed by an explicit panic: ``` ERROR bevy_render::renderer: Error running render graph: ERROR bevy_render::renderer: > encountered an error when running a sub-graph ERROR bevy_render::renderer: > tried to pass inputs to sub-graph "outline_graph", which has no input slots thread 'main' panicked at 'Error running render graph: encountered an error when running a sub-graph', /[redacted]/bevy/crates/bevy_render/src/renderer/mod.rs:44:9 ``` Some errors' `Display` impls (via `thiserror`) have also been updated to provide more detail about the cause of the error. |
||
|
|
ffecb05a0a |
Replace old renderer with new renderer (#3312)
This makes the [New Bevy Renderer](#2535) the default (and only) renderer. The new renderer isn't _quite_ ready for the final release yet, but I want as many people as possible to start testing it so we can identify bugs and address feedback prior to release. The examples are all ported over and operational with a few exceptions: * I removed a good portion of the examples in the `shader` folder. We still have some work to do in order to make these examples possible / ergonomic / worthwhile: #3120 and "high level shader material plugins" are the big ones. This is a temporary measure. * Temporarily removed the multiple_windows example: doing this properly in the new renderer will require the upcoming "render targets" changes. Same goes for the render_to_texture example. * Removed z_sort_debug: entity visibility sort info is no longer available in app logic. we could do this on the "render app" side, but i dont consider it a priority. |