Build performance advice (#18339)

# Objective - Fixes #18331 ## Solution - Add some info on other tools that `cargo timings`
2025-03-16 12:37:41 +01:00 · 2025-03-16 12:37:41 +01:00 · 70514229da
commit 70514229da
parent 528e68f5bb
1 changed files with 34 additions and 1 deletions
--- a/docs/profiling.md
+++ b/docs/profiling.md
@ -59,6 +59,7 @@ For more details, check out the [tracing span docs](https://docs.rs/tracing/*/tr
 ### Tracy profiler

 The [Tracy profiling tool](https://github.com/wolfpld/tracy) is:
+
 > A real time, nanosecond resolution, remote telemetry, hybrid frame and sampling profiler for games and other applications.

 There are binaries available for Windows, and installation / build instructions for other operating systems can be found in the [Tracy documentation PDF](https://github.com/wolfpld/tracy/releases/latest/download/tracy.pdf).
@ -146,10 +147,42 @@ Graphics related work is not all CPU work or all GPU work, but a mix of both, an

 ## Compile time

+### General advice
+
+- Run `cargo clean` before timing a command.
+- If you are using a rustc wrapper (like `sccache`), disable it by setting `RUSTC_WRAPPER=""`
+- To measure noise in duration, run commands more than once and take the average. [`hyperfine`](https://github.com/sharkdp/hyperfine) can do that for you with a cleanup between each execution (`hyperfine --cleanup "sleep 1; cargo clean" "cargo build"`).
+- Avoid running benchmarks on a computer that can do power throttling or thermal throttling, like a laptop.
+- Avoid running benchmarks with a processor that has different types of cores (efficiency vs performance), unless you can force the processor to use only one type of core.
+
+### Cargo timings
+
 Append `--timings` to your app's cargo command (ex: `cargo build --timings`).
 If you want a "full" profile, make sure you run `cargo clean` first (note: this will clear previously generated reports).
 The command will tell you where it saved the report, which will be in your target directory under `cargo-timings/`.
 The report is a `.html` file and can be opened and viewed in your browser.
 This will show how much time each crate in your app's dependency tree took to build.

-![image](https://user-images.githubusercontent.com/2694663/141657811-f4e15e3b-c9fc-491b-9313-236fd8c01288.png)
+![Cargo timings](https://user-images.githubusercontent.com/2694663/141657811-f4e15e3b-c9fc-491b-9313-236fd8c01288.png)
+
+### rustc self-profile
+
+Cargo can generate a self-profile when building a crate. This is an unstable feature, but it can be used on a stable toolchain with `RUSTC_BOOTSTRAP`.
+
+The following command will generate a self-profile for the `bevy_render` crate:
+
+```sh
+RUSTC_BOOTSTRAP=1 cargo rustc --package bevy_render --  -Z self-profile -Z self-profile-events=default,args
+```
+
+This will generate a file named something like `bevy_render-<id>>.mm_profdata` in the current directory. You can convert this file to a Chrome profiler trace with [`crox`](https://github.com/rust-lang/measureme/blob/master/crox/README.md) and view the resulting trace in [perfetto](https://ui.perfetto.dev/).
+
+![rustc self-profile](https://github.com/user-attachments/assets/12645add-c647-4611-b533-9145cbcbac1c)
+
+### cargo-llvm-lines
+
+[`cargo-llvm-lines`](https://github.com/dtolnay/cargo-llvm-lines) can show the number of LLVM lines generated by generic functions in your code. This can show you how much code is generated by each function, which can help you identify potential build performance issues.
+
+### cargo-bloat
+
+[`cargo-bloat`](https://github.com/RazrFalcon/cargo-bloat) can show the size of each function in your code. This can help you identify large functions that ends up in the final binary.