bevy/crates/bevy_ecs/src/schedule/graph_utils.rs
Mike 6b84ba97a3
Auto insert sync points (#9822)
# Objective

- Users are often confused when their command effects are not visible in
the next system. This PR auto inserts sync points if there are deferred
buffers on a system and there are dependents on that system (systems
with after relationships).
- Manual sync points can lead to users adding more than needed and it's
hard for the user to have a global understanding of their system graph
to know which sync points can be merged. However we can easily calculate
which sync points can be merged automatically.

## Solution

1. Add new edge types to allow opting out of new behavior
2. Insert an sync point for each edge whose initial node has deferred
system params.
3. Reuse nodes if they're at the number of sync points away.

* add opt outs for specific edges with `after_ignore_deferred`,
`before_ignore_deferred` and `chain_ignore_deferred`. The
`auto_insert_apply_deferred` boolean on `ScheduleBuildSettings` can be
set to false to opt out for the whole schedule.

## Perf
This has a small negative effect on schedule build times.
```text
group                                           auto-sync                              main-for-auto-sync
-----                                           -----------                            ------------------
build_schedule/1000_schedule                    1.06       2.8±0.15s        ? ?/sec    1.00       2.7±0.06s        ? ?/sec
build_schedule/1000_schedule_noconstraints      1.01     26.2±0.88ms        ? ?/sec    1.00     25.8±0.36ms        ? ?/sec
build_schedule/100_schedule                     1.02     13.1±0.33ms        ? ?/sec    1.00     12.9±0.28ms        ? ?/sec
build_schedule/100_schedule_noconstraints       1.08   505.3±29.30µs        ? ?/sec    1.00   469.4±12.48µs        ? ?/sec
build_schedule/500_schedule                     1.00    485.5±6.29ms        ? ?/sec    1.00    485.5±9.80ms        ? ?/sec
build_schedule/500_schedule_noconstraints       1.00      6.8±0.10ms        ? ?/sec    1.02      6.9±0.16ms        ? ?/sec
```
---

## Changelog

- Auto insert sync points and added `after_ignore_deferred`,
`before_ignore_deferred`, `chain_no_deferred` and
`auto_insert_apply_deferred` APIs to opt out of this behavior

## Migration Guide

- `apply_deferred` points are added automatically when there is ordering
relationship with a system that has deferred parameters like `Commands`.
If you want to opt out of this you can switch from `after`, `before`,
and `chain` to the corresponding `ignore_deferred` API,
`after_ignore_deferred`, `before_ignore_deferred` or
`chain_ignore_deferred` for your system/set ordering.
- You can also set `ScheduleBuildSettings::auto_insert_sync_points` to
`false` if you want to do it for the whole schedule. Note that in this
mode you can still add `apply_deferred` points manually.
- For most manual insertions of `apply_deferred` you should remove them
as they cannot be merged with the automatically inserted points and
might reduce parallelizability of the system graph.

## TODO
- [x] remove any apply_deferred used in the engine
- [x] ~~decide if we should deprecate manually using apply_deferred.~~
We'll still allow inserting manual sync points for now for whatever edge
cases users might have.
- [x] Update migration guide
- [x] rerun schedule build benchmarks

---------

Co-authored-by: Joseph <21144246+JoJoJet@users.noreply.github.com>
2023-12-14 16:34:01 +00:00

358 lines
12 KiB
Rust

use std::fmt::Debug;
use bevy_utils::{
petgraph::{algo::TarjanScc, graphmap::NodeTrait, prelude::*},
HashMap, HashSet,
};
use fixedbitset::FixedBitSet;
use crate::schedule::set::*;
/// Unique identifier for a system or system set stored in a [`ScheduleGraph`].
///
/// [`ScheduleGraph`]: super::ScheduleGraph
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub enum NodeId {
/// Identifier for a system.
System(usize),
/// Identifier for a system set.
Set(usize),
}
impl NodeId {
/// Returns the internal integer value.
pub(crate) fn index(&self) -> usize {
match self {
NodeId::System(index) | NodeId::Set(index) => *index,
}
}
/// Returns `true` if the identified node is a system.
pub const fn is_system(&self) -> bool {
matches!(self, NodeId::System(_))
}
/// Returns `true` if the identified node is a system set.
pub const fn is_set(&self) -> bool {
matches!(self, NodeId::Set(_))
}
}
/// Specifies what kind of edge should be added to the dependency graph.
#[derive(Debug, Clone, Copy, Eq, PartialEq, PartialOrd, Ord, Hash)]
pub(crate) enum DependencyKind {
/// A node that should be preceded.
Before,
/// A node that should be succeeded.
After,
/// A node that should be preceded and will **not** automatically insert an instance of `apply_deferred` on the edge.
BeforeNoSync,
/// A node that should be succeeded and will **not** automatically insert an instance of `apply_deferred` on the edge.
AfterNoSync,
}
/// An edge to be added to the dependency graph.
#[derive(Clone)]
pub(crate) struct Dependency {
pub(crate) kind: DependencyKind,
pub(crate) set: InternedSystemSet,
}
impl Dependency {
pub fn new(kind: DependencyKind, set: InternedSystemSet) -> Self {
Self { kind, set }
}
}
/// Configures ambiguity detection for a single system.
#[derive(Clone, Debug, Default)]
pub(crate) enum Ambiguity {
#[default]
Check,
/// Ignore warnings with systems in any of these system sets. May contain duplicates.
IgnoreWithSet(Vec<InternedSystemSet>),
/// Ignore all warnings.
IgnoreAll,
}
#[derive(Clone, Default)]
pub(crate) struct GraphInfo {
pub(crate) sets: Vec<InternedSystemSet>,
pub(crate) dependencies: Vec<Dependency>,
pub(crate) ambiguous_with: Ambiguity,
}
/// Converts 2D row-major pair of indices into a 1D array index.
pub(crate) fn index(row: usize, col: usize, num_cols: usize) -> usize {
debug_assert!(col < num_cols);
(row * num_cols) + col
}
/// Converts a 1D array index into a 2D row-major pair of indices.
pub(crate) fn row_col(index: usize, num_cols: usize) -> (usize, usize) {
(index / num_cols, index % num_cols)
}
/// Stores the results of the graph analysis.
pub(crate) struct CheckGraphResults<V> {
/// Boolean reachability matrix for the graph.
pub(crate) reachable: FixedBitSet,
/// Pairs of nodes that have a path connecting them.
pub(crate) connected: HashSet<(V, V)>,
/// Pairs of nodes that don't have a path connecting them.
pub(crate) disconnected: Vec<(V, V)>,
/// Edges that are redundant because a longer path exists.
pub(crate) transitive_edges: Vec<(V, V)>,
/// Variant of the graph with no transitive edges.
pub(crate) transitive_reduction: DiGraphMap<V, ()>,
/// Variant of the graph with all possible transitive edges.
// TODO: this will very likely be used by "if-needed" ordering
#[allow(dead_code)]
pub(crate) transitive_closure: DiGraphMap<V, ()>,
}
impl<V: NodeTrait + Debug> Default for CheckGraphResults<V> {
fn default() -> Self {
Self {
reachable: FixedBitSet::new(),
connected: HashSet::new(),
disconnected: Vec::new(),
transitive_edges: Vec::new(),
transitive_reduction: DiGraphMap::new(),
transitive_closure: DiGraphMap::new(),
}
}
}
/// Processes a DAG and computes its:
/// - transitive reduction (along with the set of removed edges)
/// - transitive closure
/// - reachability matrix (as a bitset)
/// - pairs of nodes connected by a path
/// - pairs of nodes not connected by a path
///
/// The algorithm implemented comes from
/// ["On the calculation of transitive reduction-closure of orders"][1] by Habib, Morvan and Rampon.
///
/// [1]: https://doi.org/10.1016/0012-365X(93)90164-O
pub(crate) fn check_graph<V>(
graph: &DiGraphMap<V, ()>,
topological_order: &[V],
) -> CheckGraphResults<V>
where
V: NodeTrait + Debug,
{
if graph.node_count() == 0 {
return CheckGraphResults::default();
}
let n = graph.node_count();
// build a copy of the graph where the nodes and edges appear in topsorted order
let mut map = HashMap::with_capacity(n);
let mut topsorted = DiGraphMap::<V, ()>::new();
// iterate nodes in topological order
for (i, &node) in topological_order.iter().enumerate() {
map.insert(node, i);
topsorted.add_node(node);
// insert nodes as successors to their predecessors
for pred in graph.neighbors_directed(node, Incoming) {
topsorted.add_edge(pred, node, ());
}
}
let mut reachable = FixedBitSet::with_capacity(n * n);
let mut connected = HashSet::new();
let mut disconnected = Vec::new();
let mut transitive_edges = Vec::new();
let mut transitive_reduction = DiGraphMap::<V, ()>::new();
let mut transitive_closure = DiGraphMap::<V, ()>::new();
let mut visited = FixedBitSet::with_capacity(n);
// iterate nodes in topological order
for node in topsorted.nodes() {
transitive_reduction.add_node(node);
transitive_closure.add_node(node);
}
// iterate nodes in reverse topological order
for a in topsorted.nodes().rev() {
let index_a = *map.get(&a).unwrap();
// iterate their successors in topological order
for b in topsorted.neighbors_directed(a, Outgoing) {
let index_b = *map.get(&b).unwrap();
debug_assert!(index_a < index_b);
if !visited[index_b] {
// edge <a, b> is not redundant
transitive_reduction.add_edge(a, b, ());
transitive_closure.add_edge(a, b, ());
reachable.insert(index(index_a, index_b, n));
let successors = transitive_closure
.neighbors_directed(b, Outgoing)
.collect::<Vec<_>>();
for c in successors {
let index_c = *map.get(&c).unwrap();
debug_assert!(index_b < index_c);
if !visited[index_c] {
visited.insert(index_c);
transitive_closure.add_edge(a, c, ());
reachable.insert(index(index_a, index_c, n));
}
}
} else {
// edge <a, b> is redundant
transitive_edges.push((a, b));
}
}
visited.clear();
}
// partition pairs of nodes into "connected by path" and "not connected by path"
for i in 0..(n - 1) {
// reachable is upper triangular because the nodes were topsorted
for index in index(i, i + 1, n)..=index(i, n - 1, n) {
let (a, b) = row_col(index, n);
let pair = (topological_order[a], topological_order[b]);
if reachable[index] {
connected.insert(pair);
} else {
disconnected.push(pair);
}
}
}
// fill diagonal (nodes reach themselves)
// for i in 0..n {
// reachable.set(index(i, i, n), true);
// }
CheckGraphResults {
reachable,
connected,
disconnected,
transitive_edges,
transitive_reduction,
transitive_closure,
}
}
/// Returns the simple cycles in a strongly-connected component of a directed graph.
///
/// The algorithm implemented comes from
/// ["Finding all the elementary circuits of a directed graph"][1] by D. B. Johnson.
///
/// [1]: https://doi.org/10.1137/0204007
pub fn simple_cycles_in_component<N>(graph: &DiGraphMap<N, ()>, scc: &[N]) -> Vec<Vec<N>>
where
N: NodeTrait + Debug,
{
let mut cycles = vec![];
let mut sccs = vec![scc.to_vec()];
while let Some(mut scc) = sccs.pop() {
// only look at nodes and edges in this strongly-connected component
let mut subgraph = DiGraphMap::new();
for &node in &scc {
subgraph.add_node(node);
}
for &node in &scc {
for successor in graph.neighbors(node) {
if subgraph.contains_node(successor) {
subgraph.add_edge(node, successor, ());
}
}
}
// path of nodes that may form a cycle
let mut path = Vec::with_capacity(subgraph.node_count());
// we mark nodes as "blocked" to avoid finding permutations of the same cycles
let mut blocked = HashSet::with_capacity(subgraph.node_count());
// connects nodes along path segments that can't be part of a cycle (given current root)
// those nodes can be unblocked at the same time
let mut unblock_together: HashMap<N, HashSet<N>> =
HashMap::with_capacity(subgraph.node_count());
// stack for unblocking nodes
let mut unblock_stack = Vec::with_capacity(subgraph.node_count());
// nodes can be involved in multiple cycles
let mut maybe_in_more_cycles: HashSet<N> = HashSet::with_capacity(subgraph.node_count());
// stack for DFS
let mut stack = Vec::with_capacity(subgraph.node_count());
// we're going to look for all cycles that begin and end at this node
let root = scc.pop().unwrap();
// start a path at the root
path.clear();
path.push(root);
// mark this node as blocked
blocked.insert(root);
// DFS
stack.clear();
stack.push((root, subgraph.neighbors(root)));
while !stack.is_empty() {
let (ref node, successors) = stack.last_mut().unwrap();
if let Some(next) = successors.next() {
if next == root {
// found a cycle
maybe_in_more_cycles.extend(path.iter());
cycles.push(path.clone());
} else if !blocked.contains(&next) {
// first time seeing `next` on this path
maybe_in_more_cycles.remove(&next);
path.push(next);
blocked.insert(next);
stack.push((next, subgraph.neighbors(next)));
continue;
} else {
// not first time seeing `next` on this path
}
}
if successors.peekable().peek().is_none() {
if maybe_in_more_cycles.contains(node) {
unblock_stack.push(*node);
// unblock this node's ancestors
while let Some(n) = unblock_stack.pop() {
if blocked.remove(&n) {
let unblock_predecessors =
unblock_together.entry(n).or_insert_with(HashSet::new);
unblock_stack.extend(unblock_predecessors.iter());
unblock_predecessors.clear();
}
}
} else {
// if its descendants can be unblocked later, this node will be too
for successor in subgraph.neighbors(*node) {
unblock_together
.entry(successor)
.or_insert_with(HashSet::new)
.insert(*node);
}
}
// remove node from path and DFS stack
path.pop();
stack.pop();
}
}
// remove node from subgraph
subgraph.remove_node(root);
// divide remainder into smaller SCCs
let mut tarjan_scc = TarjanScc::new();
tarjan_scc.run(&subgraph, |scc| {
if scc.len() > 1 {
sccs.push(scc.to_vec());
}
});
}
cycles
}