bevy/crates/bevy_ecs/src/identifier/mod.rs
Gonçalo Rica Pais da Silva e6a324a11a
Unified identifer for entities & relations (#9797)
# Objective

The purpose of this PR is to begin putting together a unified identifier
structure that can be used by entities and later components (as
entities) as well as relationship pairs for relations, to enable all of
these to be able to use the same storages. For the moment, to keep
things small and focused, only `Entity` is being changed to make use of
the new `Identifier` type, keeping `Entity`'s API and
serialization/deserialization the same. Further changes are for
follow-up PRs.

## Solution

`Identifier` is a wrapper around `u64` split into two `u32` segments
with the idea of being generalised to not impose restrictions on
variants. That is for `Entity` to do. Instead, it is a general API for
taking bits to then merge and map into a `u64` integer. It exposes
low/high methods to return the two value portions as `u32` integers,
with then the MSB masked for usage as a type flag, enabling entity kind
discrimination and future activation/deactivation semantics.

The layout in this PR for `Identifier` is described as below, going from
MSB -> LSB.

```
|F| High value                    | Low value                      |
|_|_______________________________|________________________________|
|1| 31                            | 32                             |

F = Bit Flags
```

The high component in this implementation has only 31 bits, but that
still leaves 2^31 or 2,147,483,648 values that can be stored still, more
than enough for any generation/relation kinds/etc usage. The low part is
a full 32-bit index. The flags allow for 1 bit to be used for
entity/pair discrimination, as these have different usages for the
low/high portions of the `Identifier`. More bits can be reserved for
more variants or activation/deactivation purposes, but this currently
has no use in bevy.

More bits could be reserved for future features at the cost of bits for
the high component, so how much to reserve is up for discussion. Also,
naming of the struct and methods are also subject to further
bikeshedding and feedback.

Also, because IDs can have different variants, I wonder if
`Entity::from_bits` needs to return a `Result` instead of potentially
panicking on receiving an invalid ID.

PR is provided as an early WIP to obtain feedback and notes on whether
this approach is viable.

---

## Changelog

### Added

New `Identifier` struct for unifying IDs.

### Changed

`Entity` changed to use new `Identifier`/`IdentifierMask` as the
underlying ID logic.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: vero <email@atlasdostal.com>
2024-01-13 01:09:32 +00:00

242 lines
9.7 KiB
Rust

//! A module for the unified [`Identifier`] ID struct, for use as a representation
//! of multiple types of IDs in a single, packed type. Allows for describing an [`crate::entity::Entity`],
//! or other IDs that can be packed and expressed within a `u64` sized type.
//! [`Identifier`]s cannot be created directly, only able to be converted from other
//! compatible IDs.
use self::{error::IdentifierError, kinds::IdKind, masks::IdentifierMask};
use std::{hash::Hash, num::NonZeroU32};
pub mod error;
pub(crate) mod kinds;
pub(crate) mod masks;
/// A unified identifier for all entity and similar IDs.
/// Has the same size as a `u64` integer, but the layout is split between a 32-bit low
/// segment, a 31-bit high segment, and the significant bit reserved as type flags to denote
/// entity kinds.
#[derive(Debug, Clone, Copy)]
// Alignment repr necessary to allow LLVM to better output
// optimised codegen for `to_bits`, `PartialEq` and `Ord`.
#[repr(C, align(8))]
pub struct Identifier {
// Do not reorder the fields here. The ordering is explicitly used by repr(C)
// to make this struct equivalent to a u64.
#[cfg(target_endian = "little")]
low: u32,
high: NonZeroU32,
#[cfg(target_endian = "big")]
low: u32,
}
impl Identifier {
/// Construct a new [`Identifier`]. The `high` parameter is masked with the
/// `kind` so to pack the high value and bit flags into the same field.
#[inline(always)]
pub const fn new(low: u32, high: u32, kind: IdKind) -> Result<Self, IdentifierError> {
// the high bits are masked to cut off the most significant bit
// as these are used for the type flags. This means that the high
// portion is only 31 bits, but this still provides 2^31
// values/kinds/ids that can be stored in this segment.
let masked_value = IdentifierMask::extract_value_from_high(high);
let packed_high = IdentifierMask::pack_kind_into_high(masked_value, kind);
// If the packed high component ends up being zero, that means that we tried
// to initialise an Identifier into an invalid state.
if packed_high == 0 {
Err(IdentifierError::InvalidIdentifier)
} else {
// SAFETY: The high value has been checked to ensure it is never
// zero.
unsafe {
Ok(Self {
low,
high: NonZeroU32::new_unchecked(packed_high),
})
}
}
}
/// Returns the value of the low segment of the [`Identifier`].
#[inline(always)]
pub const fn low(self) -> u32 {
self.low
}
/// Returns the value of the high segment of the [`Identifier`]. This
/// does not apply any masking.
#[inline(always)]
pub const fn high(self) -> NonZeroU32 {
self.high
}
/// Returns the masked value of the high segment of the [`Identifier`].
/// Does not include the flag bits.
#[inline(always)]
pub const fn masked_high(self) -> u32 {
IdentifierMask::extract_value_from_high(self.high.get())
}
/// Returns the kind of [`Identifier`] from the high segment.
#[inline(always)]
pub const fn kind(self) -> IdKind {
IdentifierMask::extract_kind_from_high(self.high.get())
}
/// Convert the [`Identifier`] into a `u64`.
#[inline(always)]
pub const fn to_bits(self) -> u64 {
IdentifierMask::pack_into_u64(self.low, self.high.get())
}
/// Convert a `u64` into an [`Identifier`].
///
/// # Panics
///
/// This method will likely panic if given `u64` values that did not come from [`Identifier::to_bits`].
#[inline(always)]
pub const fn from_bits(value: u64) -> Self {
let id = Self::try_from_bits(value);
match id {
Ok(id) => id,
Err(_) => panic!("Attempted to initialise invalid bits as an id"),
}
}
/// Convert a `u64` into an [`Identifier`].
///
/// This method is the fallible counterpart to [`Identifier::from_bits`].
#[inline(always)]
pub const fn try_from_bits(value: u64) -> Result<Self, IdentifierError> {
let high = NonZeroU32::new(IdentifierMask::get_high(value));
match high {
Some(high) => Ok(Self {
low: IdentifierMask::get_low(value),
high,
}),
None => Err(IdentifierError::InvalidIdentifier),
}
}
}
// By not short-circuiting in comparisons, we get better codegen.
// See <https://github.com/rust-lang/rust/issues/117800>
impl PartialEq for Identifier {
#[inline]
fn eq(&self, other: &Self) -> bool {
// By using `to_bits`, the codegen can be optimised out even
// further potentially. Relies on the correct alignment/field
// order of `Entity`.
self.to_bits() == other.to_bits()
}
}
impl Eq for Identifier {}
// The derive macro codegen output is not optimal and can't be optimised as well
// by the compiler. This impl resolves the issue of non-optimal codegen by relying
// on comparing against the bit representation of `Entity` instead of comparing
// the fields. The result is then LLVM is able to optimise the codegen for Entity
// far beyond what the derive macro can.
// See <https://github.com/rust-lang/rust/issues/106107>
impl PartialOrd for Identifier {
#[inline]
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
// Make use of our `Ord` impl to ensure optimal codegen output
Some(self.cmp(other))
}
}
// The derive macro codegen output is not optimal and can't be optimised as well
// by the compiler. This impl resolves the issue of non-optimal codegen by relying
// on comparing against the bit representation of `Entity` instead of comparing
// the fields. The result is then LLVM is able to optimise the codegen for Entity
// far beyond what the derive macro can.
// See <https://github.com/rust-lang/rust/issues/106107>
impl Ord for Identifier {
#[inline]
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
// This will result in better codegen for ordering comparisons, plus
// avoids pitfalls with regards to macro codegen relying on property
// position when we want to compare against the bit representation.
self.to_bits().cmp(&other.to_bits())
}
}
impl Hash for Identifier {
#[inline]
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.to_bits().hash(state);
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn id_construction() {
let id = Identifier::new(12, 55, IdKind::Entity).unwrap();
assert_eq!(id.low(), 12);
assert_eq!(id.high().get(), 55);
assert_eq!(
IdentifierMask::extract_kind_from_high(id.high().get()),
IdKind::Entity
);
}
#[test]
fn from_bits() {
// This high value should correspond to the max high() value
// and also Entity flag.
let high = 0x7FFFFFFF;
let low = 0xC;
let bits: u64 = high << u32::BITS | low;
let id = Identifier::try_from_bits(bits).unwrap();
assert_eq!(id.to_bits(), 0x7FFFFFFF0000000C);
assert_eq!(id.low(), low as u32);
assert_eq!(id.high().get(), 0x7FFFFFFF);
assert_eq!(
IdentifierMask::extract_kind_from_high(id.high().get()),
IdKind::Entity
);
}
#[rustfmt::skip]
#[test]
fn id_comparison() {
// This is intentionally testing `lt` and `ge` as separate functions.
#![allow(clippy::nonminimal_bool)]
assert!(Identifier::new(123, 456, IdKind::Entity).unwrap() == Identifier::new(123, 456, IdKind::Entity).unwrap());
assert!(Identifier::new(123, 456, IdKind::Placeholder).unwrap() == Identifier::new(123, 456, IdKind::Placeholder).unwrap());
assert!(Identifier::new(123, 789, IdKind::Entity).unwrap() != Identifier::new(123, 456, IdKind::Entity).unwrap());
assert!(Identifier::new(123, 456, IdKind::Entity).unwrap() != Identifier::new(123, 789, IdKind::Entity).unwrap());
assert!(Identifier::new(123, 456, IdKind::Entity).unwrap() != Identifier::new(456, 123, IdKind::Entity).unwrap());
assert!(Identifier::new(123, 456, IdKind::Entity).unwrap() != Identifier::new(123, 456, IdKind::Placeholder).unwrap());
// ordering is by flag then high then by low
assert!(Identifier::new(123, 456, IdKind::Entity).unwrap() >= Identifier::new(123, 456, IdKind::Entity).unwrap());
assert!(Identifier::new(123, 456, IdKind::Entity).unwrap() <= Identifier::new(123, 456, IdKind::Entity).unwrap());
assert!(!(Identifier::new(123, 456, IdKind::Entity).unwrap() < Identifier::new(123, 456, IdKind::Entity).unwrap()));
assert!(!(Identifier::new(123, 456, IdKind::Entity).unwrap() > Identifier::new(123, 456, IdKind::Entity).unwrap()));
assert!(Identifier::new(9, 1, IdKind::Entity).unwrap() < Identifier::new(1, 9, IdKind::Entity).unwrap());
assert!(Identifier::new(1, 9, IdKind::Entity).unwrap() > Identifier::new(9, 1, IdKind::Entity).unwrap());
assert!(Identifier::new(9, 1, IdKind::Entity).unwrap() < Identifier::new(9, 1, IdKind::Placeholder).unwrap());
assert!(Identifier::new(1, 9, IdKind::Placeholder).unwrap() > Identifier::new(1, 9, IdKind::Entity).unwrap());
assert!(Identifier::new(1, 1, IdKind::Entity).unwrap() < Identifier::new(2, 1, IdKind::Entity).unwrap());
assert!(Identifier::new(1, 1, IdKind::Entity).unwrap() <= Identifier::new(2, 1, IdKind::Entity).unwrap());
assert!(Identifier::new(2, 2, IdKind::Entity).unwrap() > Identifier::new(1, 2, IdKind::Entity).unwrap());
assert!(Identifier::new(2, 2, IdKind::Entity).unwrap() >= Identifier::new(1, 2, IdKind::Entity).unwrap());
}
}