Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(data_structures): improve docs for stack types #8356

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions crates/oxc_data_structures/src/stack/mod.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
//! Contains the following FILO data structures:
//! - [`Stack`]: A growable stack
//! - [`SparseStack`]: A stack that can have empty entries
//! - [`NonEmptyStack`]: A growable stack that can never be empty, allowing for more efficient
//! operations
//!
//! * [`Stack`]: A growable stack, equivalent to [`Vec`], but more efficient for stack usage (push/pop).
//! * [`NonEmptyStack`]: A growable stack that can never be empty, allowing for more efficient operations
//! (very fast `last` / `last_mut`).
//! * [`SparseStack`]: A growable stack of `Option`s, optimized for low memory usage when many entries in
//! the stack are empty (`None`).

mod capacity;
mod common;
mod non_empty;
Expand Down
59 changes: 41 additions & 18 deletions crates/oxc_data_structures/src/stack/non_empty.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,44 +9,67 @@ use super::{NonNull, StackCapacity, StackCommon};

/// A stack which can never be empty.
///
/// `NonEmptyStack` is created initially with 1 entry, and `pop` does not allow removing it
/// (though that initial entry can be mutated with `last_mut`).
/// [`NonEmptyStack`] is created initially with 1 entry, and [`pop`] does not allow removing it
/// (though that initial entry can be mutated with [`last_mut`]).
///
/// The fact that the stack is never empty makes all operations except `pop` infallible.
/// `last` and `last_mut` are branchless.
/// The fact that the stack is never empty makes all operations except [`pop`] infallible.
/// [`last`] and [`last_mut`] are branchless.
///
/// The trade-off is that you cannot create a `NonEmptyStack` without allocating.
/// The trade-off is that you cannot create a [`NonEmptyStack`] without allocating,
/// and you must create an initial value for the "dummy" initial entry.
/// If that is not a good trade-off for your use case, prefer [`Stack`], which can be empty.
///
/// [`NonEmptyStack`] is usually a better choice than [`Stack`], unless either:
///
/// 1. The stack will likely never have anything pushed to it.
/// [`NonEmptyStack::new`] always allocates, whereas [`Stack::new`] does not.
/// So if stack usually starts empty and remains empty, [`Stack`] will avoid an allocation.
/// This is the same as how [`Vec`] does not allocate until you push a value into it.
///
/// 2. The type the stack holds is large or expensive to construct, so there's a high cost in having to
/// create an initial dummy value (which [`NonEmptyStack`] requires, but [`Stack`] doesn't).
///
/// [`SparseStack`] may be preferable if the type you're storing is an `Option`.
///
/// To simplify implementation, zero size types are not supported (e.g. `NonEmptyStack<()>`).
///
/// ## Design
/// Designed for maximally efficient `push`, `pop`, and reading/writing the last value on stack.
/// Designed for maximally efficient [`push`], [`pop`], and reading/writing the last value on stack
/// ([`last`] / [`last_mut`]).
///
/// The alternative would likely be to use a `Vec`. But `Vec` is optimized for indexing into at
/// The alternative would likely be to use a [`Vec`]. But `Vec` is optimized for indexing into at
/// arbitrary positions, not for `push` and `pop`. `Vec` stores `len` and `capacity` as integers,
/// so requires pointer maths on every operation: `let entry_ptr = base_ptr + index * size_of::<T>();`.
///
/// In comparison, `NonEmptyStack` contains a `cursor` pointer, which always points to last entry
/// In comparison, [`NonEmptyStack`] contains a `cursor` pointer, which always points to last entry
/// on stack, so it can be read/written with a minimum of operations.
///
/// This design is similar to `std`'s slice iterator.
/// This design is similar to [`std`'s slice iterators].
///
/// Comparison to `Vec`:
/// * `last` and `last_mut` are 1 instruction, instead of `Vec`'s 4.
/// * `pop` is 1 instruction shorter than `Vec`'s equivalent.
/// * `push` is 1 instruction shorter than `Vec`'s equivalent, and uses 1 less register.
/// Comparison to [`Vec`]:
/// * [`last`] and [`last_mut`] are 1 instruction, instead of `Vec`'s 4.
/// * [`pop`] is 1 instruction shorter than `Vec`'s equivalent.
/// * [`push`] is 1 instruction shorter than `Vec`'s equivalent, and uses 1 less register.
///
/// ### Possible alternative designs
/// 1. `cursor` could point to *after* last entry, rather than *to* it. This has advantage that `pop`
/// uses 1 less register, but disadvantage that `last` and `last_mut` are 2 instructions, not 1.
/// 1. `cursor` could point to *after* last entry, rather than *to* it. This has advantage that [`pop`]
/// uses 1 less register, but disadvantage that [`last`] and [`last_mut`] are 2 instructions, not 1.
/// <https://godbolt.org/z/xnx7YP5de>
///
/// 2. Stack could grow downwards, like `bumpalo` allocator does. This would probably make `pop` use
/// 1 less register, but at the cost that the stack can never grow in place, which would incur more
/// memory copies when the stack grows.
/// 2. Stack could grow downwards, like `bumpalo` allocator does. This would probably make [`pop`] use
/// 1 less register, but at the cost that: (a) the stack can never grow in place, which would incur
/// more memory copies when the stack grows, and (b) [`as_slice`] would have the entries in
/// reverse order.
///
/// [`push`]: NonEmptyStack::push
/// [`pop`]: NonEmptyStack::pop
/// [`last`]: NonEmptyStack::last
/// [`last_mut`]: NonEmptyStack::last_mut
/// [`as_slice`]: NonEmptyStack::as_slice
/// [`Stack`]: super::Stack
/// [`Stack::new`]: super::Stack::new
/// [`SparseStack`]: super::SparseStack
/// [`std`'s slice iterators]: std::slice::Iter
pub struct NonEmptyStack<T> {
/// Pointer to last entry on stack.
/// Points *to* last entry, not *after* last entry.
Expand Down
15 changes: 13 additions & 2 deletions crates/oxc_data_structures/src/stack/sparse.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,16 @@ use super::{NonEmptyStack, Stack};

/// Stack which is sparsely filled.
///
/// Functionally equivalent to a stack implemented as `Vec<Option<T>>`, but more memory-efficient
/// Functionally equivalent to [`NonEmptyStack<Option<T>>`], but more memory-efficient
/// in cases where majority of entries in the stack will be empty (`None`).
///
/// It has the same advantages as [`NonEmptyStack`] in terms of [`last`] and [`last_mut`] being
/// infallible and branchless, and with very fast lookup (without any pointer maths).
/// [`SparseStack`]'s advantage over [`NonEmptyStack`] is less memory usage for empty entries (`None`).
///
/// Stack is initialized with a single entry which can never be popped off.
/// If `Program` has a entry on the stack, can use this initial entry for it. Get value for `Program`
/// in `exit_program` visitor with `SparseStack::take_last` instead of `SparseStack::pop`.
/// in `exit_program` visitor with [`take_last`] instead of [`pop`].
///
/// The stack is stored as 2 arrays:
/// 1. `has_values` - Records whether an entry on the stack has a value or not (`Some` or `None`).
Expand All @@ -19,12 +23,19 @@ use super::{NonEmptyStack, Stack};
///
/// e.g. if `T` is 24 bytes, and 90% of stack entries have no values:
/// * `Vec<Option<T>>` is 24 bytes per entry (or 32 bytes if `T` has no niche).
/// * `NonEmptyStack<Option<T>>` is same.
/// * `SparseStack<T>` is 4 bytes per entry.
///
/// When the stack grows and reallocates, `SparseStack` has less memory to copy, which is a performance
/// win too.
///
/// To simplify implementation, zero size types are not supported (`SparseStack<()>`).
///
/// [`last`]: SparseStack::last
/// [`last_mut`]: SparseStack::last_mut
/// [`take_last`]: SparseStack::take_last
/// [`pop`]: SparseStack::pop
/// [`NonEmptyStack<Option<T>>`]: NonEmptyStack
pub struct SparseStack<T> {
has_values: NonEmptyStack<bool>,
values: Stack<T>,
Expand Down
30 changes: 22 additions & 8 deletions crates/oxc_data_structures/src/stack/standard.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,24 +12,38 @@ use super::{NonNull, StackCapacity, StackCommon};
/// If a non-empty stack is viable for your use case, prefer [`NonEmptyStack`], which is cheaper for
/// all operations.
///
/// [`NonEmptyStack`] is usually the better choice, unless:
/// 1. You want `new()` not to allocate.
/// 2. Creating initial value for `NonEmptyStack::new()` is expensive.
/// [`NonEmptyStack`] is usually the better choice, unless either:
///
/// 1. The stack will likely never have anything pushed to it.
/// [`NonEmptyStack::new`] always allocates, whereas [`Stack::new`] does not.
/// So if stack usually starts empty and remains empty, [`Stack`] will avoid an allocation.
/// This is the same as how [`Vec`] does not allocate until you push a value into it.
///
/// 2. The type the stack holds is large or expensive to construct, so there's a high cost in having to
/// create an initial dummy value (which [`NonEmptyStack`] requires, but [`Stack`] doesn't).
///
/// To simplify implementation, zero size types are not supported (`Stack<()>`).
///
/// ## Design
/// Designed for maximally efficient `push`, `pop`, and reading/writing the last value on stack
/// (although, unlike [`NonEmptyStack`], `last` and `last_mut` are fallible, and not branchless).
/// Designed for maximally efficient [`push`], [`pop`], and reading/writing the last value on stack
/// ([`last`] / [`last_mut`]). Although, unlike [`NonEmptyStack`], [`last`] and [`last_mut`] are
/// fallible, and not branchless. So [`Stack::last`] and [`Stack::last_mut`] are a bit more expensive
/// than [`NonEmptyStack`]'s equivalents.
///
/// The alternative would likely be to use a `Vec`. But `Vec` is optimized for indexing into at
/// The alternative would likely be to use a [`Vec`]. But `Vec` is optimized for indexing into at
/// arbitrary positions, not for `push` and `pop`. `Vec` stores `len` and `capacity` as integers,
/// so requires pointer maths on every operation: `let entry_ptr = base_ptr + index * size_of::<T>();`.
///
/// In comparison, `Stack` uses a `cursor` pointer, so avoids these calculations.
/// This is similar to how `std`'s slice iterators work.
/// In comparison, [`Stack`] uses a `cursor` pointer, so avoids these calculations.
/// This is similar to how [`std`'s slice iterators] work.
///
/// [`push`]: Stack::push
/// [`pop`]: Stack::pop
/// [`last`]: Stack::last
/// [`last_mut`]: Stack::last_mut
/// [`NonEmptyStack`]: super::NonEmptyStack
/// [`NonEmptyStack::new`]: super::NonEmptyStack::new
/// [`std`'s slice iterators]: std::slice::Iter
pub struct Stack<T> {
// Pointer to *after* last entry on stack.
cursor: NonNull<T>,
Expand Down
Loading