A stack-based allocator, for memory allocation in time-constrained programs' loops.
This allocator is a vector-like data structure, which asks n number of bytes from the heap when instantiated.
When we want to allocate memory for an object with this allocator, this structure gives a raw pointer to the current top of its stack, calculate the space needed by the object and its memory-alignment, and move the current top of its stack to this offset.
This library is nightly-only, and was meant for a very specific use case: game loops.
This library is available on crates.io
toml
[dependencies]
maskerad_stack_allocator = "0.1.0"
Usage as a single-frame allocator:
```rust extern crate maskeradstackallocator; use maskeradstackallocator::StackAllocator;
let singleframeallocator = StackAllocator::with_capacity(100); //100 bytes
while !closed { //allocator cleared every frame. singleframeallocator.reset();
//...
//allocate from the single frame allocator.
//Be sure to use the data during this frame only!
let my_object: &mut MyObject = single_frame_allocator.alloc(MyObject::new());
} ```
Usage as a double-buffered allocator:
This type of allocator allows you to use data created during frame n at frame n + 1.
```rust extern crate maskeradstackallocator; use maskeradstackallocator::DoubleBufferedAllocator;
let doublebufferedallocator = DoubleBufferedAllocator::with_capacity(100);
while !closed { //swap the active and inactive buffers of the allocator. doublebufferedallocator.swap_buffers();
//clear the newly active buffer.
double_buffered_allocator.reset_current();
//allocate with the current buffer, leaving the data in the inactive buffer intact.
//You can use this data during this frame, or the next frame.
let my_object: &mut MyObject = double_buffered_allocator.alloc(MyObject::new());
} ```
This library was made for memory allocations in game loops.
Those type of allocators are dropless: memory is never freed, it means we may override currently used memory !
Not in a game loop : - We allocate at the beginning of the loop. - We consume in the loop. - We reset at the end of the loop.
At the start of the loop n, we can be sure that the data allocated in the loop n - 1 is not longer used or needed.
It means that data allocated during frame n must only be usable during frame n, not n + 1 !
If you need to use data created at frame n for the frame n + 1, the double buffered allocator can solve your problem.
It can be faster: Allocations and frees move a pointer, that's all.
It prevents memory fragmentation: Allocation is always contiguous, memory cannot be fragmented over time.
Benchmarks have been realised with the bencher crate and the time crate.
Results with Bencher:
monster creation - heap: ~740ns/iter (+/- 15)
monster creation - stack allocator: ~3038ns/iter (+/- 63) ```rust fn monstercreationheap(bench: &mut Bencher) { bench.iter(|| { for _ in 0..1000 { //create monsters let monster1 = Box::new(Monster::default()); let monster2 = Box::new(Monster::default()); let monster3 = Box::new(Monster::default());
//Do stuff
//Monsters dropped at the end of the loop
}
})
}
fn monstercreationstackallocator(bench: &mut Bencher) { let singleframeallocator = StackAllocator::withcapacity(100); //100 bytes
bench.iter(|| {
for _ in 0..1000 {
//clear the single-frame allocator every frame
single_frame_allocator.reset();
//create monsters
let monster1 = single_frame_allocator.alloc(Monster::default());
let monster2 = single_frame_allocator.alloc(Monster::default());
let monster3 = single_frame_allocator.alloc(Monster::default());
//do stuff
//no drop -> memory overriding, but data at frame n - 1 can be overrided at frame n.
}
})
} ```
Result with Time:
Time - heap : from ~256 000ns to ~440 000ns
Time - stack allocator : from ~443 000ns to ~770 000ns ```rust fn speedcomparison() { let before = time::precisetime_ns();
for _ in 0..1000 {
let monster1 = Box::new(Monster::default());
let monster2 = Box::new(Monster::default());
let monster3 = Box::new(Monster::default());
}
let after = time::precise_time_ns();
let elapsed = after - before;
println!("Time with heap alloc: {}", elapsed);
let single_frame_alloc = StackAllocator::with_capacity(100);
let before = time::precise_time_ns();
for _ in 0..1000 {
single_frame_alloc.reset();
let monster1 = single_frame_alloc.alloc(Monster::default());
let monster2 = single_frame_alloc.alloc(Monster::default());
let monster3 = single_frame_alloc.alloc(Monster::default());
}
let after = time::precise_time_ns();
let elapsed = after - before;
println!("Time with stack alloc: {}", elapsed);
}
```
Time-constrained programs, like video-games, need to be as fast as possible.
A video-game, in its game loop, needs to : - Read the player's input at frame n. - Update the world state (AI, physics, object states, sounds...) at frame n. - Draw the scene at frame n in the back buffer. - Swap the back buffer (frame n) with the current buffer (frame n - 1).
In order to display 60 frames per second, this loop needs to be completed in 16 milliseconds (0.016 seconds).
One possible bottleneck is dynamic memory allocation (allocation on the heap). Even though Rust sometimes uses jemalloc, a fast general-purpose memory allocator (see this RFC), heap memory allocation can be a slow operation.
Moreover, memory can become fragmented over time :
Even though we have enough total memory, this memory is not contiguous so we can't
allocate anything.
Custom memory allocators can help with both problems.
We can distinguish 3 types of memory allocation : - Persistent memory allocation: data is allocated when the program is started, and freed when the program is shut down. The arena crate is perfect for that.
Dynamic memory allocation: data is allocated and freed during the lifetime of the program, but we can't predict when this data is allocated and freed. An Object Pool is a good data structure to deal with this type of memory allocation.
One-Frame memory allocation: Data is allocated, consumed and freed in a loop. This allocator deals with this type of memory allocation.
Game Engine Architecture, chapter 5.2
Stack Overflow answer about memory fragmentation
Stack Overflow answer about stack-based allocators
SwedishCoding blogpost about custom memory allocators
Game Programming Patterns, Chapter 19, about Object Pools
Wikipedia article about Object Pools
Allocations with the stack allocator is slower than heap allocation.
Licensed under either of
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.