| Crates.io | buffer_sv2 |
| lib.rs | buffer_sv2 |
| version | 2.0.0 |
| created_at | 2021-06-14 17:04:02.744542+00 |
| updated_at | 2025-03-28 12:48:16.057204+00 |
| description | buffer |
| homepage | https://stratumprotocol.org |
| repository | https://github.com/stratum-mining/stratum |
| max_upload_size | |
| id | 410001 |
| size | 166,986 |
buffer_sv2buffer_sv2 handles memory management for Stratum V2 (Sv2) roles. It provides a memory-efficient
buffer pool that minimizes allocations and deallocations for high-throughput message frame
processing in Sv2 roles. Memory allocation overhead is minimized by reusing large buffers,
improving performance and reducing latency. The buffer pool tracks the usage of memory slices,
using shared state tracking to safely manage memory across multiple threads.
BufferPool and BufferFromSystemMemory) that includes a Write trait to replace
std::io::Write in no_std environments.To include this crate in your project, run:
cargo add buffer_sv2
This crate can be built with the following feature flags:
debug: Provides additional tracking for debugging memory management issues.fuzz: Enables support for fuzz testing.There are four unsafe code blocks instances:
buffer_pool/mod.rs: fn get_writable_(&mut self, len: usize, shared_state: u8, without_check: bool) -> &mut [u8] { .. } in the impl<T: Buffer> BufferPool<T>slice.rs:
unsafe impl Send for Slice {}fn as_mut(&mut self) -> &mut [u8] { .. } in the impl AsMut<[u8]> for Slicefn as_ref(&mut self) -> &mut [u8] { .. } in the impl AsMut<[u8]> for SliceThis crate provides three examples demonstrating how the memory is managed:
Basic Usage Example: Creates a buffer pool, writes to it, and retrieves the data from it.
Buffer Pool Exhaustion Example: Demonstrates how data is added to a buffer pool and dynamically allocates directly to the heap once the buffer pool's capacity has been exhausted.
Variable Sized Messages Example: Writes messages of variable sizes to the buffer pool.
Buffer TraitThe Buffer trait is designed to work with the
codec_sv2 decoders, which operate by:
framing_sv2::framing::Frame.To fill the buffer, the codec_sv2 decoder must pass a reference of the buffer to a filler. To
construct a Frame, the decoder must pass ownership of the buffer to the Frame.
fn get_writable(&mut self, len: usize) -> &mut [u8];
This get_writable method returns a mutable reference to the buffer, starting at the current
length and ending at len, and sets the buffer length to the previous length plus len.
get_data_owned(&mut self) -> Slice;
This get_data_owned method returns a Slice that implements AsMut<[u8]> and Send.
The Buffer trait is implemented for BufferFromSystemMemory and BufferPool. It includes a
Write trait to replace std::io::Write in no_std environments.
BufferPoolFromSystemMemoryBufferFromSystemMemory is a simple implementation of the Buffer trait. Each time a new buffer is
needed, it creates a new Vec<u8>.
get_writable(..) returns mutable references to the inner vector.get_data_owned(..) returns the inner vector.BufferPoolWhile BufferFromSystemMemory is sufficient for many cases, BufferPool offers a more efficient
solution for high-performance applications, such as proxies and pools with thousands of connections.
When created, BufferPool preallocates a user-defined capacity of bytes in the heap using a
Vec<u8>. When get_data_owned(..) is called, it creates a Slice that contains a view into the
preallocated memory. BufferPool guarantees that slices never overlap and maintains unique
ownership of each Slice.
Slice implements the Drop, allowing the view into the preallocated memory to be reused upon
dropping.
BufferPool is useful for working with sequentially processed buffers, such as filling a buffer,
retrieving it, and then reusing it as needed. BufferPool optimizes for memory reuse by providing
pre-allocated memory that can be used in one of three modes:
BufferFromSystemMemory) when both back
and front sections are full, providing additional capacity but with reduced performance.BufferPool can only be fragmented between the front and back and between back and end.
BufferPool can allocate a maximum of 8 Slices (as it uses an AtomicU8 to track used and
freed slots) and up to the defined capacity in bytes. If all 8 slots are taken or there is no more
space in the preallocated memory, BufferPool falls back to BufferFromSystemMemory.
Typically, BufferPool is used to process messages sequentially (decode, respond, decode). It is
optimized to check for any freed slots starting from the beginning, then reuse these before
considering further allocation. It is also optimized to drop all the slices and to drop the last
slice. It also efficiently handles scenarios where all slices are dropped or when the last slice is
released, reducing memory fragmentation.
The following cases illustrate typical memory usage patterns within BufferPool:
Below is a graphical representation of the most optimized cases. A number means that the slot is
taken, the minus symbol (-) means the slot is free. There are 8 slots.
Case 1: Buffer pool exhaustion
-------- BACK MODE
1------- BACK MODE
12------ BACK MODE
123----- BACK MODE
1234---- BACK MODE
12345--- BACK MODE
123456-- BACK MODE
1234567- BACK MODE
12345678 BACK MODE (buffer is now full)
12345678 ALLOC MODE (new bytes being allocated in a new space in the heap)
12345678 ALLOC MODE (new bytes being allocated in a new space in the heap)
..... and so on
Case 2: Buffer pool reset to remain in back mode
-------- BACK MODE
1------- BACK MODE
12------ BACK MODE
123----- BACK MODE
1234---- BACK MODE
12345--- BACK MODE
123456-- BACK MODE
1234567- BACK MODE
12345678 BACK MODE (buffer is now full)
-------- RESET
9------- BACK MODE
9a------ BACK MODE
Case 3: Buffer pool switches from back to front to back modes
-------- BACK MODE
1------- BACK MODE
12------ BACK MODE
123----- BACK MODE
1234---- BACK MODE
12345--- BACK MODE
123456-- BACK MODE
1234567- BACK MODE
12345678 BACK MODE (buffer is now full)
--345678 Consume first two data bytes from the buffer
-9345678 SWITCH TO FRONT MODE
a9345678 FRONT MODE (buffer is now full)
a93456-- Consume last two data bytes from the buffer
a93456b- SWITCH TO BACK MODE
a93456bc BACK MODE (buffer is now full)
To run benchmarks, execute:
cargo bench --features criterion
BufferPool is benchmarked against BufferFromSystemMemory and two additional structure for
reference: PPool (a hashmap-based pool) and MaxEfficeincy (a highly optimized but unrealistic
control implementation written such that the benchmarks do not panic and the compiler does not
complain). BufferPool generally provides better performance and lower latency than PPool and
BufferFromSystemMemory.
Note: Both PPool and MaxEfficeincy are completely broken and are only useful as references
for the benchmarks.
BENCHES.md BenchmarksThe BufferPool always outperforms the PPool (hashmap-based pool) and the solution without a
pool.
Executed for 2,000 samples:
* single thread with `BufferPool`: ---------------------------------- 7.5006 ms
* single thread with `BufferFromSystemMemory`: ---------------------- 10.274 ms
* single thread with `PPoll`: --------------------------------------- 32.593 ms
* single thread with `MaxEfficeincy`: ------------------------------- 1.2618 ms
* multi-thread with `BufferPool`: ---------------------------------- 34.660 ms
* multi-thread with `BufferFromSystemMemory`: ---------------------- 142.23 ms
* multi-thread with `PPoll`: --------------------------------------- 49.790 ms
* multi-thread with `MaxEfficeincy`: ------------------------------- 18.201 ms
* multi-thread 2 with `BufferPool`: ---------------------------------- 80.869 ms
* multi-thread 2 with `BufferFromSystemMemory`: ---------------------- 192.24 ms
* multi-thread 2 with `PPoll`: --------------------------------------- 101.75 ms
* multi-thread 2 with `MaxEfficeincy`: ------------------------------- 66.972 ms
If the buffer is not sent to another context BufferPool, it is 1.4 times faster than no pool, 4.3
time faster than the PPool, and 5.7 times slower than max efficiency.
Average times for 1,000 operations:
BufferPool: 7.5 msBufferFromSystemMemory: 10.27 msPPool: 32.59 msMaxEfficiency: 1.26 msfor 0..1000:
add random bytes to the buffer
get the buffer
add random bytes to the buffer
get the buffer
drop the 2 buffer
If the buffer is sent to other contexts, BufferPool is 4 times faster than no pool, 0.6 times
faster than PPool, and 1.8 times slower than max efficiency.
BufferPool: 34.66 msBufferFromSystemMemory: 142.23 msPPool: 49.79 msMaxEfficiency: 18.20 msfor 0..1000:
add random bytes to the buffer
get the buffer
send the buffer to another thread -> wait 1 ms and then drop it
add random bytes to the buffer
get the buffer
send the buffer to another thread -> wait 1 ms and then drop it
for 0..1000:
add random bytes to the buffer
get the buffer
send the buffer to another thread -> wait 1 ms and then drop it
add random bytes to the buffer
get the buffer
send the buffer to another thread -> wait 1 ms and then drop it
wait for the 2 buffer to be dropped
Install cargo-fuzz with:
cargo install cargo-fuzz
Run the fuzz tests:
cd ./fuzz
cargo fuzz run slower -- -rss_limit_mb=5000000000
cargo fuzz run faster -- -rss_limit_mb=5000000000
The test must be run with -rss_limit_mb=5000000000 as this flag checks BufferPool with
capacities from 0 to 2^32.
BufferPool is fuzz-tested to ensure memory reliability across different scenarios, including
delayed memory release and cross-thread access. The tests checks if slices created by BufferPool
still contain the same bytes contained at creation time after a random amount of time and after it
has been sent to other threads.
There are 2 fuzzy test, the first (faster) it map a smaller input space to Two main fuzz tests are provided:
Both tests have been run for several hours without crashes.