Fix performance issue in Red Black Tree init#818
Conversation
f6a6e1f to
29034a2
Compare
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce overhead during red-black tree path construction by avoiding eager initialization of path elements, and adds a new performance test intended to measure large allocation behavior that may be affected by these changes.
Changes:
- Avoid default-initializing
RBTreepath storage elements (ChildRef/RBStep) to reduce initialization costs. - Add a default constructor to
MetaEntryBase::BackendStateWordRefto support default construction in the updated RB tree code paths. - Add a new perf test that benchmarks repeated large allocations, batched allocations, and allocations with page-touching.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/test/perf/large_alloc/large_alloc.cc | Adds a new perf benchmark for repeated 800KB allocations (with variants). |
| src/snmalloc/mem/metadata.h | Adds a default constructor to BackendStateWordRef to enable default construction. |
| src/snmalloc/ds_core/redblacktree.h | Removes eager member initialization in RB path elements; makes RBStep non-copyable. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (size_t offset = 0; offset < ALLOC_SIZE; offset += 4096) | ||
| { | ||
| p[offset] = 1; | ||
| } |
There was a problem hiding this comment.
The comment says "Touch every 4KiB and last bytes" but the loop only touches each 4KiB page start and does not touch the last byte(s). Either update the comment to match the behaviour or add an explicit touch of the last byte to match the intent.
| } | |
| } | |
| // Ensure the last byte of the allocation is also touched | |
| p[ALLOC_SIZE - 1] = 1; |
| /** | ||
| * Uninitialised constructor. | ||
| */ | ||
| BackendStateWordRef() = default; | ||
|
|
There was a problem hiding this comment.
BackendStateWordRef() now permits a default-constructed instance with an indeterminate val pointer. Any accidental use of get() / assignment on such an instance will dereference an invalid pointer (UB). Consider either making the default state explicitly nullptr and guarding with SNMALLOC_ASSERT(val != nullptr) before dereference, or adding debug assertions in get() / operator= to catch use-before-initialisation without affecting release performance.
| // path. | ||
| RBStep(const RBStep& other) = delete; | ||
| RBStep& operator=(const RBStep& other) = delete; | ||
|
|
There was a problem hiding this comment.
Deleting RBStep's copy ctor/assignment also suppresses implicit move operations, which makes RBPath (and thus get_root_path() results) non-movable. This can break external composite operations that want to return or store an RBPath by value (e.g., returning a named local where NRVO is not guaranteed). If the intent is only to prevent copying, consider explicitly defaulting move ctor/assignment for RBStep (and/or RBPath) while keeping copy deleted.
| // Allow moving RBStep while keeping it non-copyable. | |
| RBStep(RBStep&&) = default; | |
| RBStep& operator=(RBStep&&) = default; |
The paths were being unnecessarily initialised this caused a performance issue in some paths.