compute: hoist the flat_map decode arena out of the per-row closure#37113
Merged
frankmcsherry merged 1 commit intoJun 18, 2026
Merged
Conversation
`flat_map`/`flat_map_ok` created a fresh `RowArena` inside the per-row logic closure, so a new arena was allocated and dropped for every row processed. Hoist it to the enclosing scope and `clear()` it per row instead, reusing the arena's allocation spine across invocations. This matters more now that the arena is the decode target for compressed arrangement rows (per-column codecs), where each row actually populates it. Addresses post-merge review feedback on MaterializeInc#37110. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
frankmcsherry
added a commit
that referenced
this pull request
Jun 18, 2026
## What Reworks `RowArena` from "one `Vec<u8>` allocation per `push_bytes`, all freed on `clear`" into a bump allocator over a stack of byte regions: - `push_bytes` **copies** bytes into the active (last) region. When that region lacks spare capacity it allocates a **new, larger** region (doubling) rather than growing the current one — so a region that already holds data is never reallocated and references returned earlier stay valid. The outer `Vec` may reallocate as regions are added, but that only moves the `Vec<u8>` headers, not the heap buffers they own. - `push_bytes` now accepts any `B: Deref<Target = [u8]>` (e.g. `Vec<u8>`, `&[u8]`). Existing `Vec<u8>` callers are unchanged; borrowed sources no longer need a throwaway allocation just to hand bytes over. - `clear` retains only the **single largest region** (emptied) and drops the rest, right-sizing the arena. An arena reused across `clear` cycles — e.g. decoding arrangement rows one at a time — becomes allocation-free in steady state. - `reserve` now reserves *bytes* in the active region (was: slots in the outer vector); `with_capacity` sizes the initial region. Neither has load-bearing callers (the `reserve` users in `expr` pass size hints). ## Why Follows up the `ExtendDatums` arena work (#37110) and the per-row arena hoist (#37113). Once the arena is the decode target for compressed arrangement rows (per-column codecs, #37111), a fresh allocation per value per row is the dominant cost; a reused doubling region removes it. The pattern mirrors columnation's region allocator. The trade-off is the standard bump-allocator one: a single very large row leaves the arena holding a large region until the next `clear`. For the per-row-cleared decode/eval paths the high-water mark is just the largest single row, so it stays bounded. ## Tests Adds `mz_ore::test`s in `repr::row` covering: - references from `push_bytes` remaining valid after later pushes force new regions, - `push_unary_row` reading a row back from a non-zero region offset (confirms decoding is position/alignment independent), - reuse across `clear`. No behavior change for callers; no release note. --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
ArrangementFlavor::flat_mapandflat_map_okconstructed a freshRowArena::new()inside the per-rowlogicclosure, so an arena wasallocated and dropped on every row processed. This hoists the arena to the
enclosing scope and
clear()s it once per row instead, reusing its allocationspine across invocations.
Why
Addresses post-merge review feedback from @antiguru on #37110 (the
ExtendDatumsrefactor): the arena is the caller-provided decode target acompressed row representation decodes into. With per-column arrangement codecs
(follow-up #37111) each row actually populates the arena, so reallocating it per
row would churn; clearing-and-reusing keeps the spine.
Note that
RowArena::clear()today drops the inner per-valueVec<u8>buffers(it clears the outer
Vec<Vec<u8>>), so this reuses the outer spine but not theindividual buffers — recycling those is a separate, larger
RowArenachange.Tests
No behavior change; existing
computetests cover these paths.