Replace ToDatumIter with a push-based ExtendDatums trait#37110
Merged
frankmcsherry merged 2 commits intoJun 17, 2026
Conversation
Pre-compression prep. Reworks the datum-decoding trait so that a future compressed/entropy-coded row representation can decode into caller-provided storage, and removes the remaining places that drove `DatumSeq`'s `Iterator` impl from outside row-spine. Trait changes (src/repr/src/fixed_length.rs): - Rename `ToDatumIter` -> `ExtendDatums` and drop the iterator-based interface entirely: no more `to_datum_iter` / `DatumIter` GAT / `Sized` supertrait. The trait is now a single push-based method, completing the migration `main` already started toward `extend_datums`. - `extend_datums` now takes `&'a RowArena`. `Row` and `DatumSeq` ignore it (their datums borrow their own bytes), but a compressed representation can decode into the arena so the resulting `Datum<'a>` borrows from the arena rather than from `self`. The arena has no consumer yet; it is the enabling hook for the upcoming entropy-coding work. Stop iterating `DatumSeq` externally (src/compute/src/render/reduce.rs): - The four single-column "count" sites now decode the leading value datum via `extend_datums(arena, &mut scratch, Some(1))` and repeat the (Copy) datum, instead of `DatumSeq::next`. - The three `ReduceMinsMaxes` transpose sites decode each value row into a recycled `DatumVec` and index `decoded[r * arity + col]` (arity is uniform, guarded by `assert_eq!`), instead of advancing per-row `DatumSeq` iterators. - Adds recycled scratch `DatumVec`s alongside the existing key buffers. Dead-code cleanup (src/row-spine/src/lib.rs): - With no external caller left, the no-codec fast path in `DatumSeq::next` is production-dead and duplicated the loop in `extend_datums`; collapse `next` to delegate to `ColumnsIter`. The `Iterator` impl stays for the codec-encoded `extend_datums`/`to_row` paths and tests. Behavior is unchanged for the uncompressed path: the same datums are produced, now sourced through the arena/decode buffers. The `encode.rs` ordering test switches from `to_datum_iter` to `Row::iter`; the existing row-spine `test_round_trip` continues to cover the no-codec `DatumSeq::next` path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Three more call sites iterated a trace `DatumSeq` as an `Iterator` (via `Vec::extend` / `.next()`) rather than through `extend_datums` — missed by the earlier sweep, and fatal once entropy compression is enabled (the item must be decoded first; the debug tripwire in `DatumSeq::next` fires, and a release build would read compressed bytes as datums). Surfaced by benchmarking entropy on the catalog-server cluster, which crash-looped in `build_topk_negated_stage`. - `top_k.rs`: the per-row decode buffer (`build_topk_negated_stage`), and the hash-skip + key-extend in the limit-expression branch, now decode via `extend_datums(&arena, ..)` (reading `key_datums[1..]` past the hash). - `context.rs` `as_collection` and `render.rs`'s filtered-index import closures decode `k`/`v` via `extend_datums` into a local arena before packing. No behavior change with entropy off (`extend_datums` on an uncompressed item yields the same datums as iterating it). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
antiguru
reviewed
Jun 17, 2026
Member
There was a problem hiding this comment.
Please do not edit generated files.
| ok_session: &mut Session<T, DCB>, | ||
| err_session: &mut Session<T, ECB<T>>| { | ||
| // Declared before the borrow so it outlives any datums that decode into it. | ||
| let temp_storage = RowArena::new(); |
Member
There was a problem hiding this comment.
You might be able to move this out of the closure and instead clear here.
Contributor
Author
There was a problem hiding this comment.
Will investigate in a follow-up.
| let mut datums = DatumVec::new(); | ||
| let logic = move |k: DatumSeq, v: DatumSeq, t, d, ok_session: &mut Session<T, DCB>| { | ||
| // Declared before the borrow so it outlives any datums that decode into it. | ||
| let temp_storage = RowArena::new(); |
This was referenced Jun 17, 2026
frankmcsherry
added a commit
that referenced
this pull request
Jun 18, 2026
…37113) ## What `ArrangementFlavor::flat_map` and `flat_map_ok` constructed a fresh `RowArena::new()` *inside* the per-row `logic` closure, so an arena was allocated and dropped on every row processed. This hoists the arena to the enclosing scope and `clear()`s it once per row instead, reusing its allocation spine across invocations. ## Why Addresses post-merge review feedback from @antiguru on #37110 (the `ExtendDatums` refactor): the arena is the caller-provided decode target a compressed row representation decodes into. With per-column arrangement codecs (follow-up #37111) each row actually populates the arena, so reallocating it per row would churn; clearing-and-reusing keeps the spine. Note that `RowArena::clear()` today drops the inner per-value `Vec<u8>` buffers (it clears the outer `Vec<Vec<u8>>`), so this reuses the outer spine but not the individual buffers — recycling those is a separate, larger `RowArena` change. ## Tests No behavior change; existing `compute` tests cover these paths. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
frankmcsherry
added a commit
that referenced
this pull request
Jun 18, 2026
## What Reworks `RowArena` from "one `Vec<u8>` allocation per `push_bytes`, all freed on `clear`" into a bump allocator over a stack of byte regions: - `push_bytes` **copies** bytes into the active (last) region. When that region lacks spare capacity it allocates a **new, larger** region (doubling) rather than growing the current one — so a region that already holds data is never reallocated and references returned earlier stay valid. The outer `Vec` may reallocate as regions are added, but that only moves the `Vec<u8>` headers, not the heap buffers they own. - `push_bytes` now accepts any `B: Deref<Target = [u8]>` (e.g. `Vec<u8>`, `&[u8]`). Existing `Vec<u8>` callers are unchanged; borrowed sources no longer need a throwaway allocation just to hand bytes over. - `clear` retains only the **single largest region** (emptied) and drops the rest, right-sizing the arena. An arena reused across `clear` cycles — e.g. decoding arrangement rows one at a time — becomes allocation-free in steady state. - `reserve` now reserves *bytes* in the active region (was: slots in the outer vector); `with_capacity` sizes the initial region. Neither has load-bearing callers (the `reserve` users in `expr` pass size hints). ## Why Follows up the `ExtendDatums` arena work (#37110) and the per-row arena hoist (#37113). Once the arena is the decode target for compressed arrangement rows (per-column codecs, #37111), a fresh allocation per value per row is the dominant cost; a reused doubling region removes it. The pattern mirrors columnation's region allocator. The trade-off is the standard bump-allocator one: a single very large row leaves the arena holding a large region until the next `clear`. For the per-row-cleared decode/eval paths the high-water mark is just the largest single row, so it stays bounded. ## Tests Adds `mz_ore::test`s in `repr::row` covering: - references from `push_bytes` remaining valid after later pushes force new regions, - `push_unary_row` reading a row back from a non-zero region offset (confirms decoding is position/alignment independent), - reuse across `clear`. No behavior change for callers; no release note. --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Replaces the GAT-based
ToDatumItertrait ("hand me a borrowingDatumiterator") with a push-basedExtendDatumstrait, and removes the remaining places that iterated a trace row (DatumSeq) directly.ToDatumIter→ExtendDatums, a single method:extend_datums(&self, arena: &RowArena, target: &mut Vec<Datum>, max: Option<usize>).Drops the
DatumIter<'a>GAT, theto_datum_itermethod, and theSizedsupertrait.&RowArenaparameter lets an implementor decode into caller-provided storage and returnDatums that borrow from the arena rather than from&self.Rowand row-spine'sDatumSeqignore it today; it is the hook a future compressed row representation needs — anyDatum<'a>tied to&'a selfis fundamentally incompatible with a representation whose decoded bytes don't live inself.DatumSeq's inherentIterator(reduce, delta/linear joins, peek, top_k, and render'sas_collection/filtered-index import) toextend_datums, so the only remainingDatumSeq::Iteratorusers are row-spine internals.Why
A no-op refactor that finishes the push-based datum-decode interface (
mainhad already started theextend_datumsmigration) and adds the arena hook. It is a prerequisite for compressing in-memory arrangements (per-column codecs, prototyped separately — early TPCH benchmarks show ~-65% arrangement memory); landing it on its own keeps that follow-up small and reviewable.Tests
Existing
repr,row-spine, andcomputetests cover the changed paths. Theencode.rsordering test switches fromto_datum_itertoRow::iter. No behavior change; no release note.