row-spine: per-column arrangement compression (alpha, default off) by frankmcsherry · Pull Request #37111 · MaterializeInc/materialize

frankmcsherry · 2026-06-17T19:58:42Z

Motivation

Row-spine arrangements currently compress at the whole-row level via the
just-shipped dictionary codec (enable_arrangement_dictionary_compression_alpha).
That works well when whole rows repeat, but leaves a lot on the table for wide
rows where individual columns have exploitable structure (low-cardinality enums,
clustered integers, repeated strings) but the rows as a whole are nearly unique.

This PR adds a per-column compression framework: each column in a row gets
its own codec, chosen independently from a menu, based on batch-wide statistics
collected at seal time.

What's here

A new mz_row_spine::codec module:

ColumnCodec — a self-delimiting per-column codec enum: Raw,
Constant, Huffman (boxed canonical Huffman, entropy), For
(Frame-of-Reference = subtract-min then bitpack to a fixed byte width), and
Dictionary. Because each codec is self-delimiting, they compose freely down
a row.
RowCodec — encodes/decodes a whole row by dispatching column-by-column.
RowStats / ColumnStats — accumulate per-column statistics over a batch
and pick the smallest-estimated codec per column, charging the per-codec table
storage (dictionary maps, Huffman tables) against the estimate so fixed
overhead can't lose us bytes on small or high-cardinality columns.
Canonical Huffman byte codec (huffman.rs) with forward-streaming decode.

Decode is deferred to the operator that needs datums: ReadItem carries the
compressed bytes plus a reference to the codec, and decode happens into a
thread-local scratch for comparisons or into the caller's RowArena for
extend_datums / to_row.

Relationship to dictionary compression

This is purely additive and orthogonal to the existing dictionary codec:

It makes zero changes to the old dictionary path (ColumnsCodec /
dictionary builder / row_codec module).
It uses a separate container field (column_codec, alongside the existing
codec) and a separate flag
(enable_arrangement_column_compression_alpha, default off).
The two are mutually exclusive at runtime: build_column_codec returns
None whenever a dictionary codec is installed.

So the recently-shipped dictionary feature is untouched and independently
controllable; either flag can be flipped or walked back with no effect on the
other.

Flag wiring

enable_arrangement_column_compression_alpha mirrors the dictionary flag end to
end: captured once at replica creation, threaded through
InstanceConfig/ReplicaConfig, and applied as a one-shot store into the
process-global COLUMN_COMPRESSION flag at instance creation (deliberately not
re-applied on config update, so flipping it never retroactively rewrites a live
replica's arrangements).

Testing

Unit tests in codec.rs / huffman.rs cover round-trip encode/decode for each
codec and the selector. Manual validation against a TPCH load generator showed
per-column selection reducing arrangement footprint substantially on wide tables
(e.g. ~-56% on lineitem, ~-43% across a mixed TPCH set) versus uncompressed,
competitive with or better than whole-row dictionary while decoding more cheaply.

This release will add an alpha enable_arrangement_column_compression_alpha
feature flag (default off) for per-column arrangement compression.

🤖 Generated with Claude Code

@antiguru

…37113) ## What `ArrangementFlavor::flat_map` and `flat_map_ok` constructed a fresh `RowArena::new()` *inside* the per-row `logic` closure, so an arena was allocated and dropped on every row processed. This hoists the arena to the enclosing scope and `clear()`s it once per row instead, reusing its allocation spine across invocations. ## Why Addresses post-merge review feedback from @antiguru on #37110 (the `ExtendDatums` refactor): the arena is the caller-provided decode target a compressed row representation decodes into. With per-column arrangement codecs (follow-up #37111) each row actually populates the arena, so reallocating it per row would churn; clearing-and-reusing keeps the spine. Note that `RowArena::clear()` today drops the inner per-value `Vec<u8>` buffers (it clears the outer `Vec<Vec<u8>>`), so this reuses the outer spine but not the individual buffers — recycling those is a separate, larger `RowArena` change. ## Tests No behavior change; existing `compute` tests cover these paths. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

## What Reworks `RowArena` from "one `Vec<u8>` allocation per `push_bytes`, all freed on `clear`" into a bump allocator over a stack of byte regions: - `push_bytes` **copies** bytes into the active (last) region. When that region lacks spare capacity it allocates a **new, larger** region (doubling) rather than growing the current one — so a region that already holds data is never reallocated and references returned earlier stay valid. The outer `Vec` may reallocate as regions are added, but that only moves the `Vec<u8>` headers, not the heap buffers they own. - `push_bytes` now accepts any `B: Deref<Target = [u8]>` (e.g. `Vec<u8>`, `&[u8]`). Existing `Vec<u8>` callers are unchanged; borrowed sources no longer need a throwaway allocation just to hand bytes over. - `clear` retains only the **single largest region** (emptied) and drops the rest, right-sizing the arena. An arena reused across `clear` cycles — e.g. decoding arrangement rows one at a time — becomes allocation-free in steady state. - `reserve` now reserves *bytes* in the active region (was: slots in the outer vector); `with_capacity` sizes the initial region. Neither has load-bearing callers (the `reserve` users in `expr` pass size hints). ## Why Follows up the `ExtendDatums` arena work (#37110) and the per-row arena hoist (#37113). Once the arena is the decode target for compressed arrangement rows (per-column codecs, #37111), a fresh allocation per value per row is the dominant cost; a reused doubling region removes it. The pattern mirrors columnation's region allocator. The trade-off is the standard bump-allocator one: a single very large row leaves the arena holding a large region until the next `clear`. For the per-row-cleared decode/eval paths the high-water mark is just the largest single row, so it stays bounded. ## Tests Adds `mz_ore::test`s in `repr::row` covering: - references from `push_bytes` remaining valid after later pushes force new regions, - `push_unary_row` reading a row back from a non-zero region offset (confirms decoding is position/alignment independent), - reuse across `clear`. No behavior change for callers; no release note. --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Addresses review feedback on MaterializeInc#37111: scope `enable_arrangement_column_compression_alpha` to the replica (`ParameterScope::Replica`) so it can carry per-replica / per-size-family overrides, like the other replica-local physical flags (lgalloc, persist pager, column-paged batcher). This also replaces the manual capture chain the flag previously borrowed from dictionary compression — InstanceConfig/ReplicaConfig fields, the controller's per-replica capture, and the `handle_create_instance` store — with the replica-scoped idiom: the per-replica-resolved value arrives in the replica's `worker_config`, and `apply_worker_config` mirrors it into the process-global `mz_row_spine::COLUMN_COMPRESSION` flag (next to the lgalloc-region store). The controller-level capture read only the environment value and would have missed per-replica overrides. Re-applying on each config tick is safe: every batch records its own codec, so a flip only affects batches sealed afterwards. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

antiguru · 2026-06-18T14:49:57Z

Edit: I @antiguru actually requested the review, but Claude didn't know, and got confused.

Review — row-spine per-column arrangement compression

This review was written by Claude (Claude Code), at the request of @frankmcsherry / the PR author.

TL;DR

The feature code (codec.rs, huffman.rs, the lib.rs wiring) is genuinely good: clean, well-documented, correctly self-delimiting codecs, sound mutual-exclusion with the dictionary path, and a default-off alpha flag wired exactly like its predecessor. But the PR as it stands is not mergeable, for one mechanical reason and a couple of correctness/robustness gaps worth closing before this goes in even behind a flag.

❌ Blocker 1 — 134,008 lines of unrelated junk

The final commit 6d74d99 ("compute: make per-column compression flag replica-scoped") swept in 80 unrelated working files alongside the legitimate one-line flag-scoping change:

misc/build-time/* (incl. a 55,905-line cargo-timing.html, llvm-lines dumps, logs)
misc/decoupling-scans/* (~14k lines of scan output)
play/optimizer-rebuild/*, play/mz-bridge-recipe/package-lock.json
misc/demo-data/*, .agents/skills/mz-demo-data, FEATURE_FLAG_AUDIT.md

None are referenced by the feature. The real diff is just:

src/compute-types/src/dyncfgs.rs |  13 +
src/compute/src/compute_state.rs |  12 +
src/row-spine/src/codec.rs       | 592 +
src/row-spine/src/huffman.rs     | 410 +
src/row-spine/src/lib.rs         | 320 +

plus the two Python flag-registration files (mzcompose/__init__.py, parallel_workload/action.py), which are legitimate (bin/lint-test-flags requires them).

Action: rebuild the last commit so it carries only the flag-scoping change (git reset the junk paths out + git commit --amend, or interactive-rebase 6d74d99). Until then the diff is unreviewable in a normal view and will trip bin/lint on a pile of these files.

Correctness

2. Variable / mixed arity is silent corruption in release (robustness gap).
RowStats::observe grows its columns vec to the max arity ever seen, and choose() uses each column's own count. But RowCodec::encode indexes self.columns[index] per row column — a row with more columns than the codec models panics (index OOB); a row with fewer leaves count != self.columns.len(), caught only by a debug_assert_eq! (codec.rs:218) that's gone in release. And decode_into always walks all self.columns, so a short record makes For/Dictionary::decode slice cursor[..width] off the end → panic even in release, or Raw reads into the next record → corruption.

build_column_codec already hints the author saw this (it skips empty rows when gathering stats), but that's a half-measure: if a container ever holds both an empty (arity-0) row and non-empty rows, the empty row still goes through push_into → encode emits a 0-length record, and decoding it walks N columns off the end. In practice a single key/val container has uniform arity, so I believe this is not reachable today — but it's an unenforced invariant guarded only by a debug assert. Recommendation: have RowStats track the first row's arity and build() return None if any observed row's arity differs (and don't silently skip empties — they're an arity signal). Turns a latent release-mode panic into a clean "no codec."

3. HuffmanCode::from_frequencies doc/behavior mismatch. The doc says it returns None "when fewer than two distinct bytes appear," but code_lengths returns Some for the single-symbol case (length 1). So a single-distinct-byte column does build a code (1 bit/byte). Correct and round-trips, but the comment is wrong and no test exercises single-symbol selection through ColumnCodec::Huffman. Fix the comment, or actually return None for present.len() == 1 if exclusion was intended.

4. FoR / be_uint — verified correct ✓. Checked against the row encoding (Tag::NonNegativeInt64_24 = tag byte + N significant bytes). FoR is pure big-endian byte arithmetic over the fixed-length pattern including the tag, so it round-trips exactly regardless of tag/sign, correctly gated on fixed_len && len <= 8 && max_int > min_int, and width from max-min means it can never lose to Raw. round_trip_for_numeric is valid.

5. Dictionary codes are insertion-ordered, not sorted. observe assigns code = distinct.len() on first sight. Fine for correctness (decode is a lookup), but worth a one-line comment so a future reader doesn't assume order-preservation — and it forecloses the compare optimization below.

Performance (flag is off, so non-blocking — but this is whether it can ship)

6. The comparison path fully decodes both operands. PartialEq/PartialOrd on encoded items decode both whole rows into thread-local scratch (Huffman bit-by-bit, dict lookups) then compare raw bytes. Differential merge/consolidate/seek is comparison-bound, so with the flag on this is O(full row) per comparison instead of O(distinguishing prefix). This is the single biggest risk to viability and deserves a dedicated benchmark (merge throughput, flag on vs off) before leaving alpha. The two-buffer thread-local design itself is sound. Possible win: decode column-by-column, stop at first differing column (Constant compares free); making Dictionary/For order-preserving would let those compare encoded, but Huffman isn't order-preserving so a general fast path needs the lazy column-wise decode.

7. Per-row allocations on the read/scan path. to_row does Vec::new() per call; extend_datums (the documented hot scan path) does Vec::new() and then arena.push_bytes(buf) — an allocation plus a copy per row, vs. the no-codec path's allocation-free tight loop. Both should reuse a thread-local scratch like materialized already does.

8. Huffman::decode_one is bit-at-a-time — O(max_len) branches + bounds-checked read_bit per symbol. A length-limited (MAX_BITS=24) table-driven decode would be much faster, worth it if Huffman is actually selected often (see test gap below).

9. Merge re-encodes everything (inherent, fine). merge_capacity correctly drops the codec and push_into decodes each incoming item to a Row then re-encodes under a fresh codec — unavoidable since codecs aren't reusable across merges; just another full decode + to_row alloc per row during merges.

Test coverage gaps

Huffman selection end-to-end. No test asserts ColumnCodec::Huffman is ever chosen — round_trip_text_heavy and round_trip_dictionary_lowcard land on Dictionary, round_trip_for_numeric on For. So the Huffman arm through RowCodec and the model-cost accounting in choose() are never exercised. Add a case engineered to make Huffman win (high-cardinality, skewed bytes, >65,536 distinct so Dictionary is disqualified).
The merge path (push_into with item.column_codec.is_some()) — never tested; the container test only installs a codec and pushes fresh rows.
Variable/empty-row arity (finding #2) — once the guard is added.
MAX_BITS decline path in code_lengths — uncovered.
DICT_MAX_CARDINALITY boundary — exactly 65,536 / 65,537 distinct to confirm 2-byte width and fall-off to None.

Minor nits

codec.rs:388 let _ = best_bytes; — the final best_bytes write in the Dictionary arm is dead; the discard silences the warning but reads oddly. A short comment or dropping the last assignment would help.
CMP_SCRATCH_A/B thread-locals never shrink (retain the largest row seen for the thread's life) — negligible, worth a comment.
self_code (codec.rs:192) is a one-line indirection over index.get(...).expect(...).
compute_state.rs "Safe to re-apply on every tick" is the right call and well-justified. 👍

Suggested benchmarks

Merge/consolidate throughput, flag on vs off on a wide low-cardinality arrangement (e.g. TPCH lineitem) — exercises the full-row-decode comparison path (#6). This is the gating measurement.
Arrangement scan latency (extend_datums) on/off — captures the per-row alloc regression (#7).
Memory footprint on/off across the TPCH set — reproduce the "-56% lineitem / -43% mixed" claim in a repeatable harness (Feature Benchmark / mz-benchmark entry so CI tracks it).
Seal cost on high-cardinality columns — the per-batch stats pass builds a BTreeMap up to 65k entries per column and then discards it; confirm seal stays cheap.

Net: solid, well-isolated feature behind an off-by-default flag. Strip the junk commit (blocker), add the arity guard, and fill the Huffman-selection + merge-path test gaps; treat the comparison-path decode cost as the key thing to measure before promoting past alpha.

Generated by Claude Code

First piece of the entropy-coded row work. A self-contained, tested canonical Huffman coder over the byte alphabet, seeded from batch-wide frequencies. Not yet wired into the containers (allowed unused for now); integration into the codec slot follows. - `HuffmanCode` is described entirely by its 256 per-byte code lengths; canonical codes and decode tables derive from the lengths, so a stored model costs only the lengths. Built from frequencies via standard Huffman; declines (returns `None`) when <2 symbols appear or the optimal code would exceed a 24-bit bound (caller then leaves the data uncompressed). - Decoding is forward-streaming (read bits MSB-first, emit one symbol at a time with O(1) state) so that comparisons can decode operands without materializing them — the property that lets `ReadItem::Ord` avoid a per-item allocation. - `FrequencyCounter` accumulates the model; `BitWriter`/`BitReader` do MSB-first bit packing. Tests round-trip empty, single/two-symbol, skewed, uniform (verifying exactly 8-bit codes, i.e. no expansion), and pseudo-random inputs, and check that a code rebuilt from its lengths matches the original. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Foundation for per-column compression of arrangements: each column is compressed independently with a codec chosen from batch-wide statistics, instead of one scheme for the whole row. Standalone + tested; container wiring follows. - `ColumnCodec` (enum, one variant per scheme): `Raw`, `Constant` (column is one value batch-wide -> zero bytes/row), `Huffman`. Codecs are self-delimiting in the encoded stream, so they compose: `RowCodec` decodes a row by walking the column codecs in order, each consuming exactly its bytes — no central length-prefix block. New schemes (FoR = subtract-min then bit-pack, dictionary) slot in as a variant plus a size estimate. - `RowStats` selects per column the smallest-estimated codec (including `Raw`), and `build()` returns `None` when every column is best left `Raw`. This structurally avoids the overhead trap the experiments exposed (uniform per-column Huffman and dictionary both blew up small many-column arrangements): cheap columns stay `Raw`/`Constant`, expensive tables are installed only where they pay. Built on the ExtendDatums/arena iterator cleanup; reuses the canonical Huffman coder. Tests cover mixed/constant/text-heavy round-trips and that an all-raw batch yields no codec. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Wires the per-column codec framework end-to-end behind a default-off COLUMN_COMPRESSION flag, reusing the decode-on-read pattern: a container holds an optional RowCodec, records are stored RowCodec-encoded, and DatumSeq carries the compressed bytes + codec ref and decodes on demand. - DatumContainer gains `column_codec: Option<RowCodec>` + a staging buffer, threaded through with_capacity/merge_capacity/clear/heap_size; merge drops the codec (rebuilt at next seal). push_into encodes the raw row columns per-column (implies no dictionary codec); index defers decode. - DatumSeq gains `column_codec: Option<&RowCodec>`. to_row/into_owned/clone_onto decode the record to raw row bytes; cmp/eq decode each operand into a thread-local scratch (two, non-aliasing); extend_datums decodes into the arena; next/bytes_iter keep entropy.is_none()-style tripwires. - seal() builds + installs a RowCodec per key/value container via RowStats over the raw columns (all four builders); column compression layers over raw, so it is skipped when a dictionary codec is present. Flag off => no codec built => unchanged behavior. New container round-trip test covers encode/decode, compare/seek, and extend_datums for codec-encoded items. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…lection Fixes the per-column overhead the catalog benchmark exposed: `ColumnCodec` was an enum sized to its largest variant (`Huffman`'s ~1.3 KB inline tables), so every column — even `Raw`/`Constant` — cost ~1.3 KB in the per-column `Vec`, blowing up small many-column arrangements (e.g. a 7-row catalog index 750 B -> ~9 KB). - `Huffman(Box<HuffmanCode>)`: the enum is now pointer-sized, so `Raw`/`Constant` columns cost only a discriminant + pointer; the table is heap-allocated only for columns that actually use Huffman. - The selector now charges the model's storage (its inline tables) against the Huffman estimate, so Huffman is adopted only when per-row savings outweigh the one-time table — keeping it off small columns where the table would dominate. `heap_size` accounts the boxed table. Test now uses enough rows that Huffman is selected (exercising its decode path); small batches correctly fall back to Raw. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Fills out the per-column codec menu so the selector can fit numeric and low-cardinality columns, not just text. - `For` (frame-of-reference): for fixed-width (<= 8 byte) columns, interpret the bytes as a big-endian integer and store `value - min` in the fewest bytes that hold `max - min`. It is pure arithmetic on the byte pattern (no datum-type knowledge), so it always reconstructs exactly; it simply isn't *chosen* when the byte order leaves residuals large, so it's safe to offer everywhere. - `Dictionary` (per-column): map each distinct value to a fixed-width code into a value table, for columns up to 2^16 distinct values. `ColumnStats` now also tracks fixed length, min/max integer, and distinct values (bounded); `choose` estimates FoR and dictionary sizes (table storage charged) alongside Huffman/Constant/Raw and picks the smallest. New round-trip tests cover FoR-numeric and dictionary-low-cardinality selection. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…_compression_alpha Add an `enable_arrangement_column_compression_alpha` dyncfg (default off) that controls per-column arrangement compression, mirroring the existing `enable_arrangement_dictionary_compression_alpha` wiring end to end: captured once at replica creation, threaded through InstanceConfig/ReplicaConfig, and applied as a one-shot store into the process-global COLUMN_COMPRESSION flag at instance creation. The two flags are independent and the codecs are mutually exclusive at runtime. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… lists bin/lint-test-flags requires every system flag to be known to parallel-workload's FlipFlagsAction and to the mzcompose system-parameter lists. Mirror the dictionary-compression flag's entries for the new per-column compression flag. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Addresses review feedback on MaterializeInc#37111: scope `enable_arrangement_column_compression_alpha` to the replica (`ParameterScope::Replica`) so it can carry per-replica / per-size-family overrides, like the other replica-local physical flags (lgalloc, persist pager, column-paged batcher). This also replaces the manual capture chain the flag previously borrowed from dictionary compression — InstanceConfig/ReplicaConfig fields, the controller's per-replica capture, and the `handle_create_instance` store — with the replica-scoped idiom: the per-replica-resolved value arrives in the replica's `worker_config`, and `apply_worker_config` mirrors it into the process-global `mz_row_spine::COLUMN_COMPRESSION` flag (next to the lgalloc-region store). The controller-level capture read only the environment value and would have missed per-replica overrides. Re-applying on each config tick is safe: every batch records its own codec, so a flip only affects batches sealed afterwards. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Review follow-ups on MaterializeInc#37111: - Correctness: `RowCodec::decode_into` walks a fixed column count, so a row whose arity differs from the codec's would read past its record (a panic, or silent corruption in release, caught before only by a debug assert). `RowStats` now tracks the first observed row's arity and `build()` declines on any mismatch. `build_column_codec` no longer skips empty rows — an empty (arity-0) row is a real arity signal, so a mixed-arity container correctly gets no codec. - Docs/comments: fix `HuffmanCode::from_frequencies` doc (a single distinct byte yields a 1-bit code; `None` only for empty input or codes exceeding MAX_BITS); note dictionary codes are insertion-ordered (not order-preserving); explain the intentionally-unread final `best_bytes` write and that CMP_SCRATCH buffers don't shrink. - Tests: Huffman selected end-to-end (high-cardinality skewed column), the merge re-encode path (push an encoded item into another codec'd container), mixed / empty-row arity declines, `code_lengths` MAX_BITS decline, and the `DICT_MAX_CARDINALITY` retain/drop boundary. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

DAlperin · 2026-06-18T16:24:27Z

+#[allow(dead_code)]
+mod codec;
+#[allow(dead_code)]
+mod huffman;


Not actually dead anymore I think?

DAlperin · 2026-06-18T16:25:56Z

+            if let Some(codec) = self.column_codec {
+                // `iter.data` is RowCodec-encoded; decode to raw row bytes.
+                let mut buf = Vec::new();
+                codec.decode_into(self.iter.data, &mut buf);
+                return unsafe { Row::from_bytes_unchecked(&buf) };
+            }


Probably better to decode into an arena here if one is available than allocate per row

Yup. We don't have that until #37115 lands, I think.

I should go read that then :)

DAlperin · 2026-06-18T17:13:51Z

+        let mut best = ColumnCodec::Raw;
+        let mut best_bytes = self.raw_bytes;
+
+        if let Some(code) = self.freq.build() {


Probably pretty minor in the grand scheme of things but we do the expensive check first. Calling build on the huffman code calls code_lengths which is slightly expensive (BinaryHeap over >= 256 symbols plus allocs). One could optimize this by running the cheap codecs first to lower best_bytes and then gate the huffman build behind a cheap lower bound like let huffman_floor = self.count.saturating_mul(2).saturating_add(size_of::<HuffmanCode>() + 256);

Again, pretty minor and might need some stats/profile to prove if its really a problem.

antiguru

Seems good, but I'd like to see a test/nightly run with the flag enabled. At the moment it seems mostly untested delta the unit tests (thanks for adding!)

I'll approve, don't see a blocker here.

antiguru · 2026-06-18T19:48:55Z

+        // NB: per-column compression is replica-scoped and applied in `apply_worker_config`
+        // (called just above), not captured on `InstanceConfig`.


Nit: Comment reads like a note-to-self.

antiguru · 2026-06-18T19:58:23Z

+        // off the end of an empty record at decode time.
+        match self.arity {
+            None => self.arity = Some(count),
+            Some(arity) if arity != count => self.mixed_arity = true,


In what situation would we expected mixed arity rows?

antiguru · 2026-06-18T20:01:52Z

            "true" if version >= MzVersion.parse_mz("v0.132.0-dev") else "false"
        ),
        "enable_alter_swap": "true",
+        "enable_arrangement_column_compression_alpha": "false",


I'd like to see a test/nightly run with compression enabled.

frankmcsherry requested a review from a team as a code owner June 17, 2026 19:58

frankmcsherry mentioned this pull request Jun 17, 2026

compute: hoist the flat_map decode arena out of the per-row closure #37113

Merged

frankmcsherry requested a review from a team as a code owner June 17, 2026 20:22

frankmcsherry mentioned this pull request Jun 17, 2026

repr: back RowArena with a doubling region allocator #37114

Merged

antiguru reviewed Jun 18, 2026

View reviewed changes

Comment thread src/compute-types/src/dyncfgs.rs Outdated

frankmcsherry force-pushed the pr-column-codecs branch from 6d74d99 to 5ea5db5 Compare June 18, 2026 14:41

frankmcsherry and others added 9 commits June 18, 2026 10:50

row-spine: shorten CMP_SCRATCH static decls to satisfy rustfmt max_width

db4cf16

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

frankmcsherry force-pushed the pr-column-codecs branch from 5ea5db5 to c360dd8 Compare June 18, 2026 14:52

DAlperin reviewed Jun 18, 2026

View reviewed changes

frankmcsherry force-pushed the pr-column-codecs branch from aef6edc to 7a0f6ea Compare June 18, 2026 16:43

DAlperin reviewed Jun 18, 2026

View reviewed changes

antiguru approved these changes Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

row-spine: per-column arrangement compression (alpha, default off)#37111

row-spine: per-column arrangement compression (alpha, default off)#37111
frankmcsherry wants to merge 10 commits into
MaterializeInc:mainfrom
frankmcsherry:pr-column-codecs

frankmcsherry commented Jun 17, 2026

Uh oh!

Uh oh!

antiguru commented Jun 18, 2026 •

edited

Loading

Uh oh!

DAlperin Jun 18, 2026

Uh oh!

DAlperin Jun 18, 2026

Uh oh!

frankmcsherry Jun 18, 2026

Uh oh!

DAlperin Jun 18, 2026

Uh oh!

DAlperin Jun 18, 2026

Uh oh!

antiguru left a comment

Uh oh!

antiguru Jun 18, 2026

Uh oh!

antiguru Jun 18, 2026

Uh oh!

antiguru Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		// NB: per-column compression is replica-scoped and applied in `apply_worker_config`
		// (called just above), not captured on `InstanceConfig`.

Conversation

frankmcsherry commented Jun 17, 2026

Motivation

What's here

Relationship to dictionary compression

Flag wiring

Testing

Uh oh!

Uh oh!

antiguru commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review — row-spine per-column arrangement compression

TL;DR

❌ Blocker 1 — 134,008 lines of unrelated junk

Correctness

Performance (flag is off, so non-blocking — but this is whether it can ship)

Test coverage gaps

Minor nits

Suggested benchmarks

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antiguru left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

antiguru commented Jun 18, 2026 •

edited

Loading