Adapt Materialize to use the columnated merge batcher#23121
Conversation
dcc6f97 to
d21baad
Compare
d21baad to
7a99184
Compare
vmarcos
left a comment
There was a problem hiding this comment.
This seems fine to me; I included a few questions on details below.
| T: Timestamp + Lattice + Columnation, | ||
| S::Timestamp: Lattice + Refines<T> + Columnation, |
There was a problem hiding this comment.
I was not completely clear to me whether these trait bounds are absolutely required here? We are forcing CollectionBundle to require Columnation on its timestamp types, but we could require these bounds only in the implementation including ensure_collections?
Another way to phrase the question more conceptually: Do we want to ensure Columnation for timestamp types in connection with arrangement usage only or across all contexts (arrangements, dataflow edges) in this PR? The difference is not large now in that the timestamps used in arrangements and the ones in dataflow edges are today the same, but changing the requirements expresses an opinion that they should at least share some more behavior than we've required so far.
There was a problem hiding this comment.
Unfortunately, we need the bound on CollectionBundle because ArrangementFlavor needs it. We use both T and S::Timestamp in the handles, so the Columnation bound propagates everywhere.
| T: Timestamp + Lattice + Columnation, | ||
| S::Timestamp: Lattice + Refines<T> + Columnation, |
There was a problem hiding this comment.
AFAICT, even if we'd like Columnation in the definition of CollectionBundle, then we'd need to require the bound for the scope timestamp type S::Timestamp here, but not for the anchor timestamp type T?
There was a problem hiding this comment.
Also needed:
error[E0277]: the trait bound `T: Columnation` is not satisfied
--> src/compute/src/render/context.rs:372:13
|
372 | RowUnit(KeyValArrangementImport<S, Row, (), T>),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `Columnation` is not implemented for `T`
|
= note: required for `TStack<((mz_repr::Row, ()), T, i64), Overflowing<u32>>` to implement `differential_dataflow::trace::implementations::Layout`
= note: required for `Spine<Rc<OrdValBatch<TStack<((mz_repr::Row, ()), T, i64), Overflowing<u32>>, TimelyStack<((mz_repr::Row, ()), T, i64)>>>>` to implement `TraceReader`
help: consider further restricting this bound
|
369 | T: Timestamp + Lattice + timely::container::columnation::Columnation,
| +++++++++++++++++++++++++++++++++++++++++++++
Signed-off-by: Moritz Hoffmann <mh@materialize.com>
Signed-off-by: Moritz Hoffmann <mh@materialize.com>
7a99184 to
91e092d
Compare
|
Thanks for the reviews! |
cc @frankmcsherry
Update our dependency on Differential to use the columnated merge batcher. This allows us to move some of the allocations during hydration to
lgalloc.As it's not obvious: The
ColKeyBatchin differential changes such that its merge batcher will use columnated data instead of vector-allocated data. This doesn't show up in this PR because we still depend on the same type definition. For more details, see TimelyDataflow/differential-dataflow#418Motivation
Part of MaterializeInc/database-issues#6848.
Tips for reviewer
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.