WMR soft limit#18966
Conversation
1fa57d7 to
ac23eef
Compare
|
Ready for review! @MaterializeInc/surfaces, @aalexandrov, @vmarcos (rendering). |
| } | ||
| CteBlock::MutuallyRecursive(list) => { | ||
| for cte in list.iter() { | ||
| CteBlock::MutuallyRecursive(MutRecBlock { |
There was a problem hiding this comment.
As an aside (don't need to do anything in this PR): having logic in the parser is unfortunate and it'd be nice for it to not be here.
philip-stoev
left a comment
There was a problem hiding this comment.
Thank you very much for the tests -- I could not figure out any additional ones to add.
The feature held under manual experimentation and I could not get it to become wedged .
The diff looks less formidable with whitespace ignored.
|
Actually here is a test that you could push -- it confirms that the number of iterations is reset at every timestamp: |
teskje
left a comment
There was a problem hiding this comment.
General compute parts LGTM, I just have a couple stylistic comments. I'll defer to @aalexandrov for the transform changes.
| repeated mz_expr.id.ProtoLocalId ids = 1; | ||
| repeated ProtoPlan values = 2; | ||
| repeated uint64 max_iters = 4; | ||
| repeated bool max_iters_present = 5; |
There was a problem hiding this comment.
I assume there is no repeated optional uint64 that would make this second field unnecessary?
Relatedly, but not really relevant to this PR, I have been wondering why LetRec is represented with separate lists for ids and values (and now max_iters). ISTM that instead having a single bindings list with (id, value, max_iter) entries would be more convenient since it doesn't require ensuring that the lists have the same length all the time. But I'm probably missing something.
There was a problem hiding this comment.
We discussed that this might be a better representation but decided to not change it as part of the ongoing epic. I think it must be a skunkworks project or something similar, unless we find ~3-4 days of in-band work to do this as part of addressing tech debt.
There was a problem hiding this comment.
Makes sense, thanks!
Regarding max_iters_present, Alex mentioned in another comment:
One general comment: we should absolutely prohibit people setting the MAX_ITERATIONS for some binding to 0, otherwise we run the risk of all sorts of incorrect optimizations for those blocks (or unnecessary complicated transform code).
Doing that would allow us to drop max_iters_present here and just use 0 to mean "no limit set".
There was a problem hiding this comment.
Yes, I'm throwing an error for 0. (while creating the HIR from the AST)
I thought about that, but I thought the code is cleaner this way. Special values are always a little bit scary; maybe somebody decides to suddenly allow 0. But in this case the risk would be low, so I can change it if that is preferred.
There was a problem hiding this comment.
I'd prefer it, since it statically removes the possibility that the two lists might become inconsistent (e.g. have different lengths), so there is one thing less to check at runtime. But your argument about special values is valid too, so I don't oppose leaving things as they are.
vmarcos
left a comment
There was a problem hiding this comment.
Rendering changes look good to me; it would be ideal if we'd include test(s) for nested WMR blocks with different limits also in execution (test/sqllogictest/with_mutually_recursive.slt), though, not only in planning.
| Get::PassArrangements l1 | ||
| raw=true | ||
| With Mutually Recursive | ||
| cte [MaxIterations None] l1 = |
There was a problem hiding this comment.
Can we not print out [MaxIterations None] (which I assume would be the default most of the time)?
There was a problem hiding this comment.
nit: stylistically parameters of AST nodes have been printed in snake_case elsewhere and key value pairs use $key=$val, so I think
cte [max_iterations=10] l1 =
is a bit more consistent with the rest of the format.
|
One general comment: we should absolutely prohibit people setting the |
|
Thanks for the reviews! In addition to addressing the inline comments, I made the following changes: I added the test from @philip-stoev. I added the test that @vmarcos suggests. I changed the LIR EXPLAIN to not print it if None, as @aalexandrov suggested.
Yes, I'm throwing an error for 0. (while creating the HIR from the AST)
Hmm, nice! I changed the code to use this. (I didn’t properly implement |
| ids: ids.into_proto(), | ||
| max_iters: max_iters | ||
| .into_iter() | ||
| .map(|d| match d { |
There was a problem hiding this comment.
minor nit: can you add this
impl RustType<u64> for Option<NonZeroU64> {
fn into_proto(&self) -> u64 {
match self {
Some(d) => d.get(),
None => 0,
}
}
fn from_proto(proto: u64) -> Result<Self, TryFromProtoError> {
Ok(NonZeroU64::new(proto))
}
}after this line
rust_type_id![bool, f32, f64, i32, i64, String, u32, u64, Vec<u8>];and then use max_iters.into_proto() and proto.max_iters.into_rust()?? This will allow other people that want to encode Option<NonZeroU64>> without the boilerplate.
aalexandrov
left a comment
There was a problem hiding this comment.
Looks good, from my original suggestions I think the only thing missing is changing the output format of the new attribute from
[MaxIterations $x]
to
[max_iterations=$x]
ea7a9ba to
8192f22
Compare
|
I've addressed all comments, including changing the syntax based on the SQL design principles notion doc and SELECT's expected group size option. An example for the syntax: We can decide the exact keywords later, after we finalize the keywords for WITH MUTUALLY RECURSIVE itself. (One option that came up was WITH REPEATEDLY, which doesn't have RECURSIVE in it, so the word ITERATION would be ok for it.) @benesch, @ggnall, could you please take a quick look at the syntax? The surfaces code changed because of the different option parsing. @mjibson, could you please check the new parsing and planning? I also updated the design doc. |
def-
left a comment
There was a problem hiding this comment.
Some interesting spots from code coverage report marked inline: https://buildkite.com/materialize/coverage/builds/89
| tokens.insert(object.id, object_token); | ||
| // Import declared indexes into the rendering context. | ||
| for (idx_id, idx) in &dataflow.index_imports { | ||
| let export_ids = dataflow.export_ids().collect(); |
There was a problem hiding this comment.
This is indeed not covered, thanks! I'll add a test for this. (Although this PR makes only a trivial change to this part, but it's still somewhat new code. It was introduced with Frank's initial implementation of WMR.)
|
|
||
| if ctx.config.linear_chains { | ||
| writeln!(f, "{}With Mutually Recursive", ctx.indent)?; | ||
| write!(f, "{}With Mutually Recursive", ctx.indent)?; |
There was a problem hiding this comment.
This entire block is untested it seems?
There was a problem hiding this comment.
Yes, because the linear chains option currently doesn't work with WMR, see https://github.com/MaterializeInc/materialize/issues/19012.
madelynnblue
left a comment
There was a problem hiding this comment.
Surfaces parts lgtm
8b6d024 to
037dd79
Compare
|
New syntax looks much better, nice! I signal boosted in #devex (https://materializeinc.slack.com/archives/C015RHB3LDR/p1683779552919369) for additional feedback. "Iteration limit" sounds a little funny to my ear, but no strong feelings. |
Any user that is familiar with recursion in SQL will have a mental mapping for For the same reason, my preference for the iteration limit would also be for a keyword that is used in other systems, like |
True!
From my point of view, simply using
What do you think, @frankmcsherry, @benesch, @aalexandrov?
MySQL errors out when reaching MAXRECURSION, which would correspond to our hard limit. In contrast, the soft limit (this PR) simply produces the current state as the final result when reaching the limit. For this reason, I think we shouldn't align the keyword.
|
Yeah, I want to keep the door open for supporting the SQL standard's semantics for @petrosagg had the take of: why bother with the
Oh, interesting! That makes sense for debugging (you want to watch it progress), but kind of scary if you were running in production. Should we be very explicit with our keyword and include something like Just noodling: I think to avoid blocking this PR any longer we should move forward with either |
|
Another idea I had was to just use the keyword |
|
I like Petros' and Jan's simplifying ideas:
I'm not sure about the soft vs. hard limit. I think if the docs are very explicit about not erroring but simply stopping by default, then it's ok to simply stop by default. Merging now with ITERATION LIMIT, and then let's decide these things as a follow-up, to already let people use the feature (internally), and also to avoid rebasing this PR again and again. |
This PR implements the WMR soft limit in the form that we agreed on with @aalexandrov. (I'll update the design doc tomorrow.) The "soft" means that we don't error out when we reach the limit, but just stop iterating, and consider the current state as the final result.
The default limit is infinite, i.e., no limit.
We postponed the hard limit (erroring out when reaching the limit), because we now have proper dataflow cancellation for WMR queries. We can still consider a hard limit later (which would be easy to add after this PR).
We moved away from system/session variables, based on feedback on the design doc from Surfaces and Frank.
Instead of system/session variables, users can specify the limit using new SQL syntax (Edit: I'm in the process of changing this to a more standard options clause (but leave it at the same place)):
WITH MUTUALLY RECURSIVE MAXITERATIONS 42This syntax allows for separate limits for each WMR block, which is important when a query has multiple WMR blocks.
The first commit is just changing iteration numbers to use
u64instead ofusize, based on a discussion in the office hours. The second commit runscargo fmt, which I will squash into the first one, of course. (Just wanted to separate the diff for review.)The second commit is the meat.
@aalexandrov, is the EXPLAIN format ok? (HIR, MIR, LIR) Note that I'm not currently testing
linear_chains, because it doesn't seem to work for WMR (and nobody seems to be using it). We can discuss whether to fix it or deprecate it. (I'll open an issue tomorrow.)Keyword
The keyword is tentative; we can still decide that before merging, but I like
MAXITERATIONS. (@ggnall) Some possible alternatives:MAXDEPTH 42: users might think of recursive function calls, which is not what's happening here.LIMIT 42: It's not clear what are we limiting: iterations, records, time, ... Also, it could be confused with ORDER BYs LIMIT.(ITERATE 42 TIMES): This one would also be ok I guess, but I'd vote for simplicity.RECURSIONLIMIT: Might be ok as well, but I wanted to avoid (internal) confusion with ourpub const RECURSION_LIMIT, which is for something totally different.MAXRECURSION: Edit: The MySQL option with the same keyword would correspond to a hard limit, so let's not align our soft limit's keyword with that.ITERATIONS: (from Jan) We don't have to call it a limit, since running to an exactly specified number of iterations is the same as stopping when either a limit is reached or fixpoint is reached. But I'm a bit worried that some users won't make this extra mental step of realizing that it might execute less iterations if a fixpoint is reached earlier, and might get worried that it will take more steps than necessary.Motivation
Tips for reviewer
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-protolabel.LetRec, but I asked Jan whether we need to be backwards-compatible, and he said "Nope, we are not durably storing serialized IRs anywhere, I’m pretty sure. So no need to be backwards-compatible in the protobufs"