WMR soft limit by ggevay · Pull Request #18966 · MaterializeInc/materialize

ggevay · 2023-04-26T10:22:56Z

This PR implements the WMR soft limit in the form that we agreed on with @aalexandrov. (I'll update the design doc tomorrow.) The "soft" means that we don't error out when we reach the limit, but just stop iterating, and consider the current state as the final result.

The default limit is infinite, i.e., no limit.

We postponed the hard limit (erroring out when reaching the limit), because we now have proper dataflow cancellation for WMR queries. We can still consider a hard limit later (which would be easy to add after this PR).

We moved away from system/session variables, based on feedback on the design doc from Surfaces and Frank.

Instead of system/session variables, users can specify the limit using new SQL syntax (Edit: I'm in the process of changing this to a more standard options clause (but leave it at the same place)):
WITH MUTUALLY RECURSIVE MAXITERATIONS 42
This syntax allows for separate limits for each WMR block, which is important when a query has multiple WMR blocks.

The first commit is just changing iteration numbers to use u64 instead of usize, based on a discussion in the office hours. The second commit runs cargo fmt, which I will squash into the first one, of course. (Just wanted to separate the diff for review.)

The second commit is the meat.

@aalexandrov, is the EXPLAIN format ok? (HIR, MIR, LIR) Note that I'm not currently testing linear_chains, because it doesn't seem to work for WMR (and nobody seems to be using it). We can discuss whether to fix it or deprecate it. (I'll open an issue tomorrow.)

Keyword

The keyword is tentative; we can still decide that before merging, but I like MAXITERATIONS. (@ggnall) Some possible alternatives:

MAXDEPTH 42: users might think of recursive function calls, which is not what's happening here.
LIMIT 42: It's not clear what are we limiting: iterations, records, time, ... Also, it could be confused with ORDER BYs LIMIT.
(ITERATE 42 TIMES): This one would also be ok I guess, but I'd vote for simplicity.
RECURSIONLIMIT: Might be ok as well, but I wanted to avoid (internal) confusion with our pub const RECURSION_LIMIT, which is for something totally different.
MAXRECURSION: Edit: The MySQL option with the same keyword would correspond to a hard limit, so let's not align our soft limit's keyword with that.
ITERATIONS: (from Jan) We don't have to call it a limit, since running to an exactly specified number of iterations is the same as stopping when either a limit is reached or fixpoint is reached. But I'm a bit worried that some users won't make this extra mental step of realizing that it might execute less iterations if a fixpoint is reached earlier, and might get worried that it will take more steps than necessary.

Motivation

This PR adds a known-desirable feature: https://github.com/MaterializeInc/database-issues/issues/5409

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered.
This PR has an associated design doc: WMR limits design doc #18538
- (The design doc currently doesn't reflect the move away from session/system variables as well as the hard limit's importance going down with the dataflow shutdown. I'll update the design doc a bit later.)
This PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way) and therefore is tagged with a T-proto label.
- We do have a protobuf change for the LIR LetRec, but I asked Jan whether we need to be backwards-compatible, and he said "Nope, we are not durably storing serialized IRs anywhere, I’m pretty sure. So no need to be backwards-compatible in the protobufs"
If this PR will require changes to cloud orchestration, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
This PR includes the following user-facing behavior changes:

ggevay · 2023-04-26T18:58:58Z

Ready for review! @MaterializeInc/surfaces, @aalexandrov, @vmarcos (rendering).

madelynnblue · 2023-04-26T19:51:29Z

            }
-            CteBlock::MutuallyRecursive(list) => {
-                for cte in list.iter() {
+            CteBlock::MutuallyRecursive(MutRecBlock {


As an aside (don't need to do anything in this PR): having logic in the parser is unfortunate and it'd be nice for it to not be here.

philip-stoev

Thank you very much for the tests -- I could not figure out any additional ones to add.

The feature held under manual experimentation and I could not get it to become wedged .

The diff looks less formidable with whitespace ignored.

philip-stoev · 2023-04-27T09:46:10Z

Actually here is a test that you could push -- it confirms that the number of iterations is reset at every timestamp:

CREATE TABLE t1 (f1 INTEGER);
CREATE MATERIALIZED VIEW v1 AS WITH MUTUALLY RECURSIVE MAXITERATIONS 2 cnt (f1 INTEGER) AS (SELECT f1 FROM t1 UNION ALL SELECT f1+1 AS f1 FROM cnt) SELECT * FROM cnt;
INSERT INTO t1 VALUES (1);
SELECT * FROM v1;
UPDATE t1 SET f1 = 2;
SELECT * FROM v1;

teskje

General compute parts LGTM, I just have a couple stylistic comments. I'll defer to @aalexandrov for the transform changes.

teskje · 2023-04-27T09:19:30Z

        repeated mz_expr.id.ProtoLocalId ids = 1;
        repeated ProtoPlan values = 2;
+        repeated uint64 max_iters = 4;
+        repeated bool max_iters_present = 5;


I assume there is no repeated optional uint64 that would make this second field unnecessary?

Relatedly, but not really relevant to this PR, I have been wondering why LetRec is represented with separate lists for ids and values (and now max_iters). ISTM that instead having a single bindings list with (id, value, max_iter) entries would be more convenient since it doesn't require ensuring that the lists have the same length all the time. But I'm probably missing something.

We discussed that this might be a better representation but decided to not change it as part of the ongoing epic. I think it must be a skunkworks project or something similar, unless we find ~3-4 days of in-band work to do this as part of addressing tech debt.

Makes sense, thanks!

Regarding max_iters_present, Alex mentioned in another comment:

One general comment: we should absolutely prohibit people setting the MAX_ITERATIONS for some binding to 0, otherwise we run the risk of all sorts of incorrect optimizations for those blocks (or unnecessary complicated transform code).

Doing that would allow us to drop max_iters_present here and just use 0 to mean "no limit set".

Yes, I'm throwing an error for 0. (while creating the HIR from the AST)

I thought about that, but I thought the code is cleaner this way. Special values are always a little bit scary; maybe somebody decides to suddenly allow 0. But in this case the risk would be low, so I can change it if that is preferred.

I'd prefer it, since it statically removes the possibility that the two lists might become inconsistent (e.g. have different lengths), so there is one thing less to check at runtime. But your argument about special values is valid too, so I don't oppose leaving things as they are.

Changed it.

vmarcos

Rendering changes look good to me; it would be ideal if we'd include test(s) for nested WMR blocks with different limits also in execution (test/sqllogictest/with_mutually_recursive.slt), though, not only in planning.

aalexandrov · 2023-04-27T11:23:26Z

+      Get::PassArrangements l1
+        raw=true
+  With Mutually Recursive
+    cte [MaxIterations None] l1 =


Can we not print out [MaxIterations None] (which I assume would be the default most of the time)?

nit: stylistically parameters of AST nodes have been printed in snake_case elsewhere and key value pairs use $key=$val, so I think

cte [max_iterations=10] l1 =

is a bit more consistent with the rest of the format.

Yes, fixed, thx!

aalexandrov · 2023-04-27T11:43:50Z

One general comment: we should absolutely prohibit people setting the MAX_ITERATIONS for some binding to 0, otherwise we run the risk of all sorts of incorrect optimizations for those blocks (or unnecessary complicated transform code). There are some NonZero~ types in https://doc.rust-lang.org/stable/std/num/index.html and I guess the most precise way to enforce this constraint is to use one of those.

ggevay · 2023-04-27T16:23:22Z

Thanks for the reviews! In addition to addressing the inline comments, I made the following changes:

I added the test from @philip-stoev.

I added the test that @vmarcos suggests.

I changed the LIR EXPLAIN to not print it if None, as @aalexandrov suggested.

we should absolutely prohibit people setting the MAX_ITERATIONS for some binding to 0, otherwise we run the risk of all sorts of incorrect optimizations for those blocks (or unnecessary complicated transform code).

Yes, I'm throwing an error for 0. (while creating the HIR from the AST)

There are some NonZero~ types

Hmm, nice! I changed the code to use this. (I didn’t properly implement MzReflect for NonZeroU64, but I hope that’s ok. We are not planning to use lowertest for WMR I guess, since we are moving away from lowertest anyway. Edit: Sorry, it's failing the tests. I'll fix it.)

aalexandrov · 2023-05-01T15:33:16Z

                        ids: ids.into_proto(),
+                        max_iters: max_iters
+                            .into_iter()
+                            .map(|d| match d {


minor nit: can you add this

impl RustType<u64> for Option<NonZeroU64> { fn into_proto(&self) -> u64 { match self { Some(d) => d.get(), None => 0, } } fn from_proto(proto: u64) -> Result<Self, TryFromProtoError> { Ok(NonZeroU64::new(proto)) } }

after this line

rust_type_id![bool, f32, f64, i32, i64, String, u32, u64, Vec<u8>];

and then use max_iters.into_proto() and proto.max_iters.into_rust()?? This will allow other people that want to encode Option<NonZeroU64>> without the boilerplate.

aalexandrov

Looks good, from my original suggestions I think the only thing missing is changing the output format of the new attribute from

[MaxIterations $x]

to

[max_iterations=$x]

ggevay · 2023-05-09T19:32:16Z

I've addressed all comments, including changing the syntax based on the SQL design principles notion doc and SELECT's expected group size option. An example for the syntax:

WITH MUTUALLY RECURSIVE (ITERATION LIMIT 6)
  cnt (i int) AS (
    (WITH MUTUALLY RECURSIVE (ITERATION LIMIT = 3)
       cnt (i int) AS (
         SELECT 1 AS i
         UNION
         SELECT i+1 FROM cnt)
       SELECT i FROM cnt
    )
    UNION
    SELECT i+100 FROM cnt)
SELECT i FROM cnt;

We can decide the exact keywords later, after we finalize the keywords for WITH MUTUALLY RECURSIVE itself. (One option that came up was WITH REPEATEDLY, which doesn't have RECURSIVE in it, so the word ITERATION would be ok for it.)

@benesch, @ggnall, could you please take a quick look at the syntax?

The surfaces code changed because of the different option parsing. @mjibson, could you please check the new parsing and planning?

I also updated the design doc.

def-

Some interesting spots from code coverage report marked inline: https://buildkite.com/materialize/coverage/builds/89

def- · 2023-05-09T19:58:10Z

-                        tokens.insert(object.id, object_token);
+                // Import declared indexes into the rendering context.
+                for (idx_id, idx) in &dataflow.index_imports {
+                    let export_ids = dataflow.export_ids().collect();


Interesting block to test

This is indeed not covered, thanks! I'll add a test for this. (Although this PR makes only a trivial change to this part, but it's still somewhat new code. It was introduced with Frank's initial implementation of WMR.)

Added a test in aebe703

def- · 2023-05-09T19:58:59Z

+
                if ctx.config.linear_chains {
-                    writeln!(f, "{}With Mutually Recursive", ctx.indent)?;
+                    write!(f, "{}With Mutually Recursive", ctx.indent)?;


This entire block is untested it seems?

Yes, because the linear chains option currently doesn't work with WMR, see https://github.com/MaterializeInc/materialize/issues/19012.

madelynnblue

Surfaces parts lgtm

Closes #18362

benesch · 2023-05-11T04:34:58Z

New syntax looks much better, nice! I signal boosted in #devex (https://materializeinc.slack.com/archives/C015RHB3LDR/p1683779552919369) for additional feedback. "Iteration limit" sounds a little funny to my ear, but no strong feelings.

morsapaes · 2023-05-11T09:26:43Z

We can decide the exact keywords later, after we finalize the keywords for WITH MUTUALLY RECURSIVE itself. (One option that came up was WITH REPEATEDLY, which doesn't have RECURSIVE in it, so the word ITERATION would be ok for it.)

Any user that is familiar with recursion in SQL will have a mental mapping for WITH RECURSIVE, so it doesn't sound right to use an entirely different term for it (like WITH REPEATEDLY). Are we forced to use the MUTUALLY keyword, or would it be possible to relax this to WITH RECURSIVE, and spell out how we depart from the SQL standard idiom?

For the same reason, my preference for the iteration limit would also be for a keyword that is used in other systems, like MAXRECURSION or MAXDEPTH. AFAIU, these refer to the maximum depth level, though (which seems to be how most databases limit recursive queries?), so if that's a fundamentally wrong way to think about it, it might be safer to use MAXITERATIONS.

ggevay · 2023-05-11T15:51:20Z

Any user that is familiar with recursion in SQL will have a mental mapping for WITH RECURSIVE, so it doesn't sound right to use an entirely different term for it (like WITH REPEATEDLY).

True!

Are we forced to use the MUTUALLY keyword, or would it be possible to relax this to WITH RECURSIVE, and spell out how we depart from the SQL standard idiom?

From my point of view, simply using WITH RECURSIVE sounds ok, but Frank said that Nikhil might not want that. The only potential issue that I can see is that we have a semantic difference, i.e., some recursive queries have different results between Postgres and Materialize. However,

These cases are a bit exotic. Hopefully not many people are relying on the weirder parts of Postgres' recursion semantics.
Besides the keyword, we also have another syntactic difference: The user has to specify the types for each recursive CTE explicitly. So the user has to pause for a moment and look at our docs when porting a recursive query from Postgres to Materialize, and then in the docs we can place a prominent warning about the semantic difference.

What do you think, @frankmcsherry, @benesch, @aalexandrov?

For the same reason, my preference for the iteration limit would also be for a keyword that is used in other systems, like MAXRECURSION or MAXDEPTH. AFAIU, these refer to the maximum depth level, though (which seems to be how most databases limit recursive queries?), so if that's a fundamentally wrong way to think about it, it might be safer to use MAXITERATIONS.

MySQL errors out when reaching MAXRECURSION, which would correspond to our hard limit. In contrast, the soft limit (this PR) simply produces the current state as the final result when reaching the limit. For this reason, I think we shouldn't align the keyword.

RECURSION LIMIT sounds ok to me, though.

benesch · 2023-05-11T22:42:38Z

From my point of view, simply using WITH RECURSIVE sounds ok, but Frank said that Nikhil might not want that. The only potential issue that I can see is that we have a semantic difference, i.e., some recursive queries have different results between Postgres and Materialize.

Yeah, I want to keep the door open for supporting the SQL standard's semantics for WITH RECURSIVE. It may be important for a customer one day.

@petrosagg had the take of: why bother with the RECURSIVE keyword at all? Just allow CTEs in a normal WITH block to refer to one another. I think that's more plausible, because that's a strict extension to the SQL spec, rather than

MySQL errors out when reaching MAXRECURSION, which would correspond to our hard limit. In contrast, the soft limit (this PR) simply produces the current state as the final result when reaching the limit. For this reason, I think we shouldn't align the keyword.

Oh, interesting! That makes sense for debugging (you want to watch it progress), but kind of scary if you were running in production. Should we be very explicit with our keyword and include something like SOFT in the name? SOFT ITERATION LIMIT, for example? Although I'm not sure it's exactly a "soft limit"—I think of a "soft" vs "hard" as "changeable upon request" vs "enforced", whereas this is about "errors or not if limit reached." Naming is hard!

Just noodling:

WITH ... (ITERATION LIMIT = 256, REQUIRE FIXPOINT = false)
WITH ... (ITERATION LIMIT = 256, REQUIRE FIXPOINT = true)
WITH ... (ITERATION LIMIT = 256) -- REQUIRE FIXPOINT defaults to true

I think to avoid blocking this PR any longer we should move forward with either ITERATION LIMIT or RECURSION LIMIT, but let's sync again on the syntax holistically once this is closer to stabilization.

teskje · 2023-05-12T07:02:12Z

Another idea I had was to just use the keyword ITERATIONS without calling it a limit: #18538 (comment).

ggevay · 2023-05-12T08:50:35Z

I like Petros' and Jan's simplifying ideas:

Simply WITH, no RECURSIVE is needed.
Simply ITERATIONS, no LIMIT is needed.

I'm not sure about the soft vs. hard limit. I think if the docs are very explicit about not erroring but simply stopping by default, then it's ok to simply stop by default.

Merging now with ITERATION LIMIT, and then let's decide these things as a follow-up, to already let people use the feature (internally), and also to avoid rebasing this PR again and again.

ggevay added A-optimization Area: query optimization and transformation A-compute Area: compute labels Apr 26, 2023

ggevay requested a review from a team April 26, 2023 10:22

ggevay requested a review from a team as a code owner April 26, 2023 10:22

ggevay marked this pull request as draft April 26, 2023 10:23

ggevay force-pushed the wmr-limit2 branch 7 times, most recently from 1fa57d7 to ac23eef Compare April 26, 2023 15:57

madelynnblue reviewed Apr 26, 2023

View reviewed changes

Comment thread src/sql-parser/src/ast/defs/query.rs Outdated

ggevay added the T-proto Theme: `$T ⇔ Proto$T` conversions and `*.proto` files label Apr 26, 2023

ggevay force-pushed the wmr-limit2 branch from 74e5816 to f410778 Compare April 26, 2023 18:51

ggevay requested a review from aalexandrov April 26, 2023 18:54

ggevay marked this pull request as ready for review April 26, 2023 18:54

ggevay requested a review from a team April 26, 2023 19:07

madelynnblue reviewed Apr 26, 2023

View reviewed changes

philip-stoev approved these changes Apr 27, 2023

View reviewed changes

teskje approved these changes Apr 27, 2023

View reviewed changes

vmarcos approved these changes Apr 27, 2023

View reviewed changes

aalexandrov reviewed Apr 27, 2023

View reviewed changes

ggevay force-pushed the wmr-limit2 branch from c0823f8 to 954a975 Compare April 27, 2023 15:05

ggevay requested a review from benesch as a code owner April 27, 2023 16:19

ggevay force-pushed the wmr-limit2 branch from 8201815 to b423e4e Compare April 27, 2023 16:21

aalexandrov reviewed May 1, 2023

View reviewed changes

aalexandrov self-requested a review May 1, 2023 21:18

aalexandrov approved these changes May 1, 2023

View reviewed changes

ggevay marked this pull request as draft May 5, 2023 20:14

ggevay force-pushed the wmr-limit2 branch 4 times, most recently from ea7a9ba to 8192f22 Compare May 9, 2023 18:16

ggevay marked this pull request as ready for review May 9, 2023 19:27

ggevay force-pushed the wmr-limit2 branch from 1ced78f to ac65e1e Compare May 9, 2023 19:28

def- reviewed May 9, 2023

View reviewed changes

madelynnblue reviewed May 9, 2023

View reviewed changes

def- mentioned this pull request May 9, 2023

coverage: Also use clusterd binary when running SLT #19193

Merged

5 tasks

ggevay force-pushed the wmr-limit2 branch 4 times, most recently from 8b6d024 to 037dd79 Compare May 10, 2023 14:09

ggevay added 3 commits May 10, 2023 16:13

Use u64 instead of usize in WMR Pointstamps

3a23e9e

WMR iteration limit

4e116c5

Closes #18362

Add a test that covers index imports inside WMR

63c5ac6

ggevay force-pushed the wmr-limit2 branch from 037dd79 to 63c5ac6 Compare May 10, 2023 14:14

ggevay merged commit f2bfd87 into MaterializeInc:main May 12, 2023

ggevay mentioned this pull request May 12, 2023

WMR limits design doc #18538

Merged

5 tasks

Conversation

ggevay commented Apr 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Keyword

Motivation

Tips for reviewer

Checklist

Uh oh!

Uh oh!

ggevay commented Apr 26, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

philip-stoev left a comment

Choose a reason for hiding this comment

Uh oh!

philip-stoev commented Apr 27, 2023

Uh oh!

teskje left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vmarcos left a comment

Choose a reason for hiding this comment

Uh oh!

aalexandrov Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aalexandrov Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aalexandrov commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggevay commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aalexandrov May 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aalexandrov left a comment

Choose a reason for hiding this comment

Uh oh!

ggevay commented May 9, 2023

Uh oh!

def- left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay May 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

ggevay commented Apr 26, 2023 •

edited

Loading

aalexandrov Apr 27, 2023 •

edited

Loading

aalexandrov Apr 27, 2023 •

edited

Loading

aalexandrov commented Apr 27, 2023 •

edited

Loading

ggevay commented Apr 27, 2023 •

edited

Loading

aalexandrov May 1, 2023 •

edited

Loading

ggevay May 9, 2023 •

edited

Loading

ggevay May 9, 2023 •

edited

Loading

morsapaes commented May 11, 2023 •

edited

Loading

ggevay commented May 11, 2023 •

edited

Loading