[WMR] NormalizeLets refreshes types until fixpoint#18595
Conversation
523710a to
faf449e
Compare
|
I am on it. |
| // `refresh_types` might diverge. | ||
| // (Note that `prior` can't come from a shadowed binding, because there is no | ||
| // shadowing.) | ||
| assert!(prior.iter().all(|prior| { |
There was a problem hiding this comment.
That is quite the mouthful. I am hitting this assertion . will try to simplify the query now.
There was a problem hiding this comment.
Test case
CREATE TABLE t1 (f1 INTEGER PRIMARY KEY , f2 INTEGER NOT NULL, f3 INTEGER);
INSERT INTO t1 VALUES (1,1,1);
EXPLAIN WITH MUTUALLY RECURSIVE c1 (f1 INTEGER, f2 INTEGER, f3 INTEGER) AS (
SELECT * FROM t1
UNION ALL SELECT f3 + 3 + 1 AS f1 , f1 + 7 + 1 AS f2 , f2 + f3 + 1 AS f3 FROM c1 WHERE f3 < 100
) SELECT * FROM c1;
Please add to SLT once fixed.
There was a problem hiding this comment.
@ggevay I keep hitting that assertion no matter how much I dumb down the generated WMR queries. Could it be that it triggers on all queries that have the + operator? Unfortunately by using increment I ensure that the queries I run are halting, so it needs to work for testing correctness.
EDIT: nevermind, it happened on a query without the +
There was a problem hiding this comment.
Thanks for this test! The problem is that we run the new fixpoint loop not from Top, but from what a previous run of NormalizeLets gives us, which means that our nullability analysis would have to infer at least the same non-nullability now that it did in that earlier run of NormalizeLets, which seems almost impossible to ensure. For example, what happens in this query is that the earlier NormalizeLets operated on this:
Union
Get u1
Project (#3..=#5)
Map (((#2 + 3) + 1), ((#0 + 7) + 1), ((#1 + #2) + 1))
Filter (#2 < 100)
Get l0
where non-nullability of #2 inside the Map comes from the Filter, but in a later run of NormalizeLets we have the Filter and Map in the opposite order (due to a CanonicalizeMfp I guess):
Union
Get u1
Project (#3..=#5)
Filter (#2 < 100)
Map (((#2 + 3) + 1), ((#0 + 7) + 1), ((#1 + #2) + 1))
Get l0
where we would need to propagate non-null knowledge backwards to infer the same non-nullability.
We could fix this by always starting the fixpoint loop from scratch (from Top) instead of what we currently have in the Get types, but @aalexandrov pointed out an other issue with the pessimistic approach, so I might completely rewrite the PR to the optimistic approach instead (i.e., 2. from https://github.com/MaterializeInc/materialize/issues/18553).
|
Turns out that the optimistic approach (i.e., 2. from https://github.com/MaterializeInc/database-issues/issues/5487) would be better for several reasons:
I'll rewrite the PR. |
2a0d46f to
8944879
Compare
|
I've put rewriting this on hold. I'll do the WMR limits first, and then will get back to this. It's surprisingly tricky. |
|
Closing this for now, as it's super complicated. It would need at least several days, maybe more than a week to properly do this. I suggest getting back to this when we have
I added some thoughts on the issue about how we could approach this, but there are some open questions. |
This takes approach 1. from https://github.com/MaterializeInc/database-issues/issues/5487: Run the key inference in a pessimistic fixpoint loop in
refresh_typesinNormalizeLets. It seems to me that this covers a lot of cases.Note that the new fixpoint loop runs not just the key inference, but our entire type inference, including nullability analysis. Therefore, the nullability of types changed in many tests. This should be fine.
@philip-stoev, RQG testing would be really great to do here. I added a fixpoint loop that runs our type inference repeatedly, which is a lot of code, so anything might happen... I also added some non-trivial asserts. If possible, please include exotic scalar functions to exercise all corners of type inference.
https://m.media-amazon.com/images/I/41+f7nqnB3L.jpg
Motivation
Tips for reviewer
First commit is from #18554, so please review it there.
Before reviewing the new comment in
ColumnKnowledge, I suggest looking at the new test that I added incolumn_knowledge.sltas well as the test change innormalize_lets.slt.Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-protolabel.