Skip to content

[WMR] NormalizeLets refreshes types until fixpoint#18595

Closed
ggevay wants to merge 8 commits into
MaterializeInc:mainfrom
ggevay:wmr-keys
Closed

[WMR] NormalizeLets refreshes types until fixpoint#18595
ggevay wants to merge 8 commits into
MaterializeInc:mainfrom
ggevay:wmr-keys

Conversation

@ggevay

@ggevay ggevay commented Apr 4, 2023

Copy link
Copy Markdown
Contributor

This takes approach 1. from https://github.com/MaterializeInc/database-issues/issues/5487: Run the key inference in a pessimistic fixpoint loop in refresh_types in NormalizeLets. It seems to me that this covers a lot of cases.

Note that the new fixpoint loop runs not just the key inference, but our entire type inference, including nullability analysis. Therefore, the nullability of types changed in many tests. This should be fine.

@philip-stoev, RQG testing would be really great to do here. I added a fixpoint loop that runs our type inference repeatedly, which is a lot of code, so anything might happen... I also added some non-trivial asserts. If possible, please include exotic scalar functions to exercise all corners of type inference.
https://m.media-amazon.com/images/I/41+f7nqnB3L.jpg

Motivation

Tips for reviewer

First commit is from #18554, so please review it there.

Before reviewing the new comment in ColumnKnowledge, I suggest looking at the new test that I added in column_knowledge.slt as well as the test change in normalize_lets.slt.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • This PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way) and therefore is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:

@ggevay ggevay added A-optimization Area: query optimization and transformation A-compute Area: compute labels Apr 4, 2023
@ggevay ggevay force-pushed the wmr-keys branch 8 times, most recently from 523710a to faf449e Compare April 5, 2023 09:55
@ggevay ggevay marked this pull request as ready for review April 5, 2023 09:58
@ggevay ggevay requested review from a team, aalexandrov and philip-stoev April 5, 2023 09:58
@philip-stoev

Copy link
Copy Markdown
Contributor

I am on it.

Comment thread src/transform/src/normalize_lets.rs Outdated
// `refresh_types` might diverge.
// (Note that `prior` can't come from a shadowed binding, because there is no
// shadowing.)
assert!(prior.iter().all(|prior| {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is quite the mouthful. I am hitting this assertion . will try to simplify the query now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test case

CREATE TABLE t1 (f1 INTEGER PRIMARY KEY , f2 INTEGER NOT NULL, f3 INTEGER);
INSERT INTO t1 VALUES (1,1,1);
EXPLAIN WITH MUTUALLY RECURSIVE c1 (f1 INTEGER, f2 INTEGER, f3 INTEGER) AS (
	SELECT * FROM t1
	UNION ALL SELECT f3 + 3 + 1 AS f1 , f1 + 7 + 1 AS f2 , f2 + f3 + 1 AS f3 FROM c1 WHERE f3 < 100
) SELECT * FROM c1;

Please add to SLT once fixed.

@philip-stoev philip-stoev Apr 5, 2023

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggevay I keep hitting that assertion no matter how much I dumb down the generated WMR queries. Could it be that it triggers on all queries that have the + operator? Unfortunately by using increment I ensure that the queries I run are halting, so it needs to work for testing correctness.

EDIT: nevermind, it happened on a query without the +

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this test! The problem is that we run the new fixpoint loop not from Top, but from what a previous run of NormalizeLets gives us, which means that our nullability analysis would have to infer at least the same non-nullability now that it did in that earlier run of NormalizeLets, which seems almost impossible to ensure. For example, what happens in this query is that the earlier NormalizeLets operated on this:

Union
  Get u1
  Project (#3..=#5)
    Map (((#2 + 3) + 1), ((#0 + 7) + 1), ((#1 + #2) + 1))
      Filter (#2 < 100)
        Get l0

where non-nullability of #2 inside the Map comes from the Filter, but in a later run of NormalizeLets we have the Filter and Map in the opposite order (due to a CanonicalizeMfp I guess):

Union
  Get u1
  Project (#3..=#5)
    Filter (#2 < 100)
      Map (((#2 + 3) + 1), ((#0 + 7) + 1), ((#1 + #2) + 1))
        Get l0

where we would need to propagate non-null knowledge backwards to infer the same non-nullability.

We could fix this by always starting the fixpoint loop from scratch (from Top) instead of what we currently have in the Get types, but @aalexandrov pointed out an other issue with the pessimistic approach, so I might completely rewrite the PR to the optimistic approach instead (i.e., 2. from https://github.com/MaterializeInc/materialize/issues/18553).

@ggevay

ggevay commented Apr 5, 2023

Copy link
Copy Markdown
Contributor Author

Turns out that the optimistic approach (i.e., 2. from https://github.com/MaterializeInc/database-issues/issues/5487) would be better for several reasons:

  • (Philip's query shows that the pessimistic approach would have to start from Top, complicating the initialization.)
  • It is hard to be sure that our nullability analysis is actually monotonic.
  • As @aalexandrov pointed out, our unique key analysis might very well be non-monotonic, because in some situations it might decide to omit some keys to avoid an exponential blowup when, e.g., all subsets of the columns would be keys.
  • In general, the optimistic approach is more precise. The pessimistic approach would be kind of ok for the key inference (with accounting for non-monotonicity by unioning the sets of keys at every iteration), but we are also inferring nullability here, which is better to do optimistically, as agreed in the ColumnKnowledge PR.

I'll rewrite the PR.

@ggevay ggevay force-pushed the wmr-keys branch 8 times, most recently from 2a0d46f to 8944879 Compare April 6, 2023 17:35
@ggevay ggevay marked this pull request as draft April 6, 2023 18:12
ggevay added 4 commits April 6, 2023 20:15
This means that the key and nullability info in the types will
propagate among all bindings, even recursive ones.

Closes #18553
@ggevay

ggevay commented Apr 12, 2023

Copy link
Copy Markdown
Contributor Author

I've put rewriting this on hold. I'll do the WMR limits first, and then will get back to this. It's surprisingly tricky.

@ggevay

ggevay commented May 24, 2023

Copy link
Copy Markdown
Contributor Author

Closing this for now, as it's super complicated. It would need at least several days, maybe more than a week to properly do this. I suggest getting back to this when we have

  • more WMR users, and/or
  • more examples where the weak key inference is actually hurting some optimization.

I added some thoughts on the issue about how we could approach this, but there are some open questions.

@ggevay ggevay closed this May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-compute Area: compute A-optimization Area: query optimization and transformation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants