-
Notifications
You must be signed in to change notification settings - Fork 504
expr: fix non-monotone annotations on timestamp/date/interval functions #36702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
antiguru
wants to merge
4
commits into
main
Choose a base branch
from
claude/github-issue-9656-4DVQc
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+167
−12
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it actually fail without the PR? The QA LLM review claims not, but I didn't verify:
MEDIUM —
test_date_bin_timestamp_non_monotoneis a tautology and does not actually regression-test the fixFile:
src/expr/src/interpret.rs:1844-1903The test sets up
date_bin(stride_col, 2024-01-01 12:00:00) >= 2024-01-01 00:00:00withstride_col ∈ [{0,1,0}, {0,2,0}]and asserts that the interpreter admits bothTrueandFalse. The endpoint evaluations are:date_bin({0,1,0}, 2024-01-01 12:00) = 2024-01-01 00:00→>=is True.date_bin({0,2,0}, 2024-01-01 12:00) = 2023-12-31 00:00→>=is False.Because the endpoints already produce opposite boolean answers under the (buggy)
(true, true)annotation, the interpreter's union of{False}and{True}already admits both outcomes. The test therefore passes regardless of whetherdate_bin_timestampis marked(true, true)or(false, true).Verified empirically: I reverted
src/expr/src/scalar/func.rs:2057tois_monotone = "(true, true)"while leaving the rest of the PR in place, and rancargo test --lib -p mz-expr test_date_bin_timestamp_non_monotone— the test still passes.In contrast, the companion test
test_add_timestamp_interval_non_monotoneis genuinely diagnostic — reverting theadd_timestamp_intervalannotation makes it fail withinterpreter incorrectly ruled out matching rows.The in-test comment is also self-contradictory: it asserts "both endpoint evaluations give the same boolean answer" while the lines just above explicitly list one endpoint satisfying
>=and the other not.Impact: The fix to
date_bin_timestamp/date_bin_timestamp_tzis correct, but the test that's supposed to lock it in provides false assurance. A later refactor that re-promotes the annotation would not be caught bycargo test -p mz-expr --lib(the PR description points to this command as the verification gate). Given that filter-pushdown correctness bugs are P1/test-blocker (see #9656), losing the regression coverage is a real durability concern.Suggested fix: Compare against a timestamp strictly between the two endpoint outputs so that the buggy lex-mapping rules out the matching-rows case. For example:
Under
(true, true), the lex range[2023-12-31, 2024-01-01]is entirely< 2024-01-01 12:00:00, so the interpreter would only admitFalseand themay_contain(True)assertion would fail — making it a real regression test. Under the fix's(false, true), the stride flat-map yieldsanything(), both outcomes are reachable, and the assertion passes.(Concretely, an interior stride such as
{0, 1, 43200000000}— 1 day + 12 hours — does bin2024-01-01 12:00:00to itself, so the predicate is genuinely achievable. The test doesn't need to construct that value; it just needs to assert the interpreter doesn't rule outTrue.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great find! Fixed in the latest commit.