expr: age is non-monotone in both arguments#36708
Draft
def- wants to merge 5 commits into
Draft
Conversation
`Interval` is lex-ordered by (months, days, micros), but adding an
interval to a timestamp or date adds *calendar* months with day-clamping
and then days as 24-hour periods. That arithmetic does not respect the
lex order:
t = 2024-01-31
i1 = {0 months, 31 days} → t + i1 = 2024-03-02
i2 = {1 month, 0 days} → t + i2 = 2024-02-29
In lex order `i1 < i2`, yet `t + i1 > t + i2`. For timestamps, the
day-clamping also collapses near-boundary inputs into the same date
while preserving sub-day time, so the first argument is non-monotone
too:
t1 = 2024-01-30 23:59:59, i = {1 month} → 2024-02-29 23:59:59
t2 = 2024-01-31 00:00:00, i = {1 month} → 2024-02-29 00:00:00
For dates the first argument *is* monotone (no sub-day precision means
clamping only collapses, never reverses), but the interval argument has
the same problem.
These annotations are consumed by the abstract interpreter that drives
persist filter pushdown. Marking these functions monotone meant the
interpreter computed the output range by evaluating the function only
at the endpoints of the input interval range — and a stats range like
`[{0m,31d}, {1m,0d}]` would yield the narrow output range
`[2024-02-29, 2024-03-02]` even though interior intervals
(e.g. `{0m, 60d}`) actually produce timestamps far outside that window.
Filter pushdown could then incorrectly conclude a part had no matching
rows, tripping the `persist filter pushdown correctness violation!`
audit in `persist_source.rs`.
Companion fix to b607993, which corrected the analogous annotations
for `add_time_interval`, `sub_time_interval`, `mul_interval`, and
`div_interval`.
Fixes database-issues#9656.
The override was added to suppress the audit panic from database-issues#9656; with the monotonicity annotations on timestamp/date + interval corrected, filter pushdown should be sound again. Removing this lets CI re-exercise the audit and surface any remaining latent causes.
- Drop useless .into() in the new interpret.rs regression test (clippy). - Update filter-pushdown.slt: sub_timestamp_interval is no longer pushdownable now that it's marked non-monotone, so the expected pushdown= lines for 'timestamp - INTERVAL day' queries are gone. This is the known tradeoff for soundness — even day-only intervals share the non-monotonic annotation since the abstract interpreter can't distinguish them statically.
date_bin(stride, source) = origin + floor((source - origin) / stride) * stride. For a fixed source like 2024-01-01 12:00:00, a 1-day stride bins to 2024-01-01 00:00:00 but a 2-day stride bins to 2023-12-31 00:00:00 — i.e. the lex-larger stride produces an earlier output. Same class of bug as the timestamp/date + interval monotonicity issues. Demotes date_bin_timestamp and date_bin_timestamp_tz from (true, true) to (false, true). Still monotone in source.
Postgres-style `age(a, b)` is annotated `is_monotone = "(true, true)"`,
but it is non-monotone in either argument.
First argument: the carry logic in the calendar-aware difference borrows
a whole month's worth of days when the day field goes negative, which
breaks the lex order of `Interval` (months, days, micros) at month
boundaries:
age(2024-03-31, 2024-02-15) = {1 month, 16 days}
age(2024-04-01, 2024-02-15) = {1 month, 15 days} ← lex-smaller
age(2024-05-01, 2024-02-15) = {2 months, 15 days}
Second argument: after the algorithm's final sign-revert, the same
day-borrow makes the result dip non-monotonically as `b` crosses a
month boundary:
age(2024-02-29, 2024-03-30) = {-1 month, -1 day}
age(2024-02-29, 2024-03-31) = {-1 month, -2 days} ← lex-smaller
age(2024-02-29, 2024-04-01) = {-1 month, -1 day}
Demotes both `age_timestamp` and `age_timestamp_tz` to
`(false, false)`. Adds regression tests in `interpret.rs` exercising
the abstract interpreter via a `>=` predicate, mirroring the existing
tests for the analogous `add_timestamp_interval` and
`date_bin_timestamp` bugs.
Same bug class as 5257b0d and 51e86ed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See commit, based on top of #36702
Postgres-style
age(a, b)is annotatedis_monotone = "(true, true)",but it is non-monotone in either argument.
First argument: the carry logic in the calendar-aware difference borrows
a whole month's worth of days when the day field goes negative, which
breaks the lex order of
Interval(months, days, micros) at monthboundaries:
age(2024-03-31, 2024-02-15) = {1 month, 16 days}
age(2024-04-01, 2024-02-15) = {1 month, 15 days} ← lex-smaller
age(2024-05-01, 2024-02-15) = {2 months, 15 days}
Second argument: after the algorithm's final sign-revert, the same
day-borrow makes the result dip non-monotonically as
bcrosses amonth boundary:
age(2024-02-29, 2024-03-30) = {-1 month, -1 day}
age(2024-02-29, 2024-03-31) = {-1 month, -2 days} ← lex-smaller
age(2024-02-29, 2024-04-01) = {-1 month, -1 day}
Demotes both
age_timestampandage_timestamp_tzto(false, false). Adds regression tests ininterpret.rsexercisingthe abstract interpreter via a
>=predicate, mirroring the existingtests for the analogous
add_timestamp_intervalanddate_bin_timestampbugs.Same bug class as 5257b0d and 51e86ed.