Skip to content

expr: age is non-monotone in both arguments#36708

Draft
def- wants to merge 5 commits into
MaterializeInc:mainfrom
def-:pr-age-non-monotone
Draft

expr: age is non-monotone in both arguments#36708
def- wants to merge 5 commits into
MaterializeInc:mainfrom
def-:pr-age-non-monotone

Conversation

@def-
Copy link
Copy Markdown
Contributor

@def- def- commented May 24, 2026

See commit, based on top of #36702

Postgres-style age(a, b) is annotated is_monotone = "(true, true)",
but it is non-monotone in either argument.

First argument: the carry logic in the calendar-aware difference borrows
a whole month's worth of days when the day field goes negative, which
breaks the lex order of Interval (months, days, micros) at month
boundaries:

age(2024-03-31, 2024-02-15) = {1 month, 16 days}
age(2024-04-01, 2024-02-15) = {1 month, 15 days} ← lex-smaller
age(2024-05-01, 2024-02-15) = {2 months, 15 days}

Second argument: after the algorithm's final sign-revert, the same
day-borrow makes the result dip non-monotonically as b crosses a
month boundary:

age(2024-02-29, 2024-03-30) = {-1 month, -1 day}
age(2024-02-29, 2024-03-31) = {-1 month, -2 days} ← lex-smaller
age(2024-02-29, 2024-04-01) = {-1 month, -1 day}

Demotes both age_timestamp and age_timestamp_tz to
(false, false). Adds regression tests in interpret.rs exercising
the abstract interpreter via a >= predicate, mirroring the existing
tests for the analogous add_timestamp_interval and
date_bin_timestamp bugs.

Same bug class as 5257b0d and 51e86ed.

claude and others added 5 commits May 24, 2026 02:02
`Interval` is lex-ordered by (months, days, micros), but adding an
interval to a timestamp or date adds *calendar* months with day-clamping
and then days as 24-hour periods. That arithmetic does not respect the
lex order:

  t = 2024-01-31
  i1 = {0 months, 31 days}   →  t + i1 = 2024-03-02
  i2 = {1 month,   0 days}   →  t + i2 = 2024-02-29

In lex order `i1 < i2`, yet `t + i1 > t + i2`. For timestamps, the
day-clamping also collapses near-boundary inputs into the same date
while preserving sub-day time, so the first argument is non-monotone
too:

  t1 = 2024-01-30 23:59:59,  i = {1 month}  →  2024-02-29 23:59:59
  t2 = 2024-01-31 00:00:00,  i = {1 month}  →  2024-02-29 00:00:00

For dates the first argument *is* monotone (no sub-day precision means
clamping only collapses, never reverses), but the interval argument has
the same problem.

These annotations are consumed by the abstract interpreter that drives
persist filter pushdown. Marking these functions monotone meant the
interpreter computed the output range by evaluating the function only
at the endpoints of the input interval range — and a stats range like
`[{0m,31d}, {1m,0d}]` would yield the narrow output range
`[2024-02-29, 2024-03-02]` even though interior intervals
(e.g. `{0m, 60d}`) actually produce timestamps far outside that window.
Filter pushdown could then incorrectly conclude a part had no matching
rows, tripping the `persist filter pushdown correctness violation!`
audit in `persist_source.rs`.

Companion fix to b607993, which corrected the analogous annotations
for `add_time_interval`, `sub_time_interval`, `mul_interval`, and
`div_interval`.

Fixes database-issues#9656.
The override was added to suppress the audit panic from
database-issues#9656; with the monotonicity annotations on
timestamp/date + interval corrected, filter pushdown should be sound
again. Removing this lets CI re-exercise the audit and surface any
remaining latent causes.
- Drop useless .into() in the new interpret.rs regression test (clippy).
- Update filter-pushdown.slt: sub_timestamp_interval is no longer
  pushdownable now that it's marked non-monotone, so the expected
  pushdown= lines for 'timestamp - INTERVAL day' queries are gone.
  This is the known tradeoff for soundness — even day-only intervals
  share the non-monotonic annotation since the abstract interpreter
  can't distinguish them statically.
date_bin(stride, source) = origin + floor((source - origin) / stride) * stride.
For a fixed source like 2024-01-01 12:00:00, a 1-day stride bins to
2024-01-01 00:00:00 but a 2-day stride bins to 2023-12-31 00:00:00 —
i.e. the lex-larger stride produces an earlier output. Same class of
bug as the timestamp/date + interval monotonicity issues.

Demotes date_bin_timestamp and date_bin_timestamp_tz from (true, true)
to (false, true). Still monotone in source.
Postgres-style `age(a, b)` is annotated `is_monotone = "(true, true)"`,
but it is non-monotone in either argument.

First argument: the carry logic in the calendar-aware difference borrows
a whole month's worth of days when the day field goes negative, which
breaks the lex order of `Interval` (months, days, micros) at month
boundaries:

  age(2024-03-31, 2024-02-15) = {1 month, 16 days}
  age(2024-04-01, 2024-02-15) = {1 month, 15 days}   ← lex-smaller
  age(2024-05-01, 2024-02-15) = {2 months, 15 days}

Second argument: after the algorithm's final sign-revert, the same
day-borrow makes the result dip non-monotonically as `b` crosses a
month boundary:

  age(2024-02-29, 2024-03-30) = {-1 month, -1 day}
  age(2024-02-29, 2024-03-31) = {-1 month, -2 days}  ← lex-smaller
  age(2024-02-29, 2024-04-01) = {-1 month, -1 day}

Demotes both `age_timestamp` and `age_timestamp_tz` to
`(false, false)`. Adds regression tests in `interpret.rs` exercising
the abstract interpreter via a `>=` predicate, mirroring the existing
tests for the analogous `add_timestamp_interval` and
`date_bin_timestamp` bugs.

Same bug class as 5257b0d and 51e86ed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants