Skip to content

fix: use cast preimages for cast predicate rewrites#11

Open
discord9 wants to merge 11 commits into
GreptimeTeam:greptimedb-53.1.0-function-signature-exec-errorfrom
discord9:discord9/cast-preimage-greptimedb-53
Open

fix: use cast preimages for cast predicate rewrites#11
discord9 wants to merge 11 commits into
GreptimeTeam:greptimedb-53.1.0-function-signature-exec-errorfrom
discord9:discord9/cast-preimage-greptimedb-53

Conversation

@discord9

@discord9 discord9 commented Jul 2, 2026

Copy link
Copy Markdown

Which issue does this PR close?

Rationale for this change

GreptimeDB currently patches DataFusion crates from GreptimeTeam/datafusion at rev e8a127c28e8839964bb6aefdd909810dc11cd2c9, which is on branch greptimedb-53.1.0-function-signature-exec-error.

GreptimeDB can produce incorrect timestamp predicate results when a timestamp column is compared through a lower-precision cast, for example matching 2026-06-02 03:50:00.195 for a <= '2026-06-02 03:50:00' predicate.

The existing cast predicate rewrite can unwrap casts that are not one-to-one, changing predicate semantics for timestamp precision narrowing and other lossy casts. This ports the upstream Apache DataFusion cast-preimage fix onto the GreptimeDB-dependent DataFusion branch so predicate rewrites are only applied when the preimage is known to preserve semantics.

What changes are included in this PR?

  • Introduce shared cast predicate preimage logic in datafusion-expr-common.
  • Use exact preimages for safe cast predicate rewrites.
  • Use range preimages for timestamp precision narrowing instead of unsafe exact cast unwrapping.
  • Reject unsafe exact rewrites for numeric narrowing, lossy decimal casts, timestamp narrowing, and similar many-to-one casts.
  • Share the safer logic between logical optimizer and physical expression simplifier.
  • Add regression coverage for timestamp, decimal, distinctness, and edge-case cast predicate behavior.

Are these changes tested?

In progress locally after opening this PR.

Already checked before opening:

  • cargo fmt --all was run; branch is formatted.

Planned next local checks:

  • DataFusion Rust CI-equivalent cargo checks/tests/clippy that can be run locally for this branch.

Are there any user-facing changes?

Yes. Query results for predicates involving lossy casts become more correct and conservative: unsafe cast unwrapping is avoided, and timestamp precision narrowing is rewritten using ranges that preserve predicate semantics.

discord9 added 9 commits July 2, 2026 17:57
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant