Skip to content

Unsafe / incorrect unwrap_cast optimizations for narrowing casts (e.g. Decimal) #23095

Description

@alamb

Describe the bug

The unwrap_cast optimization (in both the logical ExprSimplifier and the physical simplifier) still produces incorrect query results for narrowing casts even after #22837. #22837 fixed the specific case of timestamp precision narrowing, but the same class of bug remains for other many-to-one casts, most notably decimal scale narrowing.

This ticket tracks the general problem and the remaining decimal fix

Reproducer (decimal, still broken after #22837)

CREATE TABLE t(d DECIMAL(20,4)) AS VALUES (1.2345), (1.2399), (1.2500);

-- The scale-narrowing cast rounds (half-up):
--   1.2345 -> 1.23
--   1.2399 -> 1.24
--   1.2500 -> 1.25
SELECT d, arrow_cast(d, 'Decimal128(20, 2)') AS d2 FROM t;
+--------+------+
| d      | d2   |
+--------+------+
| 1.2345 | 1.23 |
| 1.2399 | 1.24 |
| 1.2500 | 1.25 |
+--------+------+

-- Only 1.2345 rounds to 1.23, so this MUST return 1 row:
SELECT * FROM t
WHERE arrow_cast(d, 'Decimal128(20, 2)') = arrow_cast(1.23, 'Decimal128(20, 2)');

-- predicate is rewritten to `d = 1.2300` (exact) -> returns 0 rows. WRONG (should be 1 row).
+---+
| d |
+---+
+---+
0 row(s) fetched.

The original predicate CAST(d AS Decimal128(20,2)) = 1.23 means "any d that rounds to 1.23 at scale 2" — a range of source values, the half-open interval [1.225, 1.235). That range contains 1.2345 but not the exact value 1.2300, so the rewrite to d = 1.2300 wrongly returns 0 rows instead of 1.

Background / summary

The unwrap_cast optimization rewrites predicates of the form:

CAST(col AS target_type) <op> literal_target
  -->  col <op> try_cast_literal_to_type(literal_target, col_type)

This is only valid when the cast applied to the column is one-to-one. When the column cast is many-to-one, multiple distinct source values map to the same target value, so an exact-equality rewrite on the source column is semantically wrong — the correct source-domain preimage is a range, not a singleton.

This matches Spark's UnwrapCastInBinaryComparison, which has a two-layer check (see this comment for details):

  • Layer 1 — a check on the data types (is this cast direction safe to unwrap at all?)
  • Layer 2 — a check that the literal can be cast losslessly

DataFusion currently only has the equivalent of Layer 2 (try_cast_literal_to_type) and is missing the general Layer 1 type-direction check. #22837 added a narrow Layer-1-style guard for timestamp precision narrowing only; this ticket tracks the general fix.

Why this doesn't blow up more often

The optimization mostly stays correct today by accident, because it leans on try_cast_literal_to_type being conservative. That helper is a restricted, value-preserving cast of the literal that returns None when the value can't be represented exactly in the source type (out of range, decimal digits lost, etc.). For many lossy casts the literal simply fails to round-trip, try_cast_literal_to_type returns None, and the unwrap is silently skipped — so no wrong answer.

The bug surfaces precisely in the cases where:

  1. the literal does round-trip exactly into the source type (so Layer 2 passes), but
  2. the column-side cast is many-to-one, so exact equality on the source column is still wrong.

Decimal scale narrowing is exactly this trap: casting the literal 1.23 (scale 2) up to scale 4 → 1.2300 is exact and value-preserving, so Layer 2 happily passes — but CAST(d AS Decimal(20,2)) on the column is a rounding, many-to-one cast, so the rewrite is wrong. Timestamp precision narrowing (now fixed) had the same shape.

So the right fix is a Layer-1 type-direction check (or a preimage-based rewrite) rather than relying on try_cast_literal_to_type alone.

Proposed approaches (PRs from @discord9)

Acceptance criteria

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions