Skip to content

fix: Correctly compute nullability in recursive CTE schemas#22552

Open
neilconway wants to merge 1 commit into
apache:mainfrom
neilconway:neilc/fix-recursive-cte-explicit-schema
Open

fix: Correctly compute nullability in recursive CTE schemas#22552
neilconway wants to merge 1 commit into
apache:mainfrom
neilconway:neilc/fix-recursive-cte-explicit-schema

Conversation

@neilconway
Copy link
Copy Markdown
Contributor

@neilconway neilconway commented May 27, 2026

Which issue does this PR close?

Rationale for this change

Nullability analysis for recursive CTEs had two shortcomings:

  1. The output schema of the RecursiveQuery was derived solely from the static (anchor) term. This is incorrect; the recursive term might widen nullability for some of the output columns.

  2. The schema of the CTE work table was derived solely from the static term. This is only correct for the first iteration of recursive CTE evaluation; on subsequent iterations, NULLs might have been deposited into the work table, so the static term's nullability properties may not hold.

In this PR, we fix the first issue by computing the output schema of the RecursiveQuery by taking the union of the per-column nullability of the static and recursive terms. The output schema of the RecursiveQuery is stored explicitly, matching the approach taken for most logical plan nodes.

We fix the second issue by conservatively marking the work table's columns as nullable. We could compute nullability precisely by repeatedly doing nullability analysis until we reach fixed point, but for now we take the simpler and cheaper approach.

What changes are included in this PR?

  • Add explicit schema to RecursiveQuery, computed by widening the per-column nullability of the static and recursive terms
  • Conservatively mark CTE worktable columns as nullable
  • Recompute RecursiveQuery schema as part of proto deserialization, rather than attempting to serialize it
  • Add unit and SLT tests

Are these changes tested?

Yes; new unit and SLT tests added, existing tests updated.

Are there any user-facing changes?

Behavioral: nullability analysis for CTEs will be less buggy.

API: RecursiveQuery is a pub struct that has a new field (schema), and no longer derives PartialOrd.

@github-actions github-actions Bot added sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) catalog Related to the catalog crate proto Related to proto crate physical-plan Changes to the physical-plan crate labels May 27, 2026
@neilconway
Copy link
Copy Markdown
Contributor Author

cc @kosiew I'll fill in the rest of the PR template tomorrow, but lmk what you think of this approach.

@neilconway neilconway force-pushed the neilc/fix-recursive-cte-explicit-schema branch from 9b087b3 to 33bcfcd Compare May 28, 2026 16:37
@github-actions
Copy link
Copy Markdown

Thank you for opening this pull request!

Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch).

Details
     Cloning apache/main
    Building datafusion v53.1.0 (current)
       Built [  99.219s] (current)
     Parsing datafusion v53.1.0 (current)
      Parsed [   0.036s] (current)
    Building datafusion v53.1.0 (baseline)
       Built [  99.686s] (baseline)
     Parsing datafusion v53.1.0 (baseline)
      Parsed [   0.035s] (baseline)
    Checking datafusion v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.617s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 201.157s] datafusion
    Building datafusion-catalog v53.1.0 (current)
       Built [  40.167s] (current)
     Parsing datafusion-catalog v53.1.0 (current)
      Parsed [   0.026s] (current)
    Building datafusion-catalog v53.1.0 (baseline)
       Built [  39.812s] (baseline)
     Parsing datafusion-catalog v53.1.0 (baseline)
      Parsed [   0.027s] (baseline)
    Checking datafusion-catalog v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.139s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  81.353s] datafusion-catalog
    Building datafusion-expr v53.1.0 (current)
       Built [  28.124s] (current)
     Parsing datafusion-expr v53.1.0 (current)
      Parsed [   0.074s] (current)
    Building datafusion-expr v53.1.0 (baseline)
       Built [  28.263s] (baseline)
     Parsing datafusion-expr v53.1.0 (baseline)
      Parsed [   0.074s] (baseline)
    Checking datafusion-expr v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   1.216s] 222 checks: 221 pass, 1 fail, 0 warn, 30 skip

--- failure constructible_struct_adds_field: externally-constructible struct adds field ---

Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
        ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/constructible_struct_adds_field.ron

Failed in:
  field RecursiveQuery.schema in /home/runner/work/datafusion/datafusion/datafusion/expr/src/logical_plan/plan.rs:2279
  field RecursiveQuery.schema in /home/runner/work/datafusion/datafusion/datafusion/expr/src/logical_plan/plan.rs:2279

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  58.800s] datafusion-expr
    Building datafusion-physical-plan v53.1.0 (current)
       Built [  38.617s] (current)
     Parsing datafusion-physical-plan v53.1.0 (current)
      Parsed [   0.130s] (current)
    Building datafusion-physical-plan v53.1.0 (baseline)
       Built [  37.818s] (baseline)
     Parsing datafusion-physical-plan v53.1.0 (baseline)
      Parsed [   0.126s] (baseline)
    Checking datafusion-physical-plan v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.582s] 222 checks: 221 pass, 1 fail, 0 warn, 30 skip

--- failure method_parameter_count_changed: pub method parameter count changed ---

Description:
A publicly-visible method now takes a different number of parameters, not counting the receiver (self) parameter.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#fn-change-arity
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/method_parameter_count_changed.ron

Failed in:
  datafusion_physical_plan::recursive_query::RecursiveQueryExec::try_new takes 4 parameters in /home/runner/work/datafusion/datafusion/target/semver-checks/git-apache_main/3cf7effc87b9920734a7bd585f7256ccb157c4b2/datafusion/physical-plan/src/recursive_query.rs:85, but now takes 5 parameters in /home/runner/work/datafusion/datafusion/datafusion/physical-plan/src/recursive_query.rs:85

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  78.688s] datafusion-physical-plan
    Building datafusion-proto v53.1.0 (current)
       Built [  59.991s] (current)
     Parsing datafusion-proto v53.1.0 (current)
      Parsed [   0.018s] (current)
    Building datafusion-proto v53.1.0 (baseline)
       Built [  60.408s] (baseline)
     Parsing datafusion-proto v53.1.0 (baseline)
      Parsed [   0.019s] (baseline)
    Checking datafusion-proto v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.272s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 122.343s] datafusion-proto
    Building datafusion-sql v53.1.0 (current)
       Built [  43.192s] (current)
     Parsing datafusion-sql v53.1.0 (current)
      Parsed [   0.030s] (current)
    Building datafusion-sql v53.1.0 (baseline)
       Built [  41.839s] (baseline)
     Parsing datafusion-sql v53.1.0 (baseline)
      Parsed [   0.032s] (baseline)
    Checking datafusion-sql v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.236s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  86.507s] datafusion-sql
    Building datafusion-sqllogictest v53.1.0 (current)
       Built [ 170.133s] (current)
     Parsing datafusion-sqllogictest v53.1.0 (current)
      Parsed [   0.022s] (current)
    Building datafusion-sqllogictest v53.1.0 (baseline)
       Built [ 177.548s] (baseline)
     Parsing datafusion-sqllogictest v53.1.0 (baseline)
      Parsed [   0.026s] (baseline)
    Checking datafusion-sqllogictest v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.106s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 350.668s] datafusion-sqllogictest

@github-actions github-actions Bot added the auto detected api change Auto detected API change label May 28, 2026
@neilconway neilconway marked this pull request as ready for review May 28, 2026 17:41
@neilconway neilconway changed the title fix: Preserve nullability in CTE recursive schema fix: Correctly compute nullability in recursive CTEs May 28, 2026
@neilconway neilconway changed the title fix: Correctly compute nullability in recursive CTEs fix: Correctly compute nullability in recursive CTE schemas May 28, 2026
@neilconway
Copy link
Copy Markdown
Contributor Author

Alright, I think this is ready for review. Personally I think this approach is simpler and more robust than #22037, but I'd love feedback either way.

Copy link
Copy Markdown
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neilconway

Looks 👍 to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto detected api change Auto detected API change catalog Related to the catalog crate core Core DataFusion crate logical-expr Logical plan and expressions physical-plan Changes to the physical-plan crate proto Related to proto crate sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Recursive CTE Nullability Handling Should Preserve Logical Schema Without Requiring SQL Rewrites

2 participants