Skip to content

[EPIC] Port PhysicalExpr implementations to use try_from_proto / try_to_proto #22418

@adriangb

Description

@adriangb

Follow-up to #21929 / #21835. This issue tracks porting the remaining built-in PhysicalExpr implementations to the new try_to_proto / try_from_proto hooks.

Why

Today, every built-in PhysicalExpr is serialized through a single ~300-line downcast_ref chain in datafusion/proto/src/physical_plan/to_proto.rs and a symmetric match on ExprType on the decode side. The full motivation — silent round-trip bugs (#19379, #22065), pub-for-proto API bloat (#21807, #22011), built-in/third-party asymmetry — is in #21835.

#21929 landed the infrastructure and migrated Column and BinaryExpr as working demos. Each remaining expression can now be moved off the central chain one PR at a time with no wire-format change.

Reference implementation

Copy the Column migration in #21929 almost verbatim. Note the asymmetry — the two hooks live in different impl blocks, and getting this wrong is the single most common review bounce (see below):

  • try_to_proto is a trait method → it goes inside impl PhysicalExpr for Column (datafusion/physical-expr/src/expressions/column.rs), feature-gated #[cfg(feature = "proto")].
  • try_from_proto is an inherent associated fn → it goes in a separate #[cfg(feature = "proto")] impl Column block, and is wired into the ExprType::Column(_) => Column::try_from_proto(proto, &decode_ctx)? arm in from_proto.rs.

Pattern (per expression)

For each impl PhysicalExpr for FooExpr:

  1. Add try_to_proto inside the impl PhysicalExpr for FooExpr block (returning Ok(Some(_))), feature-gated #[cfg(feature = "proto")].
    ⚠️ Do not put it in an inherent impl FooExpr block. The serializer calls expr.try_to_proto(&ctx) through &dyn PhysicalExpr, so an inherent method with the same name is silently never called — serialization keeps using the old downcast chain and your tests still pass against the dead path. This has been the MINOR: Set GitHub description and labels #1 review-blocker.
  2. Add FooExpr::try_from_proto(node, ctx) as an inherent associated fn and wire it into the matching ExprType::… arm in from_proto.rs.
  3. In the same PR, delete the FooExpr arm from to_proto.rs and the old ExprType::… decode arm from from_proto.rs. This is not deferrable cleanup — it is the only verification that your new hook is actually reached. If the existing roundtrip tests still pass after you delete the arm, the hook works; if you leave the arm in "as a fallback for now," the tests pass through the dead code and tell you nothing.
  4. Keep the protobuf wire format byte-for-byte identical. expr_id is None for every expression here — DynamicFilterPhysicalExpr (Port DynamicFilterPhysicalExpr to use try_to_proto / try_from_proto #22434) is the only one that sets it.
  5. Revert any pub items that existed only to let datafusion-proto reach in — DynamicFilterPhysicalExpr is the canonical case (the Inner/from_parts/inner()/original_children/remapped_children scaffolding from proto: serialize and dedupe dynamic filters v2 #21807), and that revert belongs in the same PR.

Tests (yes, add one)

The earlier "no new tests needed" guidance was wrong — reviewers have asked for new tests on every reviewed PR. Each PR should:

  • Confirm the existing roundtrip_physical_plan / roundtrip_physical_expr tests still pass: cargo test -p datafusion-proto --test proto_integration.
  • Add a direct unit test for the new hook, next to the expression, covering both directions and a bad-input case — e.g. try_from_proto on a PhysicalExprNode with the wrong expr_type variant or a missing child returns a clean error, not a panic.
  • Put #[cfg(test)] mod tests at the end of the file, or cargo clippy -D warnings fails with items-after-test-module.

Before you open the PR

  • Fill in every section of the PR template. PRs with a bare description get bounced before review.
  • cargo fmt --all and cargo clippy --all-targets --all-features -- -D warnings must both be clean.

Expressions to migrate

Currently still on the central downcast chain (per to_proto.rs):

Already migrated: Column, BinaryExpr (in #21929).

Window / aggregate expressions go through a separate code path and are out of scope for this epic; tracked separately if/when needed.

Definition of done

  • All built-in PhysicalExpr impls listed above use the new hooks.
  • The central downcast chain in to_proto.rs and the ExprType match in from_proto.rs are deleted (or reduced to a thin dispatch).
  • pub-for-proto scaffolding on DynamicFilterPhysicalExpr (and any others) is reverted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions