You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sub-issue of #22418. Port HashTableLookupExpr to the per-expression proto pattern from #21929. This one is special: it is encode-only.
HashTableLookupExpr holds a runtime Arc<dyn JoinHashMapType> (the build-side hash table) that cannot be serialized, so try_to_proto returns lit(true). There is no try_from_proto — decode just produces a Literal.
⚠️HashTableLookupExpr lives in datafusion-physical-plan, which does not expose the proto crates (datafusion_proto_models, datafusion_proto_common, proto_encode) in a normal build. Gate the new code with #[cfg(feature = "proto")] and make sure cargo check -p datafusion-physical-plan (no proto feature) still compiles — a previous PR broke the default build by referencing those crates unconditionally.
Add try_to_protoinside the impl PhysicalExpr for HashTableLookupExpr block (trait override, #[cfg(feature = "proto")]), returning lit(true) as a PhysicalExprNode. ⚠️ Not an inherent impl method — the serializer dispatches through &dyn PhysicalExpr, so an inherent method is silently never called and the central downcast fallback keeps doing the work.
In this same PR, delete the HashTableLookupExpr arm from to_proto.rs. (No decode arm to remove — decode already yields a Literal.) Deleting the encode arm is the only proof your hook is reached.
Fix the explanatory comment. The current "safe because lit(true) passes all rows" comment was disputed in review (refactor: add try_to_proto to HashTableLookupExpr #22451). The accurate story: a HashTableLookupExpr only exists during execution (built after the build side completes, placed in the dynamic filter). A plan serialized before execution contains none; on deserialize+execute the join builds a fresh one, so all pruning is preserved. A plan serialized after execution loses only the optimization, never correctness. State both cases.
HashTableLookupExpr is pub: add a doc comment warning that it is not serialized, and consider filing a follow-up ticket to track that.
Tests
The existing roundtrip_hash_table_lookup_expr_to_lit test must still pass; update its comment to match step 3.
cargo test -p datafusion-proto --test proto_integration stays green.
#[cfg(test)] mod tests at the end of the file (clippy items-after-test-module).
cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo check -p datafusion-physical-plan (no proto feature) must all pass.
Checklist
try_to_proto (→ lit(true)) in impl PhysicalExpr for HashTableLookupExpr (not inherent), #[cfg(feature="proto")]
Sub-issue of #22418. Port
HashTableLookupExprto the per-expression proto pattern from #21929. This one is special: it is encode-only.HashTableLookupExprholds a runtimeArc<dyn JoinHashMapType>(the build-side hash table) that cannot be serialized, sotry_to_protoreturnslit(true). There is notry_from_proto— decode just produces aLiteral.Where it lives
datafusion/physical-plan/src/joins/hash_join/partitioned_hash_eval.rs(cratedatafusion-physical-plan).Steps
try_to_protoinside theimpl PhysicalExpr for HashTableLookupExprblock (trait override,#[cfg(feature = "proto")]), returninglit(true)as aPhysicalExprNode.implmethod — the serializer dispatches through&dyn PhysicalExpr, so an inherent method is silently never called and the central downcast fallback keeps doing the work.HashTableLookupExprarm fromto_proto.rs. (No decode arm to remove — decode already yields aLiteral.) Deleting the encode arm is the only proof your hook is reached.try_to_prototoHashTableLookupExpr#22451). The accurate story: aHashTableLookupExpronly exists during execution (built after the build side completes, placed in the dynamic filter). A plan serialized before execution contains none; on deserialize+execute the join builds a fresh one, so all pruning is preserved. A plan serialized after execution loses only the optimization, never correctness. State both cases.HashTableLookupExprispub: add a doc comment warning that it is not serialized, and consider filing a follow-up ticket to track that.Tests
roundtrip_hash_table_lookup_expr_to_littest must still pass; update its comment to match step 3.cargo test -p datafusion-proto --test proto_integrationstays green.#[cfg(test)] mod testsat the end of the file (clippyitems-after-test-module).Before you push
cargo fmt --all,cargo clippy --all-targets --all-features -- -D warnings, andcargo check -p datafusion-physical-plan(no proto feature) must all pass.Checklist
try_to_proto(→lit(true)) inimpl PhysicalExpr for HashTableLookupExpr(not inherent),#[cfg(feature="proto")]to_proto.rsencode arm deleted in this PRpubtype noting it isn't serializedcargo check -p datafusion-physical-plan) still compilesfmt+clippy -D warningsclean; PR template filled in