Skip to content

Port NotExpr proto hooks#22463

Merged
adriangb merged 13 commits into
apache:mainfrom
Herrtian:port-notexpr-proto-hooks
May 28, 2026
Merged

Port NotExpr proto hooks#22463
adriangb merged 13 commits into
apache:mainfrom
Herrtian:port-notexpr-proto-hooks

Conversation

@Herrtian
Copy link
Copy Markdown
Contributor

@Herrtian Herrtian commented May 22, 2026

Which issue does this PR close?

Rationale for this change

NotExpr still used the central physical expression protobuf downcast path. Moving it to the expression-level proto hook keeps it aligned with the newer serialization pattern and reduces the special-case branching in the shared conversion code.

What changes are included in this PR?

  • Move NotExpr protobuf serialization into its try_to_proto hook.
  • Add NotExpr::try_from_proto and route decode through it.
  • Remove the old central to_proto downcast branch for NotExpr.

Are these changes tested?

Yes. I ran:

  • cargo fmt --all -- --check
  • cargo check -p datafusion-physical-expr --features proto
  • cargo check -p datafusion-proto
  • cargo test -p datafusion-proto --test proto_integration roundtrip_filter_with_not
  • git diff --check

Are there any user-facing changes?

No. This is an internal proto serialization refactor and should not change query behavior or public APIs.

@github-actions github-actions Bot added physical-expr Changes to the physical-expr crates proto Related to proto crate labels May 22, 2026
@kumarUjjawal
Copy link
Copy Markdown
Contributor

@Herrtian Thank you for picking this up. Can you please update the PR body with more details. Use the pr template for this.

@Herrtian
Copy link
Copy Markdown
Contributor Author

Updated the PR body to use the template.

@kumarUjjawal
Copy link
Copy Markdown
Contributor

@Herrtian we should add a direct test for the new hook code itself, especially the bad-input case like a missing child expression.

@Herrtian
Copy link
Copy Markdown
Contributor Author

Added a direct missing-child test for the NotExpr proto hook.

Checked with:

  • cargo test -p datafusion-physical-expr --features proto test_from_proto_missing_child
  • cargo fmt --check
  • git diff --check


let protobuf::PhysicalNot { expr } = match &node.expr_type {
Some(protobuf::physical_expr_node::ExprType::NotExpr(e)) => e.as_ref(),
_ => return internal_err!("PhysicalExprNode is not a NotExpr"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should surface both expr_id and the actual expr_type variant the decoder received.

_ => return internal_err!("PhysicalExprNode is not a NotExpr"),
};
let expr = expr.as_deref().ok_or_else(|| {
datafusion_common::DataFusionError::Internal(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use internal_datafusion_err!

@Herrtian
Copy link
Copy Markdown
Contributor Author

Addressed the review comments: the missing-child path now uses internal_datafusion_err! and includes the expr_id plus NotExpr type in the error.

Checked with:

  • cargo test -p datafusion-physical-expr --features proto test_from_proto_missing_child
  • cargo fmt --check
  • git diff --check

Herrtian added 4 commits May 25, 2026 16:22
Signed-off-by: Herrtian <70463940+Herrtian@users.noreply.github.com>
Signed-off-by: Herrtian <70463940+Herrtian@users.noreply.github.com>
@Herrtian Herrtian force-pushed the port-notexpr-proto-hooks branch from d7ad593 to 33a90db Compare May 25, 2026 14:24
) -> Result<Arc<dyn PhysicalExpr>> {
use datafusion_proto_models::protobuf;

let protobuf::PhysicalNot { expr } = match &node.expr_type {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think try_from_proto should use ctx.decode_required_expression(...) here instead of open-coding the missing-child check and then calling ctx.decode(...). A helper was added recently for this, can you check, I just pushed main to this branch?


let err = NotExpr::try_from_proto(&node, &ctx).unwrap_err();
assert!(matches!(err, DataFusionError::Internal(msg)
if msg.contains("NotExpr is missing required field 'expr'")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would includes expr_id and the actual expr_type here, let me know what you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think expr_id would be very helpful, it's mostly for deduplication of expressions / not useful in identifying them.

use arrow::{array::BooleanArray, datatypes::*};
use datafusion_physical_expr_common::physical_expr::fmt_sql;

#[cfg(feature = "proto")]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also move these into a separate #[cfg(all(test, feature = "proto"))] mod proto_tests

@Herrtian
Copy link
Copy Markdown
Contributor Author

Updated the NotExpr proto path to use decode_required_expression, kept the expr_id/expr_type context for the missing-child error, and moved the proto test into its own module.

Checked with:

  • cargo test -p datafusion-physical-expr --features proto test_from_proto_missing_child
  • cargo fmt --all -- --check
  • git diff --check

…hooks

Signed-off-by: Herrtian <70463940+Herrtian@users.noreply.github.com>

# Conflicts:
#	datafusion/proto/src/physical_plan/from_proto.rs
#	datafusion/proto/src/physical_plan/to_proto.rs
@Herrtian
Copy link
Copy Markdown
Contributor Author

Herrtian commented May 28, 2026

Synced with latest main and resolved the proto hook conflict with the newer NegativeExpr hook.

Checked with:

  • cargo test -p datafusion-physical-expr --features proto test_from_proto_missing_child
  • cargo fmt --all -- --check
  • git diff --check

Herrtian added 2 commits May 28, 2026 19:57
…hooks

Signed-off-by: Herrtian <70463940+Herrtian@users.noreply.github.com>

# Conflicts:
#	datafusion/proto/src/physical_plan/to_proto.rs
Signed-off-by: Herrtian <70463940+Herrtian@users.noreply.github.com>
@adriangb
Copy link
Copy Markdown
Contributor

#22596 adds an expect_expr_variant! macro and a require_proto_field helper that together collapse the outer match and any non-expression "missing required field" ok_or_else to one line each. Could you rebase onto it and adopt them before final review?

Herrtian added 2 commits May 28, 2026 20:55
Signed-off-by: Herrtian <70463940+Herrtian@users.noreply.github.com>
@Herrtian
Copy link
Copy Markdown
Contributor Author

Synced with #22596 and updated NotExpr::try_from_proto to use expect_expr_variant! while keeping the missing-child context.

Checked with:

  • cargo test -p datafusion-physical-expr --features proto test_from_proto_missing_child
  • RUST_BACKTRACE=1 cargo test -p datafusion-physical-expr --features proto test_from_proto_missing_child
  • cargo fmt --all -- --check
  • git diff --check

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports NotExpr to use the per-expression try_to_proto / try_from_proto hooks, removing it from the central downcast/decode chains in datafusion-proto to match the migration pattern established for Column, Negative, Cast, Like, and InList.

Changes:

  • Add try_to_proto override inside impl PhysicalExpr for NotExpr and an inherent NotExpr::try_from_proto decode helper.
  • Route the ExprType::NotExpr decode arm through the new helper and delete the corresponding central encode branch.
  • Add a proto_tests module containing a single missing-child error case.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
datafusion/physical-expr/src/expressions/not.rs Implements the encode/decode hooks, adds an error-remapping closure around decode_required_expression, and adds a missing-child test module.
datafusion/proto/src/physical_plan/to_proto.rs Removes the central NotExpr downcast branch from serialize_physical_expr_with_converter.
datafusion/proto/src/physical_plan/from_proto.rs Replaces the inline ExprType::NotExpr decode with a delegation to NotExpr::try_from_proto.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +220 to +233
let expr = ctx
.decode_required_expression(not_expr.expr.as_deref(), "NotExpr", "expr")
.map_err(|err| match err {
datafusion_common::DataFusionError::Internal(msg)
if msg.starts_with("NotExpr is missing required field 'expr'") =>
{
internal_datafusion_err!(
"NotExpr is missing required field 'expr' (expr_id: {:?}, expr_type: {:?})",
node.expr_id,
&node.expr_type
)
}
other => other,
})?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I don't think this is worth it: #22463 (comment)

Comment on lines +244 to +289
#[cfg(all(test, feature = "proto"))]
mod proto_tests {
use std::sync::Arc;

use super::*;

use arrow::datatypes::Schema;
use datafusion_common::DataFusionError;
use datafusion_physical_expr_common::physical_expr::proto_decode::{
PhysicalExprDecode, PhysicalExprDecodeCtx,
};
use datafusion_proto_models::protobuf::{
PhysicalExprNode, PhysicalNot, physical_expr_node,
};

struct NoopDecoder;

impl PhysicalExprDecode for NoopDecoder {
fn decode(
&self,
_node: &PhysicalExprNode,
_schema: &Schema,
) -> Result<Arc<dyn PhysicalExpr>> {
unreachable!("missing child should be rejected before decoding")
}
}

#[test]
fn test_from_proto_missing_child() {
let node = PhysicalExprNode {
expr_id: Some(42),
expr_type: Some(physical_expr_node::ExprType::NotExpr(Box::new(
PhysicalNot { expr: None },
))),
};
let schema = Schema::empty();
let decoder = NoopDecoder;
let ctx = PhysicalExprDecodeCtx::new(&schema, &decoder);

let err = NotExpr::try_from_proto(&node, &ctx).unwrap_err();
assert!(matches!(err, DataFusionError::Internal(msg)
if msg.contains("NotExpr is missing required field 'expr'")
&& msg.contains("expr_id: Some(42)")
&& msg.contains("expr_type: Some(NotExpr")));
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easy enough to add another rt test

Comment on lines +244 to +290
#[cfg(all(test, feature = "proto"))]
mod proto_tests {
use std::sync::Arc;

use super::*;

use arrow::datatypes::Schema;
use datafusion_common::DataFusionError;
use datafusion_physical_expr_common::physical_expr::proto_decode::{
PhysicalExprDecode, PhysicalExprDecodeCtx,
};
use datafusion_proto_models::protobuf::{
PhysicalExprNode, PhysicalNot, physical_expr_node,
};

struct NoopDecoder;

impl PhysicalExprDecode for NoopDecoder {
fn decode(
&self,
_node: &PhysicalExprNode,
_schema: &Schema,
) -> Result<Arc<dyn PhysicalExpr>> {
unreachable!("missing child should be rejected before decoding")
}
}

#[test]
fn test_from_proto_missing_child() {
let node = PhysicalExprNode {
expr_id: Some(42),
expr_type: Some(physical_expr_node::ExprType::NotExpr(Box::new(
PhysicalNot { expr: None },
))),
};
let schema = Schema::empty();
let decoder = NoopDecoder;
let ctx = PhysicalExprDecodeCtx::new(&schema, &decoder);

let err = NotExpr::try_from_proto(&node, &ctx).unwrap_err();
assert!(matches!(err, DataFusionError::Internal(msg)
if msg.contains("NotExpr is missing required field 'expr'")
&& msg.contains("expr_id: Some(42)")
&& msg.contains("expr_type: Some(NotExpr")));
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the tracking issue is pendantic, but I do think new code at the bottom of module makes the most sense

@adriangb
Copy link
Copy Markdown
Contributor

@Herrtian a couple review items from copilot and myself

@adriangb
Copy link
Copy Markdown
Contributor

also ci is failing now

Signed-off-by: Herrtian <70463940+Herrtian@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@adriangb adriangb enabled auto-merge May 28, 2026 19:21
Signed-off-by: Herrtian <70463940+Herrtian@users.noreply.github.com>
auto-merge was automatically disabled May 28, 2026 19:58

Head branch was pushed to by a user without write access

@adriangb adriangb enabled auto-merge May 28, 2026 20:02
@adriangb adriangb added this pull request to the merge queue May 28, 2026
Merged via the queue into apache:main with commit 2fc3b1d May 28, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-expr Changes to the physical-expr crates proto Related to proto crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Port NotExpr to use try_to_proto / try_from_proto

4 participants