Skip to content

Fix walrus truthiness on metrics bounds and identity-partition projection#3412

Open
kevinjqliu wants to merge 4 commits into
apache:mainfrom
kevinjqliu:kevinjqliu/walrus-metrics-bounds
Open

Fix walrus truthiness on metrics bounds and identity-partition projection#3412
kevinjqliu wants to merge 4 commits into
apache:mainfrom
kevinjqliu:kevinjqliu/walrus-metrics-bounds

Conversation

@kevinjqliu
Copy link
Copy Markdown
Contributor

@kevinjqliu kevinjqliu commented May 24, 2026

Inspired by the walrus issue in #3353

Summary

Several if x := dict.get(k): checks treated legitimate falsy values as missing:

  • lower_bounds.get(field_id) / upper_bounds.get(field_id) return bytes. b"" is a valid serialization of an empty string and was being skipped, causing metrics-based row filtering to return ROWS_MIGHT_MATCH when it should have been ROWS_CANNOT_MATCH (and vice versa for strict eval).
  • accessors[...].get(file.partition) can return 0 or "" for an IdentityTransform partition. The walrus dropped it, so projected partition columns were filled with null instead of the actual value.
  • inspect.py lower_bound / upper_bound rendering had the same b"" issue, showing None instead of "" in readable_metrics.

All conditions are switched to explicit is not None checks. A small _readable_bound helper deduplicates the inspect rendering.

Changes

  • pyiceberg/expressions/visitors.py_InclusiveMetricsEvaluator and _StrictMetricsEvaluator bound lookups (21 sites).
  • pyiceberg/io/pyarrow.py_get_column_projection_values and ArrowProjectionVisitor missing-field handling.
  • pyiceberg/table/inspect.py — extract _readable_bound, use it in both the entries and _get_files_from_manifest rendering paths.

Tests

  • tests/expressions/test_evaluator.py — inclusive and strict evaluators with empty-string bounds, covering lower-only, upper-only, and both-bounds branches.
  • tests/io/test_pyarrow.py — parametrized identity-transform projection with falsy partition values (0, "", None).
  • tests/table/test_inspect.py_readable_bound helper plus integration tests via tbl.inspect.entries() and tbl.inspect.files() for empty-string and null bounds.

@kevinjqliu kevinjqliu requested a review from Fokko May 24, 2026 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant