The filter implementation for BinaryArray discards nullness of data. BinaryArrays that are null (seem to) always return an empty string slice when getting a value, so the way filter works might be a bug depending on what Arrow developers' or users' intentions are.
I think we should either preserve nulls (and their count) or document this as intended behaviour.
Below is a test case that reproduces the bug.
#[test]
fn test_filter_binary_array_with_nulls() {
let mut a: BinaryBuilder = BinaryBuilder::new(100);
a.append_null().unwrap();
a.append_string("a string").unwrap();
a.append_null().unwrap();
a.append_string("with nulls").unwrap();
let array = a.finish();
let b = BooleanArray::from(vec![true, true, true, true]);
let c = filter(&array, &b).unwrap();
let d: &BinaryArray = c.as_any().downcast_ref::<BinaryArray>().unwrap();
// I didn't expect this behaviour
assert_eq!("", d.get_string(0));
// fails here
assert!(d.is_null(0));
assert_eq!(4, d.len());
// fails here
assert_eq!(2, d.null_count());
assert_eq!("a string", d.get_string(1));
// fails here
assert!(d.is_null(2));
assert_eq!("with nulls", d.get_string(3));
}
Reporter: Neville Dipale / @nevi-me
Related issues:
Note: This issue was originally created as ARROW-5352. Please see the migration documentation for further details.
The filter implementation for BinaryArray discards nullness of data. BinaryArrays that are null (seem to) always return an empty string slice when getting a value, so the way filter works might be a bug depending on what Arrow developers' or users' intentions are.
I think we should either preserve nulls (and their count) or document this as intended behaviour.
Below is a test case that reproduces the bug.
Reporter: Neville Dipale / @nevi-me
Related issues:
Note: This issue was originally created as ARROW-5352. Please see the migration documentation for further details.