Skip to content

Fix is_zero() for FixedSizeList and Struct scalar types#8500

Merged
connortsui20 merged 1 commit into
developfrom
claude/serene-darwin-ne73a3
Jun 19, 2026
Merged

Fix is_zero() for FixedSizeList and Struct scalar types#8500
connortsui20 merged 1 commit into
developfrom
claude/serene-darwin-ne73a3

Conversation

@connortsui20

@connortsui20 connortsui20 commented Jun 19, 2026

Copy link
Copy Markdown
Member

Summary

Closes: #7891

Fixes incorrect is_zero() implementations for FixedSizeList and Struct scalar types that were only checking the number of elements/fields rather than recursively validating that all contained values are themselves zero.

`Scalar::is_zero` previously only checked the element/field count for
`FixedSizeList` and `Struct` values, so any non-null struct or fixed-size
list was reported as zero. Recurse into the elements/fields and require
every one to be a non-null zero value, matching the documented
`Scalar::zero_value` semantics.

Add regression tests covering non-zero elements/fields, null
elements/fields, and a nested struct-of-fixed-size-list value. These fail
against the previous element-count-only behavior.

Closes: #7891

Signed-off-by: Connor Tsui <connor@spiraldb.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_0139i5huJ9ixt2m3LggBoUKQ
@connortsui20 connortsui20 requested a review from a team June 19, 2026 01:22
@connortsui20 connortsui20 added the changelog/fix A bug fix label Jun 19, 2026 — with Claude
@connortsui20 connortsui20 requested review from AdamGS, joseph-isaacs and robert3005 and removed request for joseph-isaacs June 19, 2026 01:34
@connortsui20 connortsui20 enabled auto-merge (squash) June 19, 2026 01:35
@codspeed-hq

codspeed-hq Bot commented Jun 19, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 6 improved benchmarks
❌ 7 regressed benchmarks
✅ 1568 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation decompress_rd[f64, (10000, 0.01)] 108.7 µs 139 µs -21.83%
Simulation decompress_rd[f64, (10000, 0.1)] 109 µs 139.4 µs -21.81%
Simulation decompress_rd[f64, (10000, 0.0)] 108.7 µs 139 µs -21.79%
Simulation decompress_rd[f32, (100000, 0.0)] 496 µs 583.7 µs -15.03%
Simulation decompress_rd[f32, (10000, 0.1)] 78.1 µs 90.6 µs -13.8%
Simulation decompress_rd[f32, (10000, 0.01)] 78.1 µs 90.6 µs -13.79%
Simulation decompress_rd[f32, (10000, 0.0)] 78.5 µs 91.1 µs -13.75%
Simulation chunked_bool_canonical_into[(1000, 10)] 32 µs 16.6 µs +92.19%
Simulation decompress_rd[f64, (100000, 0.1)] 1,020.7 µs 842.7 µs +21.13%
Simulation decompress_rd[f64, (100000, 0.01)] 1,020.7 µs 842.7 µs +21.13%
Simulation decompress_rd[f32, (100000, 0.1)] 582.9 µs 495.3 µs +17.69%
Simulation decompress_rd[f32, (100000, 0.01)] 582.9 µs 495.3 µs +17.69%
Simulation eq_i64_constant 318 µs 288.2 µs +10.34%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing claude/serene-darwin-ne73a3 (1abc9e3) with develop (6a3fb19)

Open in CodSpeed

Comment on lines +191 to +203
// A fixed-size list is zero only if it has the expected number of elements and every
// element is itself a non-null zero value.
DType::FixedSizeList(_, list_size, _) => {
let list = self.as_list();
list.len() == *list_size as usize
&& (0..list.len())
.all(|i| list.element(i).is_some_and(|e| e.is_zero() == Some(true)))
}
// A struct is zero only if every one of its fields is itself a non-null zero value.
DType::Struct(..) => self
.as_struct()
.fields_iter()
.is_some_and(|mut fields| fields.all(|f| f.is_zero() == Some(true))),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The definitions here seem really weird to me, I think its worth documenting the "why" here.
How do other systems handle this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why they seem weird? I feel like this is the only possible correct definition for these scalars.

@AdamGS AdamGS Jun 19, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind a "zero" struct is an empty struct (as in - 0 fields) of length 0

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is a scalar, which means it's already length 1. And I would think of this as is_default almost where we do not have control over the struct fields, we just call is_default on all fields (via auto-derived Default impl of the struct).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More importantly, we have to implement the functions in vortex-array/src/scalar/scalar_value.rs somehow based on the dtype, and the is_ functions have to mirror that.

@connortsui20 connortsui20 merged commit 9814173 into develop Jun 19, 2026
114 of 117 checks passed
@connortsui20 connortsui20 deleted the claude/serene-darwin-ne73a3 branch June 19, 2026 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/fix A bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scalar::is_zero is probably not correct for FixedSizeList and Struct

3 participants