Centralize DATE_BIN source scaling and binning through shared helper#22823
Open
kosiew wants to merge 14 commits into
Open
Centralize DATE_BIN source scaling and binning through shared helper#22823kosiew wants to merge 14 commits into
kosiew wants to merge 14 commits into
Conversation
…e64MicrosecondArray - Introduced scalar Time64Microsecond(i64::MAX) overflow reproducer. - Introduced array Time64MicrosecondArray(i64::MAX) overflow reproducer. - Updated tests to catch current panic using catch_unwind.
…timize timestamp handling - Added `value_to_nanos(value, scale)` function. - Refactored to eliminate repeated `timestamp scalar checked_mul` blocks. - Implemented helper for timestamp array scaling. - Left TIME direct multiplies for SUB_ISSUE_03.
- Implemented value_to_nanos for TIME origin scaling. - Updated TIME scalar source scaling to use value_to_nanos. - Modified TIME array source scaling to include value_to_nanos and ArrowError::ComputeError mapping. - Revised overflow repro tests to ensure no panic occurs and handle normal errors appropriately.
…ate_bin - Renamed the overflow helper from `timestamp_scale_overflow_error` to `nanos_scale_overflow_error`. - Updated error message to be more generic: "DATE_BIN value ... cannot be represented in nanoseconds". - Added a new test helper: `invoke_time64_microsecond_date_bin(...)`. - Simplified scalar and array overflow tests by using the new helper.
…to reflect new representation limit
…dling - Restored timestamp overflow message for DATE_BIN source timestamp. - Retained generic TIME/value overflow message for DATE_BIN value. - Updated value_to_nanos to take an error constructor. - Revised timestamp paths to utilize timestamp_scale_overflow_error. - Updated TIME paths to use nanos_scale_overflow_error.
- Removed timestamp_scale_overflow_error from date_bin.rs - Updated value_to_nanos function to only use nanos_scale_overflow_error for scaling - Modified expected DATE_BIN value in date_bin_errors.slt
…anos for safety This commit updates the date_bin_impl function to use `checked_scale_to_nanos` instead of direct integer multiplication for scaling timestamps. This change ensures safer handling of potential overflow scenarios while maintaining the intended functionality for date and time operations.
…g functions - Renamed helper argument `x` to `value` for clarity - Added `checked_scale_and_bin_to_nanos` function - Unified source scaling and binning paths to share the helper flow - Ensured no direct calls to `checked_mul(scale)` remain outside the helper
…or behavior - Reverted Time64(Nanosecond) array branch to use try_unary(...) - Maintained prior error behavior - Kept out-of-scope branch unchanged semantically
…to_nanos_or_null for clarity on NULL behavior
- Reformatted match statement in `checked_scale_to_nanos` for clarity. - Split long function calls across multiple lines in `date_bin_impl` for better readability. - Enhanced indentation and formatting consistency in error handling and mapping functions.
Jefffrey
reviewed
Jun 19, 2026
Comment on lines
+456
to
+457
| checked_scale_to_nanos(value, scale) | ||
| .ok() |
Contributor
There was a problem hiding this comment.
i dont really see a value add here considering we ignore the error from checked_scale_to_nanos since we convert it to an option, not to mention this new wrapper function takes 5 arguments 🤔
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
DATE_BIN still contained duplicated source-scaling logic across timestamp and TIME scalar/array code paths. The repeated checked multiplication and binning logic made the implementation harder to maintain and increased the risk of behavior drifting between source types.
This change centralizes source scaling to nanoseconds and binning through shared helpers while preserving existing overflow and error-handling semantics.
What changes are included in this PR?
Renamed the scaling helper parameter from
xtovaluefor clarity.Added a new helper,
checked_scale_and_bin_to_nanos_or_null, which:Nonefor source-value processing paths.Updated timestamp source handling (scalar and array paths) to use the shared helper instead of inlined scaling and binning logic.
Updated TIME source handling (scalar and array paths) for:
Time32MillisecondTime32SecondTime64Microsecondto use the shared helper instead of direct checked multiplication.
Kept
Time64Nanosecondon a separate path since no scaling is required, and switched the array implementation to usetry_unaryso binning errors are propagated consistently through Arrow error handling.Removed duplicated scaling code throughout DATE_BIN source conversion paths and centralized nanosecond scaling behavior.
Are these changes tested?
No new tests are included in this PR.
This refactor is intended to preserve existing behavior while consolidating implementation details. Existing DATE_BIN tests should continue to validate overflow and NULL/error handling behavior, including the targeted overflow scenarios described in the issue summary.
Are there any user-facing changes?
No.
This is an internal refactor intended to centralize DATE_BIN source scaling logic without changing SQL-visible behavior.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed.