Enforce strict CMOR time-axis integrity#470
Merged
Merged
Conversation
Why this matters (plain terms):\nCMOR output needs a clean timeline to avoid silent data issues downstream.\nThis change makes CMORisation fail fast when time has duplicates, is not strictly increasing, or has missing/overlapping steps.\n\nWhat changed:\n- replaced silent duplicate dropping with strict validation\n- added gap detection via contiguous time bounds when available\n- added frequency-aware fallback checks for daily/monthly/yearly\n- added unit tests for duplicate, missing-month, and valid-monthly cases\n\nValidation:\n- pixi run -e dev python -m pytest tests/unit/test_base.py -k "sort_time_dimension_raises_on_duplicate_timestamps or sort_time_dimension_raises_on_missing_month or sort_time_dimension_keeps_valid_monthly_axis" -q
- handle datetime64 deltas before numeric scalar conversion\n- skip frequency-fallback gap checks for numeric time axes (still keep duplicate/monotonic checks)\n- preserve bounds-based gap checks when available\n\nThis avoids false positives in atmosphere unit tests that use synthetic numeric time coordinates while keeping strict checks for real datetime axes.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #470 +/- ##
=======================================
+ Coverage 78.1% 78.4% +0.3%
=======================================
Files 33 33
Lines 6385 6483 +98
Branches 1197 1226 +29
=======================================
+ Hits 4984 5082 +98
Misses 1128 1128
Partials 273 273
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Improve patch coverage for new CMOR time-axis integrity logic by covering:\n- bounds-contiguity pass/fail paths\n- non-monotonic and single-timestep integrity branches\n- fallback daily/yearly gap checks\n- helper conversions for timedelta64, numeric units, and object-style day deltas\n- malformed bounds candidate filtering
Add direct tests for the last uncovered branches in the strict time-axis patch:\n- unknown frequency fallback\n- numeric-axis skip\n- missing time coordinate\n- duplicate bounds candidate name\n- malformed bounds shapes\n- total_seconds and no-units delta helpers\n\nThis is purely to raise patch coverage on the CMOR time-axis integrity change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We are doing this so CMORisation does not quietly hide bad time data. If timestamps are duplicated or months are missing, we would rather fail early with a clear message than write output that looks valid but can break analysis later. This change makes time checks strict and predictable, which is safer for downstream science workflows.