Add configurable grid spacing relative tolerance#2308
Add configurable grid spacing relative tolerance#2308rachaelesler wants to merge 8 commits intometoppv:masterfrom
Conversation
ef01dc0 to
d1cc63c
Compare
d1cc63c to
7ad9c3e
Compare
8592766 to
e99fda8
Compare
benowen-bom
left a comment
There was a problem hiding this comment.
Partial review: I'll check the remaining tests tomorrow. And sorry for the delay, I've been trying to test the best location for the constants (between other tasks).
| # Grid spacing relative tolerances | ||
| RTOL_GRID_SPACING_TIGHT = 1.0e-5 | ||
| RTOL_GRID_SPACING_DEFAULT = 4.0e-5 | ||
| RTOL_GRID_SPACING_WARNING_THRESHOLD = 0.01 | ||
|
|
There was a problem hiding this comment.
Thanks for moving this out of constants.py. I notice that including these here means that the regrid.py CLI can not import RTOL_GRID_SPACING_DEFAULT directly due to the issue with the numpy import.
A possible solution to this is to move these into regrid/_init_.py as this will enable them to be imported into the CLI without issue.
It also moves these grid-specific constants into the regrid directory, which seems like a reasonable place for them (given that is where they are primarily used).
There was a problem hiding this comment.
I don't think an __init__.py file is the right place for these constants for a few reasons. The main one is that regrid/__init__.py is currently empty and I am guessing this is by design from the MO IMPROVER team. The official Python docs indicate that __init__.py should contain package initialisation code but constants don't really fit under the umbrella of package initialisation.
I don't think it's too much of an issue that the constants aren't available in the CLI code, because this is the same pattern used in many other IMPROVER CLIs.
I'm tempted to leave these constants as is now, and get feedback from the Met Office team if they have a better suggestion on where to put them.
There was a problem hiding this comment.
No worries. I'll resolve the other issues raised under this item.
| def test_run_regrid_with_tolerance_specified(self): | ||
| """Test masked regridding with relative grid tolerance specified.""" | ||
| expected_data = 282 * np.ones((12, 12), dtype=np.float32) | ||
| test_cases = [1.0e-10, 1.0e-5, 4.0e-3, 7.0e-1] | ||
| for rtol in test_cases: | ||
| with self.subTest(rtol=rtol): | ||
| result = RegridLandSea( | ||
| regrid_mode="nearest-with-mask-2", | ||
| landmask=self.landmask, | ||
| landmask_vicinity=90000, | ||
| rtol_grid_spacing=rtol, | ||
| )(self.cube, self.target_grid.copy()) | ||
| np.testing.assert_array_almost_equal(result.data, expected_data) | ||
|
|
There was a problem hiding this comment.
Given rtol_grid_spacing is only used to run a checking function, I'm not sure there is much value in the tests run above as they don't really don't do more than check that the class runs without failing.
Maybe this test could be broken down into three cases:
- Test runs as expected (as above)
- Test runs in case the warning is raised; it should warn, but still work
- Test case where the tolerance is something bonkers which fails
The 1.0e-10 tolerance looks like a good candidate for the last one, but may require a test grid with a more exotic grid-spacing (one likely to suffer from precision issues).
There was a problem hiding this comment.
I hear that, I could probably cut this down to a single test here (testing with just one custom rtol) as I've tested the three test cases you mentioned at the function level in improver_tests/regrid/test_grid.py. What do you think?
|
@benowen-bom As discussed I will scale back the testing and add tests for the following:
|
Thanks @rachaelesler. In looking over the rest of the code changes, I think scaling back the tests is worthwhile. Given we have not changed the functionality, rather provided a means to change a value utilised in the existing functionality, I think what should be tested is:
In terms of checking the rtol is passed through for dependent functions, you could add a single test to each class that has been affected where 2values are passed in: sensible value and an excessively large one which should warn: the first case should pass without error and the second case should raise a warning. With these tests, I think we have done the due-diligence without adding more coverage than is necessary (especially given the arg is used in a check function which has no material impact on the data). |
@benowen-bom I think I have made most of the changes suggested above and this is ready for another review. Please let me know if there is additional coverage you think I should add. |
3c9b6e6 to
9fc0fec
Compare
|
Hi @bayliffe, I believe this is ready for a review from the Met Office. I'm still new to IMPROVER development, so any feedback is welcomed. In particular, I am interested in your thoughts on whether the test coverage is sufficient, and where to put the constants used in regridding. Thanks! |
9fc0fec to
0a9a45f
Compare
|
I've rebased on |
|
@rachaelesler apologies for not getting to this yet. I'm not well today, so I will try to get to it tomorrow. |
No worries @bayliffe Some of the tests are failing after a recent rebase on master, but this looks to be the case on other PR branches too. I could be wrong but I suspect the culprit for the failing tests is not this code, so it is good to review. |
This feature allows users of the `regrid` CLI to supply a relative
tolerance for grid spacing checks, e.g.,
```
improver regrid <cubes> --rtol-grid-spacing=4.0e-3
```
Previously, this relative tolerance was hard-coded into a function.
The change in this commit enables users to use the `regrid` CLI with
grids that have minor discrepancies in latitude/longitude spacing.
The new `--rtol-grid-spacing` argument is only used for certain regrid
modes ("nearest-2", "nearest-with-mask-2","bilinear-2", and
"bilinear-with-mask-2").
The new `--rtol-grid-spacing` argument is optional and it takes a
default value. The default value is the hard-coded value that existed
previously. This means that existing systems that uses the `regrid` CLI
will not be affected by this change.
This changeset includes:
* Add `--rtol-grid-spacing` argument to `regrid` CLI
* Update the classes `RegridLandSea` and `RegridWithLandSeaMask` to
have a new attribute, `rtol_grid_spacing`
* Update the `calculate_input_grid_spacing()` function in
`improver/regrid/grid.py` with a new argument, `rtol`, which is the
relative tolerance that will be used when calculating grid spacing.
* Use constants for default grid spacing rtols (rather than magic
numbers)
* Update `calculate_grid_spacing()` function in
`improver/utilities/spatial.py` with more input validation
* Unit and acceptance tests for all of the above
Closes metoppv#2307
ea49860 to
7f9163e
Compare
benowen-bom
left a comment
There was a problem hiding this comment.
Hi Rachael,
I've had another look over this PR. I think the testing is much more appropriate given the changes. I have one suggestion below, otherwise I'm happy with the changes.
| y_array = np.array( | ||
| [ | ||
| -46.8, | ||
| -46.813496399, | ||
| -46.826994705, | ||
| -46.840493965, | ||
| -46.85399418, | ||
| -46.86749058, | ||
| -46.880988884, | ||
| -46.894488144, | ||
| -46.90798836, | ||
| -46.92148476, | ||
| -46.93498306, | ||
| -46.94848232, | ||
| -46.96198254, | ||
| -46.97547894, | ||
| -46.98897724, | ||
| ], | ||
| dtype=np.float32, | ||
| ) | ||
| # Regularly spaced x points | ||
| x_array = np.linspace(100.0, 115.0, n_points, dtype=np.float32) | ||
| data = 282 * np.ones((15, 15), dtype=np.float32) | ||
| y_coord, x_coord = _construct_yx_coords_from_arrays( | ||
| y_array=y_array, x_array=x_array, spatial_grid="latlon" | ||
| ) | ||
| dim_coords = _construct_dimension_coords( | ||
| data, | ||
| y_coord=y_coord, | ||
| x_coord=x_coord, | ||
| ) | ||
| src_cube = Cube( | ||
| data.astype(np.float32), | ||
| standard_name="air_temperature", | ||
| units="K", | ||
| dim_coords_and_dims=dim_coords, | ||
| ) |
There was a problem hiding this comment.
This is fine, but I'm wondering whether you could do something similar here to what was done for test_lat_lon_with_irregular_grid in test_regrid.py? That is, set up the cube and then alter the y_coord directly?
I appreciate the values above have been chosen to give the desired behaviour, but an alternate means could be to add and subtract delta to two of the coordinate values, where delta > rtol * mean(grid_spacing). That is:
y_coord = [y0, y1, y2 + delta, y3- delta, y4, ..., yn]
```.
This would ensure `mean(grid_spacing)` is the same, but that one or more grid_spacing values exceed the rtol; additionally, the rtol used to set the delta value determines whether the function will pass or fail.
There was a problem hiding this comment.
Great idea, I've applied your suggestion now and it has made the test much more concise.
Addresses #2307
This feature allows users of the
regridCLI to supply a relativetolerance for grid spacing checks, e.g.,
Previously, this relative tolerance was hard-coded into a function.
The change in this commit enables users to use the
regridCLI withgrids that have minor discrepancies in latitude/longitude spacing.
The new
--rtol-grid-spacingargument is only used for certain regridmodes ("nearest-2", "nearest-with-mask-2","bilinear-2", and
"bilinear-with-mask-2").
The new
--rtol-grid-spacingargument is optional and it takes adefault value. The default value is the hard-coded value that existed
previously. This means that existing systems that uses the
regridCLIwill not be affected by this change.
This changeset includes:
--rtol-grid-spacingargument toregridCLIRegridLandSeaandRegridWithLandSeaMasktohave a new attribute,
rtol_grid_spacingcalculate_input_grid_spacing()function inimprover/regrid/grid.pywith a new argument,rtol, which is therelative tolerance that will be used when calculating grid spacing.
numbers)
calculate_grid_spacing()function inimprover/utilities/spatial.pywith more input validationTesting:
CLA