CMOR check generic level coordinates in CMIP6#598
Conversation
|
@sloosvel - Am I correct in understanding that these fixes will fix the problem of variables having different aliases in standard names (e.g. mrsos has moisture_content_of_soil_layer |
Not really. This pull request does not affect or modify the checking of variable's standard names, it just adds a check for generic levels in CMIP6.
Thanks for the reference, I checked the dataset and there are so many wrong things with it. It seems like it's not reporting the error because I assumed that the coordinate would have at least a standard_name, and it that case it does not even have that. I will need to rework the code a bit to make it work. |
|
@jvegasbsc and @sloosvel - I followed the thread of an original ESMValTool issue regarding variable alias management to this one, via issue ESMValGroup/ESMValTool#950 . This is with reference to the soil moisture variable mrsos having two aliases for short name - "moisture_content_of_soil_layer" and "mass_content_of_water_in_soil_layer". The UKESM and MRI model data for mrsos are two examples where the latter alias causes the esmvaltool preprocessor to fail. Will these be fixed ? |
|
I think it's a mistake on the reference, this pull request has nothing to do with the issue with the aliases. Maybe you are looking for #595 ? |
Co-authored-by: Bouwe Andela <b.andela@esciencecenter.nl>
|
@jvegasbsc @mattiarighi Today is the deadline for anything that is included in the first version 2 release. Would you have time to review/test this today? If not, this can go into the next release. |
…to dev_check_generic_levels
|
I just tried to test this and got the following error message: Is there indeed an error in this dataset, or is the check too strict? |
|
I guess this is because of this attribute?
|
It is ok, lev is a dimension, so you don't need to specify it in the coordinates |
|
@sloosvel , take a look at ESMValCore/esmvalcore/cmor/check.py Line 297 in 94568f8 Maybe |
Ahh, alright, that makes totally sense. Sorry for the misunderstanding. So I guess this PR is working. However, I'm not sure if we should merge this now without providing fixes to the corresponding datasets, since this PR will introduce new errors that were not present in the old |
|
Found 3 more errors (for the "full" time range 1980-2014 but also for 1980-1980): CESM2 CESM2-WACCM CESM2-FV2 |
Good point. Let's wait with merging this until the feature freeze (@valeriupredoi and I will do that around 3 PM CEST today), if everything is working as expected here we can get this merged after and have an opportunity to implement the required fixes in the next few months, before the next release. |
I think there is an ambiguity problem with this particular variable. HadGem or UKESM have the If the recipe only contains datasets that share the standard name, it will pass the checks without problems. For HadGem and UKESM the level coordinate for If you run a recipe with BCC and HadGem (different standard names), the checker will assume that the right coordinate is the one from the dataset that checks first. So if BCC is checked first, HadGem will fail because the alevel coordinate will checked against to So what should be done in those cases? Overwrite the cmor information for every dataset? |
Yes, the check should be independent for each dataset. |
|
With the latest changes, I can run UKESM and HadGEM without failing anymore when called with other models that have a different standard name for the lev variable . CESM2 is not passing the checker, but that is expected with the new checks because there is not any I also don't have access to CAMS-CSM1-0, CAS-ESM2-0, CMCC-CM2-SR5, E3SM-1-0, FGOALS-g3, GISS-E2-1-H, MIROC-ES2L, MPI-ESM-1-2-HAM, MPI-ESM1-2-LR, NESM3, NorESM2-LM, NorESM2-MM, SAM0-UNICON and TaiESM1. But I will try to test them tomorrow. |
|
picking up from #859 for CESM2/thetao I am getting this: loading the input data off ESGF: c = iris.load_cube("/badc/cmip6/data/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Omon/thetao/gn/latest/thetao_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc")indeed for |
valeriupredoi
left a comment
There was a problem hiding this comment.
let's merge this, it's a very useful feature 👍
|
The idea was to add fixes before merging I think. Because some of the coordinates like in #840 have wrong values and it has gone undetected. But in some cases it's not that straightforward to fix. |
|
no no - it is beyond the scope of this PR to add fixes, plus, it'd bloat it out of proportion - this is a checker enhancement, if you can add an automated fix in |
|
Then I guess it can be merged? I think that the problem with mixing datasets above got solved as well. |
|
yep - @bouweandela and @jvegasbsc as reviewers you OK with merge? @schlunma as well, since you've noticed that issue? |
|
Yes, go ahead |
|
done, cheers Saskia and Javi! Manu man, if you still see that issue, forget about it - just kidding, open an Issue pls, bro 🍺 |
|
I think we should have waited with the merge...That PR broke at least one recipe (https://github.com/ESMValGroup/ESMValTool/blob/master/esmvaltool/recipes/recipe_ecs_constraints.yml) and maybe others that we are not aware of. |
|
@schlunma this is a chicken-egg situation - we need to fry the chicken now ie fix the data that gets into those recipes because inherently those recipes may be wrong since the data that gets in and analyzed may be wrong. Cheers for opening the issue! 🍺 But an enhanced functionality in Core should not wait on the fix(es) needed in Tool 👍 |
|
I still have this weird issue that (I think) depends on the order of which the datasets are read: The following recipe sometimes works, sometimes fails with for Each of theses dataset does NOT throw an error when used alone. What is going on here??? The recipe: # ESMValTool
---
documentation:
authors:
- schlund_manuel
maintainer:
- schlund_manuel
references:
- gregory04grl
projects:
- crescendo
diagnostics:
diag_test:
variables:
cl:
project: CMIP6
exp: historical
mip: Amon
start_year: 2004
end_year: 2004
additional_datasets:
- {dataset: ACCESS-ESM1-5, ensemble: r1i1p1f1, grid: gn}
- {dataset: GISS-E2-1-G, ensemble: r1i1p1f1, grid: gn}
- {dataset: GISS-E2-1-H, ensemble: r1i1p1f1, grid: gn}
- {dataset: HadGEM3-GC31-LL, ensemble: r1i1p1f3, grid: gn}
- {dataset: HadGEM3-GC31-MM, ensemble: r1i1p1f3, grid: gn}
scripts:
null |
|
@schlunma To create more visibility for this, it might be good to create an issue for that instead of commenting on closed pull request. |
|
Makes sense, see #883 |
For each generic level, all coordinates that have the same generic_level_name get loaded when reading the tables under a new CoordinateInfo attribute. The checker checks if the out_name and standard_name match any of the possible coordinates, and if so proceeds with the checks with the right coordinate.
Tasks
If you need help with any of the tasks above, please do not hesitate to ask by commenting in the issue or pull request.
Closes #516