Fix timerange wildcard search when deriving variables or downloading files#1562
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1562 +/- ##
=======================================
Coverage 91.36% 91.37%
=======================================
Files 204 204
Lines 11144 11154 +10
=======================================
+ Hits 10182 10192 +10
Misses 962 962
Continue to review full report at Codecov.
|
bouweandela
left a comment
There was a problem hiding this comment.
Thanks for fixing this @sloosvel! The code looks good, but I have not tried to use the feature.
|
Thanks for your feedback @bouweandela ! But what should we do about this? Since the code freeze is coming up. |
|
I can try to test this briefly, can you resolve the conflict in the meantime? |
schlunma
left a comment
There was a problem hiding this comment.
For downloaded data, works well. I tested *, */YYYY, YYYY/*, */P1Y.
For derived data, I still get errors:
- Data is available locally and something like
*/1855is used:ERROR No input files found for variableeven though the data is there - All other cases:
ValueError: min() arg is an empty sequence
Details
Error in 1.:
2022-06-03 15:04:24,530 UTC [1474856] ERROR No input files found for variable {'short_name': 'clwvi', 'mip': 'Amon', 'derive': True, 'force_derivation': True, 'timerange': '185001/1851', 'variable_group': 'lwp_derive_input_clwvi', 'diagnostic': 'test', 'preprocessor': 'default', 'dataset': 'CanESM5', 'project': 'CMIP6', 'exp': 'historical', 'ensemble': 'r1i1p1f1', 'grid': 'gn', 'recipe_dataset_index': 0, 'institute': ['CCCma'], 'activity': 'CMIP', 'alias': 'CanESM5', 'original_short_name': 'clwvi', 'standard_name': 'atmosphere_mass_content_of_cloud_condensed_water', 'long_name': 'Condensed Water Path', 'units': 'kg m-2', 'modeling_realm': ['atmos'], 'frequency': 'mon', 'start_year': 1850, 'end_year': 1851}
2022-06-03 15:04:24,530 UTC [1474856] ERROR Looked for files matching: /work/bd0854/DATA/ESMValTool2/CMIP6_DKRZ/CMIP/CCCma/CanESM5/historical/r1i1p1f1/Amon/clwvi/gn/v20190429/clwvi_Amon_CanESM5_historical_r1i1p1f1_gn*.nc
2022-06-03 15:04:24,530 UTC [1474856] ERROR Set 'log_level' to 'debug' to get more information
2022-06-03 15:04:24,530 UTC [1474856] ERROR Could not create all tasks
2022-06-03 15:04:24,530 UTC [1474856] ERROR Missing data for preprocessor test/lwp_derive_input_clwvi:
- Missing data for CanESM5: clwvi
2022-06-03 15:04:24,531 UTC [1474856] ERROR Not all input files required to run the recipe could be found.
2022-06-03 15:04:24,531 UTC [1474856] ERROR If the files are available locally, please check your `rootpath` and `drs` settings in your user configuration file /home/b/b309141/config/dkrz.yml
2022-06-03 15:04:24,531 UTC [1474856] ERROR To automatically download the required files to `download_dir: /work/bd0854/DATA/ESMValTool2/download`, set `offline: false` in /home/b/b309141/config/dkrz.yml or run the recipe with the extra command line argument --offline=False
2022-06-03 15:04:24,531 UTC [1474856] INFO Note that automatic download is only available for files that are hosted on the ESGF, i.e. for projects: CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs
2022-06-03 15:04:25,491 UTC [1474856] INFO Maximum memory used (estimate): 0.0 GB
2022-06-03 15:04:25,491 UTC [1474856] INFO Sampled every second. It may be inaccurate if short but high spikes in memory consumption occur.
2022-06-03 15:04:25,493 UTC [1474856] ERROR Could not create all taskseven though the data is here: /work/ik1017/CMIP6/data/CMIP6/CMIP/CCCma/CanESM5/historical/r1i1p1f1/Amon/clwvi/gn/v20190429/clwvi_Amon_CanESM5_historical_r1i1p1f1_gn_185001-201412.nc
Error in 2.:
Traceback (most recent call last):
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 499, in run
fire.Fire(ESMValTool())
File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 443, in run
process_recipe(recipe_file=recipe, config_user=cfg)
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 121, in process_recipe
recipe = read_recipe_file(recipe_file, config_user)
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 74, in read_recipe_file
return Recipe(raw_recipe,
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1269, in __init__
self.tasks = self.initialize_tasks() if initialize_tasks else None
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1878, in initialize_tasks
tasks = self._create_tasks()
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1853, in _create_tasks
new_tasks, failed = self._create_preprocessor_tasks(
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1813, in _create_preprocessor_tasks
task = _get_preprocessor_task(
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1228, in _get_preprocessor_task
task = _get_single_preprocessor_task(
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1069, in _get_single_preprocessor_task
products = _get_preprocessor_products(
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 927, in _get_preprocessor_products
_update_timerange(variable, config_user)
File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 851, in _update_timerange
min_date = min(interval[0] for interval in intervals)
ValueError: min() arg is an empty sequence|
I think it should work now for deriver variables |
schlunma
left a comment
There was a problem hiding this comment.
Just tested this with downloaded data, derived variables, and the combination of the two. Works just like expected 🎉
Thanks Saskia!!
|
Thanks to you for reviewing! |
Description
This pull request should allow to use wildcards in the timerange when deriving variables and when downloading files from the ESGF.
I still have to find a way to add a test for the line that is not covered, which should check that the timeranges for variables needed in order to
deriveare the same, but if you want to test away the current changes feel free to do so.Closes #1516
Closes #1550
Link to documentation:
Before you get started
Checklist
It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.
To help with the number pull requests: