Skip to content

Fix timerange wildcard search when deriving variables or downloading files#1562

Merged
sloosvel merged 14 commits into
mainfrom
dev_fix_timerange_wildcard_search
Jun 7, 2022
Merged

Fix timerange wildcard search when deriving variables or downloading files#1562
sloosvel merged 14 commits into
mainfrom
dev_fix_timerange_wildcard_search

Conversation

@sloosvel

@sloosvel sloosvel commented Apr 19, 2022

Copy link
Copy Markdown
Contributor

Description

This pull request should allow to use wildcards in the timerange when deriving variables and when downloading files from the ESGF.

I still have to find a way to add a test for the line that is not covered, which should check that the timeranges for variables needed in order to derive are the same, but if you want to test away the current changes feel free to do so.

Closes #1516
Closes #1550

Link to documentation:


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@sloosvel sloosvel added the bug Something isn't working label Apr 19, 2022
@sloosvel sloosvel added this to the v2.6.0 milestone Apr 19, 2022
@codecov

codecov Bot commented Apr 19, 2022

Copy link
Copy Markdown

Codecov Report

Merging #1562 (094a6d2) into main (ead491a) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1562   +/-   ##
=======================================
  Coverage   91.36%   91.37%           
=======================================
  Files         204      204           
  Lines       11144    11154   +10     
=======================================
+ Hits        10182    10192   +10     
  Misses        962      962           
Impacted Files Coverage Δ
esmvalcore/_recipe.py 95.90% <100.00%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ead491a...094a6d2. Read the comment docs.

@bouweandela bouweandela left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @sloosvel! The code looks good, but I have not tried to use the feature.

@sloosvel

sloosvel commented Jun 3, 2022

Copy link
Copy Markdown
Contributor Author

Thanks for your feedback @bouweandela ! But what should we do about this? Since the code freeze is coming up.

@schlunma

schlunma commented Jun 3, 2022

Copy link
Copy Markdown
Contributor

I can try to test this briefly, can you resolve the conflict in the meantime?

Comment thread esmvalcore/_recipe.py Outdated

@schlunma schlunma left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For downloaded data, works well. I tested *, */YYYY, YYYY/*, */P1Y.

For derived data, I still get errors:

  1. Data is available locally and something like */1855 is used: ERROR No input files found for variable even though the data is there
  2. All other cases: ValueError: min() arg is an empty sequence
Details

Error in 1.:

2022-06-03 15:04:24,530 UTC [1474856] ERROR   No input files found for variable {'short_name': 'clwvi', 'mip': 'Amon', 'derive': True, 'force_derivation': True, 'timerange': '185001/1851', 'variable_group': 'lwp_derive_input_clwvi', 'diagnostic': 'test', 'preprocessor': 'default', 'dataset': 'CanESM5', 'project': 'CMIP6', 'exp': 'historical', 'ensemble': 'r1i1p1f1', 'grid': 'gn', 'recipe_dataset_index': 0, 'institute': ['CCCma'], 'activity': 'CMIP', 'alias': 'CanESM5', 'original_short_name': 'clwvi', 'standard_name': 'atmosphere_mass_content_of_cloud_condensed_water', 'long_name': 'Condensed Water Path', 'units': 'kg m-2', 'modeling_realm': ['atmos'], 'frequency': 'mon', 'start_year': 1850, 'end_year': 1851}
2022-06-03 15:04:24,530 UTC [1474856] ERROR   Looked for files matching: /work/bd0854/DATA/ESMValTool2/CMIP6_DKRZ/CMIP/CCCma/CanESM5/historical/r1i1p1f1/Amon/clwvi/gn/v20190429/clwvi_Amon_CanESM5_historical_r1i1p1f1_gn*.nc
2022-06-03 15:04:24,530 UTC [1474856] ERROR   Set 'log_level' to 'debug' to get more information
2022-06-03 15:04:24,530 UTC [1474856] ERROR   Could not create all tasks
2022-06-03 15:04:24,530 UTC [1474856] ERROR   Missing data for preprocessor test/lwp_derive_input_clwvi:
- Missing data for CanESM5: clwvi
2022-06-03 15:04:24,531 UTC [1474856] ERROR   Not all input files required to run the recipe could be found.
2022-06-03 15:04:24,531 UTC [1474856] ERROR   If the files are available locally, please check your `rootpath` and `drs` settings in your user configuration file /home/b/b309141/config/dkrz.yml
2022-06-03 15:04:24,531 UTC [1474856] ERROR   To automatically download the required files to `download_dir: /work/bd0854/DATA/ESMValTool2/download`, set `offline: false` in /home/b/b309141/config/dkrz.yml or run the recipe with the extra command line argument --offline=False
2022-06-03 15:04:24,531 UTC [1474856] INFO    Note that automatic download is only available for files that are hosted on the ESGF, i.e. for projects: CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs
2022-06-03 15:04:25,491 UTC [1474856] INFO    Maximum memory used (estimate): 0.0 GB
2022-06-03 15:04:25,491 UTC [1474856] INFO    Sampled every second. It may be inaccurate if short but high spikes in memory consumption occur.
2022-06-03 15:04:25,493 UTC [1474856] ERROR   Could not create all tasks

even though the data is here: /work/ik1017/CMIP6/data/CMIP6/CMIP/CCCma/CanESM5/historical/r1i1p1f1/Amon/clwvi/gn/v20190429/clwvi_Amon_CanESM5_historical_r1i1p1f1_gn_185001-201412.nc

Error in 2.:

  Traceback (most recent call last):
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 499, in run
    fire.Fire(ESMValTool())
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 443, in run
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 121, in process_recipe
    recipe = read_recipe_file(recipe_file, config_user)
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 74, in read_recipe_file
    return Recipe(raw_recipe,
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1269, in __init__
    self.tasks = self.initialize_tasks() if initialize_tasks else None
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1878, in initialize_tasks
    tasks = self._create_tasks()
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1853, in _create_tasks
    new_tasks, failed = self._create_preprocessor_tasks(
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1813, in _create_preprocessor_tasks
    task = _get_preprocessor_task(
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1228, in _get_preprocessor_task
    task = _get_single_preprocessor_task(
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 1069, in _get_single_preprocessor_task
    products = _get_preprocessor_products(
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 927, in _get_preprocessor_products
    _update_timerange(variable, config_user)
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe.py", line 851, in _update_timerange
    min_date = min(interval[0] for interval in intervals)
ValueError: min() arg is an empty sequence

@sloosvel

sloosvel commented Jun 7, 2022

Copy link
Copy Markdown
Contributor Author

I think it should work now for deriver variables

@schlunma schlunma left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested this with downloaded data, derived variables, and the combination of the two. Works just like expected 🎉

Thanks Saskia!!

@sloosvel

sloosvel commented Jun 7, 2022

Copy link
Copy Markdown
Contributor Author

Thanks to you for reviewing!

@sloosvel sloosvel merged commit 4a840fb into main Jun 7, 2022
@sloosvel sloosvel deleted the dev_fix_timerange_wildcard_search branch June 7, 2022 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Specifying wildcards for timerange does not work for derived variables Specifying wildcards for timerange does not work with automatic downloads

3 participants