Skip to content

General data wrangling #57

Description

@pp-mo

Although strictly excluded as a goal for the initial release,
I still think the 'secondary' usage of ncdata will be useful :

  • for modifying data before loading, or after saving, with an analysis package
  • or just to adjust data and save to another file

For this there real scope for some convenience and sugar.
Some ideas :

  • ds.is_valid(error_when_not=False) : checking the consistencies not ensured by the free-and-easy design
    • ideas
      • all elements are filed under their own name (e.g. ds.variables['x'].name == 'x')
      • dims used by variables all exist
      • variables all have data
      • variable data shapes all match the dims
    • ( delivered : Save errors util #64 )
  • make it easy to add items by name : el.variables[var.name] = var --> el.variables.add(var)
  • make it easy to rename content, e.g. ds.variables.rename('x', 'y')
  • make it easy to construct containers (variables, attributes) from lists of element specifications
    e.g. NcData(dimensions=nc_dims(x=3, y=5, t=(2, True)), variables=nc_vars(x=(['x'], int), y=(['y'], int), data=(['t', 'y', 'x'], float))
    (or something !)
  • special convenience handling for attrs : e,g,
    el.ncd_setatt(name, value) ~= el.attributes[name] = NcAttribute(name, value)
    el.ncd_getatt(name) ~= el.attributes.get('name', NcAttribute('', None)).as_python_value()

Update:

v0.1.1 delivered most of this :


For instance, some actions I needed to adjust a given file output from xarray so that Iris can correctly interpret the coord-system ...

>>> ds = ncdata.netcdf4.from_nc4(filepath)
>>> ds.variables['x'].attributes['standard_name'] = NcAttribute('standard_name', 'projection_x_coordinate')
>>> ds.variables['y'].attributes['standard_name'] = NcAttribute('standard_name', 'projection_y_coordinate')
>>> ds.variables['x'].attributes['units'] = NcAttribute('units', 'm')
>>> ds.variables['y'].attributes['units'] = NcAttribute('units', 'm')
>>> del ds.variables['spatial_ref'].attributes['spatial_ref']
>>> del ds.variables['spatial_ref'].attributes['crs_wkt']
>>> del ds.variables['spatial_ref'].attributes['horizontal_datum_name'] 
>>> cube, = to_iris(ds)
>>> print(cube.coord_system)
<bound method Cube.coord_system of <iris 'Cube' of band_data / (unknown) (band: 5; projection_y_coordinate: 6400; projection_x_coordinate: 7600)>>
>>> print(cube.coord_system())
TransverseMercator(latitude_of_projection_origin=53.5, longitude_of_central_meridian=-8.0, false_easting=200000.0, false_northing=250000.0, scale_factor_at_central_meridian=1.000035, ellipsoid=GeogCS(semi_major_axis=6377340.189, semi_minor_axis=6356034.447938534))
>>> 

So, how about

ds.variables['x'].attributes.update(NameMap(
    NcAttribute,  # type of contents
    ('standard_name', 'projection_x_coordinate'),  # *args are init arglists
    (`units', 'm')
))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions