Fix/1120#1538
Conversation
…oords during dataset construction
|
failures here appear to be related to dask distributed. |
|
|
||
| def assert_valid_explicit_coords(variables, explicit_coords): | ||
| '''raise a MergeError if an explicit coord shares a name with a dimension | ||
| but is comprised of arbitrary dimensions''' |
There was a problem hiding this comment.
If you care: pep8 is """ & on their own lines
| for name, var in variables.items(): | ||
| if name not in explicit_coords: | ||
| var_dims.extend(var.dims) | ||
| for coord_name in explicit_coords: |
There was a problem hiding this comment.
I think you're missing this as a function argument
There was a problem hiding this comment.
explicit_coords is a function argument.
There was a problem hiding this comment.
I was misreading something else, nevermind
| var_dims.extend(var.dims) | ||
| for coord_name in explicit_coords: | ||
| if coord_name in var_dims and not all( | ||
| [d in var_dims for d in variables[coord_name].dims]): |
There was a problem hiding this comment.
Rather than not all(...), I think this condition should be just variables[coord_name].dims != (coord_name,)
| Raise a MergeError if an explicit coord shares a name with a dimension | ||
| but is comprised of arbitrary dimensions. | ||
| """ | ||
| var_dims = [] |
There was a problem hiding this comment.
I don't follow. Are you just suggesting we cast var_dims to a set after populating it or are you looking for some different logic for determining the dimensions in the non-explicit coord variables?
There was a problem hiding this comment.
Just that a set() is a better data structure for contains lookups than a list.
But actually, you should use the result of calculate_dimensions() here, instead of calculating dimensions twice.
|
@shoyer - thanks for the review. I iterated on this a few times and landed on something that was more complex than necessary. Your suggestions have been incorporated. |
|
This is ready for a final review. Tests are passing now. |
| if coord_name in dims and variables[coord_name].dims != (coord_name,): | ||
| raise MergeError( | ||
| 'coordinate %s shares a name with a dimension but ' | ||
| 'includes at least one arbitrary dimensions' % coord_name) |
There was a problem hiding this comment.
"includes at least one arbitrary dimensions" is a little confusing to me.
Let's try to be totally explicit here, even if the error message needs to be longer. Maybe:
coordinate X shares a name with a dataset dimension, but is not a 1D variable along that dimension. This is disallowed by the xarray data model.
Datasetwith a coordinate given by aDataArraymay create an invalid dataset #1120git diff upstream/master | flake8 --diffwhats-new.rstfor all changes andapi.rstfor new API