Skip to content

Regression: "ValueError: cannot unstack dimensions that do not have a MultiIndex" when unstacking a MultiIndex #5384

Description

@dranjan

I'm not sure if this is a bug or I'm not using xarray correctly, but I used to be able to do this without crashing. The new behavior seems to have been introduced some time between 0.16.2 and 0.18.2.

What happened:

Traceback (most recent call last):
  File "scripts/repro.py", line 12, in <module>
    ds = ds.unstack(['c'])
  File "/home/darsh/src/notebooks/build/venv/lib/python3.8/site-packages/xarray/core/dataset.py", line 4024, in unstack
    raise ValueError(
ValueError: cannot unstack dimensions that do not have a MultiIndex: ['c']

What you expected to happen:

The code runs without the ValueError exception.

Minimal Complete Verifiable Example:

from xarray import DataArray, Dataset


a = DataArray([0], dims=['a'])
b = a.stack(b=('a',)).reset_index('b')
c = b.stack({'c': ['b']})

ds = Dataset({'d': DataArray(c.data, dims=['c'])}, coords=c.coords)
print('\nBefore:')
print(ds)

ds = ds.unstack(['c'])
print('\nAfter:')
print(ds)

Anything else we need to know?:

Here's the full output from the example on 0.18.2:


Before:
<xarray.Dataset>
Dimensions:  (c: 1)
Coordinates:
  * c        (c) MultiIndex
  - b        (c) int64 0
    a        (c) int64 0
Data variables:
    d        (c) int64 0
Traceback (most recent call last):
  File "scripts/repro.py", line 12, in <module>
    ds = ds.unstack(['c'])
  File "/home/darsh/src/notebooks/build/venv/lib/python3.8/site-packages/xarray/core/dataset.py", line 4024, in unstack
    raise ValueError(
ValueError: cannot unstack dimensions that do not have a MultiIndex: ['c']

What confuses me is that the c dimension is shown as a MultiIndex, but it still complains that it doesn't have a MultiIndex. Directly unstacking ds.d rather than the dataset itself also fails with the same exception.

Oddly, it seems to work if I assign the coordinates after constructing the dataset:

diff --git a/scripts/repro.py b/scripts/repro.py
index ed2ae7c..d5bd6a3 100644
--- a/scripts/repro.py
+++ b/scripts/repro.py
@@ -5,7 +5,7 @@ a = DataArray([0], dims=['a'])
 b = a.stack(b=('a',)).reset_index('b')
 c = b.stack({'c': ['b']})
 
-ds = Dataset({'d': DataArray(c.data, dims=['c'])}, coords=c.coords)
+ds = Dataset({'d': DataArray(c.data, dims=['c'])}).assign_coords(c.coords)
 print('\nBefore:')
 print(ds)
 

With that workaround, or by downgrading to 0.16.2, the example doesn't crash:


Before:
<xarray.Dataset>
Dimensions:  (c: 1)
Coordinates:
  * c        (c) MultiIndex
  - b        (c) int64 0
    a        (c) int64 0
Data variables:
    d        (c) int64 0

After:
<xarray.Dataset>
Dimensions:  (b: 1)
Coordinates:
    a        (b) int64 0
  * b        (b) int64 0
Data variables:
    d        (b) int64 0

Environment:

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.8.0 (default, Feb 25 2021, 22:10:10)
[GCC 8.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-73-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 0.18.2
pandas: 1.2.4
numpy: 1.20.3
scipy: 1.6.3
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2021.05.0
distributed: None
matplotlib: 3.4.2
cartopy: None
seaborn: None
numbagg: None
pint: 0.17
setuptools: 39.0.1
pip: 21.1.1
conda: None
pytest: 6.2.4
IPython: 7.23.1
sphinx: None
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions