Not really an issue. Just wondering about how straightforward it would be to support passing imperfect dataframes in set_data() assignments?
Currently flopy is tolerant of recarrays being passed with excess data fields:
|
if isinstance(data, np.recarray): |
|
# verify data shape of data (recarray) |
|
if len(data) == 0: |
|
# create empty dataset |
|
data = pandas.DataFrame(columns=self._header_names) |
|
elif len(data[0]) != len(self._header_names): |
|
if len(data[0]) == len(self._data_item_names): |
|
# data most likely being stored with cellids as tuples, |
|
# create a dataframe and untuple the cellids |
|
# In pandas 3+, DataFrame() with recarray requires columns to match |
|
# field names, so create without columns param then rename if needed |
|
data = pandas.DataFrame(data) |
|
if list(data.columns) != self._data_item_names: |
|
data.columns = self._data_item_names |
|
data = self._untuple_cellids(data)[0] |
|
# make sure columns are still in correct order |
|
data = pandas.DataFrame(data, columns=self._header_names) |
However, if data is passed already as DataFrame it is expected to be perfect (with no additional columns) and with exactly the correct cellid column breakdown. [as an aside the error message below may not trigger because data[0] returns a KeyError first (iloc[0]?)]
|
elif isinstance(data, pandas.DataFrame): |
|
if len(data.columns) != len(self._header_names): |
|
message = ( |
|
f"ERROR: Data list {self._data_name} supplied the " |
|
f"wrong number of columns of data, expected " |
|
f"{len(self._data_item_names)} got {len(data[0])}.\n" |
|
f"Data columns supplied: {data.columns}\n" |
|
f"Data columns expected: {self._header_names}" |
|
) |
Could the self._untuple_cellids(data)[0] step and extraction of self._header_names happen before the if len(data.columns) != len(self._header_names): check for DataFrame instances?
EDIT:
maybe additional fields aren't supported even when using recarrays (that might be nice)? But the recarray set_data() does support allow for a generic cellid field using that self._untuple_cellids(data)[0] step.
Not really an issue. Just wondering about how straightforward it would be to support passing imperfect dataframes in set_data() assignments?
Currently flopy is tolerant of recarrays being passed with excess data fields:
flopy/flopy/mf6/data/mfdataplist.py
Lines 670 to 686 in 9bc7aac
However, if data is passed already as DataFrame it is expected to be perfect (with no additional columns) and with exactly the correct cellid column breakdown. [as an aside the error message below may not trigger because data[0] returns a KeyError first (
iloc[0]?)]flopy/flopy/mf6/data/mfdataplist.py
Lines 755 to 763 in 9bc7aac
Could the
self._untuple_cellids(data)[0]step and extraction ofself._header_nameshappen before theif len(data.columns) != len(self._header_names):check for DataFrame instances?EDIT:
maybe additional fields aren't supported even when using recarrays (that might be nice)? But the recarray set_data() does support allow for a generic
cellidfield using thatself._untuple_cellids(data)[0]step.