-
Notifications
You must be signed in to change notification settings - Fork 60
Description
In #145 RCall was changed to create NullableArray and CategoricalArray objects rather than DataArray and PooledDataArray. This was consistent with the DataFrames plans at the time.
This decision probably needs revisiting now that the plans have changed (i.e. DataFrames will keep using DataArray and PooledDataArray, and the new DataTables package uses NullableArray and CategoricalArray). The current state is inconsistent since DataFrame is not made to work well with NullableArray and CategoricalArray. So there are two options:
- keep creating
DataFrameobjects by default, and go back to creatingDataArrayandPooledDataArraycolumns - switch to
DataTableand keep creatingNullableArrayandCategoricalArraycolumns
I'm really torn about which solution is better. We don't have all the high-level query APIs yet to make working with DataTables as nice as it should be. Yet, going back to DataArray could be annoying for users who already had to adapt their code. Maybe we should do it nevertheless until the Nullable API stabilizes, to avoid having our data frame/table ecosystem too inconsistent or painful to use for several months.