Skip to content

DataFrames and DataTables support #174

@nalimilan

Description

@nalimilan

In #145 RCall was changed to create NullableArray and CategoricalArray objects rather than DataArray and PooledDataArray. This was consistent with the DataFrames plans at the time.

This decision probably needs revisiting now that the plans have changed (i.e. DataFrames will keep using DataArray and PooledDataArray, and the new DataTables package uses NullableArray and CategoricalArray). The current state is inconsistent since DataFrame is not made to work well with NullableArray and CategoricalArray. So there are two options:

  • keep creating DataFrame objects by default, and go back to creating DataArray and PooledDataArray columns
  • switch to DataTable and keep creating NullableArray and CategoricalArray columns

I'm really torn about which solution is better. We don't have all the high-level query APIs yet to make working with DataTables as nice as it should be. Yet, going back to DataArray could be annoying for users who already had to adapt their code. Maybe we should do it nevertheless until the Nullable API stabilizes, to avoid having our data frame/table ecosystem too inconsistent or painful to use for several months.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions