With ARROW-14500 and ARROW-15545 (#14106), we are allowing to cast "storage_type" -> "extension<storage_type>" (and the cast the other way around already worked as well).
Initially, that PR allowed any cast from "any" -> "extension<storage_type>", as long as the input type could be cast to the storage type (so deferring to the "any" -> "storage_type" cast). However, because whether a certain cast makes sense or not depends on the semantics of the extension type, it was restricted to exactly matching storage_type.
One idea could be to still allow the other casts behind a cast option flag, like allow_non_storage_extension_casts (or a better name), so the user can explicitly allow to cast to/from any type (as long as the cast from/to the storage type works).
That could help for the user, but for certain casts, the ExtensionType might also want to control how such a cast is done. For example, for casting to/from string type (which would be useful for reading/writing CSV files, or for repr), you typically will want to do something different than casting your storage array to string.
A more general solution could thus be to have a mechanism for the ExtensionType to implement a certain cast kernel itself, and register this to the C++ cast dispatching.
Reporter: Joris Van den Bossche / @jorisvandenbossche
Note: This issue was originally created as ARROW-17890. Please see the migration documentation for further details.
With ARROW-14500 and ARROW-15545 (#14106), we are allowing to cast "storage_type" -> "extension<storage_type>" (and the cast the other way around already worked as well).
Initially, that PR allowed any cast from "any" -> "extension<storage_type>", as long as the input type could be cast to the storage type (so deferring to the "any" -> "storage_type" cast). However, because whether a certain cast makes sense or not depends on the semantics of the extension type, it was restricted to exactly matching storage_type.
One idea could be to still allow the other casts behind a cast option flag, like
allow_non_storage_extension_casts(or a better name), so the user can explicitly allow to cast to/from any type (as long as the cast from/to the storage type works).That could help for the user, but for certain casts, the ExtensionType might also want to control how such a cast is done. For example, for casting to/from string type (which would be useful for reading/writing CSV files, or for repr), you typically will want to do something different than casting your storage array to string.
A more general solution could thus be to have a mechanism for the ExtensionType to implement a certain cast kernel itself, and register this to the C++ cast dispatching.
Reporter: Joris Van den Bossche / @jorisvandenbossche
Note: This issue was originally created as ARROW-17890. Please see the migration documentation for further details.