Skip to content

ARROW-7085: Custom column builder for CSV reader#5880

Closed
fexolm wants to merge 1 commit into
apache:masterfrom
fexolm:column-builder
Closed

ARROW-7085: Custom column builder for CSV reader#5880
fexolm wants to merge 1 commit into
apache:masterfrom
fexolm:column-builder

Conversation

@fexolm

@fexolm fexolm commented Nov 21, 2019

Copy link
Copy Markdown
Contributor

As discussed in ARROW-7085, I've implemented one of the ways to provide custom column builder to a column.
However, it doesn't look like final solution for me. Hope to hear your thought about possible implementation.

@github-actions

Copy link
Copy Markdown

@fexolm

fexolm commented Nov 25, 2019

Copy link
Copy Markdown
Contributor Author

@pitrou @wesm PTAL

@pitrou

pitrou commented Nov 25, 2019

Copy link
Copy Markdown
Member

@fexolm The Parser, ColumnBuilder.. APIs are currently internal, and slightly messy. We'll have to think about cleaning them up before making them public.

@fexolm

fexolm commented Nov 25, 2019

Copy link
Copy Markdown
Contributor Author

@pitrou the other option is to define a class inherited from ExtentionType with an additional method for creating column builder for that type.

This solution wouldn't require changing stable API.
I can make a sample implementation if it sounds better for you.

@pitrou

pitrou commented Nov 25, 2019

Copy link
Copy Markdown
Member

No, this would have the same problem since ColumnBuilder takes a BlockParser.

@anton-malakhov

anton-malakhov commented Nov 25, 2019

Copy link
Copy Markdown

@pitrou We would not mind if this ExtensionType based approach is implemented in internal namespace as well, so it will not appear as a stable API right away.
[updated] I mean ExtensionType is already defined. What we need is additional flexibility in of its usage which will not touch public API but will allow to define our custom converter even if it is internal or uses internal inputs at the moment,

@wesm

wesm commented May 24, 2020

Copy link
Copy Markdown
Member

Closing this PR since it has grown stale. I think it would be worth defining a public API for this, but we're going to have to take time to propose clean, well-documented interfaces rather than poking through internal APIs into the public API without any changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants