ARROW-11243: [C++] Recognize time types in CSV files#10782
Conversation
pitrou
commented
Jul 22, 2021
- Allow reading CSV columns as time32 and time64
- Automatically infer "hh:mm" and "hh:mm:ss" as time32[s]
|
@ursabot please benchmark |
|
Benchmark runs are scheduled for baseline = 169b249 and contender = 25b5c1e445f04b53a0beb40afabb3e8012c31f50. Results will be available as each benchmark for each run completes. |
bkietz
left a comment
There was a problem hiding this comment.
This looks good, I'd just like the unit tests for inference to be expanded a bit
nealrichardson
left a comment
There was a problem hiding this comment.
Seems reasonable to me, just one question about time64
There was a problem hiding this comment.
Should there also be an inferring test that results in time64?
There was a problem hiding this comment.
Currently, there is no inference to time64. Should there be one (for times with nanosecond precision perhaps)?
There was a problem hiding this comment.
Ah I see, I was looking at the wrong case statement, you're right. Maybe there should be one to catch any times with fractional seconds, though I don't feel strongly about it since you can explicitly declare the type/schema you want.
25b5c1e to
301c1c1
Compare
|
(rebased) |
|
This could be a future JIRA or "wait until it's asked for" but technically ISO-8601 allows the omission of the colons if you add a leading T but I've never seen it in practice... |
* Allow reading CSV columns as time32 and time64 * Automatically infer "hh:mm" and "hh:mm:ss" as time32[s]
301c1c1 to
b103dd8
Compare
|
@westonpace I'd rather wait for someone to request it. |