Fix SortExec discards field metadata on the output schema#1477
Conversation
| None, | ||
| )?; | ||
|
|
||
| let schema = Arc::new(Schema::new( |
There was a problem hiding this comment.
the fix is to delete this code, the rest of the PR is a test
|
This is just to correct the bug correct? We will deal with carrying over metadata properly in another PR? |
Yes -- I think DataFusion does carry over metadata now in many cases and this PR corrects a case where the metadata used to be carried over but is no longer |
|
@alamb If I recall the reason this change was made was because in our cases the schema was stripped of the timezone information but the underlying record batches preserved it. |
I am sorry if I broke something for you @maxburke -- if you help me understand / write a reproducer for what want to do I am happy to help try and make it work |
Which issue does this PR close?
Closes #1476
Rationale for this change
This is a regression introduced in #1455
The following code in Sort skips the field level metadata when when it constructs the output batch.
Thus the output schema is not correct.
It also seems to be unnecessary (at least all the tests pass without it, so maybe @maxburke can comment on what problem it was solving).
@hntd187 has also added some functions working with apache/arrow-rs#1033 PR to add some additional functions for creating fields and I will work on this as well
What changes are included in this PR?
Are there any user-facing changes?
Bug fix