ARROW-9777: [Rust] [IPC] write custom metadata [WIP]#9011
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9011 +/- ##
==========================================
- Coverage 82.87% 82.82% -0.05%
==========================================
Files 201 201
Lines 49739 49781 +42
==========================================
+ Hits 41220 41232 +12
- Misses 8519 8549 +30
Continue to review full report at Codecov.
|
cc11d56 to
8dfa2f5
Compare
|
I just mark this PR as draft, because I think there must have some missing tests to add. @nevi-me since this PR may overlaps with your existing work, please let me know should I close this PR or not? |
nevi-me
left a comment
There was a problem hiding this comment.
Hey @mqy, I'm on vacation, so I haven't been checking the project much.
ARROW-10299 is for implementing ipc::MetadataVersion::V5 on the write side (reads are mostly fine)
ARROW-10258 depends on ARROW-10259 that you're working on.
| } | ||
|
|
||
| /// Creates the FileWriter. | ||
| pub fn new(writer: W, schema: &Schema) -> Self { |
There was a problem hiding this comment.
The major downside with a new() that doesn't create the header, is that it creates a burden on end-users to remember to call the write_header_schema(). I would prefer to stick with try_new() and try_new_with_options()(happy to rename the latter), where we now amend the latter to also take other parameters.
I don't mind us using a builder pattern, but we shouldn't change the try_new in the process.
There was a problem hiding this comment.
Agree, so add metadata to IpcWriteOptions?
@nevi-me I had seen that your status is busy. The major reason that I closed this PR is because I don't want it in the pull request list even if in draft mode too long. |
I think this PR is a sub task of issue ARROW-9777.
Sorry had not seen this issue before I finished this PR.
This PR is extracted from one of my dev branch, which has more changes for custom metadata. You may have a look at the diff to arrow master.
Two design choices:
Added a new type
CustomMetaData(alias toBTreeMap), in datatypes.rs.Why
BTreeMapinstead ofHashMap? BecauseFieldimplements traitsHash,PartialOrd, andOrd,but
HashMapbreaks the behavior.Additional refactoring
It tends to be tedious when add more args, for example:
try_new, try_new_with_options, try_new_with_options_and_custom_metadata.
Finally, I refactored several functions according to builder pattern. Sorry that, this refactoring may introduce some inconveniences.