ARROW-6519: [Java] Use IPC continuation prefix as part of 8-byte EOS#5345
Conversation
There was a problem hiding this comment.
I'd prefer to keep this static because it is being used here to hide the EOS identifier https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala#L67
There was a problem hiding this comment.
Also apply this change to ArrowFileWriter#endInternal and probably remove MessageSerializer#writeLongLitterEndian since it’s no use anymore?
There was a problem hiding this comment.
good catch, I forgot that also writes EOS. done.
tianchen92
left a comment
There was a problem hiding this comment.
LGTM, thanks @BryanCutler
|
Merging this. https://github.com/apache/arrow/tree/ARROW-6313-flatbuffer-alignment has a rebase conflict on apache/master, I'm going to try to fix |
4f9b887 to
0352456
Compare
Write IPC continuation token to file format EOS
edce1cd to
8fb38a0
Compare
This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.
Closes #5345 from BryanCutler/java-ipc-cont-for-EOS-ARROW-6519 and squashes the following commits:
8fb38a0 <Bryan Cutler> Use IPC continuation token to write 8-byte EOS
Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
|
Thanks @wesm and @tianchen92 ! |
This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.
Closes #5345 from BryanCutler/java-ipc-cont-for-EOS-ARROW-6519 and squashes the following commits:
8fb38a0 <Bryan Cutler> Use IPC continuation token to write 8-byte EOS
Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.
Closes apache#5345 from BryanCutler/java-ipc-cont-for-EOS-ARROW-6519 and squashes the following commits:
8fb38a0 <Bryan Cutler> Use IPC continuation token to write 8-byte EOS
Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.
Closes apache#5345 from BryanCutler/java-ipc-cont-for-EOS-ARROW-6519 and squashes the following commits:
8fb38a0 <Bryan Cutler> Use IPC continuation token to write 8-byte EOS
Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.