Skip to content

ARROW-6519: [Java] Use IPC continuation prefix as part of 8-byte EOS#5345

Closed
BryanCutler wants to merge 1 commit into
apache:ARROW-6313-flatbuffer-alignmentfrom
BryanCutler:java-ipc-cont-for-EOS-ARROW-6519
Closed

ARROW-6519: [Java] Use IPC continuation prefix as part of 8-byte EOS#5345
BryanCutler wants to merge 1 commit into
apache:ARROW-6313-flatbuffer-alignmentfrom
BryanCutler:java-ipc-cont-for-EOS-ARROW-6519

Conversation

@BryanCutler

Copy link
Copy Markdown
Member

This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep this static because it is being used here to hide the EOS identifier https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala#L67

@BryanCutler

Copy link
Copy Markdown
Member Author

cc @emkornfield @tianchen92

@BryanCutler BryanCutler changed the title ARROW-6519: [Java] Use IPC continuation token to write 8-byte EOS ARROW-6519: [Java] Use IPC continuation prefix plus zero for 8-byte EOS Sep 10, 2019
@BryanCutler BryanCutler changed the title ARROW-6519: [Java] Use IPC continuation prefix plus zero for 8-byte EOS ARROW-6519: [Java] Use IPC continuation prefix as part of 8-byte EOS Sep 10, 2019

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also apply this change to ArrowFileWriter#endInternal and probably remove MessageSerializer#writeLongLitterEndian since it’s no use anymore?

@BryanCutler BryanCutler Sep 11, 2019

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I forgot that also writes EOS. done.

@tianchen92 tianchen92 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @BryanCutler

@wesm

wesm commented Sep 11, 2019

Copy link
Copy Markdown
Member

Merging this. https://github.com/apache/arrow/tree/ARROW-6313-flatbuffer-alignment has a rebase conflict on apache/master, I'm going to try to fix

@wesm wesm force-pushed the ARROW-6313-flatbuffer-alignment branch from 4f9b887 to 0352456 Compare September 11, 2019 22:07
Write IPC continuation token to file format EOS
@wesm wesm force-pushed the java-ipc-cont-for-EOS-ARROW-6519 branch from edce1cd to 8fb38a0 Compare September 11, 2019 22:08
wesm pushed a commit that referenced this pull request Sep 11, 2019
This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.

Closes #5345 from BryanCutler/java-ipc-cont-for-EOS-ARROW-6519 and squashes the following commits:

8fb38a0 <Bryan Cutler> Use IPC continuation token to write 8-byte EOS

Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
@wesm wesm closed this Sep 11, 2019
@BryanCutler BryanCutler deleted the java-ipc-cont-for-EOS-ARROW-6519 branch September 12, 2019 05:45
@BryanCutler

BryanCutler commented Sep 12, 2019

Copy link
Copy Markdown
Member Author

Thanks @wesm and @tianchen92 !

wesm pushed a commit that referenced this pull request Sep 13, 2019
This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.

Closes #5345 from BryanCutler/java-ipc-cont-for-EOS-ARROW-6519 and squashes the following commits:

8fb38a0 <Bryan Cutler> Use IPC continuation token to write 8-byte EOS

Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
pprudhvi pushed a commit to pprudhvi/arrow that referenced this pull request Sep 16, 2019
This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.

Closes apache#5345 from BryanCutler/java-ipc-cont-for-EOS-ARROW-6519 and squashes the following commits:

8fb38a0 <Bryan Cutler> Use IPC continuation token to write 8-byte EOS

Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
This changes the 8-byte EOS for non-legacy stream format to use {0xFFFFFFFF, 0x00000000} instead of all zeros. When using all zeros, the reader will not know to read the last 4-bytes, but with the 4-byte continuation token, all bytes written to a channel can be read.

Closes apache#5345 from BryanCutler/java-ipc-cont-for-EOS-ARROW-6519 and squashes the following commits:

8fb38a0 <Bryan Cutler> Use IPC continuation token to write 8-byte EOS

Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants