Skip to content

ARROW-6314: [C++] Implement IPC message format alignment changes, provide backwards compatibility and "legacy" option to emit old message format#5211

Closed
wesm wants to merge 7 commits into
apache:ARROW-6313-flatbuffer-alignmentfrom
wesm:ARROW-6314
Closed

ARROW-6314: [C++] Implement IPC message format alignment changes, provide backwards compatibility and "legacy" option to emit old message format#5211
wesm wants to merge 7 commits into
apache:ARROW-6313-flatbuffer-alignmentfrom
wesm:ARROW-6314

Conversation

@wesm

@wesm wesm commented Aug 28, 2019

Copy link
Copy Markdown
Member

This also moves the alignment multiple to IpcOptions and adds the IpcOptions argument to more functions.

@wesm wesm requested review from emkornfield and pitrou August 28, 2019 02:17
@wesm wesm changed the base branch from master to ARROW-6313-flatbuffer-alignment August 28, 2019 02:19
@wesm

wesm commented Aug 28, 2019

Copy link
Copy Markdown
Member Author

Will fix the CI issues and arrow_cuda failure (don't currently have CUDA installed on my laptop)

Comment thread cpp/src/arrow/ipc/message.cc Outdated
Comment thread cpp/src/arrow/ipc/writer.cc Outdated
Comment thread cpp/src/arrow/ipc/writer.cc Outdated
Comment thread cpp/src/arrow/ipc/message.cc Outdated
@emkornfield

Copy link
Copy Markdown
Contributor

Took a quick scan through and seems reasonable. One question about if we want to memcpy for pre-0.15.0 version.

@pitrou pitrou left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall.

Comment thread cpp/src/arrow/ipc/message.cc Outdated
Comment thread cpp/src/arrow/ipc/message.cc Outdated
Comment thread cpp/src/arrow/ipc/message.cc Outdated
Comment thread cpp/src/arrow/ipc/message.cc Outdated
@pitrou

pitrou commented Aug 28, 2019

Copy link
Copy Markdown
Member

I pushed a fix for the CUDA issue.

@wesm

wesm commented Aug 28, 2019

Copy link
Copy Markdown
Member Author

@kszucs not sure what's up with the Buildbot builds

-- Could NOT find LLVM (missing: LLVM_DIR)
CMake Error at cmake_modules/FindLLVM.cmake:34 (find_package):
  Could not find a package configuration file provided by "LLVM" (requested
  version 7.1) with any of the following names:
    LLVMConfig.cmake
    llvm-config.cmake
  Add the installation prefix of "LLVM" to CMAKE_PREFIX_PATH or set
  "LLVM_DIR" to a directory containing one of the above files.  If "LLVM"
  provides a separate development package or SDK, be sure it has been
  installed.
Call Stack (most recent call first):
  src/gandiva/CMakeLists.txt:31 (find_package)

@kszucs

kszucs commented Aug 28, 2019

Copy link
Copy Markdown
Member

@ursabot build

@wesm

wesm commented Aug 28, 2019

Copy link
Copy Markdown
Member Author

@emkornfield @tianchen92 note that Java is failing in a really ugly way when sent this new message format

Incompatible files
null
16:27:42.478 [main] ERROR org.apache.arrow.tools.Integration - Incompatible files
java.lang.NullPointerException: null
	at org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeDictionaryBatch(MessageSerializer.java:504)
	at org.apache.arrow.vector.ipc.ArrowFileReader.readDictionaryBatch(ArrowFileReader.java:178)
	at org.apache.arrow.vector.ipc.ArrowFileReader.readDictionary(ArrowFileReader.java:127)
	at org.apache.arrow.vector.ipc.ArrowFileReader.initialize(ArrowFileReader.java:110)
	at org.apache.arrow.vector.ipc.ArrowReader.ensureInitialized(ArrowReader.java:160)
	at org.apache.arrow.vector.ipc.ArrowReader.getVectorSchemaRoot(ArrowReader.java:62)
	at org.apache.arrow.tools.Integration$Command$3.execute(Integration.java:191)
	at org.apache.arrow.tools.Integration.run(Integration.java:118)
	at org.apache.arrow.tools.Integration.main(Integration.java:69)
	Suppressed: java.lang.IllegalStateException: Memory was leaked by query. Memory leaked: (1024)
Allocator(ROOT) 0/1024/33428/2147483647 (res/actual/peak/limit)
		at org.apache.arrow.memory.BaseAllocator.close(BaseAllocator.java:454)
		at org.apache.arrow.tools.Integration$Command$3.execute(Integration.java:227)
		... 2 common frames omitted

I would guess that there are inadequate checks happening. For example it is only checked whether the metadata length is not equal to 0

https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/ipc/message/MessageSerializer.java#L590

@wesm

wesm commented Aug 28, 2019

Copy link
Copy Markdown
Member Author

Travis CI succeeds except for integration tests, as expected: https://travis-ci.org/wesm/arrow/builds/577931299

Appveyor looking good: https://ci.appveyor.com/project/wesm/arrow/builds/27023793

Going to go ahead and merge this into the integration branch.

wesm added a commit that referenced this pull request Aug 28, 2019
…vide backwards compatibility and "legacy" option to emit old message format

This also moves the alignment multiple to `IpcOptions` and adds the `IpcOptions` argument to more functions.

Closes #5211 from wesm/ARROW-6314 and squashes the following commits:

df3b910 <Wes McKinney> Fix MSVC narrowing warning
62758b6 <Wes McKinney> Code review comments. Copy metadata always in prefix_length==4 legacy case
857a571 <Antoine Pitrou> Fix CUDA IPC
71b2fad <Wes McKinney> Add tests exercising backwards compatibility write and read path
4777d2b <Wes McKinney> Implement backwards compatibility and compatibility mode, pass IpcOptions in more APIs
1a38432 <Wes McKinney> Revert changes to submodule
69883cf <Micah Kornfield> verify 8 bytes alignment fixes ubsan for ipc

Lead-authored-by: Wes McKinney <wesm+git@apache.org>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Co-authored-by: Micah Kornfield <emkornfield@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
@wesm

wesm commented Aug 28, 2019

Copy link
Copy Markdown
Member Author

Merged into the integration branch

@wesm wesm closed this Aug 28, 2019
@wesm wesm deleted the ARROW-6314 branch August 28, 2019 17:00
wesm added a commit that referenced this pull request Sep 11, 2019
…vide backwards compatibility and "legacy" option to emit old message format

This also moves the alignment multiple to `IpcOptions` and adds the `IpcOptions` argument to more functions.

Closes #5211 from wesm/ARROW-6314 and squashes the following commits:

df3b910 <Wes McKinney> Fix MSVC narrowing warning
62758b6 <Wes McKinney> Code review comments. Copy metadata always in prefix_length==4 legacy case
857a571 <Antoine Pitrou> Fix CUDA IPC
71b2fad <Wes McKinney> Add tests exercising backwards compatibility write and read path
4777d2b <Wes McKinney> Implement backwards compatibility and compatibility mode, pass IpcOptions in more APIs
1a38432 <Wes McKinney> Revert changes to submodule
69883cf <Micah Kornfield> verify 8 bytes alignment fixes ubsan for ipc

Lead-authored-by: Wes McKinney <wesm+git@apache.org>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Co-authored-by: Micah Kornfield <emkornfield@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
wesm added a commit that referenced this pull request Sep 13, 2019
…vide backwards compatibility and "legacy" option to emit old message format

This also moves the alignment multiple to `IpcOptions` and adds the `IpcOptions` argument to more functions.

Closes #5211 from wesm/ARROW-6314 and squashes the following commits:

df3b910 <Wes McKinney> Fix MSVC narrowing warning
62758b6 <Wes McKinney> Code review comments. Copy metadata always in prefix_length==4 legacy case
857a571 <Antoine Pitrou> Fix CUDA IPC
71b2fad <Wes McKinney> Add tests exercising backwards compatibility write and read path
4777d2b <Wes McKinney> Implement backwards compatibility and compatibility mode, pass IpcOptions in more APIs
1a38432 <Wes McKinney> Revert changes to submodule
69883cf <Micah Kornfield> verify 8 bytes alignment fixes ubsan for ipc

Lead-authored-by: Wes McKinney <wesm+git@apache.org>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Co-authored-by: Micah Kornfield <emkornfield@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
pprudhvi pushed a commit to pprudhvi/arrow that referenced this pull request Sep 16, 2019
…vide backwards compatibility and "legacy" option to emit old message format

This also moves the alignment multiple to `IpcOptions` and adds the `IpcOptions` argument to more functions.

Closes apache#5211 from wesm/ARROW-6314 and squashes the following commits:

df3b910 <Wes McKinney> Fix MSVC narrowing warning
62758b6 <Wes McKinney> Code review comments. Copy metadata always in prefix_length==4 legacy case
857a571 <Antoine Pitrou> Fix CUDA IPC
71b2fad <Wes McKinney> Add tests exercising backwards compatibility write and read path
4777d2b <Wes McKinney> Implement backwards compatibility and compatibility mode, pass IpcOptions in more APIs
1a38432 <Wes McKinney> Revert changes to submodule
69883cf <Micah Kornfield> verify 8 bytes alignment fixes ubsan for ipc

Lead-authored-by: Wes McKinney <wesm+git@apache.org>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Co-authored-by: Micah Kornfield <emkornfield@gmail.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants