Skip to content

GH-45225: [C++] Upgrade ORC to 2.1.0#45226

Merged
wgtmac merged 13 commits into
apache:mainfrom
wgtmac:orc_2.1.0
Jan 14, 2025
Merged

GH-45225: [C++] Upgrade ORC to 2.1.0#45226
wgtmac merged 13 commits into
apache:mainfrom
wgtmac:orc_2.1.0

Conversation

@wgtmac

@wgtmac wgtmac commented Jan 11, 2025

Copy link
Copy Markdown
Member

Rationale for this change

Apache ORC has just released 2.1.0: https://orc.apache.org/news/2025/01/09/ORC-2.1.0/

We need to upgrade it to avoid occasional download failures of orc-format.

What changes are included in this PR?

Bump Apache ORC to its latest version 2.1.0.

Are these changes tested?

Pass CIs.

Are there any user-facing changes?

No.

@github-actions

Copy link
Copy Markdown

⚠️ GitHub issue #45225 has been automatically assigned in GitHub to PR creator.

@kou

kou commented Jan 12, 2025

Copy link
Copy Markdown
Member

Hmm. There are some CI failures with ORC 2.1.0...
Could you take a look at them?

Could you remove a workaround for ORC 2.0.2?

# We can remove this with ORC 2.0.2 or later.
list(PREPEND CMAKE_MODULE_PATH
${CMAKE_CURRENT_BINARY_DIR}/_deps/orc-src/cmake_modules)

@wgtmac

wgtmac commented Jan 12, 2025

Copy link
Copy Markdown
Member Author

Yes, I will take a look and fix it.

Could you remove a workaround for ORC 2.0.2?

Just to confirm that we can drop the support of ORC versions prior to 2.1.0, right?

@wgtmac

wgtmac commented Jan 12, 2025

Copy link
Copy Markdown
Member Author

My bad! I found an issue here: https://github.com/apache/orc/blob/5bbafbb847f6e23b5a25d83c4d817741d36d9cc8/c%2B%2B/src/CMakeLists.txt#L221-L222

target_include_directories (orc
  INTERFACE
    $<INSTALL_INTERFACE:${CMAKE_INSTALL_INCLUDEDIR}>
  PUBLIC
    $<BUILD_INTERFACE:${CMAKE_SOURCE_DIR}/c++/include>
    $<BUILD_INTERFACE:${CMAKE_BINARY_DIR}/c++/include>
  PRIVATE
    ${CMAKE_CURRENT_SOURCE_DIR}
    ${CMAKE_CURRENT_BINARY_DIR}
    ${LIBHDFSPP_INCLUDE_DIR}
)

I should use PROJECT_SOURCE_DIR and PROJECT_BINARY_DIR instead.

@wgtmac wgtmac marked this pull request as draft January 12, 2025 16:16
@kou

kou commented Jan 13, 2025

Copy link
Copy Markdown
Member

Just to confirm that we can drop the support of ORC versions prior to 2.1.0, right?

Right. But it's only for bundled ORC. We keep old ORC support for system ORC (not bundled ORC).
The old ORC workaround code is used only with bundled ORC.

@wgtmac

wgtmac commented Jan 13, 2025

Copy link
Copy Markdown
Member Author

@kou Is it fine to disable ORC in C++ / ARM64 Ubuntu 20.04 C++? The patch command is unavailable there. Otherwise we need to install patch command on the docker images IMO.

@kou

kou commented Jan 13, 2025

Copy link
Copy Markdown
Member

We can install patch in Ubuntu 20.04 image:

diff --git a/ci/docker/ubuntu-20.04-cpp.dockerfile b/ci/docker/ubuntu-20.04-cpp.dockerfile
index 8dc778d544..259c5fb77f 100644
--- a/ci/docker/ubuntu-20.04-cpp.dockerfile
+++ b/ci/docker/ubuntu-20.04-cpp.dockerfile
@@ -106,6 +106,7 @@ RUN apt-get update -y -q && \
         ninja-build \
         nlohmann-json3-dev \
         npm \
+        patch \
         pkg-config \
         protobuf-compiler \
         python3-dev \

@wgtmac wgtmac marked this pull request as ready for review January 13, 2025 07:06
@wgtmac

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@wgtmac

wgtmac commented Jan 13, 2025

Copy link
Copy Markdown
Member Author

@github-actions crossbow submit -g cpp

@github-actions

Copy link
Copy Markdown

Revision: 2b0a4ab

Submitted crossbow builds: ursacomputing/crossbow @ actions-2893fb5e45

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
test-alpine-linux-cpp GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-cuda-cpp-ubuntu-20.04-cuda-11.2.2 GitHub Actions
test-cuda-cpp-ubuntu-22.04-cuda-11.7.1 GitHub Actions
test-debian-12-cpp-amd64 GitHub Actions
test-debian-12-cpp-i386 GitHub Actions
test-fedora-39-cpp GitHub Actions
test-ubuntu-20.04-cpp GitHub Actions
test-ubuntu-20.04-cpp-bundled GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-bundled-offline GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-24.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

@wgtmac

wgtmac commented Jan 13, 2025

Copy link
Copy Markdown
Member Author

I think this is ready for review @kou

@kou

kou commented Jan 13, 2025

Copy link
Copy Markdown
Member

@github-actions crossbow submit -g linux

@kou

kou commented Jan 13, 2025

Copy link
Copy Markdown
Member

@github-actions crossbow submit -g wheel -g r

Comment thread cpp/src/arrow/adapters/orc/adapter_test.cc
@github-actions github-actions Bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Jan 13, 2025
@github-actions

Copy link
Copy Markdown

Revision: 2b0a4ab

Submitted crossbow builds: ursacomputing/crossbow @ actions-6d7e7577fe

Task Status
almalinux-8-amd64 GitHub Actions
almalinux-8-arm64 GitHub Actions
almalinux-9-amd64 GitHub Actions
almalinux-9-arm64 GitHub Actions
amazon-linux-2023-amd64 GitHub Actions
amazon-linux-2023-arm64 GitHub Actions
centos-7-amd64 GitHub Actions
centos-8-stream-amd64 GitHub Actions
centos-8-stream-arm64 GitHub Actions
centos-9-stream-amd64 GitHub Actions
centos-9-stream-arm64 GitHub Actions
debian-bookworm-amd64 GitHub Actions
debian-bookworm-arm64 GitHub Actions
debian-trixie-amd64 GitHub Actions
debian-trixie-arm64 GitHub Actions
ubuntu-focal-amd64 GitHub Actions
ubuntu-focal-arm64 GitHub Actions
ubuntu-jammy-amd64 GitHub Actions
ubuntu-jammy-arm64 GitHub Actions
ubuntu-noble-amd64 GitHub Actions
ubuntu-noble-arm64 GitHub Actions

@github-actions

Copy link
Copy Markdown

Revision: 2b0a4ab

Submitted crossbow builds: ursacomputing/crossbow @ actions-852243f72c

Task Status
python-sdist GitHub Actions
r-binary-packages GitHub Actions
r-recheck-most GitHub Actions
test-r-arrow-backwards-compatibility GitHub Actions
test-r-clang-sanitizer GitHub Actions
test-r-depsource-bundled Azure
test-r-depsource-system GitHub Actions
test-r-dev-duckdb GitHub Actions
test-r-devdocs GitHub Actions
test-r-extra-packages GitHub Actions
test-r-gcc-11 GitHub Actions
test-r-gcc-12 GitHub Actions
test-r-install-local GitHub Actions
test-r-install-local-minsizerel GitHub Actions
test-r-linux-as-cran GitHub Actions
test-r-linux-rchk GitHub Actions
test-r-linux-valgrind GitHub Actions
test-r-macos-as-cran GitHub Actions
test-r-minimal-build Azure
test-r-offline-maximal GitHub Actions
test-r-offline-minimal Azure
test-r-rhub-debian-gcc-devel-lto-latest Azure
test-r-rhub-debian-gcc-release-custom-ccache Azure
test-r-rhub-ubuntu-release-latest Azure
test-r-rocker-r-ver-latest Azure
test-r-rstudio-r-base-4.1-opensuse155 Azure
test-r-rstudio-r-base-4.2-focal Azure
test-r-ubuntu-22.04 GitHub Actions
test-r-versions GitHub Actions
test-ubuntu-r-sanitizer GitHub Actions
wheel-macos-monterey-cp310-cp310-amd64 GitHub Actions
wheel-macos-monterey-cp310-cp310-arm64 GitHub Actions
wheel-macos-monterey-cp311-cp311-amd64 GitHub Actions
wheel-macos-monterey-cp311-cp311-arm64 GitHub Actions
wheel-macos-monterey-cp312-cp312-amd64 GitHub Actions
wheel-macos-monterey-cp312-cp312-arm64 GitHub Actions
wheel-macos-monterey-cp313-cp313-amd64 GitHub Actions
wheel-macos-monterey-cp313-cp313-arm64 GitHub Actions
wheel-macos-monterey-cp313-cp313t-amd64 GitHub Actions
wheel-macos-monterey-cp313-cp313t-arm64 GitHub Actions
wheel-macos-monterey-cp39-cp39-amd64 GitHub Actions
wheel-macos-monterey-cp39-cp39-arm64 GitHub Actions
wheel-manylinux-2-28-cp310-cp310-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-cp310-arm64 GitHub Actions
wheel-manylinux-2-28-cp311-cp311-amd64 GitHub Actions
wheel-manylinux-2-28-cp311-cp311-arm64 GitHub Actions
wheel-manylinux-2-28-cp312-cp312-amd64 GitHub Actions
wheel-manylinux-2-28-cp312-cp312-arm64 GitHub Actions
wheel-manylinux-2-28-cp313-cp313-amd64 GitHub Actions
wheel-manylinux-2-28-cp313-cp313-arm64 GitHub Actions
wheel-manylinux-2-28-cp313-cp313t-amd64 GitHub Actions
wheel-manylinux-2-28-cp313-cp313t-arm64 GitHub Actions
wheel-manylinux-2-28-cp39-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp39-cp39-arm64 GitHub Actions
wheel-manylinux-2014-cp310-cp310-amd64 GitHub Actions
wheel-manylinux-2014-cp310-cp310-arm64 GitHub Actions
wheel-manylinux-2014-cp311-cp311-amd64 GitHub Actions
wheel-manylinux-2014-cp311-cp311-arm64 GitHub Actions
wheel-manylinux-2014-cp312-cp312-amd64 GitHub Actions
wheel-manylinux-2014-cp312-cp312-arm64 GitHub Actions
wheel-manylinux-2014-cp313-cp313-amd64 GitHub Actions
wheel-manylinux-2014-cp313-cp313-arm64 GitHub Actions
wheel-manylinux-2014-cp313-cp313t-amd64 GitHub Actions
wheel-manylinux-2014-cp313-cp313t-arm64 GitHub Actions
wheel-manylinux-2014-cp39-cp39-amd64 GitHub Actions
wheel-manylinux-2014-cp39-cp39-arm64 GitHub Actions
wheel-windows-cp310-cp310-amd64 GitHub Actions
wheel-windows-cp311-cp311-amd64 GitHub Actions
wheel-windows-cp312-cp312-amd64 GitHub Actions
wheel-windows-cp313-cp313-amd64 GitHub Actions
wheel-windows-cp313-cp313t-amd64 GitHub Actions
wheel-windows-cp39-cp39-amd64 GitHub Actions

wgtmac added a commit to apache/orc that referenced this pull request Jan 13, 2025
### What changes were proposed in this pull request?

The change to add support for exporting CMake config and target has introduced some minor issues when the ORC C++ library is used inside a larger CMake project. Mainly there are three issues:

- We should prepend (not append) our CMake modules so we always use our own modules when there are naming conflict.
- Do not use CMAKE_SOURCE_DIR and CMAKE_BINARY_DIR because they are tied to the root project and it is no longer ORC project when we are inside another project.
- Add orc_ prefix to our CMake functions to avoid potential conflict.

See: apache/arrow#45226

### Why are the changes needed?

We had a problem with upgrading Apache ORC to 2.1.0 in the Apache Arrow. See: apache/arrow#45226

### How was this patch tested?

- Pass all CIs.
- Manually integrated it with Apache Arrow.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #2108 from wgtmac/fix_cmake.

Authored-by: Gang Wu <ustcwg@gmail.com>
Signed-off-by: Gang Wu <ustcwg@gmail.com>
wgtmac added a commit to apache/orc that referenced this pull request Jan 13, 2025
### What changes were proposed in this pull request?

The change to add support for exporting CMake config and target has introduced some minor issues when the ORC C++ library is used inside a larger CMake project. Mainly there are three issues:

- We should prepend (not append) our CMake modules so we always use our own modules when there are naming conflict.
- Do not use CMAKE_SOURCE_DIR and CMAKE_BINARY_DIR because they are tied to the root project and it is no longer ORC project when we are inside another project.
- Add orc_ prefix to our CMake functions to avoid potential conflict.

See: apache/arrow#45226

### Why are the changes needed?

We had a problem with upgrading Apache ORC to 2.1.0 in the Apache Arrow. See: apache/arrow#45226

### How was this patch tested?

- Pass all CIs.
- Manually integrated it with Apache Arrow.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #2108 from wgtmac/fix_cmake.

Authored-by: Gang Wu <ustcwg@gmail.com>
Signed-off-by: Gang Wu <ustcwg@gmail.com>
(cherry picked from commit 3eb423a)
Signed-off-by: Gang Wu <ustcwg@gmail.com>

@kou kou left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment thread cpp/cmake_modules/ThirdpartyToolchain.cmake Outdated
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
@github-actions github-actions Bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jan 14, 2025
@wgtmac wgtmac merged commit 7fc8222 into apache:main Jan 14, 2025
@wgtmac wgtmac removed the awaiting change review Awaiting change review label Jan 14, 2025
@conbench-apache-arrow

Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 7fc8222.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 12 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants