ci(package-c): build pytorch plugin without bundled runtime by njzjz · Pull Request #5706 · deepmodeling/deepmd-kit

njzjz · 2026-07-01T22:11:58Z

Summary

build the C package in the manylinux_2_28 CUDA 12.9 image and install TensorFlow/PyTorch from dependency groups
package the PyTorch backend plugin while excluding libtorch/CUDA runtime libraries from the tarball
add package README/download_libtorch.sh guidance and update C package tests for external PyTorch runtime
avoid CUDA 12.9 CCCL failures from -arch=all by using all-major as the CUDA 12.9 default

Tests

ruff check .
ruff format .
git diff --check
sh -n source/install/package_c.sh
bash -n source/install/docker_package_c.sh
bash -n source/install/docker_test_package_c.sh
fork CI on bbcd58d: Build C library, Build C++, Build/upload to PyPI, CodeQL, Test C++, and Test Python all passed

Fork CI: https://github.com/njzjz/deepmd-kit/actions/runs/28547373402

Summary by CodeRabbit

New Features
- Expanded C library packaging and installation support for PyTorch alongside TensorFlow and JAX.
- Added packaged runtime guidance so users can download and use the matching PyTorch runtime more easily.
Bug Fixes
- Improved packaged library validation to catch missing shared libraries before release.
- Reduced the chance of bundling incompatible runtime components.
Documentation
- Updated install instructions with clearer PyTorch compatibility and runtime setup steps.

Copilot

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

coderabbitai · 2026-07-01T22:28:30Z

📝 Walkthrough

Walkthrough

This PR adds PyTorch backend support to the C library packaging pipeline: CMake filters/bundles runtime dependencies via configurable exclude regexes, packaging and Docker scripts generate a libtorch downloader when PyTorch is enabled, CI validates the packaged tarball and tests PyTorch runtime loading, docs are updated, and CUDA architecture defaults are refined.

Changes

PyTorch runtime packaging

Layer / File(s)	Summary
CMake runtime dependency exclusion and install logic `source/api_c/CMakeLists.txt`	Adds cached pre/post-exclude regex variables, rebuilds runtime library list with backend plugin targets, injects regexes into `file(GET_RUNTIME_DEPENDENCIES)`, and adjusts install targets for libname, LIB_DEEPMD_OP, and pt/pd backends.
package_c.sh PyTorch-aware build and libtorch downloader `source/install/package_c.sh`	Updates path handling, adds Python/CUDA nvrtc detection, parameterizes CMake with PyTorch/exclude-regex options, and generates `download_libtorch.sh`, README, and `libtorch_env.sh` when PyTorch is enabled.
Docker packaging container migration `source/install/docker_package_c.sh`	Switches from a tensorflow/build image to a configurable manylinux_cuda image, adds Python/PyTorch group environment defaults, and installs dependencies via `uv` before invoking `package_c.sh`.
Docker test container PyTorch runtime validation `source/install/docker_test_package_c.sh`	Rewritten to use manylinux 2.28, compiles/runs C examples, and conditionally validates PyTorch runtime loading via `ldd` and `ctypes.CDLL` when `CHECK_PYTORCH_RUNTIME` is set.
CI workflow tarball validation and testing `.github/workflows/package_c.yml`	Adds matrix exclude regexes, enables PyTorch packaging, verifies tarball contents to prevent bundling PyTorch/CUDA runtime libs, writes a step summary with the download URL, and enables PyTorch runtime test checks.
Installation documentation update `doc/install/install-from-c-library.md`	Adds PyTorch to supported backends and documents libtorch version matching, `LD_LIBRARY_PATH` setup, and use of `download_libtorch.sh`/`libtorch_env.sh`.

CUDA architecture default adjustment

Layer / File(s)	Summary
CUDA toolkit version-based architecture default `source/lib/src/gpu/CMakeLists.txt`	Moves `find_package(CUDAToolkit REQUIRED)` before architecture defaulting and sets `CMAKE_CUDA_ARCHITECTURES` to `all-major` for CUDA toolkit versions 12.9–13.0, otherwise `all`.

Estimated code review effort: 4 (Complex) | ~60 minutes

Sequence Diagram(s)

sequenceDiagram
    participant Workflow as CI Workflow
    participant DockerTest as docker_test_package_c.sh
    participant Container as manylinux Container
    participant PythonCheck as ctypes.CDLL check
    Workflow->>DockerTest: run with CHECK_PYTORCH_RUNTIME=1
    DockerTest->>Container: extract libdeepmd_c.tar.gz, compile examples
    Container->>Container: install uv, install PyTorch deps
    Container->>Container: set LD_LIBRARY_PATH, run ldd checks
    Container->>PythonCheck: load shared objects via ctypes.CDLL
    PythonCheck-->>DockerTest: pass/fail result

Suggested labels: enhancement

Suggested reviewers: wanghan-iapcm, iProzd

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main change: building the package-c PyTorch plugin without bundling its runtime.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@source/install/package_c.sh`:
- Around line 119-215: The PyTorch runtime URL derivation in package_c.sh can
fail silently when ENABLE_PYTORCH is true but PYTHON_BIN is missing or the torch
version heuristic does not produce a supported URL. Add a clear warning or
diagnostic in the URL-generation block before the README and
download_libtorch.sh generation is skipped, using the existing
PYTORCH_RUNTIME_DOWNLOAD_URL, ENABLE_PYTORCH, and PYTHON_BIN checks to explain
why no PyTorch runtime package was emitted. Keep the warning tied to the same
install-script flow so local/manual runs surface the missing libtorch packaging
immediately.
- Around line 122-138: The libtorch URL generation in the torch version handling
logic is stripping nightly `.devYYYYMMDD` data by splitting on `.dev`, so fix
the version parsing in the package_c script to preserve the full nightly version
string when building the download URL. Update the `version`/`variant` extraction
path around the `torch.__version__` and `torch.version.cuda` handling so nightly
builds keep their `.dev` suffix while still producing the correct `cpu` or `cu*`
libtorch archive name.

In `@source/lib/src/gpu/CMakeLists.txt`:
- Around line 9-17: The CUDA architecture fallback in CMakeLists.txt is too
narrow because the `all-major` branch in the `find_package(CUDAToolkit)` /
`CMAKE_CUDA_ARCHITECTURES` logic stops at CUDA 12.x. Update the version check so
the `all-major` fallback also applies to CUDA 13.x, keeping the existing
`CUDAToolkit_VERSION` guard and the `set(CMAKE_CUDA_ARCHITECTURES ...)` logic in
place while extending the upper bound appropriately.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3872c764-c654-40a6-9a12-928f465e34ab

📥 Commits

Reviewing files that changed from the base of the PR and between 42de67e and bbcd58d.

📒 Files selected for processing (7)

.github/workflows/package_c.yml
doc/install/install-from-c-library.md
source/api_c/CMakeLists.txt
source/install/docker_package_c.sh
source/install/docker_test_package_c.sh
source/install/package_c.sh
source/lib/src/gpu/CMakeLists.txt

coderabbitai · 2026-07-01T22:28:33Z

+if [ -z "${PYTORCH_RUNTIME_DOWNLOAD_URL:-}" ] && [ "${ENABLE_PYTORCH:-FALSE}" = "TRUE" ] && [ -n "${PYTHON_BIN:-}" ]; then
+	PYTORCH_RUNTIME_DOWNLOAD_URL=$(
+		"${PYTHON_BIN}" - <<'PY'
+from urllib.parse import quote
+
+import torch
+
+version = torch.__version__.split(".dev", 1)[0]
+if "+" in version:
+    variant = version.split("+", 1)[1]
+else:
+    cuda_version = torch.version.cuda
+    variant = "cu" + cuda_version.replace(".", "") if cuda_version else "cpu"
+    version = f"{version}+{variant}"
+
+if variant == "cpu" or variant.startswith("cu"):
+    print(
+        f"https://download.pytorch.org/libtorch/{variant}/"
+        f"libtorch-shared-with-deps-{quote(version, safe='')}.zip"
+    )
+PY
+	)
+fi
+
+if [ -n "${PYTORCH_RUNTIME_DOWNLOAD_URL:-}" ]; then
+	cat >"${BUILD_TMP_DIR}/libdeepmd_c/README.md" <<EOF
+This DeePMD-kit C package was built with PyTorch support, but PyTorch runtime libraries are not bundled.
+
+To use the PyTorch C/C++ backend, install a libtorch runtime that exactly matches the PyTorch version used at build time:
+
+${PYTORCH_RUNTIME_DOWNLOAD_URL}
+
+The PyTorch version must match exactly. The CUDA variant may be omitted only when the target runtime is compatible with the models and hardware you use.
+Make the libtorch lib directory discoverable by the dynamic linker, for example by adding it to LD_LIBRARY_PATH.
+Run ./download_libtorch.sh from this directory to download and unpack the matching libtorch runtime.
+EOF
+	cat >"${BUILD_TMP_DIR}/libdeepmd_c/download_libtorch.sh" <<'EOF'
+#!/bin/sh
+
+set -eu
+
+LIBTORCH_DOWNLOAD_URL="__PYTORCH_RUNTIME_DOWNLOAD_URL__"
+SCRIPT_DIR=$(CDPATH= cd -- "$(dirname -- "$0")" && pwd -P)
+DEST_DIR=${1:-"${SCRIPT_DIR}"}
+ARCHIVE_PATH=${LIBTORCH_ARCHIVE:-"${SCRIPT_DIR}/libtorch.zip"}
+
+mkdir -p "${DEST_DIR}"
+
+if [ -d "${DEST_DIR}/libtorch" ]; then
+	echo "libtorch already exists at ${DEST_DIR}/libtorch"
+else
+	echo "Downloading ${LIBTORCH_DOWNLOAD_URL}"
+	if command -v curl >/dev/null 2>&1; then
+		curl -L --fail --retry 3 -o "${ARCHIVE_PATH}" "${LIBTORCH_DOWNLOAD_URL}"
+	elif command -v wget >/dev/null 2>&1; then
+		wget -O "${ARCHIVE_PATH}" "${LIBTORCH_DOWNLOAD_URL}"
+	else
+		echo "curl or wget is required to download libtorch." >&2
+		exit 1
+	fi
+
+	echo "Extracting ${ARCHIVE_PATH} to ${DEST_DIR}"
+	if command -v unzip >/dev/null 2>&1; then
+		unzip -q -o "${ARCHIVE_PATH}" -d "${DEST_DIR}"
+	elif command -v python3 >/dev/null 2>&1; then
+		ZIP_ARCHIVE="${ARCHIVE_PATH}" ZIP_DEST_DIR="${DEST_DIR}" python3 - <<'PY'
+import os
+import zipfile
+
+with zipfile.ZipFile(os.environ["ZIP_ARCHIVE"]) as zip_file:
+    zip_file.extractall(os.environ["ZIP_DEST_DIR"])
+PY
+	elif command -v python >/dev/null 2>&1; then
+		ZIP_ARCHIVE="${ARCHIVE_PATH}" ZIP_DEST_DIR="${DEST_DIR}" python - <<'PY'
+import os
+import zipfile
+
+with zipfile.ZipFile(os.environ["ZIP_ARCHIVE"]) as zip_file:
+    zip_file.extractall(os.environ["ZIP_DEST_DIR"])
+PY
+	else
+		echo "unzip or python is required to extract libtorch." >&2
+		exit 1
+	fi
+fi
+
+cat >"${SCRIPT_DIR}/libtorch_env.sh" <<EOF_ENV
+export LD_LIBRARY_PATH="${DEST_DIR}/libtorch/lib:\${LD_LIBRARY_PATH:-}"
+EOF_ENV
+
+echo "libtorch is available at ${DEST_DIR}/libtorch"
+echo "Run this before using the PyTorch C/C++ backend:"
+echo "  . ${SCRIPT_DIR}/libtorch_env.sh"
+EOF
+	sed -i "s#__PYTORCH_RUNTIME_DOWNLOAD_URL__#${PYTORCH_RUNTIME_DOWNLOAD_URL}#g" "${BUILD_TMP_DIR}/libdeepmd_c/download_libtorch.sh"
+	chmod +x "${BUILD_TMP_DIR}/libdeepmd_c/download_libtorch.sh"
+fi


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Silent skip when PyTorch download URL can't be derived.

If ENABLE_PYTORCH=TRUE but PYTHON_BIN is unset, or the version/variant heuristic yields no URL (e.g., unsupported variant), PYTORCH_RUNTIME_DOWNLOAD_URL stays empty and the README/download_libtorch.sh generation is silently skipped with no diagnostic. CI happens to catch this via the tarball content check, but local/manual invocations get no explanation.

💬 Suggested diagnostic warning

if [ -z "${PYTORCH_RUNTIME_DOWNLOAD_URL:-}" ] && [ "${ENABLE_PYTORCH:-FALSE}" = "TRUE" ] && [ -n "${PYTHON_BIN:-}" ]; then PYTORCH_RUNTIME_DOWNLOAD_URL=$( "${PYTHON_BIN}" - <<'PY' ... PY ) fi + +if [ "${ENABLE_PYTORCH:-FALSE}" = "TRUE" ] && [ -z "${PYTORCH_RUNTIME_DOWNLOAD_URL:-}" ]; then + echo "WARNING: could not derive a PyTorch runtime download URL; README/download_libtorch.sh will not be generated." >&2 +fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if [ -z "${PYTORCH_RUNTIME_DOWNLOAD_URL:-}" ] && [ "${ENABLE_PYTORCH:-FALSE}" = "TRUE" ] && [ -n "${PYTHON_BIN:-}" ]; then

PYTORCH_RUNTIME_DOWNLOAD_URL=$(

"${PYTHON_BIN}" - <<'PY'

from urllib.parse import quote

import torch

version = torch.__version__.split(".dev", 1)[0]

if "+" in version:

variant = version.split("+", 1)[1]

else:

cuda_version = torch.version.cuda

variant = "cu" + cuda_version.replace(".", "") if cuda_version else "cpu"

version = f"{version}+{variant}"

if variant == "cpu" or variant.startswith("cu"):

print(

f"https://download.pytorch.org/libtorch/{variant}/"

f"libtorch-shared-with-deps-{quote(version, safe='')}.zip"

)

PY

)

fi

if [ -n "${PYTORCH_RUNTIME_DOWNLOAD_URL:-}" ]; then

cat >"${BUILD_TMP_DIR}/libdeepmd_c/README.md" <<EOF

This DeePMD-kit C package was built with PyTorch support, but PyTorch runtime libraries are not bundled.

To use the PyTorch C/C++ backend, install a libtorch runtime that exactly matches the PyTorch version used at build time:

${PYTORCH_RUNTIME_DOWNLOAD_URL}

The PyTorch version must match exactly. The CUDA variant may be omitted only when the target runtime is compatible with the models and hardware you use.

Make the libtorch lib directory discoverable by the dynamic linker, for example by adding it to LD_LIBRARY_PATH.

Run ./download_libtorch.sh from this directory to download and unpack the matching libtorch runtime.

EOF

cat >"${BUILD_TMP_DIR}/libdeepmd_c/download_libtorch.sh" <<'EOF'

#!/bin/sh

set -eu

LIBTORCH_DOWNLOAD_URL="__PYTORCH_RUNTIME_DOWNLOAD_URL__"

SCRIPT_DIR=$(CDPATH= cd -- "$(dirname -- "$0")" && pwd -P)

DEST_DIR=${1:-"${SCRIPT_DIR}"}

ARCHIVE_PATH=${LIBTORCH_ARCHIVE:-"${SCRIPT_DIR}/libtorch.zip"}

mkdir -p "${DEST_DIR}"

if [ -d "${DEST_DIR}/libtorch" ]; then

echo "libtorch already exists at ${DEST_DIR}/libtorch"

else

echo "Downloading ${LIBTORCH_DOWNLOAD_URL}"

if command -v curl >/dev/null 2>&1; then

curl -L --fail --retry 3 -o "${ARCHIVE_PATH}" "${LIBTORCH_DOWNLOAD_URL}"

elif command -v wget >/dev/null 2>&1; then

wget -O "${ARCHIVE_PATH}" "${LIBTORCH_DOWNLOAD_URL}"

else

echo "curl or wget is required to download libtorch." >&2

exit 1

fi

echo "Extracting ${ARCHIVE_PATH} to ${DEST_DIR}"

if command -v unzip >/dev/null 2>&1; then

unzip -q -o "${ARCHIVE_PATH}" -d "${DEST_DIR}"

elif command -v python3 >/dev/null 2>&1; then

ZIP_ARCHIVE="${ARCHIVE_PATH}" ZIP_DEST_DIR="${DEST_DIR}" python3 - <<'PY'

import os

import zipfile

with zipfile.ZipFile(os.environ["ZIP_ARCHIVE"]) as zip_file:

zip_file.extractall(os.environ["ZIP_DEST_DIR"])

PY

elif command -v python >/dev/null 2>&1; then

ZIP_ARCHIVE="${ARCHIVE_PATH}" ZIP_DEST_DIR="${DEST_DIR}" python - <<'PY'

import os

import zipfile

with zipfile.ZipFile(os.environ["ZIP_ARCHIVE"]) as zip_file:

zip_file.extractall(os.environ["ZIP_DEST_DIR"])

PY

else

echo "unzip or python is required to extract libtorch." >&2

exit 1

fi

fi

cat >"${SCRIPT_DIR}/libtorch_env.sh" <<EOF_ENV

export LD_LIBRARY_PATH="${DEST_DIR}/libtorch/lib:\${LD_LIBRARY_PATH:-}"

EOF_ENV

echo "libtorch is available at ${DEST_DIR}/libtorch"

echo "Run this before using the PyTorch C/C++ backend:"

echo " . ${SCRIPT_DIR}/libtorch_env.sh"

EOF

sed -i "s#__PYTORCH_RUNTIME_DOWNLOAD_URL__#${PYTORCH_RUNTIME_DOWNLOAD_URL}#g" "${BUILD_TMP_DIR}/libdeepmd_c/download_libtorch.sh"

chmod +x "${BUILD_TMP_DIR}/libdeepmd_c/download_libtorch.sh"

fi

if [ -z "${PYTORCH_RUNTIME_DOWNLOAD_URL:-}" ] && [ "${ENABLE_PYTORCH:-FALSE}" = "TRUE" ] && [ -n "${PYTHON_BIN:-}" ]; then

PYTORCH_RUNTIME_DOWNLOAD_URL=$(

"${PYTHON_BIN}" - <<'PY'

from urllib.parse import quote

import torch

version = torch.__version__.split(".dev", 1)[0]

if "+" in version:

variant = version.split("+", 1)[1]

else:

cuda_version = torch.version.cuda

variant = "cu" + cuda_version.replace(".", "") if cuda_version else "cpu"

version = f"{version}+{variant}"

if variant == "cpu" or variant.startswith("cu"):

print(

f"https://download.pytorch.org/libtorch/{variant}/"

f"libtorch-shared-with-deps-{quote(version, safe='')}.zip"

)

PY

)

fi

if [ "${ENABLE_PYTORCH:-FALSE}" = "TRUE" ] && [ -z "${PYTORCH_RUNTIME_DOWNLOAD_URL:-}" ]; then

echo "WARNING: could not derive a PyTorch runtime download URL; README/download_libtorch.sh will not be generated." >&2

fi

if [ -n "${PYTORCH_RUNTIME_DOWNLOAD_URL:-}" ]; then

cat >"${BUILD_TMP_DIR}/libdeepmd_c/README.md" <<EOF

This DeePMD-kit C package was built with PyTorch support, but PyTorch runtime libraries are not bundled.

To use the PyTorch C/C++ backend, install a libtorch runtime that exactly matches the PyTorch version used at build time:

${PYTORCH_RUNTIME_DOWNLOAD_URL}

The PyTorch version must match exactly. The CUDA variant may be omitted only when the target runtime is compatible with the models and hardware you use.

Make the libtorch lib directory discoverable by the dynamic linker, for example by adding it to LD_LIBRARY_PATH.

Run ./download_libtorch.sh from this directory to download and unpack the matching libtorch runtime.

EOF

cat >"${BUILD_TMP_DIR}/libdeepmd_c/download_libtorch.sh" <<'EOF'

#!/bin/sh

set -eu

LIBTORCH_DOWNLOAD_URL="__PYTORCH_RUNTIME_DOWNLOAD_URL__"

SCRIPT_DIR=$(CDPATH= cd -- "$(dirname -- "$0")" && pwd -P)

DEST_DIR=${1:-"${SCRIPT_DIR}"}

ARCHIVE_PATH=${LIBTORCH_ARCHIVE:-"${SCRIPT_DIR}/libtorch.zip"}

mkdir -p "${DEST_DIR}"

if [ -d "${DEST_DIR}/libtorch" ]; then

echo "libtorch already exists at ${DEST_DIR}/libtorch"

else

echo "Downloading ${LIBTORCH_DOWNLOAD_URL}"

if command -v curl >/dev/null 2>&1; then

curl -L --fail --retry 3 -o "${ARCHIVE_PATH}" "${LIBTORCH_DOWNLOAD_URL}"

elif command -v wget >/dev/null 2>&1; then

wget -O "${ARCHIVE_PATH}" "${LIBTORCH_DOWNLOAD_URL}"

else

echo "curl or wget is required to download libtorch." >&2

exit 1

fi

echo "Extracting ${ARCHIVE_PATH} to ${DEST_DIR}"

if command -v unzip >/dev/null 2>&1; then

unzip -q -o "${ARCHIVE_PATH}" -d "${DEST_DIR}"

elif command -v python3 >/dev/null 2>&1; then

ZIP_ARCHIVE="${ARCHIVE_PATH}" ZIP_DEST_DIR="${DEST_DIR}" python3 - <<'PY'

import os

import zipfile

with zipfile.ZipFile(os.environ["ZIP_ARCHIVE"]) as zip_file:

zip_file.extractall(os.environ["ZIP_DEST_DIR"])

PY

elif command -v python >/dev/null 2>&1; then

ZIP_ARCHIVE="${ARCHIVE_PATH}" ZIP_DEST_DIR="${DEST_DIR}" python - <<'PY'

import os

import zipfile

with zipfile.ZipFile(os.environ["ZIP_ARCHIVE"]) as zip_file:

zip_file.extractall(os.environ["ZIP_DEST_DIR"])

PY

else

echo "unzip or python is required to extract libtorch." >&2

exit 1

fi

fi

cat >"${SCRIPT_DIR}/libtorch_env.sh" <<EOF_ENV

export LD_LIBRARY_PATH="${DEST_DIR}/libtorch/lib:\${LD_LIBRARY_PATH:-}"

EOF_ENV

echo "libtorch is available at ${DEST_DIR}/libtorch"

echo "Run this before using the PyTorch C/C++ backend:"

echo " . ${SCRIPT_DIR}/libtorch_env.sh"

EOF

sed -i "s#__PYTORCH_RUNTIME_DOWNLOAD_URL__#${PYTORCH_RUNTIME_DOWNLOAD_URL}`#g`" "${BUILD_TMP_DIR}/libdeepmd_c/download_libtorch.sh"

chmod +x "${BUILD_TMP_DIR}/libdeepmd_c/download_libtorch.sh"

fi

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/install/package_c.sh` around lines 119 - 215, The PyTorch runtime URL derivation in package_c.sh can fail silently when ENABLE_PYTORCH is true but PYTHON_BIN is missing or the torch version heuristic does not produce a supported URL. Add a clear warning or diagnostic in the URL-generation block before the README and download_libtorch.sh generation is skipped, using the existing PYTORCH_RUNTIME_DOWNLOAD_URL, ENABLE_PYTORCH, and PYTHON_BIN checks to explain why no PyTorch runtime package was emitted. Keep the warning tied to the same install-script flow so local/manual runs surface the missing libtorch packaging immediately.

coderabbitai · 2026-07-01T22:28:33Z

+from urllib.parse import quote
+
+import torch
+
+version = torch.__version__.split(".dev", 1)[0]
+if "+" in version:
+    variant = version.split("+", 1)[1]
+else:
+    cuda_version = torch.version.cuda
+    variant = "cu" + cuda_version.replace(".", "") if cuda_version else "cpu"
+    version = f"{version}+{variant}"
+
+if variant == "cpu" or variant.startswith("cu"):
+    print(
+        f"https://download.pytorch.org/libtorch/{variant}/"
+        f"libtorch-shared-with-deps-{quote(version, safe='')}.zip"
+    )


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

What is the exact filename format for PyTorch nightly libtorch downloads on download.pytorch.org?

💡 Result:

The filename format for LibTorch nightly downloads hosted on download.pytorch.org generally follows a pattern based on the build variant, platform, and version [1][2]. For the latest nightly build, the filename typically uses the suffix -latest.zip [3][4][2]. Common patterns include: - Generic/CPU: libtorch-shared-with-deps-latest.zip [3][2] - CUDA-enabled: libtorch-shared-with-deps-latest.zip (located under the specific cuXXX subdirectory) [4][5][2] - Windows: libtorch-win-shared-with-deps-latest.zip [1] - macOS (ARM64): libtorch-macos-arm64-latest.zip [1] For specific versioned nightly builds, the format incorporates the package version, development date, and sometimes CUDA compute architecture [5][6][7][2]: - Pattern: {prefix}-{version}.dev{date}+{cuda_arch}.zip [5][7] - Example: libtorch-shared-with-deps-2.8.0.dev20250507+cu128.zip [5][7] The {prefix} (e.g., libtorch-shared-with-deps or libtorch-win-shared-with-deps) is determined by the platform and build requirements [1]. Note that as of June 2026, requests for these.zip files on download.pytorch.org issue a 301 redirect to download-r2.pytorch.org, so users should ensure their download tools support following redirects (e.g., using curl -L) [8]. Additionally, the use of the cxx11-abi tag in filenames has been largely deprecated in favor of standard binaries, as modern PyTorch builds have shifted to the C++11 ABI by default [5][7].

Citations:

1: https://github.com/pytorch/pytorch/blob/915982a4/.ci/libtorch/extract_libtorch_from_wheel.py

2: Versioned nightly URL for LibTorch with CUDA 10.0 doesn't work pytorch/pytorch#23039

3: https://github.com/pytorch/pytorch/blob/main/docs/cpp/source/installing.rst

4: https://discuss.pytorch.org/t/libtorch-nightly-versions/219838/4

5: https://discuss.pytorch.org/t/libtorch-nightly-versions/219838/1

6: https://dev-discuss.pytorch.org/t/confusion-about-libtorch-binary-naming-and-abi-on-get-started-page/3220

7: https://discuss.pytorch.org/t/libtorch-nightly-versions/219838

8: https://dev-discuss.pytorch.org/t/heads-up-libtorch-downloads-now-redirect-to-download-r2-pytorch-org-by-default-use-curl-l/3395

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect the target file around the cited lines. sed -n '1,220p' source/install/package_c.sh # Find where this libtorch URL logic is used and whether nightly versions are handled elsewhere. rg -n "libtorch|\\.dev|torch.__version__|download_libtorch|quote\\(" source/install source || true

Repository: deepmodeling/deepmd-kit

Length of output: 50380

🌐 Web query:

PyTorch libtorch nightly filename format official docs dev date suffix download.pytorch.org libtorch-shared-with-deps-2.8.0.dev20250507+cu128.zip

💡 Result:

The filename format for PyTorch LibTorch nightly builds typically follows a structured pattern used to identify the version, variant, and build date [1][2]. A standard nightly filename generally adheres to the following convention: libtorch-{variant}-{version}.dev{YYYYMMDD}+{cuda_version}.zip [3] Breaking down your example, libtorch-shared-with-deps-2.8.0.dev20250507+cu128.zip: 1. Prefix (libtorch-shared-with-deps): This indicates the LibTorch distribution type. Common variants include shared-with-deps and shared-without-deps [1][4]. 2. Version (2.8.0): The base PyTorch release version under development [3]. 3. Dev Date Suffix (.devYYYYMMDD): Represents the development snapshot date (e.g., dev20250507 corresponds to May 7, 2025) [3]. 4. CUDA Suffix (+cu128): Specifies the CUDA toolkit version (e.g., cu128 for CUDA 12.8) [3]. Key points regarding these files: - Latest Builds: PyTorch provides "latest" symlink versions (e.g., libtorch-shared-with-deps-latest.zip) to always point to the most recent daily build, which is recommended for most users [5][3]. - ABI Conventions: In the past, binaries explicitly tagged with cxx11-abi were common. However, as PyTorch has standardized its C++ ABI, these tags are often omitted in newer nightly builds [3]. - Official Source: All official binaries are hosted at https://download.pytorch.org/libtorch/ [5][6]. Specific nightly builds are categorized by their CUDA or CPU configuration under the /nightly/ directory (e.g., download.pytorch.org/libtorch/nightly/cu128/) [5][4][3]. For the most reliable access, it is recommended to use the "latest" link provided on the official PyTorch Get Started page, as specific versioned filenames are snapshots that change daily [5][3].

Citations:

1: https://github.com/pytorch/pytorch/blob/e9ebbd3b/.ci/libtorch/extract_libtorch_from_wheel.py

2: Fix Windows libtorch x86_64 and arm64 packages overwriting each other pytorch/pytorch#187837

3: https://discuss.pytorch.org/t/libtorch-nightly-versions/219838

4: Libtorch binaries compiled with flag _GLIBCXX_USE_CXX11_ABI = 1 pytorch/pytorch#17492

5: https://github.com/pytorch/pytorch/blob/main/docs/cpp/source/installing.rst

6: https://dev-discuss.pytorch.org/t/confusion-about-libtorch-binary-naming-and-abi-on-get-started-page/3220

Preserve the nightly .devYYYYMMDD suffix in the libtorch URL
torch.__version__.split(".dev", 1)[0] turns nightly versions like 2.8.0.dev20250507+cu128 into 2.8.0+cu128, which generates a stable-release filename instead of the nightly archive. source/install/package_c.sh:126-137

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/install/package_c.sh` around lines 122 - 138, The libtorch URL generation in the torch version handling logic is stripping nightly `.devYYYYMMDD` data by splitting on `.dev`, so fix the version parsing in the package_c script to preserve the full nightly version string when building the download URL. Update the `version`/`variant` extraction path around the `torch.__version__` and `torch.version.cuda` handling so nightly builds keep their `.dev` suffix while still producing the correct `cpu` or `cu*` libtorch archive name.

coderabbitai · 2026-07-01T22:28:33Z

+  find_package(CUDAToolkit REQUIRED)
  if(NOT DEFINED CMAKE_CUDA_ARCHITECTURES)
-    set(CMAKE_CUDA_ARCHITECTURES all)
+    # CUDA 12.9 CCCL fails to compile CUB/Thrust with -arch=all.
+    if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "12.9" AND CUDAToolkit_VERSION
+                                                            VERSION_LESS "13.0")
+      set(CMAKE_CUDA_ARCHITECTURES all-major)
+    else()
+      set(CMAKE_CUDA_ARCHITECTURES all)
+    endif()


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Does CUDA 13.0 or later still fail to compile CUB/Thrust with -arch=all (CCCL issue)?

💡 Result:

Yes, compilation issues related to the use of -arch=all with CUDA C++ Core Compute Libraries (CCCL), which includes Thrust and CUB, have been reported in the CUDA 13.x timeframe [1][2][3]. Specifically, issues have stemmed from the preprocessor reaching token limits or encountering macro expansion errors when generating the architecture-specific namespace strings required by CCCL headers when -arch=all is used [1]. For instance, a known bug reported in 2025 involved an error where the _CCCL_PP_SPLICE_WITH_IMPL macro failed because it was passed more arguments than it was designed to handle when processing the extensive list of architectures enabled by -arch=all [1]. While fixes have been implemented for such specific preprocessor limitations (e.g., in CCCL PR #4972), users may continue to encounter various compilation challenges when using -arch=all due to the sheer complexity of the generated headers or other edge-case interactions within the CCCL preprocessor machinery [1][3]. Additionally, users should be aware that the general approach of using -arch=all to create universal binaries has been discouraged or has faced functional limitations in recent CUDA versions, as the resulting binaries may not be truly universal due to PTX versioning, driver compatibility requirements, and the removal of support for older architectures (e.g., Maxwell, Pascal, and Volta support was removed in CUDA 13.0) [4][5]. It is generally recommended to explicitly target the required architectures for a specific deployment environment rather than relying on -arch=all [4].

Citations:

1: [BUG]: nvcc -arch=all fails with cub/cub.cuh included due to CCCL_PP_SPLICE_WITH issue NVIDIA/cccl#4967

2: [BUG]: CUDA 13, nvcc compilation error when including <thrust/execution_policy.h> with THRUST_DEVICE_SYSTEM_OMP in cpp file NVIDIA/cccl#5747

3: [BUG]: Compile failure when -DDEFAULT is set in nvcc NVIDIA/cccl#9406

4: https://forums.developer.nvidia.com/t/cuda-12-13-arch-flag-no-longer-produces-universal-binaries/342983/9

5: https://docs.nvidia.com/cuda/archive/13.0.3/cuda-toolkit-release-notes/index.html

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== File context ==\n' git ls-files 'source/lib/src/gpu/CMakeLists.txt' wc -l source/lib/src/gpu/CMakeLists.txt sed -n '1,80p' source/lib/src/gpu/CMakeLists.txt printf '\n== Search for CUDA architecture handling ==\n' rg -n "CMAKE_CUDA_ARCHITECTURES|all-major|all" source/lib/src/gpu source/lib -g '!**/build/**' || true

Repository: deepmodeling/deepmd-kit

Length of output: 50379

🌐 Web query:

CUDA 13.0 release notes support for older architectures removed CUB Thrust -arch=all CCCL preprocessor issue

💡 Result:

In CUDA Toolkit 13.0, NVIDIA removed support for several older GPU architectures and introduced significant changes to the CUDA Core Compute Libraries (CCCL). GPU Architecture Support CUDA 13.0 removed offline compilation and library support for GPU architectures prior to the Turing generation [1][2][3]. This includes Maxwell, Pascal, and Volta architectures (compute capabilities earlier than 7.5) [2][4][3]. These architectures are now considered feature-complete, and developers must use CUDA Toolkit 12.x series to build applications targeting them [1][5]. CCCL and Preprocessor Issues With the introduction of CCCL 3.0 in CUDA 13.0, several changes and known issues occurred: 1. Header Reorganization: CCCL headers were moved to a new top-level directory structure (${CTK_ROOT}/include/cccl/...) to prevent conflicts [6][7]. For projects not using CMake with the CCCL::CCCL target, developers must manually add the new include path to their build systems [6][7]. 2. CCCL Preprocessor Issue with -arch=all: A bug was identified where using nvcc with the -arch=all flag while including CUB or Thrust headers caused compilation errors [8]. The issue stemmed from an overflow in the CCCL preprocessor macro _CCCL_PP_SPLICE_WITH_IMPLx, which could not handle the growing number of GPU architectures [8]. A fix was implemented in the CCCL repository (PR #4972) to increase the argument capacity of this macro [8]. 3. Other Compatibility Issues: Additional issues were reported when compiling with CUDA 13.0 nvcc in host-compiled files (.cpp) when using specific system definitions, such as THRUST_DEVICE_SYSTEM_OMP [9]. These were generally addressed by patching guard clauses in the affected PTX helper headers within the CCCL library [9]. Developers are encouraged to consult the official CCCL 3.0 migration guide for detailed instructions on updating build paths and handling these breaking changes [6][7].

Citations:

1: https://docs.nvidia.com/cuda/archive/13.0.0/cuda-toolkit-release-notes/

2: https://docs.nvidia.com/cuda/archive/13.0.3/cuda-toolkit-release-notes/index.html

3: https://developer.nvidia.com/blog/whats-new-and-important-in-cuda-toolkit-13-0/

4: https://developer.nvidia.com/blog/navigating-gpu-architecture-support-a-guide-for-nvidia-cuda-developers/

5: https://docs.nvidia.com/cuda/archive/13.0.1/cuda-toolkit-release-notes/index.html

6: https://nvidia.github.io/cccl/unstable/cccl/3.0_migration_guide.html

7: https://github.com/NVIDIA/cccl/releases/tag/v3.0.0

8: [BUG]: nvcc -arch=all fails with cub/cub.cuh included due to CCCL_PP_SPLICE_WITH issue NVIDIA/cccl#4967

9: [BUG]: CUDA 13, nvcc compilation error when including <thrust/execution_policy.h> with THRUST_DEVICE_SYSTEM_OMP in cpp file NVIDIA/cccl#5747

🌐 Web query:

CUDA 13.2 release notes CCCL -arch=all compiler flag issue Thrust CUB

💡 Result:

There is no documented "issue" specifically named the "-arch=all compiler flag issue" in the official CUDA 13.2 release notes for CCCL, Thrust, or CUB [1]. However, the use of -arch=all has been associated with preprocessor and macro-related compilation errors in previous versions of the CUDA Core Compute Library (CCCL) [2]. Specifically, users have reported that when compiling with nvcc using -arch=all while including CUB headers (e.g., cub/cub.cuh), the compiler may fail due to limitations in internal CCCL preprocessor macros (such as _CCCL_PP_SPLICE_WITH) that are responsible for handling the increasingly large list of CUDA architectures [2]. If you are encountering compilation errors with -arch=all in CUDA 13.2, they are likely related to: 1. Macro Limitations: The preprocessor may struggle to build the architecture string if the number of supported architectures exceeds the capacity of specific internal macros (a known issue that has been patched in previous CCCL versions) [2]. 2. Environment-Specific Errors: While not specifically tied to -arch=all, users have reported that CUB code (e.g., device_transform.cuh) can encounter issues with specific host compilers (like GCC on ARM/SBSA) in CUDA 13.2 due to non-conforming C++ syntax that nvcc may silently accept but host compilers treat as errors [3]. To resolve or investigate such issues, it is recommended to: - Check the official NVIDIA CCCL GitHub repository for current open issues and discussions, as these are often tracked there rather than in the general CUDA Toolkit release notes [4][2][3]. - If you are building for a specific set of hardware, replace -arch=all with more explicit architecture flags (e.g., -arch=sm_80) to reduce the complexity of the architecture string handled by the preprocessor [5]. - Ensure you are using the latest patch release for the CUDA Toolkit, as patch releases often contain fixes for library-specific compilation issues [1].

Citations:

1: https://docs.nvidia.com/cuda/archive/13.2.1/pdf/CUDA_Toolkit_Release_Notes.pdf

2: [BUG]: nvcc -arch=all fails with cub/cub.cuh included due to CCCL_PP_SPLICE_WITH issue NVIDIA/cccl#4967

3: CUB: ill-formed class-head name in device_transform.cuh breaks GCC (CUDA 13.2) NVIDIA/cccl#8833

4: https://github.com/NVIDIA/cccl

5: https://docs.nvidia.com/cuda/archive/13.2.0/cuda-features-archive/index.html

Extend the all-major fallback to CUDA 13.x

The < 13.0 cutoff is too narrow. CUDA 13.x still has reported CCCL/Thrust/CUB compile failures with -arch=all, so dropping back to all can reintroduce the build breakage this guard is meant to avoid.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/lib/src/gpu/CMakeLists.txt` around lines 9 - 17, The CUDA architecture fallback in CMakeLists.txt is too narrow because the `all-major` branch in the `find_package(CUDAToolkit)` / `CMAKE_CUDA_ARCHITECTURES` logic stops at CUDA 12.x. Update the version check so the `all-major` fallback also applies to CUDA 13.x, keeping the existing `CUDAToolkit_VERSION` guard and the `set(CMAKE_CUDA_ARCHITECTURES ...)` logic in place while extending the upper bound appropriately.

codecov · 2026-07-01T23:16:16Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.08%. Comparing base (ee30956) to head (bbcd58d).
⚠️ Report is 6 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5706      +/-   ##
==========================================
- Coverage   81.20%   81.08%   -0.12%     
==========================================
  Files         978      980       +2     
  Lines      108858   109350     +492     
  Branches     4139     4207      +68     
==========================================
+ Hits        88393    88669     +276     
- Misses      18946    19156     +210     
- Partials     1519     1525       +6

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

ci(package-c): build pytorch plugin without bundled runtime

bbcd58d

Copilot AI review requested due to automatic review settings July 1, 2026 22:11

dosubot Bot added the build label Jul 1, 2026

Copilot AI reviewed Jul 1, 2026

github-actions Bot added Core CUDA ROCM Docs C labels Jul 1, 2026

coderabbitai Bot reviewed Jul 1, 2026

View reviewed changes

njzjz marked this pull request as draft July 2, 2026 05:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ci(package-c): build pytorch plugin without bundled runtime#5706

ci(package-c): build pytorch plugin without bundled runtime#5706
njzjz wants to merge 1 commit into
deepmodeling:masterfrom
njzjz:ci/package-c-pytorch-runtime-external

njzjz commented Jul 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai Bot commented Jul 1, 2026

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jul 1, 2026

Uh oh!

coderabbitai Bot Jul 1, 2026

Uh oh!

coderabbitai Bot Jul 1, 2026

Uh oh!

codecov Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

njzjz commented Jul 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Summary by CodeRabbit

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Jul 1, 2026

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njzjz commented Jul 1, 2026 •

edited by coderabbitai Bot

Loading

codecov Bot commented Jul 1, 2026 •

edited

Loading