Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,25 @@ env:
NIGHTLY_TOOLCHAIN: nightly-2026-02-05

jobs:
duckdb-mirror:
name: "Mirror DuckDB to R2"
if: github.event_name == 'pull_request'
uses: ./.github/workflows/duckdb-r2.yml
secrets: inherit

duckdb-ready:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this, or asked diff can we encode the dependency?

name: "DuckDB libraries available in R2"
needs: duckdb-mirror
if: ${{ !cancelled() }}
runs-on: ubuntu-latest
timeout-minutes: 5

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for the timeout duration here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chosen randomly

steps:
- name: Verify DuckDB mirror
if: ${{ needs.duckdb-mirror.result == 'failure' }}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's more result types than success and failure. We could consider if: ${{ needs.duckdb-mirror.result != 'success' }} to signal a failure here.

@0ax1 0ax1 Jun 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently there's also: needs.check.outputs.any_missing == 'true'.

run: |
echo "DuckDB mirror failed; downstream builds would 404"
exit 1

lint-toml:
runs-on: ubuntu-latest
timeout-minutes: 10
Expand Down Expand Up @@ -115,6 +134,7 @@ jobs:

rust-docs:
name: "Rust (docs)"
needs: duckdb-ready
timeout-minutes: 30
runs-on: >-
${{ github.repository == 'vortex-data/vortex'
Expand Down Expand Up @@ -204,6 +224,7 @@ jobs:

rust-lint:
name: "Rust (lint)"
needs: duckdb-ready
timeout-minutes: 30
runs-on: >-
${{ github.repository == 'vortex-data/vortex'
Expand Down Expand Up @@ -301,6 +322,7 @@ jobs:

rust-test-other:
name: "Rust tests (${{ matrix.os }})"
needs: duckdb-ready
timeout-minutes: 30
strategy:
fail-fast: false
Expand Down Expand Up @@ -422,6 +444,7 @@ jobs:

sqllogic-test:
name: "SQL logic tests"
needs: duckdb-ready
runs-on: >-
${{ github.repository == 'vortex-data/vortex'
&& format('runs-on={0}/runner=amd64-medium/image=ubuntu24-full-x64-pre-v2/tag=sql-logic-test', github.run_id)
Expand Down
197 changes: 197 additions & 0 deletions .github/workflows/duckdb-r2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
name: DuckDB R2 mirror

# Mirror DuckDB libraries referenced by vortex-duckdb/build.rs to R2 when they

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does mirror mean exactly? Should we extend the text here a bit on how the whole setup works with R2 and caching?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

# are not present yet. Download tagged archives or build commits from source.
on:
workflow_call: { }

concurrency:
group: duckdb-r2-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: false

permissions:
contents: read

env:
PUBLIC_BASE_URL: "https://ci-builds.vortex.dev"
R2_BUCKET: "duckdb-builds"
R2_ENDPOINT_URL: "https://52bdeab5651e1584747feefd051fd566.r2.cloudflarestorage.com"

jobs:
check:
name: "Resolve DuckDB version and check R2"
runs-on: ubuntu-latest
timeout-minutes: 10
outputs:
version: ${{ steps.resolve.outputs.version }}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support individual commits? (by default DDB sets sth like version 0.0.0 or so right ?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we support commits.

ref_dir: ${{ steps.resolve.outputs.ref_dir }}
release: ${{ steps.resolve.outputs.release }}
matrix: ${{ steps.resolve.outputs.matrix }}
any_missing: ${{ steps.resolve.outputs.any_missing }}
steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6
- name: Resolve version and check R2
id: resolve
run: |
set -Eeuo pipefail

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit complex and long to inline a shell script into the GH action, wdyt?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

version=$(grep -oP 'DEFAULT_DUCKDB_VERSION:\s*&str\s*=\s*"\K[^"]+' \
vortex-duckdb/build.rs)
# Same as in vortex-duckdb/build.rs: >=2 dot-separated numeric
# components is a tagged release (ref dir "vX.Y.Z"), anything
# else is a commit.
ref="${version#v}"
if [[ "$ref" =~ ^[0-9]+(\.[0-9]+)+$ ]]; then
release=true
ref_dir="v$ref"
else
release=false
ref_dir="$ref"
fi
echo "DuckDB $version release=$release"
entries=()
for archive in \
libduckdb-linux-amd64.zip \
libduckdb-linux-arm64.zip \
libduckdb-osx-universal.zip; do
url="${PUBLIC_BASE_URL}/${ref_dir}/${archive}"
code=$(curl -o /dev/null -s -w '%{http_code}' --head "$url" || echo 000)
if [ "$code" = "200" ]; then
echo "present in R2: $archive"
continue
fi
echo "missing in R2 (HTTP $code): $archive"
case "$archive" in
*linux-amd64*) runner="ubuntu-latest"; os="linux"; arch="amd64" ;;
*linux-arm64*) runner="ubuntu-24.04-arm"; os="linux"; arch="arm64" ;;
*osx-universal*) runner="macos-14"; os="osx"; arch="universal" ;;
esac
entries+=("$(jq -nc \
--arg archive "$archive" \
--arg runner "$runner" \
--arg os "$os" \
--arg arch "$arch" \
'{archive: $archive, runner: $runner, os: $os, arch: $arch}')")
done
if [ "${#entries[@]}" -eq 0 ]; then
matrix='{"include":[]}'
any_missing=false
else
include=$(printf '%s\n' "${entries[@]}" | jq -sc '.')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the runners have python preinstalled, we could maybe consider that.

matrix=$(jq -nc --argjson include "$include" '{include: $include}')
any_missing=true
fi
echo "any_missing=$any_missing"
{
echo "version=$version"
echo "ref_dir=$ref_dir"
echo "release=$release"
echo "matrix=$matrix"
echo "any_missing=$any_missing"
} >> "$GITHUB_OUTPUT"
mirror:
name: "Mirror DuckDB ${{ matrix.archive }} to R2"
needs: check
if: >-
needs.check.outputs.any_missing == 'true' &&
github.repository == 'vortex-data/vortex' &&
github.event.pull_request.head.repo.full_name == github.repository
environment: duckdb-build
timeout-minutes: 120
strategy:
fail-fast: false
matrix: ${{ fromJSON(needs.check.outputs.matrix) }}
runs-on: ${{ matrix.runner }}
steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6

- name: Install build dependencies (Linux)
if: needs.check.outputs.release != 'true' && runner.os == 'Linux'
run: |
sudo apt-get update
sudo apt-get install -y ninja-build libcurl4-openssl-dev zip unzip
# MacOS already has ninja and p7zip

- name: Prepare ${{ matrix.archive }}
env:
ARCHIVE: ${{ matrix.archive }}
REF_DIR: ${{ needs.check.outputs.ref_dir }}
RELEASE: ${{ needs.check.outputs.release }}
PLATFORM_OS: ${{ matrix.os }}
run: |
set -Eeuo pipefail
if [ "$RELEASE" = "true" ]; then
echo "Mirroring DuckDB release ${REF_DIR}/${ARCHIVE}"
curl -fSL --retry 3 -o "$ARCHIVE" \
"https://github.com/duckdb/duckdb/releases/download/${REF_DIR}/${ARCHIVE}"
else
echo "Building DuckDB commit ${REF_DIR} from source"
curl -fSL --retry 3 -o duckdb-src.zip \
"https://github.com/duckdb/duckdb/archive/${REF_DIR}.zip"
# macos zip extract error: cannot create
# <...>/issue2628_������.csv Illegal byte sequence
if [ "$PLATFORM_OS" = "osx" ]; then
7z x duckdb-src.zip
else
unzip -q duckdb-src.zip
fi
src_dir="duckdb-${REF_DIR}"
extra=""
if [ "$PLATFORM_OS" = "osx" ]; then
extra="OSX_BUILD_UNIVERSAL=1"
fi
make -C "$src_dir" \
GEN=ninja \
DISABLE_SANITIZER=1 \
THREADSAN=0 \
BUILD_SHELL=false \
BUILD_UNITTESTS=false \
ENABLE_UNITTEST_CPP_TESTS=false \
BUILD_EXTENSIONS="parquet;tpch;tpcds" \
$extra
lib_dir="${src_dir}/build/release/src"
stage="stage"
rm -rf "$stage"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we need to clear the dir here? Isn't this empty for each new runner?

mkdir -p "$stage"
cp -a "${lib_dir}/libduckdb.so" "$stage/" 2>/dev/null || true
cp -a "${lib_dir}/libduckdb.dylib" "$stage/" 2>/dev/null || true
cp -a "${lib_dir}/libduckdb_static.a" "$stage/"
cp -a "${src_dir}/src/include/duckdb.h" "$stage/" 2>/dev/null || true
cp -a "${src_dir}/src/include/duckdb.hpp" "$stage/" 2>/dev/null || true
( cd "$stage" && zip -r "../${ARCHIVE}" . )
fi
ls -la "$ARCHIVE"
- name: Upload to R2
env:
AWS_ACCESS_KEY_ID: ${{ secrets.DUCKDB_R2_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.DUCKDB_R2_SECRET_ACCESS_KEY }}
AWS_REGION: "us-east-1"
AWS_ENDPOINT_URL: ${{ env.R2_ENDPOINT_URL }}
run: |
set -Eeuo pipefail
python3 scripts/s3-upload.py \
--bucket "$R2_BUCKET" \
--key "${{ needs.check.outputs.ref_dir }}/${{ matrix.archive }}" \
--body "${{ matrix.archive }}" \
--checksum-algorithm CRC32
20 changes: 20 additions & 0 deletions .github/workflows/rust-instrumented.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,28 @@ env:
NIGHTLY_TOOLCHAIN: nightly-2026-02-05

jobs:
duckdb-mirror:
name: "Mirror DuckDB to R2"
if: github.event_name == 'pull_request'
uses: ./.github/workflows/duckdb-r2.yml
secrets: inherit

duckdb-ready:
name: "DuckDB libraries available in R2"
needs: duckdb-mirror
if: ${{ !cancelled() }}
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Verify DuckDB mirror
if: ${{ needs.duckdb-mirror.result == 'failure' }}
run: |
echo "DuckDB mirror failed"
exit 1

rust-coverage:
name: "Rust tests (coverage) (${{ matrix.suite }})"
needs: duckdb-ready
timeout-minutes: 30
permissions:
id-token: write
Expand Down
Loading
Loading