Summary
Add a GitHub Actions workflow that detects performance regressions in the exact-arithmetic benchmark suite on PRs and pushes to main.
Current State
- Exact-arithmetic benchmarks exist in
benches/exact.rs (D=2–5, near-singular 3x3)
scripts/bench_compare.py can compare Criterion baselines and generate markdown tables
- Baselines are saved locally via
just bench-save-baseline <TAG> but there is no CI integration
- The
delaunay project has a working implementation (.github/workflows/benchmarks.yml + scripts/benchmark_utils.py) that can serve as a reference
Proposed Changes
GitHub Actions workflow (.github/workflows/benchmarks.yml)
- Trigger: on push to main and on PRs
- Baseline strategy: save a Criterion baseline on main merges as a GitHub Actions artifact; download it on PR runs for comparison
- Regression detection: run
cargo bench --features bench,exact --bench exact, compare against the main baseline, flag regressions above a configurable threshold (e.g., 7.5%)
- Reporting: post a summary comment on the PR with the comparison table (reuse
bench_compare.py output or Criterion's built-in comparison)
Baseline artifact management
- Save baseline artifacts on main branch pushes (retain for ~30 days)
- PRs download the latest main baseline for comparison
- Optionally save tagged-release baselines with longer retention (90 days)
Considerations
- Hardware consistency: GitHub Actions runners have variable performance; may need a noise threshold higher than local benchmarks
- Runtime: exact benchmarks currently take ~30s with
--quick; full run ~2–3 min — acceptable for CI
- False positives: the threshold should be tuned after a few weeks of data; start conservative (e.g., 10%) and tighten later
Benefits
- Catch performance regressions before merge
- Automated tracking — no manual benchmark runs needed for regression detection
- Complements the existing local
just bench-compare workflow
Implementation Notes
Summary
Add a GitHub Actions workflow that detects performance regressions in the exact-arithmetic benchmark suite on PRs and pushes to main.
Current State
benches/exact.rs(D=2–5, near-singular 3x3)scripts/bench_compare.pycan compare Criterion baselines and generate markdown tablesjust bench-save-baseline <TAG>but there is no CI integrationdelaunayproject has a working implementation (.github/workflows/benchmarks.yml+scripts/benchmark_utils.py) that can serve as a referenceProposed Changes
GitHub Actions workflow (
.github/workflows/benchmarks.yml)cargo bench --features bench,exact --bench exact, compare against the main baseline, flag regressions above a configurable threshold (e.g., 7.5%)bench_compare.pyoutput or Criterion's built-in comparison)Baseline artifact management
Considerations
--quick; full run ~2–3 min — acceptable for CIBenefits
just bench-compareworkflowImplementation Notes
../delaunay/.github/workflows/benchmarks.ymland../delaunay/scripts/benchmark_utils.py--baselinecomparison