Skip to content

fix(r-pipeline): drop broken renv.lock; clean up python_version in artifact YAML#6950

Merged
MarkusNeusinger merged 3 commits into
mainfrom
fix/r-install-and-drop-python-version
May 16, 2026
Merged

fix(r-pipeline): drop broken renv.lock; clean up python_version in artifact YAML#6950
MarkusNeusinger merged 3 commits into
mainfrom
fix/r-install-and-drop-python-version

Conversation

@MarkusNeusinger
Copy link
Copy Markdown
Owner

Two related cleanups to the R/ggplot2 pipeline so it actually completes end-to-end.

1. setup-r: install via install.packages(..., dependencies = TRUE) instead of renv::restore()

Symptom

impl-repair for scatter-basic / ggplot2 (run 25973551110) failed at the smoke-test step:

Error: package or namespace load failed for 'ggplot2' in loadNamespace(...): there is no package called 'munsell'

Root cause

The hand-written renv.lock listed only 15 top-level packages (ggplot2, scales, tibble, …). renv::restore() does not auto-resolve transitive dependencies that aren't pinned in the lockfile, so munsell (required by scales), plus gtable, colorspace, R6, RColorBrewer, isoband, farver, labeling, MASS, mgcv, … were never installed into the renv-managed library.

Earlier runs only succeeded because the runner image's pre-baked R library was also on .libPaths(). When the cache-restoration path scoped .libPaths() to the renv-managed library (/home/runner/work/_temp/Library), library(ggplot2) could no longer find munsell and the action failed.

Adding individual missing entries to renv.lock is a treadmill — every new R lib means more transitive deps to hunt down by hand. The setup-r action's existing comment already hinted at this being a transitional approach ("Once a successful CI run populates hashes via renv::snapshot(), this flag can be dropped").

Fix

Replace r-lib/actions/setup-renv@v2 with a direct install.packages(c(...), dependencies = TRUE) call. Base R's installer resolves the transitive dep tree automatically; reproducibility comes from pinning to a dated Posit Package Manager snapshot (noble/2025-01-15) instead of a per-package hash table.

  • ✗ Removed: setup-renv@v2 step + RENV_CONFIG_INSTALL_HASHES=FALSE workaround
  • ✗ Removed: renv.lock (no tooling outside the workflows referenced it)
  • ✗ Removed: working-directory input (no longer needed)
  • ✓ Added: install.packages step with dependencies = TRUE + RSPM snapshot URL
  • ✓ Kept: smoke-test (ggsave of geom_point() to a PNG) for fast-fail detection

RENV_CONFIG_STARTUP_QUIET=TRUE in impl-generate.yml's version probes stays as defensive insurance in case anyone reintroduces a .Rprofile later — without renv it's a no-op.

2. impl-generate: stop writing python_version to new artifact YAMLs

python_version: 3.13.13 on an R YAML reads like a claim the artifact runs on Python — it's actually pipeline-audit metadata (the workflow runner's Python). The DB has had language_version since migration 3a7e1b5c0c4f (backfilled from python_version for legacy rows); the frontend already prefers it (PlotOfTheDay.tsx:267). New YAMLs drop python_version entirely; workflow_run is sufficient pipeline audit. sync_to_postgres.py handles the missing field via its existing fallback chain.

Test plan

  • CI on this PR (lint + tests) green
  • After merge: re-trigger bulk-generate.yml -f specification_id=scatter-basic -f library=ggplot2
  • impl-generate succeeds, smoke-test confirms ok: ... bytes
  • Resulting YAML has language_version: 4.4.1, library_version: 3.5.1, no python_version
  • If impl-review requests a repair, impl-repair also passes library(ggplot2) cleanly

🤖 Generated with Claude Code

python_version was always intended as pipeline-audit metadata (the
Python interpreter that ran the workflow), but on an R artifact it
reads like a claim that the artifact runs on Python 3.13. With the
language_version column from migration 3a7e1b5c0c4f now carrying the
artifact's actual runtime, the python_version field has no place in
artifact YAML — workflow_run is sufficient for pipeline audit.

Existing rows keep their python_version in the DB (the migration
backfilled language_version from python_version, and frontend already
prefers language_version with python_version as fallback).
sync_to_postgres.py handles missing python_version in YAML via the
existing fallback chain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 16, 2026 21:45
The hand-written renv.lock listed only 15 top-level packages
(ggplot2, scales, tibble, ...). renv::restore() does not auto-resolve
transitive dependencies that aren't pinned in the lockfile, so
munsell, gtable, colorspace, R6, RColorBrewer, isoband, farver,
labeling, ... were never installed in the renv-managed library.
Earlier runs only worked because the runner image's system R library
also lived on .libPaths(); when the cache restoration scoped libpaths
to the renv library, library(ggplot2) failed with "no package called
munsell" during impl-repair (run 25973551110).

Replace setup-renv@v2 with a direct install.packages step. Base R's
installer resolves transitive deps automatically; pinning to a dated
Posit Package Manager snapshot (noble/2025-01-15) keeps versions
reproducible without needing a fully-populated lockfile. renv.lock
is deleted — it was misleading rather than helpful, and no tooling
outside the workflows referenced it.

The renv-stdout-quieting in impl-generate.yml from PR #6948 stays as
defensive insurance in case someone reintroduces a .Rprofile later;
without renv it is a no-op.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR cleans up the R/ggplot2 generation pipeline by replacing the incomplete renv.lock restore path with direct R package installation from a pinned Posit Package Manager snapshot, and stops writing pipeline python_version metadata into newly generated artifact YAML.

Changes:

  • Removes the repository-level renv.lock.
  • Updates the shared setup-r composite action to install ggplot2-related packages directly and smoke-test rendering.
  • Updates impl-generate.yml metadata generation to emit language_version without python_version.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File Description
renv.lock Removes the incomplete R lockfile that was used by setup-renv.
.github/workflows/impl-generate.yml Drops python_version from newly generated implementation metadata YAML.
.github/actions/setup-r/action.yml Replaces setup-renv with direct install.packages() installation from a pinned RSPM snapshot.

Comment thread .github/actions/setup-r/action.yml Outdated
c("ggplot2", "ragg", "tidyr", "dplyr", "viridis",
"palmerpenguins", "gapminder", "tibble", "scales",
"systemfonts", "textshaping"),
dependencies = TRUE
Apply Copilot review feedback on #6950. `dependencies = TRUE` also
pulls Suggests, which for ggplot2/dplyr/tidyr includes sf (needs
libgeos/proj/gdal that we don't apt-install), Hmisc, mapproj, maps,
quantreg, and other optional packages unused by the implementations.
Restrict to c("Depends", "Imports", "LinkingTo") so transitive runtime
deps (munsell, gtable, colorspace, etc.) are still resolved while
skipping the optional stack.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MarkusNeusinger MarkusNeusinger merged commit 4f833ff into main May 16, 2026
7 checks passed
@MarkusNeusinger MarkusNeusinger deleted the fix/r-install-and-drop-python-version branch May 16, 2026 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants