Skip to content

ci(mutants): drop --jobs 4 → --jobs 2 to fit 32G lean-mem cgroup#301

Merged
avrabe merged 1 commit into
mainfrom
ci/mutants-jobs-bound
May 18, 2026
Merged

ci(mutants): drop --jobs 4 → --jobs 2 to fit 32G lean-mem cgroup#301
avrabe merged 1 commit into
mainfrom
ci/mutants-jobs-bound

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 18, 2026

Summary

Smithy operator's 2026-05-18 status showed lean-mem runners approaching the 32G cgroup ceiling again, ~16 hours after their 24G→32G bump:

```
runner3 31.0G/32G (996MB headroom) swap 2.1G peak 2.8G
runner4 30.4G/32G (1.5G headroom) swap 2.1G peak 2.3G
```

Same death-spiral pattern that triggered the prior emergency bump. Their note: "the 32G bump is also approaching its ceiling — the next escalation tier is the workflow-side fix".

What changes

cargo-mutants --jobs 4--jobs 2 in the mutants step.

Each cargo-mutants worker compiles a fresh target dir and runs the full test suite. At 4-way parallel on a 32G cgroup that's ~8G/worker which is at-or-above rivet-core's compile peak — triggers the same swap-death-spiral pattern the cgroup bump was meant to break. 2-way gives ~16G/worker with comfortable headroom.

Trade-off: each shard takes ~2× as long (was 12-20 min, now 20-40 min). Net effect on the lean-mem queue is similar because we stop OOM-cycling mutants that previously had to be retried. Smithy operator confirmed: "the durable answer is on the workflow side: lower cargo-mutants --jobs concurrency, or split mutation testing into smaller shards."

Composes with

Test plan

  • Next mutants run lands under the lean-mem 32G ceiling with >4G headroom per worker.
  • Shard completion times move from 12-20 min to 20-40 min as predicted.
  • No regression in caught/missed mutant counts (concurrency change doesn't affect mutation semantics).

🤖 Generated with Claude Code

Smithy operator status (2026-05-18) showed lean-mem runners
approaching the 32G cgroup ceiling again, just 16 hours after the
24G→32G cgroup bump:

  runner3   31.0G/32G  (996MB headroom)  swap 2.1G  peak 2.8G
  runner4   30.4G/32G  (1.5G headroom)   swap 2.1G  peak 2.3G

Same death-spiral pattern that triggered the prior bump. The cgroup
fix bought time, not a durable answer.

Workflow-side lever: cargo-mutants runs `--jobs N` parallel mutants
per shard. Each worker compiles a fresh target dir and runs the full
test suite — at 4-way that's ~8G/worker on a 32G runner, at or above
rivet-core's compile peak. Halving to `--jobs 2` gives ~16G/worker
with comfortable headroom for the test suite.

Trade-off: each shard ~2× longer (was 12-20 min, now 20-40 min). Net
effect on the lean-mem queue is similar because we stop OOM-cycling
mutants that previously had to be re-tried.

This addresses smithy's flagged escalation tier ("the next fix is on
the workflow side"). It complements rather than replaces issue #299
(decoupling playwright/kani/rocq from `needs: [test]`).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Rivet Criterion Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: aed75ae Previous: fc44838 Ratio
traceability_matrix/1000 58543 ns/iter (± 1264) 46078 ns/iter (± 279) 1.27

This comment was automatically generated by workflow using github-action-benchmark.

@avrabe avrabe merged commit 36b98fe into main May 18, 2026
14 checks passed
@avrabe avrabe deleted the ci/mutants-jobs-bound branch May 18, 2026 06:38
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

avrabe added a commit that referenced this pull request May 19, 2026
…es (#303)

Workspace version 0.10.0 → 0.10.1. Patch-shaped: every change is
additive (new fields/subcommands/heuristics; no breaking schema/CLI).

Highlights (full notes in CHANGELOG.md):
- Added: rivet audit (#297), rivet check ai-defects-open (#295), dpia
  artifact type (#295), variant-aware validate (#298), JUnit importer
  marker join (#302).
- Fixed: JUnit import overwrite + dropped linkage (#302, user-reported);
  salsa build_store not memoized (#295); dossier scope overstated
  (#295).
- Changed: sigstore keyless signing of SHA256SUMS (#296),
  build-test-evidence non-blocking (#294), cargo-mutants --jobs 4→2
  (#301).

Also bumps vscode-rivet/package.json to 0.10.1 and allowlists "0.10.0"
in rivet.yaml docs-check (historical references in dossier §0 and
schemas/common.yaml).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant