Skip to content

vitest worker SIGBUS on /dev/shm-constrained hosts with task cache enabled #1453

@TheHolyWaffle

Description

@TheHolyWaffle

Describe the bug

On any host with a small /dev/shm (e.g. the default 64 MiB on GitLab Kubernetes runners, or Docker's default 64 MiB), vp run ... test reproducibly SIGBUSes a vitest forked worker as soon as the task's file-access tracking has written enough path records to fill /dev/shm:

Bus error (core dumped)
⎯⎯⎯⎯⎯⎯ Unhandled Errors ⎯⎯⎯⎯⎯⎯
Error: [vitest-pool]: Worker forks emitted error.
Caused by: Error: Worker exited unexpectedly

I probed the child worker env mid-run and /dev/shm usage climbs fast:

SHM usage: Filesystem      Size  Used Avail Use% Mounted on
shm              64M   60M  4.6M  93% /dev/shm

This matches vite-task's 4 GiB shared-memory IPC mapping (fspy::ipc::SHM_CAPACITY = 4 * 1024 * 1024 * 1024, https://github.com/voidzero-dev/vite-task/blob/main/crates/fspy/src/ipc.rs): once a child touches pages that can't be backed by /dev/shm, the mapping faults with SIGBUS.

Setting cache: false on the task resolves UserCacheConfig::Disabled in vite_task_graph::config, which skips the fspy IPC channel entirely, and the same workload completes cleanly.

I don't intend to submit a PR but I'm happy to test patches.

Reproduction

https://github.com/TheHolyWaffle/vite-plus-sigbus-repro

Steps to reproduce

git clone https://github.com/TheHolyWaffle/vite-plus-sigbus-repro
cd vite-plus-sigbus-repro
npm ci

# Fails — "Bus error (core dumped)" + "Worker exited unexpectedly" after ~10s
docker run --rm --shm-size=64m -v "$PWD":/work -w /work node:24.15.0 \
  bash -c 'npm ci --prefer-offline && npx vp run --filter "*" test --no-cache'

Then change the test task in vite.config.ts from { command: 'vp test --run' } to { command: 'vp test --run', cache: false } and re-run — passes cleanly (~49 s).

The repo also has a .github/workflows/repro.yml that runs both cases on ubuntu-latest under docker.

System Info

$ npx vp --version
VITE+ - The Unified Toolchain for the Web

vp v0.0.0

Local vite-plus:
  vite-plus  v0.1.19

Tools:
  vite             v8.0.8
  rolldown         v1.0.0-rc.16
  vitest           v4.1.4
  oxfmt            v0.45.0
  oxlint           v1.60.0
  oxlint-tsgolint  v0.21.1
  tsdown           v0.21.9

$ npx vp env current
error: Command 'env' not found
# `vp env current` isn't recognized as a subcommand on 0.1.19 — not sure what
# the template expects here; listing node + uname below instead.

$ node --version
v24.15.0

$ uname -a
Linux 6.6.119-0-virt #1-Alpine SMP PREEMPT_DYNAMIC 2025-12-10 08:04:08 aarch64 GNU/Linux

# Host: node:24.15.0 container, glibc (Debian base), aarch64 on Rancher Desktop.
# Also reproduced on x86_64 GitLab K8s runners with node:24.15.0.

Versions tested: vite-plus 0.1.16 and 0.1.19 — both affected.
Pre-fspy versions (0.1.14) are not affected on the same workload.Failing run (cache enabled, default):

=== RUN ===
 RUN  /work

⎯⎯⎯⎯⎯⎯ Unhandled Errors ⎯⎯⎯⎯⎯⎯

Vitest caught 1 unhandled error during the test run.

⎯⎯⎯⎯⎯⎯ Unhandled Error ⎯⎯⎯⎯⎯⎯⎯
Error: [vitest-pool]: Worker forks emitted error.
 ❯ EventEmitter.<anonymous> node_modules/@voidzero-dev/vite-plus-test/dist/chunks/cli-api.lDy4N9kC.js:3444:27
 ❯ EventEmitter.emit node:events:509:28
 ❯ ChildProcess.emitUnexpectedExit node_modules/@voidzero-dev/vite-plus-test/dist/chunks/cli-api.lDy4N9kC.js:3011:24
 ❯ ChildProcess.emit node:events:509:28
 ❯ Process.ChildProcess._handle.onexit node:internal/child_process:295:12

Caused by: Error: Worker exited unexpectedly
 ❯ ChildProcess.emitUnexpectedExit node_modules/@voidzero-dev/vite-plus-test/dist/chunks/cli-api.lDy4N9kC.js:3010:35
 ❯ ChildProcess.emit node:events:509:28
 ❯ Process.ChildProcess._handle.onexit node:internal/child_process:295:12

 Test Files  (1)
      Tests  (1)
     Errors  1 error
   Duration  10.13s

Bus error (core dumped)

---

Passing run with `cache: false`:

=== RUN (cache: false) ===
$ vp test --run --no-cache ⊘ cache disabled
 RUN  /work

 ✓ packages/heavy-io-test/src/heavy.spec.ts (1 test) 48896ms
 Test Files  1 passed (1)
      Tests  1 passed (1)
   Duration  49.02s

---

Possible fixes:

1. Probe /dev/shm size at startup and either reduce SHM_CAPACITY to fit, or fall back to a file/pipe-backed IPC.
2. When the first write to the shared mapping fails, degrade gracefully with a warning instead of producing SIGBUS in the test worker.
3. CLI `--no-cache` disables cache hit/store but still resolves the task to UserCacheConfig::Enabled, so fspy tracking still runs. Either make --no-cache imply no tracking (what users typically expect) or document the distinction so `cache: false` is the obvious knob for CI.

Used Package Manager

npm

Logs

Validations

Metadata

Metadata

Assignees

Type

Priority

None yet

Effort

None yet

Target date

None yet

Start date

None yet

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions