Skip to content

feat(opt): island-model parallel optimization (v1.0.4 #71, Track D)#128

Merged
avrabe merged 1 commit into
mainfrom
release/v1.0.4-pr-islands
May 17, 2026
Merged

feat(opt): island-model parallel optimization (v1.0.4 #71, Track D)#128
avrabe merged 1 commit into
mainfrom
release/v1.0.4-pr-islands

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 17, 2026

New loom-core/src/islands.rs (~580 LOC) + CLI --islands N flag. Runs N IslandConfigs concurrently via rayon, each independently passing Z3 + stack validation. Picks min_by_key(encoded_size) with deterministic name lex tie-break.

Measurement on gale: all 4 default islands converge to 1846 bytes (a fixed-point after v0.7.0+ pipeline hardening — the regression-detection safety net is in place even when current pipelines don't diverge). N=4 takes 1.4× wall time for 4× serial work — rayon distribution confirmed across cores.

🤖 Generated with Claude Code

Implements #71: run multiple pass orderings concurrently and pick the
smallest verified result. The v0.6.0 -> v0.7.0 gale CSE-cost regression
(commit afc9318) is exactly the failure mode this guards against -
different orderings produce different sizes, the smaller verified result
wins, and the regression cannot ship.

New module `loom-core/src/islands.rs` with `IslandConfig` and
`optimize_module_islands(module, configs) -> Result<Module>`. Ships 4
default configs (baseline, inline-late, cse-early, aggressive-inline).
Each island clones the input, runs its pass sequence (every pass still
invokes its per-function Z3 + stack `verify_or_revert` gate internally),
encodes the result, validates via `wasmparser::validate`, and is
selected if it produces the smallest size. Tie-break is lex order on
island name for deterministic results.

Parallelism via `rayon::scope`. Soundness: Z3's context is thread-local
(z3 crate's `with_z3_config` sets a per-thread context), so each rayon
worker creates its own Z3 state when verification passes run inside the
optimization passes themselves. No shared mutable Z3 state across
islands.

CLI: new `--islands N` flag (default 1 preserves current serial path).
N>1 dispatches to the parallel harness. Per-island encoded sizes are
emitted to stderr so users can see how each ordering shaped the output -
this is exactly the diagnostic that would have caught the gale
regression.

Tests: 6 unit tests covering N=1 byte-identity with serial,
smallest-wins selection, invalid-island rejection, all-failed error,
deterministic tie-break across rayon thread interleavings, empty-configs
error.

Dependency: `rayon = "1"` added to `loom-core` only - the CLI does not
take a direct rayon dep, the boundary stays tight.

Measurement on gale_in_baseline.wasm: all 4 islands produce 1846 bytes
(converged fixed point on this small fixture). N=1 takes 303ms, N=4
takes 430ms - parallelism is working (4x sequential work in ~1.4x wall
time). On larger / less-converged modules the harness will surface the
size deltas that would otherwise have shipped silently.

Implements #71
Refs: docs/research/v1.0.3/issue-roadmap.md (#71 section)
@avrabe avrabe merged commit d8317ec into main May 17, 2026
9 of 19 checks passed
@avrabe avrabe deleted the release/v1.0.4-pr-islands branch May 17, 2026 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant