Skip to content

feat(opt): #68 Tier-1.1 scalar adapter inlining + Tier-2.2 body dedup (v1.0.5 Track 3)#132

Merged
avrabe merged 1 commit into
mainfrom
release/v1.0.5-pr-68-fused
May 19, 2026
Merged

feat(opt): #68 Tier-1.1 scalar adapter inlining + Tier-2.2 body dedup (v1.0.5 Track 3)#132
avrabe merged 1 commit into
mainfrom
release/v1.0.5-pr-68-fused

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 19, 2026

Extends fused_optimizer.rs with two passes. ~510 LOC + 6 tests, all pass. Byte-neutral on current corpus (no real meld-fused fixture yet); infrastructure for #68's broader cross-component story.

🤖 Generated with Claude Code

… (v1.0.5 Track 3)

Adds two new passes to loom-core/src/fused_optimizer.rs, slotted between
devirtualize_adapters and eliminate_dead_functions.

## Pass 1.1 — inline_scalar_adapters (Tier-1.1)

Filters AdapterInfo list to scalar (i32/i64/f32/f64) adapters whose body
matches one of two canonical shapes:

  - Pure pass-through: 'local.get 0; ...; local.get N-1; call target;
    end' with no memory ops. Already in the desired shape — no-op rewrite
    but counted so the optimisation surface is visible.
  - Cross-memory scalar copy: load(callee_mem) + load(caller_mem) +
    call target + store(caller_mem) + store(callee_mem), where the two
    memory indices differ. Rewritten to the pass-through shape so the
    standard inline_functions pass picks them up downstream.

Anything else is skipped (Unrecognised). Start function is also skipped.

## Pass 1.2 — dedupe_function_bodies (Tier-2.2)

Hashes each local function's (signature, locals, instructions) tuple
using DefaultHasher over Debug-format strings (Instruction derives
PartialEq but not Hash, so we use a serialisable representation).
Groups by hash, verifies byte-equality on multi-entry groups (defends
against hash collisions), and redirects all calls to the duplicates to
a single representative (lowest function index) via the existing
rewrite_calls helper. Redirected functions become dead and are removed
by the subsequent eliminate_dead_functions pass.

Conservative skips:

  - instructions.len() <= 4 (overhead exceeds savings on tiny bodies)
  - Exported functions (observable name on public surface)
  - Start function (same)

## Wiring

Both passes are added as Pass 1.1 and Pass 1.2 in optimize_fused_module.
New stats fields 'scalar_adapters_inlined' and
'function_bodies_deduplicated' surface counts.

## Correctness

Both passes are scoped narrowly enough to be obviously sound:

  - Tier-1.1: pass-through and cross-memory shapes are both
    semantically equivalent to a direct call to the underlying target.
    Pass-through is identity; cross-memory relies on callee-memory
    loads/stores being redundant once the underlying function is called
    directly with the caller's scalar value.
  - Tier-2.2: redirecting a call site from func A to func B is sound
    when A and B have byte-identical (signature, locals, instructions).

## Tests (6 new, all passing)

  - test_inline_scalar_adapter_pass_through
  - test_inline_scalar_adapter_skips_non_scalar (Unrecognised body shape)
  - test_dedupe_finds_identical_bodies
  - test_dedupe_skips_exported_function
  - test_dedupe_skips_tiny_bodies (<=4 instr threshold)
  - test_inline_and_dedupe_via_full_pipeline (end-to-end through
    optimize_fused_module, checks both stats counters fire)

Full fused_optimizer test suite: 64 passed / 0 failed.

## Measurement (loom optimize --attestation false)

  gale_in_baseline.wasm:  1941 -> 1846  (-95B, -4.9%)
  httparse.wasm:          4766 -> 4668  (-98B, -2.1%)
  json_lite.wasm:         3510 -> 3377  (-133B, -3.8%)
  state_machine.wasm:     1655 -> 1558  (-97B, -5.9%)
  calc.component.wasm:     442 ->  392  (-50B, -11.3%)

All deltas come from the broader optimization pipeline. The new
fused-side passes are byte-neutral on non-fused inputs (as designed)
but unblock further wins on real meld-fused multi-component modules
once those are added to the corpus.

Trace: REQ-3, REQ-11
Refs: #68

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@avrabe avrabe merged commit 27c0a01 into main May 19, 2026
@avrabe avrabe deleted the release/v1.0.5-pr-68-fused branch May 19, 2026 05:28
@avrabe avrabe mentioned this pull request May 19, 2026
avrabe added a commit that referenced this pull request May 19, 2026
…#68 fused + Rocq prep) (#133)

Four v1.0.4 infrastructure pieces grew real consumers:

  #130  Track 1+4  ægraph pipeline integration + Rocq Roundtrip prep doc
  #131  Track 2    #70 six-pass chain composition + task.return shim peephole
  #132  Track 3    #68 Tier-1.1 scalar adapter inlining + Tier-2.2 body dedup

+18 tests, 400+ total. Code-section bytes unchanged on the current
corpus — Track 3 agent flagged 4 pre-existing observations worth
tracking for v1.0.6+.
avrabe added a commit that referenced this pull request May 19, 2026
….1.0 Track E) (#136)

Adds the `meld_fused` workload entry to `scripts/measure_corpus.sh` and a
`tests/corpus/MELD_FUSED_README.md` placeholder documenting exactly what
the v1.0.5 Track 3 byte-wins surface needs: a real `meld fuse` of two
**dependent** components (one importing the other) so the fused core
contains a cross-memory scalar adapter and at least one dedup target.

This is a doc-only PR because the authoring agent could not invoke
`meld` (environment permission restriction). Per CLAUDE.md
"Honesty > shipping something fake": the harness now lists the workload
(it reports `n/a` until the fixture lands), and the README spells out the
toolchain, input requirements, and verification commands so the next
operator can complete the work in minutes.

The existing single-component fixtures (`simple.component.wasm`,
`calc.component.wasm`) are independent (neither imports the other) and
therefore cannot produce the cross-memory adapter shape Track 3 Tier-1.1
targets — that's why Track 3 fired byte-neutral on the v1.0.5 corpus.

Implements: TASK-corpus-meld-fixture
Refs: #132 (Track 3 passes), 0424b7e (component-aware harness)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant