Observation
When synth lowers a wasm i32.load from a constant address that's known to be in .data/.rodata, it emits an absolute-address load rather than a base-relative one:
2f5ee: f240 1c00 movw ip, #256 ; build absolute address
2f5f2: f2c2 0c00 movt ip, #8192 ; = 0x20000100
2f5f8: f8dc 3000 ldr.w r3, [ip] ; load from absolute
That's three Thumb-2 instructions (10 bytes) per load. The conventional ARM convention is base-register-relative:
ldr r3, [r5, #imm] ; one instruction, 2-4 bytes
assuming a base register (often r9 for .data, pc for .rodata literal pools) is reserved.
Impact
In the gale-ffi spike, the bench's hot-path z_impl_k_sem_give uses linear-memory loads to read sem->count and sem->limit. Each one becomes a movw + movt + ldr triplet under synth, adding ~6 bytes per load vs the conventional ldrd r2, r1, [r0, #8]. Cumulative: ~20% of the size delta vs LLVM-LTO.
Recommendation
Reserve a base register for the wasm linear memory at function entry (or rely on the existing convention if there is one) and lower constant-address i32.load/i32.store to base+offset addressing. Could also use a literal pool for far constants.
Cross-references
Observation
When synth lowers a wasm
i32.loadfrom a constant address that's known to be in.data/.rodata, it emits an absolute-address load rather than a base-relative one:That's three Thumb-2 instructions (10 bytes) per load. The conventional ARM convention is base-register-relative:
assuming a base register (often
r9for.data,pcfor.rodataliteral pools) is reserved.Impact
In the gale-ffi spike, the bench's hot-path
z_impl_k_sem_giveuses linear-memory loads to readsem->countandsem->limit. Each one becomes amovw + movt + ldrtriplet under synth, adding ~6 bytes per load vs the conventionalldrd r2, r1, [r0, #8]. Cumulative: ~20% of the size delta vs LLVM-LTO.Recommendation
Reserve a base register for the wasm linear memory at function entry (or rely on the existing convention if there is one) and lower constant-address
i32.load/i32.storeto base+offset addressing. Could also use a literal pool for far constants.Cross-references