docs: Audit crate boundaries and responsibilities#32
Merged
Conversation
Created comprehensive crate structure documentation: docs/architecture/CRATE_STRUCTURE.md includes: - Summary table of all 14 crates with line counts and dependencies - ASCII dependency graph showing crate relationships - Compilation pipeline diagram - Detailed breakdown of each crate's purpose - Answered key questions: - synth-opt vs synth-synthesis: optimization vs instruction selection - synth-codegen vs synth-backend: assembly generation vs binary emission - Refactoring recommendations (conclusion: structure is already good) Key finding: Crate structure is well-designed with clear separation. Priority should be wiring the pipeline, not restructuring crates. Closes #9
avrabe
added a commit
that referenced
this pull request
Mar 1, 2026
…ault Shift operations (SHL, SHR_S, SHR_U, ROTL, ROTR) had `shift: 0` hardcoded, using immediate-shift ARM instructions. In WASM, the shift amount is always dynamic (popped from stack), so these must use register-based forms (LslReg, AsrReg, LsrReg, RorReg). ROTL is implemented as RSB tmp, rm, #32 + ROR rd, rn, tmp (rotate left by N = rotate right by 32-N). Also: CLI now compiles all exported functions by default when no --func-index or --func-name is specified, fixing the "only first function compiled" issue (#35). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
avrabe
added a commit
that referenced
this pull request
Mar 2, 2026
…aching 188 Qed / 52 Admitted Phase 4: Align Rocq model with Rust instruction selector for shift/rotate ops. ARM model changes: - Add LSL_reg, LSR_reg, ASR_reg, ROR_reg (register-based shifts) to ArmInstructions.v - Add RSB (reverse subtract) instruction for rotl compilation - Add semantics for all 5 new instructions in ArmSemantics.v Compilation alignment: - I32/I64 shifts now use register-based instructions (LSL_reg R0 R0 R1, etc.) - I32/I64 rotl now uses RSB R2 R1 #32; ROR_reg R0 R0 R2 (matching Rust) - I32/I64 rotr now uses ROR_reg R0 R0 R1 Proofs closed: - i32_shl_correct, i32_shru_correct, i32_shrs_correct (via synth_binop_proof) - i32_rotr_correct (via synth_binop_proof) - i32_rotl_correct (explicit witness with set/subst + get_set_reg lemmas) - Added I32.rotl_rotr_sub axiom connecting rotl to rotr via complement T1 result-correspondence: 34 → 39 (+5 shift/rotate) CorrectnessI32.v: 29 Qed / 0 Admitted (all i32 ops now have T1 proofs) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
avrabe
added a commit
that referenced
this pull request
May 10, 2026
…x-M (#93) Pre-fix, `optimizer_bridge::wasm_to_ir` had no handler for `WasmOp::I64ExtendI32U` / `I64ExtendI32S` / `I32WrapI64`. They fell through the catch-all `_ => Opcode::Nop`, advancing `inst_id` by 1 but never registering the produced i64-pair vregs in `vreg_to_arm`. When a downstream `Opcode::I64ShrU` / `I64Shl` looked up its shift-count vreg via `get_arm_reg`, the lookup missed the map and silently fell back to `Reg::R0` (`optimizer_bridge.rs:1333`). The ARM emitter for the variable i64 shift writes to both `rm_lo` and `rm_hi` as scratch (`AND.W rm_lo, rm_lo, #63; SUBS.W rm_hi, rm_lo, #32; ...`) — so any function that did `i64.shr_u` of an `i64.extend_i32_u`-ed shift count clobbered the AAPCS first-param register R0 inside the shift expansion. For `compiler_builtins::memset`, R0 is the destination pointer, so the byte loop's pointer was destroyed every iteration → non-terminating loop on silicon at `memset+0x4c`. This patch: - Adds `Opcode::I64ExtendI32U`, `Opcode::I64ExtendI32S`, and `Opcode::I32WrapI64` to `synth-opt`. - Handles those WasmOps in `wasm_to_ir`, with slot accounting that keeps `inst_id.saturating_sub(K)` arithmetic correct for downstream i64 ops (extend reuses the consumed i32 slot for `dest_lo`; wrap reuses the i64-lo slot and decrements `inst_id` so the natural +1 cancels with the -1 net slot delta). - Lowers them in `ir_to_arm`. Critically, the extend lowerings allocate a callee-saved consecutive pair via `alloc_i64_pair` and Mov the i32 source into `dest_lo` even when the source already lives in a non-param register. This ensures the downstream i64-shift's `rm_lo`/`rm_hi` (which the emitter treats as clobbered scratch) are never AAPCS param registers. - Updates `analyze_i64_local_gets` to skip the i32 LocalGet that feeds an `I64ExtendI32U`/`I64ExtendI32S` so the analyzer doesn't mistakenly mark it as i64. Regression test: `crates/synth-synthesis/tests/issue_93_memset_i64_codegen.rs` exercises the bug pattern in the optimized path (4-param function shifting an `i64.const` by `i64.extend_i32_u(local.get $n)`) and asserts the emitted i64-shift's `rm_lo`/`rm_hi` are not in R0..R3. Pre-fix: 3 of 5 tests fail (one per shift variant: shr_u, shl, shr_s). Post-fix: 5/5 pass. The no-optimize path was always correct and is exercised by a sanity test in the same file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
avrabe
added a commit
that referenced
this pull request
May 11, 2026
…x-M (#93) Pre-fix, `optimizer_bridge::wasm_to_ir` had no handler for `WasmOp::I64ExtendI32U` / `I64ExtendI32S` / `I32WrapI64`. They fell through the catch-all `_ => Opcode::Nop`, advancing `inst_id` by 1 but never registering the produced i64-pair vregs in `vreg_to_arm`. When a downstream `Opcode::I64ShrU` / `I64Shl` looked up its shift-count vreg via `get_arm_reg`, the lookup missed the map and silently fell back to `Reg::R0` (`optimizer_bridge.rs:1333`). The ARM emitter for the variable i64 shift writes to both `rm_lo` and `rm_hi` as scratch (`AND.W rm_lo, rm_lo, #63; SUBS.W rm_hi, rm_lo, #32; ...`) — so any function that did `i64.shr_u` of an `i64.extend_i32_u`-ed shift count clobbered the AAPCS first-param register R0 inside the shift expansion. For `compiler_builtins::memset`, R0 is the destination pointer, so the byte loop's pointer was destroyed every iteration → non-terminating loop on silicon at `memset+0x4c`. This patch: - Adds `Opcode::I64ExtendI32U`, `Opcode::I64ExtendI32S`, and `Opcode::I32WrapI64` to `synth-opt`. - Handles those WasmOps in `wasm_to_ir`, with slot accounting that keeps `inst_id.saturating_sub(K)` arithmetic correct for downstream i64 ops (extend reuses the consumed i32 slot for `dest_lo`; wrap reuses the i64-lo slot and decrements `inst_id` so the natural +1 cancels with the -1 net slot delta). - Lowers them in `ir_to_arm`. Critically, the extend lowerings allocate a callee-saved consecutive pair via `alloc_i64_pair` and Mov the i32 source into `dest_lo` even when the source already lives in a non-param register. This ensures the downstream i64-shift's `rm_lo`/`rm_hi` (which the emitter treats as clobbered scratch) are never AAPCS param registers. - Updates `analyze_i64_local_gets` to skip the i32 LocalGet that feeds an `I64ExtendI32U`/`I64ExtendI32S` so the analyzer doesn't mistakenly mark it as i64. Regression test: `crates/synth-synthesis/tests/issue_93_memset_i64_codegen.rs` exercises the bug pattern in the optimized path (4-param function shifting an `i64.const` by `i64.extend_i32_u(local.get $n)`) and asserts the emitted i64-shift's `rm_lo`/`rm_hi` are not in R0..R3. Pre-fix: 3 of 5 tests fail (one per shift variant: shr_u, shl, shr_s). Post-fix: 5/5 pass. The no-optimize path was always correct and is exercised by a sanity test in the same file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive audit of the 14 Synth crates to document responsibilities, identify overlaps, and propose improvements.
Deliverable
Created
docs/architecture/CRATE_STRUCTURE.mdwith:Key Finding
The pipeline flows cleanly:
Recommendation
Focus on wiring the pipeline together rather than restructuring crates.
Issues #10 and #11 (merge/split crates) can be closed as "won't fix" since the analysis shows the current structure is already appropriate.
Closes #9