Skip to content

wave-36d: t27c add gen-axi-lite-slave for BitNet host CSR interface (R-BN-4, Closes #766)#767

Merged
gHashTag merged 1 commit into
masterfrom
wave-36d/axi-lite-slave
May 23, 2026
Merged

wave-36d: t27c add gen-axi-lite-slave for BitNet host CSR interface (R-BN-4, Closes #766)#767
gHashTag merged 1 commit into
masterfrom
wave-36d/axi-lite-slave

Conversation

@gHashTag
Copy link
Copy Markdown
Owner

Closes #766

Wave 36d lands the BitNet HLS host interface (R-BN-4): an AMBA AXI4-Lite slave with a 16-entry control/status register aperture. With this module the previously emitted compute and buffering blocks (W36a-W36c) become host-controllable — a CPU can program DDR base addresses, layer geometry, threshold, IRQ enables, kick off inference via the CTRL register, and read retired-cycle telemetry.

W36d intentionally ships a single AXI module to keep the L4 test surface obozrimo. DMA controller and IRQ controller land in W36e / W36f.

What changed

  • bootstrap/src/bitnet_axi.rs (NEW, ~410 lines) — build_axi_lite_slave(module_name, addr_width, data_width) + Verilog-identifier validator + param clamp + 15 inline unit tests.
  • bootstrap/src/main.rs — registers mod bitnet_axi;, adds Commands::GenAxiLiteSlave + run_gen_axi_lite_slave() dispatched from both match arms via the shared write_verilog_to_output(...) helper.
  • bootstrap/tests/bitnet_axi.rs (NEW) — 18 integration tests covering module-name overrides, identifier rejection, parameter clamps, full write/read AXI port lists, full CSR port list, complete write/read case maps (16 readable offsets, 12 writable offsets), 32'hDEADBEEF default read, BRESP/RRESP OKAY semantics, handshake dropbacks, reset, ASCII, and help.
  • docs/NOW.md — Wave 36d section prepended.

CLI surface

t27c gen-axi-lite-slave [--module-name <name>] [--addr-width <N>] [--data-width <N>] [--output <path>]

Defaults: ADDR_WIDTH=8 (clamped 1..=16), DATA_WIDTH=32 (clamped 1..=64).

CSR map (16 regs × 4 bytes = 64-byte aperture)

Offset Name Dir Notes
0x00 CTRL RW enable / start
0x04 STATUS RO engine status
0x08 IRQ_EN RW per-cause interrupt enable
0x0C IRQ_STAT RO latched interrupt causes
0x10 NUM_LAYERS RW inference depth
0x14 NEURONS RW per-layer width
0x18 CHUNKS RW chunks per neuron
0x1C THRESHOLD RW ternary activation threshold
0x20 WEIGHT_ADDR_LO RW DDR weight base (low 32)
0x24 WEIGHT_ADDR_HI RW DDR weight base (high 32)
0x28 INPUT_ADDR_LO RW DDR input base (low)
0x2C INPUT_ADDR_HI RW DDR input base (high)
0x30 OUTPUT_ADDR_LO RW DDR output base (low)
0x34 OUTPUT_ADDR_HI RW DDR output base (high)
0x38 CYCLES_LO RO retired cycle counter (low)
0x3C CYCLES_HI RO retired cycle counter (high)

Reads to unmapped offsets return 32'hDEADBEEF. All responses OKAY.

Sample output

t27c gen-axi-lite-slave --module-name axi_lite_slave:

module axi_lite_slave #(
    parameter ADDR_WIDTH = 8,
    parameter DATA_WIDTH = 32
) (
    input  wire                    clk,
    input  wire                    rst_n,
    // AXI-Lite Write Address channel
    input  wire [ADDR_WIDTH-1:0]   s_axi_awaddr,
    input  wire                    s_axi_awvalid,
    output reg                     s_axi_awready,
    // ... full 5-channel AXI-Lite port list ...
    // CSR outputs / inputs
    output reg  [31:0]             reg_ctrl,
    input  wire [31:0]             reg_status,
    // ... 12 more CSR ports ...
);
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n) begin
            s_axi_awready <= 1'b1; s_axi_wready <= 1'b1; s_axi_bvalid <= 1'b0;
            // ... reset all regs ...
        end else begin
            // ---- Write channel ----
            if (s_axi_awvalid && s_axi_wvalid && s_axi_awready && s_axi_wready) begin
                case (s_axi_awaddr[5:2])
                    4'h0: reg_ctrl                 <= s_axi_wdata[31:0];
                    4'h8: reg_weight_addr[31:0]    <= s_axi_wdata[31:0];
                    4'h9: reg_weight_addr[63:32]   <= s_axi_wdata[31:0];
                    // ...
                endcase
                s_axi_bvalid <= 1'b1; s_axi_bresp <= 2'b00;
            end
            // ---- Read channel ----
            if (s_axi_arvalid && s_axi_arready) begin
                case (s_axi_araddr[5:2])
                    4'hE: s_axi_rdata <= reg_cycles[31:0];
                    4'hF: s_axi_rdata <= reg_cycles[63:32];
                    default: s_axi_rdata <= 32'hDEADBEEF;
                endcase
                s_axi_rvalid <= 1'b1; s_axi_rresp <= 2'b00;
            end
        end
    end
endmodule

Constitutional gates

  • L1 TRACEABILITY: Closes #766 in title, body, and commit message.
  • L2 SCOPE: Touches only bootstrap/ and docs/NOW.md. No edits to gen/, coq/, trios-coq/, proofs/, specs/, conformance/, architecture/, rings/, or root Cargo.toml.
  • L3 ASCII: Source + doc-comments are ASCII English-only.
  • L4 TESTS: 18 new integration tests added; all green.
  • L5 KERNEL UNTOUCHED: Numeric kernel and trinity invariant phi^2 + 1/phi^2 = 3 unmodified — emitter is control-plane only.
  • L6 SPEC FROZEN: Zero spec/kernel changes — pure additive codegen.
  • *L7 NO NEW .sh: No shell scripts introduced.

Tests

  • W36d integration: 18/18 pass (cargo test -p t27c --release --test bitnet_axi).
  • Cross-wave regression: 98/98 pass (bitnet_buffers 22, bitnet_pipeline 20, weight_bram 13, phi_selfcheck 11, behavior_sva 8, trit_stdlib 14, verilog_* 5×2 = 10).
  • Total: 116/116.

Source attribution

Ported from gHashTag/vibee-lang src/vibeec/verilog_codegen.zig lines ~1344-1450 (writeAxiLiteSlave). Original author: Dmitrii Vasilev (@gHashTag).

Known orthogonal CI failures

The following jobs are inherited failures pre-dating Wave 31 and remain out of scope for this PR:

  • fpga-formal — formal engine harness.
  • fpga-synthesis-arty — Arty-board synthesis.
  • fpga-bitstream — bitstream build (always-pending → fail ~3 min).

Required gates (check, coverage, fpga-conformance, fpga-smoke, fpga-lint, fpga-synthesis, phi-loop-check, L1 traceability, plus the unit/integration suite) are expected to be green.

Roadmap

  • W36e: dma_controller (vibee-lang ~1452-1548) — streams activations between DDR and the on-chip activation BRAMs in parallel with compute.
  • W36f: interrupt_controller (~1550-1590) + bitnet_engine_top (~1667-1725) integration. Closes BitNet HLS at 9/9 components — end-to-end synthesizable.
  • W37: richer behavior-DSL (multi-clause antecedents, temporal ##N / s_eventually).
  • W38+: wire stdlib + behavior emitter into existing gen_verilog_* spec emits.

After W36d merges: BitNet HLS pipeline 6/9 components complete (weight_bram, pipeline_stage2_compute, layer_sequencer, double_buffer_ctrl, weight_prefetch_ctrl, axi_lite_slave).

@github-actions
Copy link
Copy Markdown

📓 NotebookLM Notebook linked to this PR

This notebook contains session context, decisions, and artifacts for this work.

@github-actions
Copy link
Copy Markdown

PR Dashboard

Generated at: 2026-05-23 12:12:56 UTC

Summary

Status Count
Total Open PRs 22
PRs with Failing Checks 20
PRs with All Checks Green 2
READY 1
FAILING 20
PENDING 0

@gHashTag gHashTag merged commit 951b7c4 into master May 23, 2026
21 of 24 checks passed
@gHashTag gHashTag deleted the wave-36d/axi-lite-slave branch May 23, 2026 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wave 36d (R-BN-4): t27c add gen-axi-lite-slave for BitNet host CSR interface

1 participant