wave-36b: t27c add gen-pipeline-stage2 + gen-layer-sequencer for BitNet SIMD compute + FSM (R-BN-2, Closes #762)#763
Merged
Conversation
…et SIMD compute + FSM (R-BN-2)
- new module bootstrap/src/bitnet_pipeline.rs: two pure string emitters
for the next two BitNet HLS pipeline primitives.
* build_pipeline_stage2(module_name) emits the SIMD compute stage:
consumes 54-bit input/weight chunks, instantiates trit27_dot_product
(from trit_stdlib, W33), accumulates into a signed 16-bit
accumulator gated by first_chunk, strobes valid_out and result_final
on last_chunk, resets on negedge rst_n.
* build_layer_sequencer(module_name) emits the three-state FSM
(IDLE/RUN/DONE_ST) walking (neuron_id, chunk_id) across the
neuron-chunk grid and driving valid / first_chunk / last_chunk /
done strobes consumed by pipeline_stage2_compute.
Includes the same Verilog-identifier validator pattern as W36a so
invalid module names safely fall back to the canonical defaults.
21 inline unit tests cover module headers, port lists, instantiation
hookup, accumulator semantics, FSM transitions, reset behaviour, and
ASCII-only invariant.
- main.rs: register mod bitnet_pipeline; add Commands::GenPipelineStage2
and Commands::GenLayerSequencer; dispatch in both HTTP-server and CLI
match arms via run_gen_pipeline_stage2 / run_gen_layer_sequencer.
Extracted a shared write_verilog_to_output(verilog, output, label)
helper from the existing per-subcommand boilerplate so future
emitters reuse one well-tested I/O path.
- bootstrap/tests/bitnet_pipeline.rs: 20 integration tests shelling out
to the two new subcommands, covering module header (default + custom +
invalid-name fallback), trit27_dot_product instantiation hookup,
54-bit chunk ports, signed-16 accumulator, negedge-rst_n reset,
first_chunk gating, last_chunk strobes, FSM three-state declaration,
port-list neuron/chunk counters, first/last-chunk strobes, IDLE arm
on start, DONE_ST -> IDLE transition, reset path, ASCII-only stdout,
and --output file writes for both subcommands.
- docs/NOW.md: prepend Wave 36b section.
Constitution: L1 traceability, L2 bootstrap-only (no edits under gen/,
coq/, trios-coq/, proofs/, specs/, conformance/, architecture/, rings/,
root Cargo.toml), L3 ASCII source + English doc-comments, L4 new tests
added and green (20 integration + 21 inline unit), L5 numeric kernel
untouched (emitters wire together existing primitives only), L6 zero
spec/kernel changes, L7 no new *.sh scripts.
Algorithms ported from gHashTag/vibee-lang src/vibeec/verilog_codegen.zig
lines ~1100-1145 (writePipelineStage2) and lines ~1147-1190
(writeLayerSequencer). Second slice of the W36 BitNet HLS pipeline port;
W36c will add double_buffer_ctrl + AXI-Lite / DMA / IRQ scaffolding.
BitNet HLS pipeline progress: 3/6 modules.
Closes #762
|
📓 NotebookLM Notebook linked to this PR
This notebook contains session context, decisions, and artifacts for this work. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #762
Wave 36b lands the second BitNet HLS pipeline module (R-BN-2): the SIMD compute stage with signed-16 accumulator (
pipeline_stage2_compute) and the neuron/chunk-grid FSM (layer_sequencer) that drives it. Both emitters are exposed as fresht27csubcommands so downstream FPGA flows can pin exact Verilog modules. The compute stage consumes thetrit27_dot_productprimitive emitted bygen-trit-stdlib(Wave 33) — wiring the BitNet stack together end-to-end.What changed
bootstrap/src/bitnet_pipeline.rs(NEW, ~330 lines) —build_pipeline_stage2(module_name)+build_layer_sequencer(module_name)+ Verilog-identifier validator + 21 inline unit tests.bootstrap/src/main.rs— registersmod bitnet_pipeline;, addsCommands::GenPipelineStage2+Commands::GenLayerSequencer, plusrun_gen_pipeline_stage2()/run_gen_layer_sequencer()dispatched from both match arms. Boilerplate de-duplicated into a newwrite_verilog_to_output(verilog, output, label)helper.bootstrap/tests/bitnet_pipeline.rs(NEW) — 20 integration tests covering module-name overrides, identifier rejection, port lists, FSM states, accumulator semantics, output writing, andtrit27_dot_productwiring.docs/NOW.md— Wave 36b section prepended.CLI surface
Sample output
t27c gen-pipeline-stage2 --module-name pipeline_stage2_compute:t27c gen-layer-sequencer --module-name layer_sequencer:Constitutional gates
Closes #762in title, body, and commit message.bootstrap/anddocs/NOW.md. No edits togen/,coq/,trios-coq/,proofs/,specs/,conformance/,architecture/,rings/, or rootCargo.toml.phi^2 + 1/phi^2 = 3unmodified.Tests
cargo test -p t27c --release --test bitnet_pipeline).weight_bram13,phi_selfcheck11,behavior_sva8,trit_stdlib14,verilog_*5×2 = 10).Source attribution
Ported from
gHashTag/vibee-langsrc/vibeec/verilog_codegen.zig:writePipelineStage2— lines ~1100-1145.writeLayerSequencer— lines ~1147-1190.Original author: Dmitrii Vasilev (
@gHashTag).Known orthogonal CI failures
The following jobs are inherited failures pre-dating Wave 31 and remain out of scope for this PR:
fpga-formal— formal engine harness.fpga-synthesis-arty— Arty-board synthesis.fpga-bitstream— bitstream build (always-pending → fail ~3 min).Required gates (
check,coverage,fpga-conformance, plus the unit/integration suite) are expected to be green.Roadmap
double_buffer_ctrl(ping-pong activation buffers, vibee-lang ~lines 1192-1225) + AXI-Lite / DMA / IRQ scaffolding — closes the BitNet HLS pipeline (6/6 modules).##N/s_eventually) — extension of Wave 34.gen_verilog_*spec emits — first wave that needs L2/L6 reconsideration.After W36b merges: BitNet HLS pipeline 3/6 modules complete (
weight_bram,pipeline_stage2_compute,layer_sequencer).