Skip to content

Commit 4d00a6c

Browse files
12-op ISA and micro sequencer.
1 parent 5b8a7df commit 4d00a6c

47 files changed

Lines changed: 4931 additions & 2029 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/command_protocol.md

Lines changed: 255 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,255 @@
1+
# Command Protocol
2+
3+
This document describes how to drive the live `tt_um_crockpotveggies_neuron` wrapper through the TinyTapeout pins.
4+
5+
The current protocol is a request/ready interface at the wrapper boundary. The wrapper synchronizes the external pins, de-duplicates the input request, decodes a `tt_cmd_t`, and forwards that command to the programmable neuron core.
6+
7+
## 1. Pin Map
8+
9+
### Inputs
10+
11+
- `ui_in[7:0]`: primary command payload byte
12+
- `uio_in[0]`: input request / command valid
13+
- `uio_in[1]`: output acknowledge
14+
- `uio_in[7:2]`: sideband payload
15+
16+
### Outputs
17+
18+
- `uo_out[7:0]`: held output beat from the neuron core
19+
- `uio_out[0]`: wrapper ready for the next command
20+
- `uio_out[1]`: output valid
21+
- `uio_out[7:2]`: always `0`
22+
- `uio_oe[1:0] = 1`, `uio_oe[7:2] = 0`
23+
24+
## 2. Input Handshake Rules
25+
26+
The live wrapper accepts exactly one command per assertion of `uio_in[0]`.
27+
28+
Host-side sequence:
29+
30+
1. Drive `ui_in` and `uio_in[7:2]` with the desired command payload.
31+
2. Assert `uio_in[0]`.
32+
3. Hold the payload stable until `uio_out[0]` is observed high.
33+
4. Deassert `uio_in[0]`.
34+
5. Only then begin the next command.
35+
36+
Important behavior:
37+
38+
- The wrapper internally synchronizes the incoming pins before latching a command.
39+
- `in_req_seen` blocks duplicate acceptance while `uio_in[0]` remains high.
40+
- Holding `uio_in[0]` high after acceptance will not enqueue repeated commands.
41+
- A new command requires a fresh low-to-high transition of `uio_in[0]`.
42+
43+
## 3. Output Handshake Rules
44+
45+
The core emits one held output beat at a time.
46+
47+
Host-side sequence:
48+
49+
1. Wait for `uio_out[1] = 1`.
50+
2. Read `uo_out[7:0]`.
51+
3. Assert `uio_in[1]` to acknowledge the beat.
52+
53+
The core will keep `uo_out[7:0]` stable until it sees `uio_in[1]` through the synchronized frontend.
54+
55+
## 4. Command Classes
56+
57+
`tt_event_decode.sv` always projects the raw pin payload into all `tt_cmd_t` fields. The active meaning depends on `cmd.kind = ui_in[7:6]`.
58+
59+
### `CMD_CSR = 2'b00`
60+
61+
Writes one 8-bit CSR value.
62+
63+
Encoding:
64+
65+
- `ui_in[5:2] = csr_addr`
66+
- `ui_in[1:0] = data[1:0]`
67+
- `uio_in[7:2] = data[7:2]`
68+
69+
Reconstructed byte:
70+
71+
- `cmd.data = {uio_in[7:2], ui_in[1:0]}`
72+
73+
### `CMD_WEIGHT = 2'b01`
74+
75+
Writes one host-supplied ternary weight.
76+
77+
Encoding:
78+
79+
- `ui_in[5:2] = synapse_id`
80+
- `ui_in[1:0] = weight_code`
81+
82+
Weight codes:
83+
84+
- `2'b00` => `0`
85+
- `2'b01` => `+1`
86+
- `2'b11` => `-1`
87+
- `2'b10` => treated as `0`
88+
89+
### `CMD_UCODE = 2'b10`
90+
91+
Streams one microcode byte into the 16x16 microcode store.
92+
93+
Encoding:
94+
95+
- `ui_in[1:0] = data[1:0]`
96+
- `uio_in[7:2] = data[7:2]`
97+
98+
The destination byte address comes from `ucode_ptr_r`. After acceptance:
99+
100+
- `ucode_prog_we` pulses for one cycle
101+
- `ucode_prog_addr = ucode_ptr_r`
102+
- `ucode_prog_data = cmd.data`
103+
- `ucode_ptr_r` auto-increments
104+
105+
### `CMD_EVENT = 2'b11`
106+
107+
Queues one inbound event in the 2-entry FIFO.
108+
109+
Encoding:
110+
111+
- `ui_in[5:2] = sid`
112+
- `ui_in[1:0] = tag`
113+
- `uio_in[7:2] = event_time`
114+
115+
Queued event payload:
116+
117+
- `sid[3:0]`
118+
- `tag[1:0]`
119+
- `event_time[5:0]`
120+
121+
## 5. `cmd_ready` Behavior
122+
123+
The neuron core computes readiness from the decoded command class.
124+
125+
### Non-event commands
126+
127+
`CMD_CSR`, `CMD_WEIGHT`, and `CMD_UCODE` are accepted whenever:
128+
129+
- `ena = 1`
130+
- `rst_n = 1`
131+
- no event is currently in flight (`busy_r = 0`)
132+
133+
They do not depend on FIFO occupancy or a held output beat, but they are intentionally blocked while the core is mid-event so an in-flight event sees a stable program image and weight configuration.
134+
135+
### Event commands
136+
137+
`CMD_EVENT` is accepted only when:
138+
139+
- `ena = 1`
140+
- `rst_n = 1`
141+
- no event is currently in flight (`busy_r = 0`)
142+
- no output beat is currently held
143+
- the registered FIFO level is not `2`
144+
145+
The live implementation uses a state-only fullness check (`fifo_level != 2`) to avoid a combinational loop through FIFO ready/pop logic, so a full FIFO will not accept a same-cycle replacement push.
146+
147+
## 6. CSR Map
148+
149+
The core uses a compact wrapper-owned CSR bank.
150+
151+
### `0x0` `CSR_CTRL`
152+
153+
Pulse bits:
154+
155+
- `bit0`: soft runtime reset
156+
- `bit1`: clear held output beat
157+
- `bit2`: clear event FIFO
158+
159+
### `0x1` `CSR_UCODE_PTR`
160+
161+
- `cmd.data[4:0]` sets the byte pointer used by `CMD_UCODE`
162+
163+
### `0x2` `CSR_UCODE_LEN`
164+
165+
- `cmd.data[3:0]` sets the last active microcode step index
166+
- `0` means one active instruction
167+
- `15` means sixteen active instructions
168+
169+
### `0x3` `CSR_VEC_BASE_01`
170+
171+
- `cmd.data[3:0]` => vector base for `tag 0`
172+
- `cmd.data[7:4]` => vector base for `tag 1`
173+
174+
### `0x4` `CSR_VEC_BASE_23`
175+
176+
- `cmd.data[3:0]` => vector base for `tag 2`
177+
- `cmd.data[7:4]` => vector base for `tag 3`
178+
179+
### `0x5` `CSR_INIT_VI`
180+
181+
- `cmd.data[3:0]` => reset/init value for `R0 = V`
182+
- `cmd.data[7:4]` => reset/init value for `R1 = I`
183+
184+
### `0x6` `CSR_INIT_TR`
185+
186+
- `cmd.data[3:0]` => reset/init value for `R2 = TH`
187+
- `cmd.data[7:4]` => reset/init value for `R3 = R`
188+
189+
### `0x7` `CSR_INIT_T01`
190+
191+
- `cmd.data[3:0]` => reset/init value for `R4 = T0`
192+
- `cmd.data[7:4]` => reset/init value for `R5 = T1`
193+
194+
### `0x8` `CSR_INIT_WAUX`
195+
196+
- `cmd.data[3:0]` => reset/init value for `R6 = W`
197+
- `cmd.data[7:4]` => reset/init value for `R7 = AUX`
198+
199+
This only changes the RF reset image for `W` and `AUX`.
200+
201+
It does not preload the persistent 16-entry synapse weight bank. Use `CMD_WEIGHT` for that.
202+
203+
## 7. FIFO Semantics
204+
205+
The ingress FIFO is two entries deep and stores only event commands.
206+
207+
Properties:
208+
209+
- in-order delivery
210+
- simultaneous push and pop supported
211+
- `level` is `0`, `1`, or `2`
212+
- `out_valid` mirrors whether slot 0 is occupied
213+
- `clear` empties both entries immediately on the next clock edge
214+
215+
The neuron core pops automatically whenever:
216+
217+
- `ena = 1`
218+
- `rst_n = 1`
219+
- `have_out_r = 0`
220+
- `out_valid = 1`
221+
222+
There is no separate "run" command for service once an event has entered the FIFO.
223+
224+
## 8. Output Beat Encoding
225+
226+
`uo_out[7:0]` is generated by the `EMIT` micro-op and stored in `neuron_state`.
227+
228+
Layout:
229+
230+
- `uo_out[7] = 1` valid marker
231+
- `uo_out[6:5] = emitted tag literal`
232+
- `uo_out[4:1] = last_sid`
233+
- `uo_out[0] = spike_flag`
234+
235+
Only the first `EMIT` encountered during one event service pass is kept.
236+
237+
## 9. Practical Programming Sequence
238+
239+
A typical host-side setup looks like this:
240+
241+
1. Program `CSR_UCODE_PTR` if you want to start writing microcode somewhere other than byte `0`.
242+
2. Stream microcode bytes with `CMD_UCODE`.
243+
3. Program `CSR_UCODE_LEN`.
244+
4. Program `CSR_VEC_BASE_01` and `CSR_VEC_BASE_23`.
245+
5. Program any desired initial RF values with the `CSR_INIT_*` registers.
246+
6. Program weights with `CMD_WEIGHT`.
247+
7. Send events with `CMD_EVENT`.
248+
249+
Because non-event commands are blocked while `busy_r = 1`, the clean operating model is:
250+
251+
- program while idle
252+
- then enqueue events
253+
- then optionally perform more programming only after the current event retires
254+
255+
For a complete worked example, including how to map several core instances into a fully connected layer, see `docs/layer_examples.md`.

0 commit comments

Comments
 (0)