Skip to content

silicon: NUCLEO-G474RE anchor protocol scaffolding (board-prep)#37

Merged
avrabe merged 5 commits into
mainfrom
feat/silicon-anchor-nucleo-g474re
May 11, 2026
Merged

silicon: NUCLEO-G474RE anchor protocol scaffolding (board-prep)#37
avrabe merged 5 commits into
mainfrom
feat/silicon-anchor-nucleo-g474re

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 3, 2026

Summary

Scaffolding for the silicon-anchor protocol so that when the
NUCLEO-G474RE arrives this week, taking a capture is a single
`bash silicon/capture.sh --board nucleo_g474re --variant baseline --sweep long`
away. No firmware code touched; no CI changes. Manual-flow only,
recorded directly into the repo.

Companion to PR #36 (overhead compensation + SCOPE.md). Composes
cleanly on top — when #36 lands, silicon captures will already
emit the `overhead_cycles,` metadata and be apples-to-apples
with the post-compensation Renode CI numbers.

Architecture

  • CI = Renode, deterministic, parallel-safe, every PR.
  • Silicon = manual, periodic, single board shared across the
    team, recorded as immutable evidence in `silicon/runs//`.
    Citeable from any blog post via stable git URL.

The relationship `silicon_median / renode_median` per RPM step is
what the anchor establishes. Once consistent across multiple
captures it can be cited as the Renode-silicon multiplier for that
bench/board combination.

What's in this PR

Path What
`silicon/README.md` Protocol doc — why we anchor, the recorded-run-in-git convention, capture procedure for NUCLEO-G474RE, comparison workflow against Renode CI, anchor cadence, don't-do-this list (overwriting dated dirs, mixing pre/post-overhead-compensation captures, claiming WCET)
`silicon/capture.sh` Build + flash + capture + tag + manifest, one invocation. Auto-detects the USB serial port on macOS / Linux. Refuses to overwrite an existing dated dir.
`silicon/capture.py` Cross-platform pyserial UART capture. Reads until `=== END ===` or wall-clock timeout.
`silicon/boards/nucleo_g474re/` Board overlay (currently empty — Zephyr defaults are right) + board notes (clock, UART, programming)
`silicon/runs/.gitkeep` Placeholder; first dated capture goes here
`README.md` Methodology section now points at `silicon/`

Each captured run will commit:

  • `output.csv` — raw firmware UART
  • `events.csv` — tagged through `tag_events.py`
  • `firmware.elf` + `firmware.elf.sha256`
  • `manifest.txt` — board, MCU, gale_sha, rustc, west, zephyr_sha, ELF sha256, capture timestamp, port, timeout

CSV row counts are small (~50–500 KB per long-sweep run). At one
capture per board per major bench-relevant commit, repo growth is
modest.

Test plan

  • `bash -n capture.sh` syntax-clean
  • `python3 -c 'import ast; ast.parse(open(...))'` clean
  • First end-to-end capture on the NUCLEO-G474RE when it
    arrives — that's the validation that matters; this PR is
    preparation only.

Out of scope

  • ESP32-C3-DevKit-RUST-1 board: separate work, will need its own
    Renode equivalent built first (RISC-V port of engine_control)
    before silicon makes sense. Tracked separately.
  • Analyzer `--silicon-anchor` flag for single-call Renode-vs-silicon
    side-by-side rendering: deferred until the first capture exists
    to test against.

🤖 Generated with Claude Code

@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

avrabe added a commit that referenced this pull request May 9, 2026
A publication-grade silicon-anchor capture is the matrix
  variant ∈ {baseline, gale} × tick_source ∈ {systick, lptim}
not just two variants — LPTIM has different jitter and ISR-overhead
characteristics than the Cortex-M default SysTick, so the silicon /
renode multiplier must be reported per tick_source to be meaningful.

Changes:

- capture.sh
  - new --tick-source {systick,lptim} flag (default: systick)
  - OVERLAY_CONFIG composed from up to 3 ordered layers:
      1. gale overlay (when --variant gale)
      2. board silicon overlay (silicon/boards/<board>/prj.conf)
      3. tick-source overlay (silicon/boards/<board>/prj-tick-<src>.conf)
  - tick_source embedded in BUILD_DIR and RUN_DIR so 4 runs don't collide
  - manifest gains `tick_source:` field
  - summary block + post-capture commit hint reflect the 4-run protocol

- silicon/boards/nucleo_g474re/prj-tick-lptim.conf
  - new overlay enabling STM32_LPTIM_TIMER and disabling CORTEX_M_SYSTICK
  - documented clock-source caveat: LSE-clocked LPTIM cannot sustain
    the bench's 100 kHz tick; a DT overlay layering LPTIM1 onto PCLK1
    is needed for apples-to-apples vs SysTick — flagged in the board
    README as a follow-up

- silicon/README.md
  - run-dir naming now includes tick_source
  - capture procedure shows the 4-run loop
  - smoke-run instruction added (drop --sweep long, omit --tick-source)
  - commit hint updated to grab all 4 dirs at once

- silicon/boards/nucleo_g474re/README.md
  - new "Kernel tick sources" section with the per-source overlay table
    and the LPTIM clock-source caveat

No firmware code touched (still consistent with PR #37's stated scope).
Smart-data emission (DWT counters + STM32 self-monitoring) is the
follow-up PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@avrabe avrabe force-pushed the feat/silicon-anchor-nucleo-g474re branch from 2732035 to e5d8827 Compare May 11, 2026 03:26
avrabe and others added 5 commits May 11, 2026 05:28
CI = Renode (deterministic, parallel-safe). Silicon captures are
manual, periodic, and shared across one board per architecture.
Recorded captures live in the repo as immutable evidence, citeable
from any blog post via stable git URLs. This commit is the
scaffolding — protocol doc, build wrapper, board overlay, capture
script — that makes a silicon capture a flash-and-go operation
the moment hardware is in hand.

Files:

  silicon/README.md
    Protocol: why we silicon-anchor, the recorded-run-in-git
    convention, the capture procedure for the NUCLEO-G474RE, the
    comparison workflow against Renode CI, anchor cadence, and
    the don't-do-this list (overwriting, mixing pre/post-overhead-
    compensation captures, claiming WCET).

  silicon/capture.sh
    Build + flash + capture + tag + manifest, in one invocation.
    --board nucleo_g474re --variant {baseline,gale} [--sweep ...].
    Auto-detects the serial port on macOS / Linux. Refuses to
    overwrite an existing dated dir.

  silicon/capture.py
    Cross-platform pyserial UART capture. Reads until '=== END ===',
    times out at the wall clock, writes the raw stream to a file.

  silicon/boards/nucleo_g474re/{README.md,prj.conf}
    Board notes + (currently empty) Kconfig overlay. Cortex-M4F + FPU
    @ 170 MHz, ST-Link/V3E with VCP at 115200, DWT_CYCCNT works
    identically to stm32f4_disco. Closest production-shape silicon
    to our existing Renode target.

  silicon/runs/.gitkeep
    Placeholder; first dated capture goes in here.

Each captured run will commit:
  - output.csv (raw firmware UART)
  - events.csv (tagged through tag_events.py)
  - firmware.elf + firmware.elf.sha256
  - manifest.txt (board, MCU, gale_sha, rustc, west, zephyr_sha,
    ELF sha256, capture timestamp, port, timeout)

Manual flow only — no CI changes. README updated to point at
silicon/ from the methodology section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-on fixups on the silicon-anchor capture wrapper, surfaced while
preparing first-capture for the NUCLEO-G474RE on macOS:

1. --help printed nothing on macOS. The sed extractor used GNU-only `\?`
   for "0 or 1 space"; on BSD sed the pattern is treated as a literal `?`
   and never matches. Replaced with a portable awk one-liner that also
   skips the shebang line.

2. The manifest's `csv_sha256:` line had `| awk '{print $1}'` outside
   the `$(...)` command-substitution, so the manifest got the literal
   pipeline text instead of the hash. Wrapped the `||` group in
   `{ ...; }` so the pipe applies to either branch.

Both are cosmetic but block automated parsing of the manifest and
discovering the script's own usage.
A publication-grade silicon-anchor capture is the matrix
  variant ∈ {baseline, gale} × tick_source ∈ {systick, lptim}
not just two variants — LPTIM has different jitter and ISR-overhead
characteristics than the Cortex-M default SysTick, so the silicon /
renode multiplier must be reported per tick_source to be meaningful.

Changes:

- capture.sh
  - new --tick-source {systick,lptim} flag (default: systick)
  - OVERLAY_CONFIG composed from up to 3 ordered layers:
      1. gale overlay (when --variant gale)
      2. board silicon overlay (silicon/boards/<board>/prj.conf)
      3. tick-source overlay (silicon/boards/<board>/prj-tick-<src>.conf)
  - tick_source embedded in BUILD_DIR and RUN_DIR so 4 runs don't collide
  - manifest gains `tick_source:` field
  - summary block + post-capture commit hint reflect the 4-run protocol

- silicon/boards/nucleo_g474re/prj-tick-lptim.conf
  - new overlay enabling STM32_LPTIM_TIMER and disabling CORTEX_M_SYSTICK
  - documented clock-source caveat: LSE-clocked LPTIM cannot sustain
    the bench's 100 kHz tick; a DT overlay layering LPTIM1 onto PCLK1
    is needed for apples-to-apples vs SysTick — flagged in the board
    README as a follow-up

- silicon/README.md
  - run-dir naming now includes tick_source
  - capture procedure shows the 4-run loop
  - smoke-run instruction added (drop --sweep long, omit --tick-source)
  - commit hint updated to grab all 4 dirs at once

- silicon/boards/nucleo_g474re/README.md
  - new "Kernel tick sources" section with the per-source overlay table
    and the LPTIM clock-source caveat

No firmware code touched (still consistent with PR #37's stated scope).
Smart-data emission (DWT counters + STM32 self-monitoring) is the
follow-up PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…CK=n alone

Local smoke build of the silicon-anchor scaffolding on real Zephyr SDK
1.0.1 + arm-zephyr-eabi-gcc 14.3.0 against the actual Zephyr workspace
revealed the original `prj-tick-lptim.conf` doesn't actually switch the
kernel tick to LPTIM. Both `baseline/lptim` and `gale/lptim` built
configurations failed to link with:

  zephyr/kernel/libkernel.a(timeout.c.obj): in function `elapsed':
    timeout.c:70: undefined reference to `sys_clock_elapsed'
  zephyr/kernel/libkernel.a(busy_wait.c.obj):
    misc.h:26: undefined reference to `sys_clock_cycle_get_32'

…meaning *no* tick driver was being compiled in. Setting
`CONFIG_STM32_LPTIM_TIMER=y` was being silently ignored by Kconfig
because of unmet dependencies in
`zephyr/drivers/timer/Kconfig.stm32_lptim`:

  depends on dt_nodelabel_exists(stm32_lp_tick_source)  ← OK on G4
  depends on DT_HAS_ST_STM32_LPTIM_ENABLED              ← OK on G4
  depends on CLOCK_CONTROL && PM                        ← MISSING
  select TICKLESS_CAPABLE

Upstream `nucleo_g474re.dts` already labels `&lptim1` as the
`stm32_lp_tick_source` and sets `status="okay"` with LSI clocks, so the
DT side is fine — the only piece missing was `CONFIG_PM=y`, which lets
`STM32_LPTIM_TIMER`'s `default y` fire and the driver source actually
compile.

Replaces `CONFIG_STM32_LPTIM_TIMER=y` (redundant once PM enables it via
default) with `CONFIG_PM=y`. Keeps `CONFIG_CORTEX_M_SYSTICK=n` so the
SysTick driver doesn't compile in parallel and race with LPTIM for the
system-clock-driver init slot. Comment block reframed to explain the
real Kconfig dependency chain rather than the speculative DT-overlay
caveat.

Verified locally: all 4 variants (baseline/gale × systick/lptim) now
link cleanly. The lptim variant carries the PM subsystem (~120 KB
ELF growth, 1% extra flash, ~600 B extra RAM) — that's the cost of
using LPTIM as the kernel tick on this part.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eset

Two fixes surfaced when running capture.sh on the bench for real:

1. west flash on nucleo_g474re defaults to the stm32cubeprogrammer
   runner, which requires ST's proprietary STM32CubeProgrammer.app
   that most Linux/macOS dev setups don't have installed. The board
   also configures the openocd runner (which is brew-installable on
   macOS, package-managed on Linux), but it's not the default.
   Add a --runner flag to capture.sh, default openocd, with
   pass-through to `west flash`. Include the choice in the manifest.

2. Even with the openocd runner, west flash via Zephyr 4.4.0-rc3 on
   STM32G4 + CONFIG_PM=y leaves the chip *halted* after writing the
   image — no implicit reset+run is issued, so the firmware never
   starts and the UART stays silent. Add an explicit
     openocd init reset run sleep 200 exit
   step between flash and the serial capture. NB: do NOT pipe openocd
   through head/grep — SIGPIPE on early close kills openocd before
   it processes `reset run`, leaving the chip halted just the same.
   Capture full openocd output to /tmp/silicon-reset-<board>.log
   instead, with a 0.5s grace before opening the serial port so the
   sentinel-search window aligns cleanly with the bench's CSV stream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@avrabe avrabe force-pushed the feat/silicon-anchor-nucleo-g474re branch from e5d8827 to ace0c3a Compare May 11, 2026 03:28
@avrabe avrabe merged commit 9d8e9f8 into main May 11, 2026
55 of 59 checks passed
@avrabe avrabe deleted the feat/silicon-anchor-nucleo-g474re branch May 11, 2026 03:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant