Skip to content

Show current CPU clock, DDR clock, and per-die HPM characterization #161

@widgetii

Description

@widgetii

Motivation

When two boards of the same chip variant behave differently (encoder fps, sensor capture rate, throughput), the first useful diagnostic question is "are they running at the same CPU / DDR clock?". Today the answer requires manual devmem + per-platform PLL formula decoding. A bundled tool would surface this in one command.

Importantly: on V4 family Hisilicon/Goke (gk7205v200/v300, hi3516ev200, etc.) the mask ROM sets PLL multipliers at boot based on per-die HPM (Hardware Performance Monitor) characterization — different physical chips, even from the same batch, can end up at different runtime clocks. Today there's no easy way to spot this.

Concrete case from the field

Two goke,gk7205v300 boards, same OTP 0x12020084 = 0x02115888, same chip ID at 0x12020088/8c/90, same hardware — but:

Register Board A (OpenIPC) Board B (vendor) Decoded
0x12010080 (CRG32 / DDR_CKSEL) 0x00000549 0x00000549 DDR @ 450 MHz (both)
0x12010014 (CPU PLL FBDIV) 0x01770000 0x018F0000 CPU = 952 vs 1144 MHz
0x1201000c (peripheral PLL) 0x018F0000 0x01970000 peri = 1144 vs 1208 MHz
0x1202015c (HPM_CHECK_REG) 0x00F30000 0x00F60000 HPM = 243 vs 246
0x120280d8 (HPM_CORE_REG0) 0x81080102 0x80AF00AE per-die monitor reading

Same batch, different per-die HPM → mask ROM picks different PLL multiplier → 20% CPU clock gap → 36% memcpy throughput gap → wire fps difference on encoder.

Until I dumped these registers and decoded them manually, the issue looked like a software / configuration problem. It wasn't — it was silicon-binning, invisible without this view.

Proposed subcommand

ipctool clocks         (or: ipctool freq)

Output (V4 / gk7205v300 example):

Chip:           gk7205v300
OTP_CPU_CLK:    0x02115888  (mux_chn=0)
HPM:            sys_hpm_core=243 (range 190-310, [bin: low])
                core0=0x102 core1=0x102

PLL frequencies (derived from CRG):
  CPU PLL          1144 MHz   (CRG[0x14]: FBDIV=0x77)
  Peripheral PLL   1144 MHz   (CRG[0x0c]: FBDIV=0x8F)
  DDR cksel        450 MHz    (CRG[0x80] bits[3:5]=001)

CPU running:       952 MHz    (cpufreq governor 'performance')
DDR controller @   0x120d0000:
  PLL frequency      450 MHz
  Data rate          1800 Mbps  (DDR3-1800 equiv)

The bin level (low/high/...) should reflect the documented HPM thresholds the mask ROM uses to select PLL multiplier — those are platform-specific. For V4 family the thresholds are visible in u-boot source (u-boot-gk7205v200/arch/arm/cpu/armv7/gk7205v300/lowlevel_init_v300.cHPM_CORE_VALUE_MIN/MAX, HPM_CORE_MIN/MAX).

A --json variant should be added so this is easy to consume from scripts (e.g., comparing fleet of boards).

Implementation notes

The register addresses + bit decode are well-defined per-SoC-family — V4 family alone is ~5 chips with the same CRG layout. Suggested structure:

struct clock_info {
    const char *name;
    uint32_t reg;          // physical address
    int fbdiv_shift;       // bit offset of FBDIV in the reg
    int fbdiv_mask;        // width
    int refdiv;            // typically 1
    int postdiv;           // typically 3 on V4 family
    int input_mhz;         // 24 MHz crystal
};

Then a per-platform table maps each PLL/clock domain.

Where helpful, also report:

  • HPM bin classification (low/medium/high) per documented thresholds
  • Voltage / SVB state (SYS_CTRL_VOLT_REG etc.)
  • A short note when the chip is on the low bin so users understand why their identical-spec board underperforms.

Why this matters operationally

  • Diagnosing fps gaps: confirms hardware-vs-software root cause in seconds.
  • Fleet uniformity check: identify which units in a batch landed in the lower bin (potentially useful for QA / RMA decisions on dev boards).
  • Cross-chip comparison: lets people quickly answer "is the cv500 actually running 700 MHz like the datasheet says, or is mask ROM downclocking it?"
  • Bootloader sanity: confirms u-boot / mask ROM brought everything up at the expected speeds; a stuck PLL / failed training shows up here.

Related

Companion request: OpenIPC/ipctool#160 — DDR bandwidth benchmarking subcommand. The bandwidth tool measures the result; this tool explains the cause when results differ.

For V4 family decode, the formulas are:

  • f_pll = INPUT_HZ × FBDIV / (REFDIV × POSTDIV1 × POSTDIV2)
  • on gk7205v300: INPUT=24 MHz, REFDIV × POSTDIV = 3 typically
  • DDR cksel field (CRG 0x80 bits [3:5]):
    • 0b000 → 24 MHz, 0b001 → 450 MHz, 0b011 → 300 MHz, 0b100 → 297 MHz

Per-family table needed for other Hisilicon V1-V5 / Goke variants; the V4 case is the well-documented one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions