Skip to content

std_detect: support detecting more features on aarch64 Windows#155856

Merged
rust-bors[bot] merged 1 commit intorust-lang:mainfrom
imazen:winarm-stable-features
Apr 30, 2026
Merged

std_detect: support detecting more features on aarch64 Windows#155856
rust-bors[bot] merged 1 commit intorust-lang:mainfrom
imazen:winarm-stable-features

Conversation

@lilith
Copy link
Copy Markdown
Contributor

@lilith lilith commented Apr 27, 2026

Wires IsProcessorFeaturePresent calls for the PF_ARM_* constants exposed in Windows SDK 26100 (Windows 11 24H2), plus an architectural derivation for rdm. All eight feature names have been stable in is_aarch64_feature_detected! on Linux/Darwin/BSD since Rust 1.60 — this brings the Windows backend to parity.

Feature Source on Windows
fp16 PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE (value 67)
i8mm PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE (value 66)
bf16 PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE (value 68)
sha3 PF_ARM_SHA3 (value 64) AND PF_ARM_SHA512 (value 65)
lse2 PF_ARM_LSE2_AVAILABLE (value 62)
f32mm PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE (value 58)
f64mm PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE (value 59)
rdm derived from PF_ARM_V82_DP (see below)

PF_ARM_SVE_F32MM / PF_ARM_SVE_F64MM (values 58 / 59) were already added as commented-out placeholders in rust-lang/stdarch#1749 — they have direct stable Feature mappings (f32mm, f64mm), unlike their sibling values 52 / 53 / 57 (SVE_BF16, SVE_EBF16, SVE_I8MM) which have no SVE-specific stdarch Feature name and remain commented for that reason.

sha3 requires both PF_ARM_SHA3 (FEAT_SHA3) and PF_ARM_SHA512 (FEAT_SHA512), matching the existing convention from rust-lang/stdarch#1749 where sve2-aes is set only when both PF_ARM_SVE_AES and PF_ARM_SVE_PMULL128 are present.

rdm derivation

There is no PF_ARM_RDM_* constant; Microsoft has never defined one. We derive it from PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE (FEAT_DotProd) via the following architectural chain:

  1. FEAT_DotProd is an optional v8.2-A feature, so its presence implies the core implements at least v8.1-A.
  2. Per Arm ARM K.a §D17.2.91: "In an ARMv8.1 implementation, if FEAT_AdvSIMD is implemented, FEAT_RDM is implemented."
  3. AdvSIMD is universally implemented on every Windows-on-ARM SKU.
  4. Therefore: DotProd ⇒ v8.1-A baseline + AdvSIMD ⇒ FEAT_RDM.

This is the same derivation .NET 10 uses, with comment cited verbatim (dotnet/runtime PR 109493, shipped in v10.0.0 at src/native/minipal/cpufeatures.c):

*"IsProcessorFeaturePresent does not have a dedicated flag for RDM, so we enable it by implication.

  1. DP is an optional instruction set for Armv8.2, which may be included only in processors implementing at least Armv8.1.
  2. Armv8.1 requires RDM when AdvSIMD is implemented, and AdvSIMD is a baseline requirement of .NET.
    Therefore, by documented standard, DP cannot exist here without RDM. In practice, there is only one CPU supported by Windows that includes RDM without DP, so this implication also has little practical chance of a false negative."*

The "one CPU with RDM without DP" trade-off applies equally to us: we accept a possible false negative on that single SKU rather than introducing a more aggressive heuristic.

Tests

Adds println! lines to the existing aarch64_windows() test in library/std_detect/tests/cpu-detection.rs for each newly-detected feature, mirroring the existing single-line pattern. No structural assertions added.

Scope

Stable feature names only. The unstable SME family (sme, sme2, sme2p1, sme_*, ssve_fp8*) and other unstable additions tracked under #127764 are intentionally out of scope here to keep this PR minimal — happy to do a follow-up.

References

r? @Amanieu

cc @taiki-e (author of rust-lang/stdarch#1749, would appreciate your eyes on the rdm inference)

@rustbot label +T-libs +O-windows +O-ARM

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 27, 2026

Some changes occurred in std_detect

cc @Amanieu, @folkertdev, @sayantn

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 27, 2026
@rustbot rustbot added O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state O-windows Operating system: Windows labels Apr 27, 2026
@rustbot

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@lilith lilith force-pushed the winarm-stable-features branch from 4995e28 to c9383ed Compare April 27, 2026 05:21
Wire IsProcessorFeaturePresent for the PF_ARM_* constants exposed in
Windows SDK 26100 (Win11 24H2):

  fp16   PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE  (67)
  i8mm   PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE  (66)
  bf16   PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE  (68)
  sha3   PF_ARM_SHA3 (64) AND PF_ARM_SHA512 (65)
  lse2   PF_ARM_LSE2_AVAILABLE                   (62)
  f32mm  PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE (58)
  f64mm  PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE (59)

Also derive `rdm` from FEAT_DotProd. There is no PF_ARM_RDM_* constant;
FEAT_DotProd is an optional v8.2-A feature only present on cores that
implement at least v8.1-A, and v8.1-A with AdvSIMD mandates FEAT_RDM
(Arm ARM K.a §D17.2.91). AdvSIMD is universal on Windows-on-ARM. See
PR description for full rationale and .NET 10 precedent.

All eight feature names have been stable in `is_aarch64_feature_detected!`
on Linux/Darwin/BSD since Rust 1.60.
@lilith lilith force-pushed the winarm-stable-features branch from c9383ed to f0a751d Compare April 27, 2026 07:54
lilith added a commit to imazen/winarm-cpufeatures that referenced this pull request Apr 27, 2026
…ss-arch

Tests:

- `tests/name_parity.rs` rewritten around a single
  `for_every_name!($cb)` macro so the canonical name list lives in one
  place. Both macros now exercise all 73 names on every supported
  target.

- New `winarm_is_superset_of_std_on_windows`: asserts that for every
  name std reports present on Windows ARM64, winarm also reports
  present. Holds today (we wire more PF_ARM_* than std does on
  Windows) and continues to hold once rust-lang/rust#155856 lands —
  that PR adds Windows IPFP coverage for fp16/bf16/i8mm/lse2/sha3/
  f32mm/f64mm/rdm, all of which we already detect. If std ever grows
  Windows IPFP coverage for a name we don't track, this test fails
  with a clear "winarm needs an enum entry" message.

- New `fast_and_full_agree_on_non_windows`: asserts the two macros
  produce identical answers on non-Windows targets (where there's no
  registry layer to differ on).

- New `non_aarch64_always_false`: ensures the non-aarch64 stub stays
  honest — every documented name returns false.

CI:

- New `feature-combos` job: matrix of `--no-default-features` /
  default / `--features registry` runs `cargo build`, `cargo test`,
  `cargo clippy --all-targets -- -D warnings` for each combination on
  Linux x64. Catches dead-code-under-cfg / missing-import bugs that
  only surface in specific feature subsets — the exact class of issue
  we hit during the cache restructure.

- New `feature-combos-nightly` job: `--features nightly-stdarch` and
  `--features registry,nightly-stdarch` on nightly rustc. Exercises
  the unstable stdarch dispatch.

- New `cross-test` job: actually runs `cross test` under QEMU on
  i686-unknown-linux-gnu and armv7-unknown-linux-gnueabihf. Both
  exercise the non-aarch64 stub at runtime — what was missing before
  (the prior `cross` job was build-only).

- Renamed `cross` to `cross-build` for clarity, and trimmed it to the
  targets we can't run under QEMU (aarch64-pc-windows-msvc,
  aarch64-apple-darwin, wasm32-unknown-unknown). The aarch64-linux
  target is covered by native `ubuntu-24.04-arm`; i686/armv7 by the
  new `cross-test` job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lilith added a commit to imazen/winarm-cpufeatures that referenced this pull request Apr 27, 2026
README:
- Note the registry runtime default is now on (sandboxed callers opt
  out via set_registry_enabled(false)).
- Spell out the future-proof passthrough story for non-Windows
  aarch64: future stdarch additions Just Work without crate updates.
- Add a "Relationship to upstream std" section pointing at
  rust-lang/rust#155856 and listing the names it covers vs the names
  that remain this crate's value-add.

CHANGELOG:
- Update registry-layer description to match the new default-on
  runtime gate.
- Update the std-passthrough description to make the future-proof
  property explicit.
- Update non-aarch64 stub description to mention the typo-validation
  via compile_error catch-all.
- Add a "Relationship to upstream std" subsection mirroring the
  README's, naming the parity test that locks the invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lilith added a commit to imazen/winarm-cpufeatures that referenced this pull request Apr 27, 2026
Replaces the `compile_error!` catch-all in both macros with a `:tt`
passthrough to `::std::arch::is_aarch64_feature_detected!`. Names in
our 73-entry table use the cache (with the value-add); names we don't
track defer to std, matching whatever std knows.

This means future stdarch additions (e.g. when
rust-lang/rust#155856 lands stable Windows-aarch64 IPFP coverage for
8 features, or any later additions) are picked up automatically
without crate updates.

Wrinkle handled: the 32 names std gates behind
`#![feature(stdarch_aarch64_feature_detection)]` would error at the
user's call site under a naive `:tt` passthrough, because the user's
crate doesn't have the unstable feature gate enabled. The non-Windows
aarch64 macros keep specific arms for those 32 names, routing through
`is_detected` / `is_detected_full` so the unstable gate stays
contained in our crate. The 41 stable names and any future additions
go through the `:tt` catch-all.

Behavior change on Windows aarch64: previously the fast macro
emitted `compile_error!` for the 33 Registry-classified names. They
now silently return `false`, matching std's behaviour (std's IPFP
wiring on Windows doesn't see them either). Use
`is_aarch64_feature_detected_full!` (with the `registry` Cargo
feature) to actually detect them.

Non-aarch64 macros also collapsed to a single `:literal` arm
returning `false` — std doesn't compile there to passthrough to, but
silently accepting any string is preferable to blocking
cross-platform code on a crate update for new names. Cross-platform
CI on aarch64 targets catches typos via std's validation there.

Net: winarm-cpufeatures becomes a strict supplement to std, never
overrides std's "yes" with our "compile error".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lilith added a commit to imazen/winarm-cpufeatures that referenced this pull request Apr 27, 2026
Reflects the simplifications from the recent api: passthrough
restructure. Removes the "future stdarch additions Just Work"
hand-waving (we don't auto-handle them on stable for unstable names),
clarifies what works on stable Rust on Windows aarch64 (everything,
all 73 names, no nightly opt-in), and points at rust-lang/rust#155856
as the upstream that closes part of the gap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Amanieu
Copy link
Copy Markdown
Member

Amanieu commented Apr 29, 2026

@bors r+

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 29, 2026

📌 Commit f0a751d has been approved by Amanieu

It is now in the queue for this repository.

@rust-bors rust-bors Bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 29, 2026
jhpratt added a commit to jhpratt/rust that referenced this pull request Apr 29, 2026
…manieu

std_detect: support detecting more features on aarch64 Windows

Wires `IsProcessorFeaturePresent` calls for the `PF_ARM_*` constants exposed in Windows SDK 26100 (Windows 11 24H2), plus an architectural derivation for `rdm`. All eight feature names have been stable in `is_aarch64_feature_detected!` on Linux/Darwin/BSD since Rust 1.60 — this brings the Windows backend to parity.

| Feature | Source on Windows |
|---|---|
| `fp16`  | `PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE`  (value 67) |
| `i8mm`  | `PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE`  (value 66) |
| `bf16`  | `PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE`  (value 68) |
| `sha3`  | `PF_ARM_SHA3` (value 64) **AND** `PF_ARM_SHA512` (value 65) |
| `lse2`  | `PF_ARM_LSE2_AVAILABLE`                   (value 62) |
| `f32mm` | `PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE` (value 58) |
| `f64mm` | `PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE` (value 59) |
| `rdm`   | derived from `PF_ARM_V82_DP` (see below) |

`PF_ARM_SVE_F32MM` / `PF_ARM_SVE_F64MM` (values 58 / 59) were already added as commented-out placeholders in rust-lang/stdarch#1749 — they have direct stable Feature mappings (`f32mm`, `f64mm`), unlike their sibling values 52 / 53 / 57 (`SVE_BF16`, `SVE_EBF16`, `SVE_I8MM`) which have no SVE-specific stdarch Feature name and remain commented for that reason.

`sha3` requires both `PF_ARM_SHA3` (FEAT_SHA3) and `PF_ARM_SHA512` (FEAT_SHA512), matching the existing convention from rust-lang/stdarch#1749 where `sve2-aes` is set only when both `PF_ARM_SVE_AES` and `PF_ARM_SVE_PMULL128` are present.

### `rdm` derivation

There is no `PF_ARM_RDM_*` constant; Microsoft has never defined one. We derive it from `PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE` (FEAT_DotProd) via the following architectural chain:

1. FEAT_DotProd is an optional v8.2-A feature, so its presence implies the core implements at least v8.1-A.
2. Per Arm ARM K.a §D17.2.91: *"In an ARMv8.1 implementation, if FEAT_AdvSIMD is implemented, FEAT_RDM is implemented."*
3. AdvSIMD is universally implemented on every Windows-on-ARM SKU.
4. Therefore: DotProd ⇒ v8.1-A baseline + AdvSIMD ⇒ FEAT_RDM.

This is the same derivation .NET 10 uses, with comment cited verbatim ([dotnet/runtime PR 109493](dotnet/runtime#109493), shipped in v10.0.0 at [`src/native/minipal/cpufeatures.c`](https://github.com/dotnet/runtime/blob/v10.0.0/src/native/minipal/cpufeatures.c)):

> *"IsProcessorFeaturePresent does not have a dedicated flag for RDM, so we enable it by implication.
> 1) DP is an optional instruction set for Armv8.2, which may be included only in processors implementing at least Armv8.1.
> 2) Armv8.1 requires RDM when AdvSIMD is implemented, and AdvSIMD is a baseline requirement of .NET.
> Therefore, by documented standard, DP cannot exist here without RDM. In practice, there is only one CPU supported by Windows that includes RDM without DP, so this implication also has little practical chance of a false negative."*

The "one CPU with RDM without DP" trade-off applies equally to us: we accept a possible false negative on that single SKU rather than introducing a more aggressive heuristic.

### Tests

Adds `println!` lines to the existing `aarch64_windows()` test in `library/std_detect/tests/cpu-detection.rs` for each newly-detected feature, mirroring the existing single-line pattern. No structural assertions added.

### Scope

Stable feature names only. The unstable SME family (`sme`, `sme2`, `sme2p1`, `sme_*`, `ssve_fp8*`) and other unstable additions tracked under rust-lang#127764 are intentionally out of scope here to keep this PR minimal — happy to do a follow-up.

### References

- Tracking issue: rust-lang#127764 (`stdarch_aarch64_feature_detection`)
- Precedent: rust-lang/stdarch#1749 (taiki-e, merged 2025-03-24) — added the SVE constants this builds on
- MS docs: [`IsProcessorFeaturePresent`](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-isprocessorfeaturepresent) — full PF_ARM_* table

r? @Amanieu

cc @taiki-e (author of rust-lang/stdarch#1749, would appreciate your eyes on the `rdm` inference)

@rustbot label +T-libs +O-windows +O-ARM
rust-bors Bot pushed a commit that referenced this pull request Apr 29, 2026
Rollup of 9 pull requests

Successful merges:

 - #151742 (Remove redundant information in `rustc_abi::Variants`)
 - #155856 (std_detect: support detecting more features on aarch64 Windows)
 - #155861 (Suggest `[const] Trait` bounds in more places)
 - #155899 (`dlltool`: Set the working directory to workaround `--temp-prefix` bug)
 - #155916 (Update with new LLVM 22 target for `wasm32-wali-linux-musl` target)
 - #155935 (remap OUT_DIR paths to fix build script path leakage in crate metadata. )
 - #155950 (use the new `//@ needs-asm-mnemonic: ret` more)
 - #155949 (Update `opt_ast_lowering_delayed_lints` query to allow "stealing" lints, allowing to use `FnOnce` instead of `Fn`)
 - #155951 (Make `FlatMapInPlaceVec` an unsafe trait.)
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Apr 29, 2026
…manieu

std_detect: support detecting more features on aarch64 Windows

Wires `IsProcessorFeaturePresent` calls for the `PF_ARM_*` constants exposed in Windows SDK 26100 (Windows 11 24H2), plus an architectural derivation for `rdm`. All eight feature names have been stable in `is_aarch64_feature_detected!` on Linux/Darwin/BSD since Rust 1.60 — this brings the Windows backend to parity.

| Feature | Source on Windows |
|---|---|
| `fp16`  | `PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE`  (value 67) |
| `i8mm`  | `PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE`  (value 66) |
| `bf16`  | `PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE`  (value 68) |
| `sha3`  | `PF_ARM_SHA3` (value 64) **AND** `PF_ARM_SHA512` (value 65) |
| `lse2`  | `PF_ARM_LSE2_AVAILABLE`                   (value 62) |
| `f32mm` | `PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE` (value 58) |
| `f64mm` | `PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE` (value 59) |
| `rdm`   | derived from `PF_ARM_V82_DP` (see below) |

`PF_ARM_SVE_F32MM` / `PF_ARM_SVE_F64MM` (values 58 / 59) were already added as commented-out placeholders in rust-lang/stdarch#1749 — they have direct stable Feature mappings (`f32mm`, `f64mm`), unlike their sibling values 52 / 53 / 57 (`SVE_BF16`, `SVE_EBF16`, `SVE_I8MM`) which have no SVE-specific stdarch Feature name and remain commented for that reason.

`sha3` requires both `PF_ARM_SHA3` (FEAT_SHA3) and `PF_ARM_SHA512` (FEAT_SHA512), matching the existing convention from rust-lang/stdarch#1749 where `sve2-aes` is set only when both `PF_ARM_SVE_AES` and `PF_ARM_SVE_PMULL128` are present.

### `rdm` derivation

There is no `PF_ARM_RDM_*` constant; Microsoft has never defined one. We derive it from `PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE` (FEAT_DotProd) via the following architectural chain:

1. FEAT_DotProd is an optional v8.2-A feature, so its presence implies the core implements at least v8.1-A.
2. Per Arm ARM K.a §D17.2.91: *"In an ARMv8.1 implementation, if FEAT_AdvSIMD is implemented, FEAT_RDM is implemented."*
3. AdvSIMD is universally implemented on every Windows-on-ARM SKU.
4. Therefore: DotProd ⇒ v8.1-A baseline + AdvSIMD ⇒ FEAT_RDM.

This is the same derivation .NET 10 uses, with comment cited verbatim ([dotnet/runtime PR 109493](dotnet/runtime#109493), shipped in v10.0.0 at [`src/native/minipal/cpufeatures.c`](https://github.com/dotnet/runtime/blob/v10.0.0/src/native/minipal/cpufeatures.c)):

> *"IsProcessorFeaturePresent does not have a dedicated flag for RDM, so we enable it by implication.
> 1) DP is an optional instruction set for Armv8.2, which may be included only in processors implementing at least Armv8.1.
> 2) Armv8.1 requires RDM when AdvSIMD is implemented, and AdvSIMD is a baseline requirement of .NET.
> Therefore, by documented standard, DP cannot exist here without RDM. In practice, there is only one CPU supported by Windows that includes RDM without DP, so this implication also has little practical chance of a false negative."*

The "one CPU with RDM without DP" trade-off applies equally to us: we accept a possible false negative on that single SKU rather than introducing a more aggressive heuristic.

### Tests

Adds `println!` lines to the existing `aarch64_windows()` test in `library/std_detect/tests/cpu-detection.rs` for each newly-detected feature, mirroring the existing single-line pattern. No structural assertions added.

### Scope

Stable feature names only. The unstable SME family (`sme`, `sme2`, `sme2p1`, `sme_*`, `ssve_fp8*`) and other unstable additions tracked under rust-lang#127764 are intentionally out of scope here to keep this PR minimal — happy to do a follow-up.

### References

- Tracking issue: rust-lang#127764 (`stdarch_aarch64_feature_detection`)
- Precedent: rust-lang/stdarch#1749 (taiki-e, merged 2025-03-24) — added the SVE constants this builds on
- MS docs: [`IsProcessorFeaturePresent`](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-isprocessorfeaturepresent) — full PF_ARM_* table

r? @Amanieu

cc @taiki-e (author of rust-lang/stdarch#1749, would appreciate your eyes on the `rdm` inference)

@rustbot label +T-libs +O-windows +O-ARM
rust-bors Bot pushed a commit that referenced this pull request Apr 29, 2026
…uwer

Rollup of 12 pull requests

Successful merges:

 - #155189 (simd_reduce_min/max: remove float support)
 - #155721 (When archive format is wrong produce an error instead of ICE)
 - #155794 (privacy: share effective visibility initialization)
 - #155856 (std_detect: support detecting more features on aarch64 Windows)
 - #155861 (Suggest `[const] Trait` bounds in more places)
 - #155899 (`dlltool`: Set the working directory to workaround `--temp-prefix` bug)
 - #155916 (Update with new LLVM 22 target for `wasm32-wali-linux-musl` target)
 - #155935 (remap OUT_DIR paths to fix build script path leakage in crate metadata. )
 - #155950 (use the new `//@ needs-asm-mnemonic: ret` more)
 - #155958 (ci(free-disk-space): remove more tools and fix warnings)
 - #155949 (Update `opt_ast_lowering_delayed_lints` query to allow "stealing" lints, allowing to use `FnOnce` instead of `Fn`)
 - #155951 (Make `FlatMapInPlaceVec` an unsafe trait.)
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Apr 29, 2026
…manieu

std_detect: support detecting more features on aarch64 Windows

Wires `IsProcessorFeaturePresent` calls for the `PF_ARM_*` constants exposed in Windows SDK 26100 (Windows 11 24H2), plus an architectural derivation for `rdm`. All eight feature names have been stable in `is_aarch64_feature_detected!` on Linux/Darwin/BSD since Rust 1.60 — this brings the Windows backend to parity.

| Feature | Source on Windows |
|---|---|
| `fp16`  | `PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE`  (value 67) |
| `i8mm`  | `PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE`  (value 66) |
| `bf16`  | `PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE`  (value 68) |
| `sha3`  | `PF_ARM_SHA3` (value 64) **AND** `PF_ARM_SHA512` (value 65) |
| `lse2`  | `PF_ARM_LSE2_AVAILABLE`                   (value 62) |
| `f32mm` | `PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE` (value 58) |
| `f64mm` | `PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE` (value 59) |
| `rdm`   | derived from `PF_ARM_V82_DP` (see below) |

`PF_ARM_SVE_F32MM` / `PF_ARM_SVE_F64MM` (values 58 / 59) were already added as commented-out placeholders in rust-lang/stdarch#1749 — they have direct stable Feature mappings (`f32mm`, `f64mm`), unlike their sibling values 52 / 53 / 57 (`SVE_BF16`, `SVE_EBF16`, `SVE_I8MM`) which have no SVE-specific stdarch Feature name and remain commented for that reason.

`sha3` requires both `PF_ARM_SHA3` (FEAT_SHA3) and `PF_ARM_SHA512` (FEAT_SHA512), matching the existing convention from rust-lang/stdarch#1749 where `sve2-aes` is set only when both `PF_ARM_SVE_AES` and `PF_ARM_SVE_PMULL128` are present.

### `rdm` derivation

There is no `PF_ARM_RDM_*` constant; Microsoft has never defined one. We derive it from `PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE` (FEAT_DotProd) via the following architectural chain:

1. FEAT_DotProd is an optional v8.2-A feature, so its presence implies the core implements at least v8.1-A.
2. Per Arm ARM K.a §D17.2.91: *"In an ARMv8.1 implementation, if FEAT_AdvSIMD is implemented, FEAT_RDM is implemented."*
3. AdvSIMD is universally implemented on every Windows-on-ARM SKU.
4. Therefore: DotProd ⇒ v8.1-A baseline + AdvSIMD ⇒ FEAT_RDM.

This is the same derivation .NET 10 uses, with comment cited verbatim ([dotnet/runtime PR 109493](dotnet/runtime#109493), shipped in v10.0.0 at [`src/native/minipal/cpufeatures.c`](https://github.com/dotnet/runtime/blob/v10.0.0/src/native/minipal/cpufeatures.c)):

> *"IsProcessorFeaturePresent does not have a dedicated flag for RDM, so we enable it by implication.
> 1) DP is an optional instruction set for Armv8.2, which may be included only in processors implementing at least Armv8.1.
> 2) Armv8.1 requires RDM when AdvSIMD is implemented, and AdvSIMD is a baseline requirement of .NET.
> Therefore, by documented standard, DP cannot exist here without RDM. In practice, there is only one CPU supported by Windows that includes RDM without DP, so this implication also has little practical chance of a false negative."*

The "one CPU with RDM without DP" trade-off applies equally to us: we accept a possible false negative on that single SKU rather than introducing a more aggressive heuristic.

### Tests

Adds `println!` lines to the existing `aarch64_windows()` test in `library/std_detect/tests/cpu-detection.rs` for each newly-detected feature, mirroring the existing single-line pattern. No structural assertions added.

### Scope

Stable feature names only. The unstable SME family (`sme`, `sme2`, `sme2p1`, `sme_*`, `ssve_fp8*`) and other unstable additions tracked under rust-lang#127764 are intentionally out of scope here to keep this PR minimal — happy to do a follow-up.

### References

- Tracking issue: rust-lang#127764 (`stdarch_aarch64_feature_detection`)
- Precedent: rust-lang/stdarch#1749 (taiki-e, merged 2025-03-24) — added the SVE constants this builds on
- MS docs: [`IsProcessorFeaturePresent`](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-isprocessorfeaturepresent) — full PF_ARM_* table

r? @Amanieu

cc @taiki-e (author of rust-lang/stdarch#1749, would appreciate your eyes on the `rdm` inference)

@rustbot label +T-libs +O-windows +O-ARM
rust-bors Bot pushed a commit that referenced this pull request Apr 29, 2026
…uwer

Rollup of 22 pull requests

Successful merges:

 - #154149 (resolve: Extend `ambiguous_import_visibilities` deprecation lint to glob-vs-glob ambiguities)
 - #155189 (simd_reduce_min/max: remove float support)
 - #155453 (apply Cortex-A53 errata 843419 mitigation to the AArch64 Linux targets)
 - #155562 (Add a missing `GenericTypeVisitable`, and avoid having interner traits for `FnSigKind` and `Abi`)
 - #155608 (rustc_middle: Implement the `partial_cmp` operation for `DefId`s)
 - #155721 (When archive format is wrong produce an error instead of ICE)
 - #155794 (privacy: share effective visibility initialization)
 - #155832 (c-variadic: more precise compatibility check in const-eval)
 - #155856 (std_detect: support detecting more features on aarch64 Windows)
 - #155861 (Suggest `[const] Trait` bounds in more places)
 - #155899 (`dlltool`: Set the working directory to workaround `--temp-prefix` bug)
 - #155916 (Update with new LLVM 22 target for `wasm32-wali-linux-musl` target)
 - #155935 (remap OUT_DIR paths to fix build script path leakage in crate metadata. )
 - #155950 (use the new `//@ needs-asm-mnemonic: ret` more)
 - #155958 (ci(free-disk-space): remove more tools and fix warnings)
 - #155966 (miri subtree update)
 - #155711 (bump curl-sys and openssl-sys to support OpenSSL 4.0.x)
 - #155831 (Add `AcceptContext::expect_key_value`)
 - #155877 (Avoid misleading return-type note for foreign `Fn` callees)
 - #155949 (Update `opt_ast_lowering_delayed_lints` query to allow "stealing" lints, allowing to use `FnOnce` instead of `Fn`)
 - #155951 (Make `FlatMapInPlaceVec` an unsafe trait.)
 - #155967 (Fix `doc_cfg` feature for extern items)
rust-bors Bot pushed a commit that referenced this pull request Apr 29, 2026
…uwer

Rollup of 21 pull requests

Successful merges:

 - #155966 (miri subtree update)
 - #154149 (resolve: Extend `ambiguous_import_visibilities` deprecation lint to glob-vs-glob ambiguities)
 - #155189 (simd_reduce_min/max: remove float support)
 - #155562 (Add a missing `GenericTypeVisitable`, and avoid having interner traits for `FnSigKind` and `Abi`)
 - #155608 (rustc_middle: Implement the `partial_cmp` operation for `DefId`s)
 - #155721 (When archive format is wrong produce an error instead of ICE)
 - #155794 (privacy: share effective visibility initialization)
 - #155832 (c-variadic: more precise compatibility check in const-eval)
 - #155856 (std_detect: support detecting more features on aarch64 Windows)
 - #155861 (Suggest `[const] Trait` bounds in more places)
 - #155899 (`dlltool`: Set the working directory to workaround `--temp-prefix` bug)
 - #155916 (Update with new LLVM 22 target for `wasm32-wali-linux-musl` target)
 - #155935 (remap OUT_DIR paths to fix build script path leakage in crate metadata. )
 - #155950 (use the new `//@ needs-asm-mnemonic: ret` more)
 - #155958 (ci(free-disk-space): remove more tools and fix warnings)
 - #155711 (bump curl-sys and openssl-sys to support OpenSSL 4.0.x)
 - #155831 (Add `AcceptContext::expect_key_value`)
 - #155877 (Avoid misleading return-type note for foreign `Fn` callees)
 - #155949 (Update `opt_ast_lowering_delayed_lints` query to allow "stealing" lints, allowing to use `FnOnce` instead of `Fn`)
 - #155951 (Make `FlatMapInPlaceVec` an unsafe trait.)
 - #155967 (Fix `doc_cfg` feature for extern items)
@rust-bors rust-bors Bot merged commit 4b21499 into rust-lang:main Apr 30, 2026
11 checks passed
@rustbot rustbot added this to the 1.97.0 milestone Apr 30, 2026
rust-timer added a commit that referenced this pull request Apr 30, 2026
Rollup merge of #155856 - imazen:winarm-stable-features, r=Amanieu

std_detect: support detecting more features on aarch64 Windows

Wires `IsProcessorFeaturePresent` calls for the `PF_ARM_*` constants exposed in Windows SDK 26100 (Windows 11 24H2), plus an architectural derivation for `rdm`. All eight feature names have been stable in `is_aarch64_feature_detected!` on Linux/Darwin/BSD since Rust 1.60 — this brings the Windows backend to parity.

| Feature | Source on Windows |
|---|---|
| `fp16`  | `PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE`  (value 67) |
| `i8mm`  | `PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE`  (value 66) |
| `bf16`  | `PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE`  (value 68) |
| `sha3`  | `PF_ARM_SHA3` (value 64) **AND** `PF_ARM_SHA512` (value 65) |
| `lse2`  | `PF_ARM_LSE2_AVAILABLE`                   (value 62) |
| `f32mm` | `PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE` (value 58) |
| `f64mm` | `PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE` (value 59) |
| `rdm`   | derived from `PF_ARM_V82_DP` (see below) |

`PF_ARM_SVE_F32MM` / `PF_ARM_SVE_F64MM` (values 58 / 59) were already added as commented-out placeholders in rust-lang/stdarch#1749 — they have direct stable Feature mappings (`f32mm`, `f64mm`), unlike their sibling values 52 / 53 / 57 (`SVE_BF16`, `SVE_EBF16`, `SVE_I8MM`) which have no SVE-specific stdarch Feature name and remain commented for that reason.

`sha3` requires both `PF_ARM_SHA3` (FEAT_SHA3) and `PF_ARM_SHA512` (FEAT_SHA512), matching the existing convention from rust-lang/stdarch#1749 where `sve2-aes` is set only when both `PF_ARM_SVE_AES` and `PF_ARM_SVE_PMULL128` are present.

### `rdm` derivation

There is no `PF_ARM_RDM_*` constant; Microsoft has never defined one. We derive it from `PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE` (FEAT_DotProd) via the following architectural chain:

1. FEAT_DotProd is an optional v8.2-A feature, so its presence implies the core implements at least v8.1-A.
2. Per Arm ARM K.a §D17.2.91: *"In an ARMv8.1 implementation, if FEAT_AdvSIMD is implemented, FEAT_RDM is implemented."*
3. AdvSIMD is universally implemented on every Windows-on-ARM SKU.
4. Therefore: DotProd ⇒ v8.1-A baseline + AdvSIMD ⇒ FEAT_RDM.

This is the same derivation .NET 10 uses, with comment cited verbatim ([dotnet/runtime PR 109493](dotnet/runtime#109493), shipped in v10.0.0 at [`src/native/minipal/cpufeatures.c`](https://github.com/dotnet/runtime/blob/v10.0.0/src/native/minipal/cpufeatures.c)):

> *"IsProcessorFeaturePresent does not have a dedicated flag for RDM, so we enable it by implication.
> 1) DP is an optional instruction set for Armv8.2, which may be included only in processors implementing at least Armv8.1.
> 2) Armv8.1 requires RDM when AdvSIMD is implemented, and AdvSIMD is a baseline requirement of .NET.
> Therefore, by documented standard, DP cannot exist here without RDM. In practice, there is only one CPU supported by Windows that includes RDM without DP, so this implication also has little practical chance of a false negative."*

The "one CPU with RDM without DP" trade-off applies equally to us: we accept a possible false negative on that single SKU rather than introducing a more aggressive heuristic.

### Tests

Adds `println!` lines to the existing `aarch64_windows()` test in `library/std_detect/tests/cpu-detection.rs` for each newly-detected feature, mirroring the existing single-line pattern. No structural assertions added.

### Scope

Stable feature names only. The unstable SME family (`sme`, `sme2`, `sme2p1`, `sme_*`, `ssve_fp8*`) and other unstable additions tracked under #127764 are intentionally out of scope here to keep this PR minimal — happy to do a follow-up.

### References

- Tracking issue: #127764 (`stdarch_aarch64_feature_detection`)
- Precedent: rust-lang/stdarch#1749 (taiki-e, merged 2025-03-24) — added the SVE constants this builds on
- MS docs: [`IsProcessorFeaturePresent`](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-isprocessorfeaturepresent) — full PF_ARM_* table

r? @Amanieu

cc @taiki-e (author of rust-lang/stdarch#1749, would appreciate your eyes on the `rdm` inference)

@rustbot label +T-libs +O-windows +O-ARM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state O-windows Operating system: Windows S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants