Skip to content

Arm64 SVE: Support scalable constant vectors and masks#127520

Open
a74nh wants to merge 42 commits into
dotnet:mainfrom
a74nh:truemasknode_github
Open

Arm64 SVE: Support scalable constant vectors and masks#127520
a74nh wants to merge 42 commits into
dotnet:mainfrom
a74nh:truemasknode_github

Conversation

@a74nh
Copy link
Copy Markdown
Contributor

@a74nh a74nh commented Apr 28, 2026

Adds support to GenTreeVecCon and GenTreeMskCon for constants with unknown sizes. Instead of having a blob of data, the constant is represented as being one of either: a repeated value, an sequence with start and step values, or a value in the first lane and the rest zeroed. To handle this the base type is also required.

As this new structure is slightly bigger than a simd16, the simd_t typedef is pushed up to simd32 sized.

For vector constants, a vector is scalable because if it is of TYP_SIMD.

For mask constants, the type is always TYP_MASK. However on Arm64, masks are only used by SVE. Therefore to tell if a mask is scalable then JitUseScalableVectorT is checked.

The IsAllBitsSet() on mask constants is updated to include a base type. A mask that is all set for TYP_LONG will not be all set for TYP_BYTE, and instead will be 100010001000...

Given two scalable constants it may not be possible to add them together to produce a third scalable constant. Instead they will remain as two vectors in the IR.

To show this implementation is workable, scalable support is added for:

  • Sve.CreateTrueMask*()
  • Sve.CreateFalseMask*()
  • Vector.Create()
  • Vector.CreateScalar()
  • Vector.CreateScalarUnsafe()
  • Vector.CreateSequence()

Fixes #125057

Copilot AI review requested due to automatic review settings April 28, 2026 17:32
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 28, 2026
@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Apr 28, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Adds support to GenTreeVecCon and GenTreeMskCon for constants with unknown sizes. Instead of having a blob of data, the constant is represented as being one of either: a repeated value, an sequence with start and step values, or a value in the first lane and the rest zeroed. To handle this the base type is also required.

As this new structure is slightly bigger than a simd16, the simd_t typedef is pushed up to simd32 sized.

For vector constants, a vector is scalable because if it is of TYP_SIMD.

For mask constants, the type is always TYP_MASK. However on Arm64, masks are only used by SVE. Therefore to tell if a mask is scalable then JitUseScalableVectorT is checked.

The IsAllBitsSet() on mask constants is updated to include a base type. A mask that is all set for TYP_LONG will not be all set for TYP_BYTE, and instead will be 100010001000...

Given two scalable constants it may not be possible to add them together to produce a third scalable constant. Instead they will remain as two vectors in the IR.

To show this implementation is workable, scalable support is added for:
Sve.CreateTrueMask*()
Sve.CreateFalseMask*()
Vector.Create()
Vector.CreateScalar()
Vector.CreateScalarUnsafe()
Vector.CreateSequence()

Fixes dotnet#125057
@a74nh a74nh force-pushed the truemasknode_github branch from 4754486 to 7fac1f9 Compare April 29, 2026 14:36
@a74nh
Copy link
Copy Markdown
Contributor Author

a74nh commented Apr 29, 2026

Taking this out of draft now.

Because of the very limited support for scalable SVE, this is currently very hard to test. I've been working off the top of @snickolls-arm's WIP branch with all his code in, which allows me to to call handwritten tests. In current HEAD, there are too many errors before getting to my code.

There's still a lot of work to do on top of this. Eg, I need to get generic ops working, plus all the other Vector APIs which create constants. But, I didn't want this PR to grow too big. The important part is this serves as a base for further constant work.

@dotnet/arm64-contrib @jakobbotsch @tannergooding

@a74nh a74nh requested review from Copilot and removed request for Copilot April 30, 2026 11:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR adds Arm64 SVE “scalable VectorT” support across the JIT, including new encodings for scalable vector/mask constants and updates to value numbering, folding, lowering, LSRA, and codegen to recognize and emit SVE-friendly patterns.

Changes:

  • Introduce new scalable constant representations (simdscalable_t, simdmaskscalable_t) and plumb them through GenTree constant nodes and hashing.
  • Extend value numbering and folding to create/consume scalable SIMD constants on Arm64.
  • Implement Arm64 SVE VectorT intrinsics import and codegen pathways (create/broadcast/sequence), plus mask handling updates.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/coreclr/jit/valuenum.h Adds VN support for scalable SIMD constants on Arm64
src/coreclr/jit/valuenum.cpp Creates/broadcasts scalable SIMD VN constants and dumps them
src/coreclr/jit/simd.h Defines new scalable vector/mask constant encodings and helper APIs
src/coreclr/jit/simd.cpp Implements scalable vector/mask helpers and conversions
src/coreclr/jit/lsraarm64.cpp Reserves temps for scalable vector constants that can’t be directly encoded
src/coreclr/jit/lowerarmarch.cpp Updates mask lowering + VectorT intrinsic handling
src/coreclr/jit/hwintrinsiclistarm64sve.h Enables VectorT intrinsics for SVE
src/coreclr/jit/hwintrinsiccodegenarm64.cpp Emits SVE instructions for VectorT intrinsics
src/coreclr/jit/hwintrinsicarm64.cpp Imports VectorT intrinsics and updates true/false mask creation
src/coreclr/jit/hwintrinsic.h Marks VectorT_* as special cases for scalar/broadcast creation
src/coreclr/jit/gentree.h Extends vector/mask constants to support scalable encodings
src/coreclr/jit/gentree.cpp Adds scalable constant construction, hashing, folding, and printing
src/coreclr/jit/emitarm64.h Repositions signed-immediate helpers used by new SVE paths
src/coreclr/jit/compiler.hpp Extends bitmask helpers for >64-register targets
src/coreclr/jit/compiler.h Adds new compiler helpers for scalable vector/mask constants
src/coreclr/jit/codegenarm64.cpp Adds emission for scalable vector/mask constants

Comment thread src/coreclr/jit/simd.h
Comment thread src/coreclr/jit/simd.cpp Outdated
Comment thread src/coreclr/jit/simd.h Outdated
Comment thread src/coreclr/jit/simd.cpp
Comment thread src/coreclr/jit/codegenarm64.cpp Outdated
Comment thread src/coreclr/jit/codegenarm64.cpp
Comment thread src/coreclr/jit/gentree.h Outdated
Comment thread src/coreclr/jit/gentree.h Outdated
@a74nh a74nh requested a review from Copilot May 15, 2026 14:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

src/coreclr/jit/gentree.cpp:1

  • For integral base types smaller than 64 bits, assigning IntegralValue() directly to the 64-bit gtSimdScalableIndex can sign-extend (e.g., int8 -1 becomes 0xFFFF...FFFF) and/or leave non-canonical upper bits, making semantically identical constants compare/hash differently (VN/CSE/folding regressions). Store the scalar in a canonical form by zero-initializing the struct and then writing only sizeof(baseType) bytes (e.g., via memcpy or by assigning through gtSimdScalableIndexU8/U16/U32/U64/signed views as appropriate), ensuring upper bits are cleared.
    src/coreclr/jit/gentree.cpp:1
  • Same canonicalization issue as the broadcast/scalar path: gtSimdScalableIndex/gtSimdScalableStep are compared/hashed as 64-bit values, so sign-extension or non-zero upper bits for smaller base types can cause equivalent sequences to be treated as different constants. Canonicalize index and step by writing only the element-size bytes and clearing the remaining bytes.

Comment thread src/coreclr/jit/codegenarm64.cpp Outdated
Comment thread src/coreclr/jit/emitarm64.h Outdated
Comment thread src/coreclr/jit/valuenum.h Outdated
@a74nh a74nh requested a review from Copilot May 15, 2026 15:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Comment thread src/coreclr/jit/codegenarm64.cpp Outdated
Comment thread src/coreclr/jit/codegenarm64.cpp
Comment thread src/coreclr/jit/lsraarm64.cpp
Comment thread src/coreclr/jit/simd.h Outdated
Comment thread src/coreclr/jit/simd.h
@a74nh a74nh requested a review from Copilot May 18, 2026 09:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.

Comment thread src/coreclr/jit/lsraarm64.cpp
Comment thread src/coreclr/jit/simd.cpp
@a74nh a74nh requested a review from Copilot May 18, 2026 10:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

src/coreclr/jit/gentree.cpp:1

  • Grammar: change 'Attempts to folds' to 'Attempts to fold'.

Comment thread src/coreclr/jit/valuenum.cpp Outdated
Comment thread src/coreclr/jit/simd.h
Comment thread src/coreclr/jit/codegenarm64.cpp
@a74nh a74nh requested a review from Copilot May 18, 2026 11:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (2)

src/coreclr/jit/gentree.cpp:1

  • For integral base types smaller than 64-bit (e.g., TYP_BYTE/TYP_SHORT), assigning IntegralValue() directly into the 64-bit gtSimdScalableIndex can produce non-canonical encodings (notably via sign-extension), causing semantically identical constants to compare/hash differently (GenTree equality/hash and VN constant maps rely on the full 64-bit fields). Canonicalize before storing by truncating/masking to the element width (and/or writing via the appropriately-sized union field / memcpy) so that the stored representation is consistent across all creators (IR construction vs VN broadcast paths).
    src/coreclr/jit/gentree.cpp:1
  • Same canonicalization issue as the broadcast case: for small integral element types, gtSimdScalableIndex/gtSimdScalableStep should be stored in a canonical (truncated-to-element-width) form. Without truncation, two equivalent sequences can diverge depending on whether they were created from widened IL constants vs value-numbered broadcasts, breaking equality/hash consistency and potentially blocking optimizations.

@a74nh
Copy link
Copy Markdown
Contributor Author

a74nh commented May 18, 2026

This is ready for reviews now.
I'll the copilot reviews have been resolved or won't be fixed.

In the background I'm going to merge this with all of @snickolls-arm's latest work and do some more testing

Comment thread src/coreclr/jit/gentree.cpp Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Design for Vector Constants with agnostic SVE

4 participants