JIT: Transform '(cmp & x) | (cmp & y)' to 'cmp & (x | y)' by BoyBaykiller · Pull Request #126070 · dotnet/runtime

BoyBaykiller · 2026-03-25T02:41:37Z

No description provided.

dotnet-policy-service · 2026-03-25T02:42:49Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

BoyBaykiller · 2026-03-25T03:27:14Z

I am having a problem identifying certain cases.
Currently I detect such IR:

[000007] -----------                            \--*  OR        int   
[000003] -----------                               +--*  AND       int   
[000001] -----------                               |  +--*  LCL_VAR   int    V02 arg2         
[000002] -----------                               |  \--*  CNS_INT   int    256
[000006] -----------                               \--*  AND       int   
[000004] -----------                                  +--*  LCL_VAR   int    V02 arg2          (last use)
[000005] -----------                                  \--*  CNS_INT   int    512

Which works fine for (flags & 256) | (flags & 512),
but breaks down if I add a operation in front like 1 + (flags & 256) | (flags & 512). Then the IR is:

[000008] -----------                         \--*  OR        int   
[000004] -----------                            +--*  ADD       int   
[000000] -----------                            |  +--*  CNS_INT   int    1
[000003] -----------                            |  \--*  AND       int   
[000001] -----------                            |     +--*  LCL_VAR   int    V01 arg1         
[000002] -----------                            |     \--*  CNS_INT   int    256
[000007] -----------                            \--*  AND       int   
[000005] -----------                               +--*  LCL_VAR   int    V01 arg1         
[000006] -----------                               \--*  CNS_INT   int    512

When adding explicit parentheses like 1 + ((flags & 256) | (flags & 512)) it works again:

[000008] -----------                         \--*  ADD       int   
[000000] -----------                            +--*  CNS_INT   int    1
[000007] -----------                            \--*  OR        int   
[000003] -----------                               +--*  AND       int   
[000001] -----------                               |  +--*  LCL_VAR   int    V01 arg1         
[000002] -----------                               |  \--*  CNS_INT   int    256
[000006] -----------                               \--*  AND       int   
[000004] -----------                                  +--*  LCL_VAR   int    V01 arg1         
[000005] -----------                                  \--*  CNS_INT   int    512

But this shouldn't be needed. It shouldnt get confused by some extra commutative arithmetic like the ADD(1) here. What is the general way to fix this?

huoyaoyuan · 2026-03-25T06:54:31Z

but breaks down if I add a operation in front like 1 + (flags & 256) | (flags & 512). Then the IR is:

The operator priority of | is lower than +. 1 + (flags & 256) | (flags & 512) means (1 + (flags & 256)) | (flags & 512). Check sharplab.

EgorBo · 2026-03-25T11:15:13Z

src/coreclr/jit/morph.cpp

+    // Fold "(cmp & x) | (cmp & y)" to "cmp & (x | y)".
+    if (varTypeIsIntegralOrI(orOp) && op1->OperIs(GT_AND) && op2->OperIs(GT_AND))
+    {
+        if (GenTree::Compare(op1->gtGetOp1(), op2->gtGetOp1()))


I don't think this is legal - cmp may have side-effect or be a local whose value is changed via x or y

can you give am example?

can you give am example?

of a side effect? just a tree that increments a field. it does it twice, you remove one of them.

ok let me see

I copied a check from fgRecognizeAndMorphBitwiseRotation that should make it safe:

if (((tree->gtFlags & GTF_PERSISTENT_SIDE_EFFECTS) != 0) || ((tree->gtFlags & GTF_ORDER_SIDEEFF) != 0)) { // We can't do anything if the tree has stores, calls, or volatile reads. Note that we allow // GTF_EXCEPT side effect since any exceptions thrown by the original tree will be thrown by // the transformed tree as well. return nullptr; }

This means these trees are no longer transformed which is correct:

int BugPersSideEffects(int flags) { int res = (Consume(flags) & 256) | (Consume(flags) & 512); return res; } int BugOrderSideeff(ref int flags) { Consume(flags); return (Volatile.Read(ref flags) & 256) | (Volatile.Read(ref flags) & 512); }

Interestingly, I've run superpmi and there wasn't a single case where this fires.

Interestingly, I've run superpmi and there wasn't a single case where this fires.

🤷 the fact that your current PR finds diffs implies there are such patterns, but we just can't do it in morph.

Adding the PERSISTENT_SIDE_EFFECTS/ORDER_SIDEEFF correctnes check doesnt result in any additional diffs compared to not having this correctnes check. I've tested this by putting assert(false) in it and running replay.

Adding the PERSISTENT_SIDE_EFFECTS/ORDER_SIDEEFF correctnes check doesnt result in any additional diffs compared to not having this correctnes check. I've tested this by putting assert(false) in it and running replay.

Ah, I misunderstood you. Well, you still need to add the checks

Yeah of course : )
However the check in it's current form miss some useful cases. For example:

int TransformOnInd(ref int flags) { Consume(flags); // null check int res = (flags & 256) | (flags & 512); return res; }

Here we bail because of GTF_ORDER_SIDEEFF != 0 check. Am I supposed to use a different flag? What I want to know is "can this IND be removed (no null check attached and not volatile)?".

If I knew how to do that then I could also handle this:

int TransformOnInd2(ref int flags) { int res = (flags & 256) | (flags & 512); return res; }

The first load can't be removed because it's also the nullcheck if I understand correctly. But the second still can. So the transformation can still be done. For this case there is the additional problem that GenTree::Compare returns false when the IND flags aren't the same.

BoyBaykiller · 2026-03-26T01:55:39Z

The operator priority of | is lower than +. 1 + (flags & 256) | (flags & 512) means (1 + (flags & 256)) | (flags & 512). Check sharplab.

@huoyaoyuan Bad example from me. Say we have foo | (flags & 256) | (flags & 512).
So order of operations is (foo | (flags & 256)) | (flags & 512). But since all | are commutative the compiler should be free to see it as foo | ((flags & 256) | (flags & 512)) and still apply the transformation where applicable: https://godbolt.org/z/KEa9915cM

* add check for GTF_PERSISTENT_SIDE_EFFECTS so we dont remove a tree when we shouldnt (volatile check still missing)

BoyBaykiller · 2026-03-28T02:36:29Z

I will close this for now as I've encountered 2 general existing issues:

I opened this PR to make progress towards JIT: Optimize some bitwise ops in context of enum flags #125899, however to handle the original pattern we are missing a general function that reorders arithmethic operations if it enable this (or other) transformations:
https://github.com/BoyBaykiller/runtime/blob/205c820fb048066d4f64ac5996d9ecdd7260c864/src/coreclr/jit/morph.cpp#L10717-L10719
The transformation can't be done if the op2 tree contains volatile loads. To check for that I should use GTF_ORDER_SIDEEFF flag. But this flag is also set in other cases where the transformation can in fact be done. So there seems to be no good way to tell by the flags alone.
https://github.com/BoyBaykiller/runtime/blob/205c820fb048066d4f64ac5996d9ecdd7260c864/src/coreclr/jit/morph.cpp#L10726-L10729

Fixing these would help existing transformations too. @EgorBo

BoyBaykiller added 2 commits March 25, 2026 03:08

* Transform '(cmp & x) | (cmp & y)' to 'cmp & (x | y)'

1c6892f

* allow TYP_LONG

5ab2bef

github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 25, 2026

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 25, 2026

BoyBaykiller closed this Mar 25, 2026

BoyBaykiller reopened this Mar 25, 2026

EgorBo reviewed Mar 25, 2026

View reviewed changes

* add option to ingore IND flags in GenTree::Compare

205c820

* add check for GTF_PERSISTENT_SIDE_EFFECTS so we dont remove a tree when we shouldnt (volatile check still missing)

BoyBaykiller closed this Mar 28, 2026

This was referenced Mar 28, 2026

Unable to pull image from mcr.microsoft.com #117164

Open

modpowTest.FastReducer_AssertFailure_RegressionTest hanging/timing out #126212

Open

Conversation

BoyBaykiller commented Mar 25, 2026

Uh oh!

dotnet-policy-service bot commented Mar 25, 2026

Uh oh!

BoyBaykiller commented Mar 25, 2026

Uh oh!

huoyaoyuan commented Mar 25, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BoyBaykiller Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BoyBaykiller commented Mar 26, 2026

Uh oh!

BoyBaykiller commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BoyBaykiller Mar 26, 2026 •

edited

Loading