Skip to content

JIT: Transform '(cmp & x) | (cmp & y)' to 'cmp & (x | y)'#126070

Closed
BoyBaykiller wants to merge 3 commits intodotnet:mainfrom
BoyBaykiller:transform-double-and-or
Closed

JIT: Transform '(cmp & x) | (cmp & y)' to 'cmp & (x | y)'#126070
BoyBaykiller wants to merge 3 commits intodotnet:mainfrom
BoyBaykiller:transform-double-and-or

Conversation

@BoyBaykiller
Copy link
Copy Markdown
Contributor

No description provided.

@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 25, 2026
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 25, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@BoyBaykiller
Copy link
Copy Markdown
Contributor Author

I am having a problem identifying certain cases.
Currently I detect such IR:

[000007] -----------                            \--*  OR        int   
[000003] -----------                               +--*  AND       int   
[000001] -----------                               |  +--*  LCL_VAR   int    V02 arg2         
[000002] -----------                               |  \--*  CNS_INT   int    256
[000006] -----------                               \--*  AND       int   
[000004] -----------                                  +--*  LCL_VAR   int    V02 arg2          (last use)
[000005] -----------                                  \--*  CNS_INT   int    512

Which works fine for (flags & 256) | (flags & 512),
but breaks down if I add a operation in front like 1 + (flags & 256) | (flags & 512). Then the IR is:

[000008] -----------                         \--*  OR        int   
[000004] -----------                            +--*  ADD       int   
[000000] -----------                            |  +--*  CNS_INT   int    1
[000003] -----------                            |  \--*  AND       int   
[000001] -----------                            |     +--*  LCL_VAR   int    V01 arg1         
[000002] -----------                            |     \--*  CNS_INT   int    256
[000007] -----------                            \--*  AND       int   
[000005] -----------                               +--*  LCL_VAR   int    V01 arg1         
[000006] -----------                               \--*  CNS_INT   int    512

When adding explicit parentheses like 1 + ((flags & 256) | (flags & 512)) it works again:

[000008] -----------                         \--*  ADD       int   
[000000] -----------                            +--*  CNS_INT   int    1
[000007] -----------                            \--*  OR        int   
[000003] -----------                               +--*  AND       int   
[000001] -----------                               |  +--*  LCL_VAR   int    V01 arg1         
[000002] -----------                               |  \--*  CNS_INT   int    256
[000006] -----------                               \--*  AND       int   
[000004] -----------                                  +--*  LCL_VAR   int    V01 arg1         
[000005] -----------                                  \--*  CNS_INT   int    512

But this shouldn't be needed. It shouldnt get confused by some extra commutative arithmetic like the ADD(1) here. What is the general way to fix this?

@huoyaoyuan
Copy link
Copy Markdown
Member

but breaks down if I add a operation in front like 1 + (flags & 256) | (flags & 512). Then the IR is:

The operator priority of | is lower than +. 1 + (flags & 256) | (flags & 512) means (1 + (flags & 256)) | (flags & 512). Check sharplab.

// Fold "(cmp & x) | (cmp & y)" to "cmp & (x | y)".
if (varTypeIsIntegralOrI(orOp) && op1->OperIs(GT_AND) && op2->OperIs(GT_AND))
{
if (GenTree::Compare(op1->gtGetOp1(), op2->gtGetOp1()))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is legal - cmp may have side-effect or be a local whose value is changed via x or y

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give am example?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give am example?

of a side effect? just a tree that increments a field. it does it twice, you remove one of them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok let me see

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied a check from fgRecognizeAndMorphBitwiseRotation that should make it safe:

if (((tree->gtFlags & GTF_PERSISTENT_SIDE_EFFECTS) != 0) || ((tree->gtFlags & GTF_ORDER_SIDEEFF) != 0))
{
    // We can't do anything if the tree has stores, calls, or volatile reads. Note that we allow
    // GTF_EXCEPT side effect since any exceptions thrown by the original tree will be thrown by
    // the transformed tree as well.
    return nullptr;
}

This means these trees are no longer transformed which is correct:

int BugPersSideEffects(int flags)
{
    int res = (Consume(flags) & 256) | (Consume(flags) & 512);
    return res;
}

int BugOrderSideeff(ref int flags)
{
    Consume(flags);
    return (Volatile.Read(ref flags) & 256) | (Volatile.Read(ref flags) & 512);
}

Interestingly, I've run superpmi and there wasn't a single case where this fires.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, I've run superpmi and there wasn't a single case where this fires.

🤷 the fact that your current PR finds diffs implies there are such patterns, but we just can't do it in morph.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the PERSISTENT_SIDE_EFFECTS/ORDER_SIDEEFF correctnes check doesnt result in any additional diffs compared to not having this correctnes check. I've tested this by putting assert(false) in it and running replay.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the PERSISTENT_SIDE_EFFECTS/ORDER_SIDEEFF correctnes check doesnt result in any additional diffs compared to not having this correctnes check. I've tested this by putting assert(false) in it and running replay.

Ah, I misunderstood you. Well, you still need to add the checks

Copy link
Copy Markdown
Contributor Author

@BoyBaykiller BoyBaykiller Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah of course : )
However the check in it's current form miss some useful cases. For example:

int TransformOnInd(ref int flags)
{
    Consume(flags); // null check

    int res = (flags & 256) | (flags & 512);

    return res;
}

Here we bail because of GTF_ORDER_SIDEEFF != 0 check. Am I supposed to use a different flag? What I want to know is "can this IND be removed (no null check attached and not volatile)?".

If I knew how to do that then I could also handle this:

int TransformOnInd2(ref int flags)
{
    int res = (flags & 256) | (flags & 512);

    return res;
}

The first load can't be removed because it's also the nullcheck if I understand correctly. But the second still can. So the transformation can still be done. For this case there is the additional problem that GenTree::Compare returns false when the IND flags aren't the same.

@BoyBaykiller
Copy link
Copy Markdown
Contributor Author

The operator priority of | is lower than +. 1 + (flags & 256) | (flags & 512) means (1 + (flags & 256)) | (flags & 512). Check sharplab.

@huoyaoyuan Bad example from me. Say we have foo | (flags & 256) | (flags & 512).
So order of operations is (foo | (flags & 256)) | (flags & 512). But since all | are commutative the compiler should be free to see it as foo | ((flags & 256) | (flags & 512)) and still apply the transformation where applicable: https://godbolt.org/z/KEa9915cM

* add check for GTF_PERSISTENT_SIDE_EFFECTS so we dont remove a tree when we shouldnt (volatile check still missing)
@BoyBaykiller
Copy link
Copy Markdown
Contributor Author

I will close this for now as I've encountered 2 general existing issues:

  1. I opened this PR to make progress towards JIT: Optimize some bitwise ops in context of enum flags #125899, however to handle the original pattern we are missing a general function that reorders arithmethic operations if it enable this (or other) transformations:
    https://github.com/BoyBaykiller/runtime/blob/205c820fb048066d4f64ac5996d9ecdd7260c864/src/coreclr/jit/morph.cpp#L10717-L10719
  2. The transformation can't be done if the op2 tree contains volatile loads. To check for that I should use GTF_ORDER_SIDEEFF flag. But this flag is also set in other cases where the transformation can in fact be done. So there seems to be no good way to tell by the flags alone.
    https://github.com/BoyBaykiller/runtime/blob/205c820fb048066d4f64ac5996d9ecdd7260c864/src/coreclr/jit/morph.cpp#L10726-L10729

Fixing these would help existing transformations too. @EgorBo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants