Skip to content

Conversation

@eme64
Copy link
Contributor

@eme64 eme64 commented Jan 5, 2026

In VLoopMemorySlices::find_memory_slices, we analyze the memory slices. In some cases, we only find loads in the slice, and no phi. So the memory input of the loads comes from before the loop. When I refactored the code, I made the assumption that all loads should have the same memory input. After all: any store before the loop would have to have happened before we enter the loop, and execute any loads from the loop. The assumption held for a long time, but now we have a counter example.

Summary: one load has its memory input optimized, the other is not put on the IGVN worklist again, and keeps the old memory input (even though in this case we could have optimized it just the same). Thus, both choices of memory input are correct, and the assumption of the assert does not hold.

Solution: Just bail out of auto vectorization if this assumption is violated. This is an edge case, and the assert has not been hit until the fuzzer found this example.
Alternatives: we could track the multiple memory inputs, but that would be more effort to implement, and hard to test because it is difficult to create examples.


Details

Below, look at 1145 LoadB and 1131 LoadB. One has memory input Param 7 (initial program state), the other 711 Phi (outer loop). Both loads are inside the 1147 CountedLoop. But their states come from outside, both originally from 711 Phi. But then 1145 LoadB is optimized with LoadBNode::Ideal -> LoadNode::Ideal -> LoadNode::split_through_phi: it realizes that the backedge of the 711 Phi only goes by the 593 CallStaticJava, which cannot modify the Byte::value field of the LoadB (unless it was to use reflection, but that unlocks undefined behavior anyway, so it can be ignored). So it is ok to split through the phi, as the Byte::value cannot be modified during the outer loop.

image

1131 LoadB could also do the same optimization, but it just does not end up on the IGVN worklist. The issue is that we don't have any adequate notification that goes down through the MergeMem - Call - Proj structure. We did not want to have that until now, because in theory we could have a long series of calls, and the traversals could become too expensive.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8373453: C2 SuperWord: must handle load slices that have loads with different memory inputs (Bug - P3)(⚠️ The fixVersion in this issue is [26] but the fixVersion in .jcheck/conf is 27, a new backport will be created when this pr is integrated.)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/29028/head:pull/29028
$ git checkout pull/29028

Update a local copy of the PR:
$ git checkout pull/29028
$ git pull https://git.openjdk.org/jdk.git pull/29028/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 29028

View PR using the GUI difftool:
$ git pr show -t 29028

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/29028.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 5, 2026

👋 Welcome back epeter! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 5, 2026

@eme64 This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8373453: C2 SuperWord: must handle load slices that have loads with different memory inputs

Reviewed-by: kvn, thartmann, qamai

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 21 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot changed the title JDK-8373453 8373453: C2 SuperWord: assert(_inputs.at(alias_idx) == nullptr || _inputs.at(alias_idx) == load->in(1)) failed: not yet touched or the same input Jan 5, 2026
@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jan 5, 2026
@openjdk
Copy link

openjdk bot commented Jan 5, 2026

@eme64 The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@eme64 eme64 changed the title 8373453: C2 SuperWord: assert(_inputs.at(alias_idx) == nullptr || _inputs.at(alias_idx) == load->in(1)) failed: not yet touched or the same input 8373453: C2 SuperWord: C2 SuperWord: must handle load slices that have loads with different memory inputs Jan 5, 2026
@eme64 eme64 changed the title 8373453: C2 SuperWord: C2 SuperWord: must handle load slices that have loads with different memory inputs 8373453: C2 SuperWord: must handle load slices that have loads with different memory inputs Jan 5, 2026
@eme64 eme64 marked this pull request as ready for review January 5, 2026 11:33
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 5, 2026
@mlbridge
Copy link

mlbridge bot commented Jan 5, 2026

Webrevs

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. I approve this conservative fix.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 5, 2026
Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me too. Great that we have a regression test for this rare case now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-compiler hotspot-compiler-dev@openjdk.org ready Pull request is ready to be integrated rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

4 participants