[HLK] Modify Wave match test logic to support modifications in different lanes and vector position#7991
Conversation
alsepkow
left a comment
There was a problem hiding this comment.
Made some comments. It looks like we need a few fixes on this iteration still.
| const UINT HighWaves = NumWaves - LowWaves; | ||
| LowWaveMask = (LowWaves < 64) ? (1ULL << LowWaves) - 1 : ~0ULL; | ||
| HighWaveMask = (HighWaves < 64) ? (1ULL << HighWaves) - 1 : ~0ULL; | ||
| LowBits &= LowWaveMask; |
There was a problem hiding this comment.
The assignment to LowBits and HighBits seems redundant? They're initialized to zero so the result of these operations will still always just be 0?
d884ec7 to
f0d2cfd
Compare
79cc228 to
2138c2a
Compare
| return Word; | ||
| } | ||
|
|
||
| void StoreWords(UINT *Dest, std::bitset<128> LanesState) { |
There was a problem hiding this comment.
Update LanesState to be a const reference to avoid the copy every time this is called.
|
|
||
| const uint64_t LowExpected = ~1ULL & LowWaveMask; | ||
| const uint64_t HighExpected = ~0ULL & HighWaveMask; | ||
| const uint64_t LowActiveLanes = (LanesState & LowWaveMask).to_ullong(); |
There was a problem hiding this comment.
I think LanesState is intended to represent if we expect the values between 'this' lane and other lanes to match? So, I don't think it actually represents a state? In other words, 'ChangedLanes' and 'UnchangedLanes' are intended to represent the expected values on lane for the comparison of each vector element across lanes (the definition of the WaveMatch intrinsic).
You could better convey this by changing the names 'UnchangedLanes' to 'DefaultExpectedValue'. And 'ChangedLanes' to 'ExpectedValue'.
| const UINT MidLaneID = WaveSize / 2; | ||
| const UINT LastLaneID = std::min(WaveSize - 1, VectorSize - 1); | ||
|
|
||
| std::bitset<128> UnchangedLanes; |
There was a problem hiding this comment.
nit: I suggesting adding a comment to further clarify the reason for a bitest.
// Use a std::bitset<128> to represent the uint4 returned by WaveMatch as its convenient this way in c++ #Resolved
| } | ||
|
|
||
| void WriteExpectedValueForLane(UINT *Dest, const UINT Lane, | ||
| const std::bitset<128> &LanesState) { |
There was a problem hiding this comment.
| const std::bitset<128> &LanesState) { | |
| const std::bitset<128> &ExpectedValue) { |
| HighWaveMask = ComputeWaveMask(HighWaves); | ||
| } | ||
|
|
||
| void WriteExpectedValueForLane(UINT *Dest, const UINT Lane, |
There was a problem hiding this comment.
| void WriteExpectedValueForLane(UINT *Dest, const UINT Lane, | |
| void WriteExpectedValueForLane(UINT *Dest, const UINT LaneID, |
| // WaveMatch as well. | ||
| // For this test, the shader arranges it so that lanes 0, WAVE_SIZE/2 and | ||
| // WAVE_SIZE-1 are different from all the other lanes, also those | ||
| // lanes modify the vector at positions 0, WAVE_SIZE/2 and WAVE_SIZE-1. |
There was a problem hiding this comment.
'WAVE_SIZE-1' Isn't accurate if the wave size is larger than the vector size
|
|
||
| struct WaveMatchExpectedResultWritter { | ||
| private: | ||
| UINT LowWaves; |
There was a problem hiding this comment.
LowWave and HighWaves don't need to be members. You only use them to compute the wave mask when constructing the WaveMatchExpectedResultWriter
| const uint64_t HighActiveLanes = | ||
| ((LanesState >> 64) & HighWaveMask).to_ullong(); | ||
|
|
||
| const UINT LaneIndex = 4 * Lane; |
There was a problem hiding this comment.
nit: Naming, this isn't a 'LaneIndex'.
Just change it to something simple like 'I'
|
|
||
| const UINT LaneIndex = 4 * Lane; | ||
| Dest[LaneIndex + 0] = static_cast<UINT>(LowActiveLanes); | ||
| Dest[LaneIndex + 1] = static_cast<UINT>(LowActiveLanes << 32); |
There was a problem hiding this comment.
Did you test this with values in the 'upper' half of the low/high lanes?
I think this and the subsequent shift should be a right shift, not a left shift?
There was a problem hiding this comment.
The 32 bit shifts are correct, I just investigated it. I fixed another bug that I found investigating this though
There was a problem hiding this comment.
You still swapped them to RHS though?
There was a problem hiding this comment.
You are right, I change those to right shift, sorry for the confusion
| WaveMatchExpectedResultWritter(UINT WaveSize) { | ||
| const UINT LowWaves = std::min(64U, WaveSize); | ||
| const UINT HighWaves = WaveSize - LowWaves; | ||
| LowWaveMask = ComputeWaveMask(LowWaves); |
There was a problem hiding this comment.
Given that we only ever call ComputeWaveMask with a value of 64 or less we have no need for the helper anymore.
There was a problem hiding this comment.
I removed the wrapper and made that a one-liner, however, I still need to check if the shift is below 64, because there is a bug when shifting 64 bits, that cause the number to be zeroed instead of being filled with one, damyan and I investigated and found this issue together.
| (LowWaves < 64) ? (1ULL << LowWaves) - 1 : ~0ULL; | ||
| void WriteExpectedValueForLane(UINT *Dest, const UINT LaneID, | ||
| const std::bitset<128> &ExpectedValue) { | ||
| const uint64_t LowActiveLanes = (ExpectedValue & LowWaveMask).to_ullong(); |
There was a problem hiding this comment.
nit: naming, while there is a relationship between the active lanes and the results. These values don't explicitly represent active/inactive lanes.
I would just call these low/high (or lo/hi as I've seen that in multiple places referencing similar things).
| // all the other lanes. Besides that all other lines write their result of | ||
| // WaveMatch as well. | ||
| static constexpr std::bitset<128> ComputeWaveMask(UINT NumWaves) { | ||
| return (NumWaves < 64) ? (1ULL << NumWaves) - 1 : ~0UL; |
There was a problem hiding this comment.
Just a note that ~0UL should have been ~0ULL, but we're getting rid of that anyways.
damyanp
left a comment
There was a problem hiding this comment.
LGTM, some suggestions and a question.
| // Making sure all lanes finish updating their vectors. | ||
| AllMemoryBarrierWithGroupSync(); |
There was a problem hiding this comment.
Out of interest, did you find that this was needed for the test to work?
I was under impression that this test arranged for only a single wave to run at a time, so I wouldn't think that this would have any effect.
There was a problem hiding this comment.
Agree that the small size, and the test complexity, do make that likely unnecessary.
I just didn't want to rely on chance, I added that as a precaution to make sure my test results are always consistent.
There was a problem hiding this comment.
If this test is designed to run a single wave then things will go wrong if more than one wave ends up running. Sprinkling memory barriers in because they might be needed can hide other bugs as well as make it harder to understand what the code is doing.
Seeing this I spent a good few minutes trying to read through to code to figure out where more than one wave might be dispatched.
damyanp
left a comment
There was a problem hiding this comment.
LGTM, one editorial suggestion.
|
|
||
| static void WriteExpectedValueForLane(UINT *Dest, const UINT LaneID, | ||
| const std::bitset<128> &ExpectedValue) { | ||
| // We need the mask to always be 32 bits, this calculation assurers that. |
There was a problem hiding this comment.
Fix typo:
| // We need the mask to always be 32 bits, this calculation assurers that. | |
| // We need the mask to always be 32 bits, this calculation assures that. |
My preference would be to remove this comment, it's pretty clear what the code is doing without it and the IMO the comment makes it harder to read.
| // We need the mask to always be 32 bits, this calculation assurers that. |
This patch modifies the Wave Match test to test modifications in different lanes and vector
indexes. This is achieved by forcing lanes
0,WAVE_SIZE/2andWAVE_SIZE -1, to modifythe vector at indexes
0,WAVE_SIZE/2orWAVE_SIZE -1, respectively.