Cherry-pick RDNA1 fix into 6.1 release#1916
Cherry-pick RDNA1 fix into 6.1 release#1916GZGavinZhao wants to merge 1 commit intoROCm:release/rocm-rel-6.1from
Conversation
Fixes ROCm#1757. Enables architectures that don't have optimized logic files to also produce libraries when `--separate-architectures` or `--lazy-library-loading` is turned on. Previously, one must disable both of these two flags in order for rocBLAS to run on architectures like `gfx1010`. Test plan: ``` cmake -GNinja -B build -S . \ -DCMAKE_C_COMPILER=hipcc \ -DCMAKE_CXX_COMPILER=hipcc \ -DBUILD_CLIENTS_TESTS=OFF \ -DBUILD_CLIENTS_BENCHMARKS=OFF \ -DBUILD_CLIENTS_SAMPLES=OFF \ -DBUILD_TESTING=OFF \ -DBUILD_WITH_TENSILE=ON \ -DTensile_PRINT_DEBUG=ON \ -DTensile_LIBRARY_FORMAT=msgpack \ -DTensile_CPU_THREADS=14 \ -DTensile_LAZY_LIBRARY_LOADING=ON \ -DAMDGPU_TARGETS="..." ``` With `AMDGPU_TARGETS` being one of the following - `AMDGPU_TARGETS=gfx1010` - `AMDGPU_TARGETS=gfx1030;gfx1010` - `AMDGPU_TARGETS=gfx803;gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102` In all three cases, `$ROCM_PATH/lib/rocblas/library/TensileLibrary_lazy_gfx1010.dat` is produced and all other `*.dat` files remain unchanged. Signed-off-by: Gavin Zhao <git@gzgz.dev>
|
Rocm6.1 release branch is already finalized and no further release plan for 6.1. |
|
@nakajee Thank you for your quick response. Would you mind if I create a PR to |
|
Nevermind, I see this is already included in |
To be clear, there is a ROCm 6.1.1 release planned. Though, perhaps there are no changes planned for Tensile specifically. @GZGavinZhao, it is not a simple process to cherry-pick a change into a patch release. There is an internal change management process for cherry-picks. Unfortunately, it is mostly handled manually and it is very time-consuming. And even a small change in Tensile requires retesting a large suite of libraries and applications for performance and correctness. This is a great PR and it highlights some of the difficulties with the existing cherry-pick process. I'm hopeful we can improve to the point that a PR like this could be accepted in the future. |
|
@cgmb I see. I understand that the cherry-picking process may require extensive testing, hence this PR is more of a request to cherry-pick. I thought that maybe some Tensile changes are planned for the next patch release, so if this change can be considered together that'd be great, but releasing in ROCm 6.2 is totally fine as well. Anyway, Thank you so much for your explanation! |
This PR requests to cherry-pick #1897 into ROCm 6.1 release.
#1897 enables RDNA1 (gfx101*) users to run rocBLAS with the rocblas package from AMD's official repository. Since rocBLAS is almost ubiquitous in ML workflows, this means that all RDNA1 users are essentially unable to run any programs related to ML right now (ROCm/ROCm#2527).
Admittedly, RDNA1 GPUs are not officially supported, but this is such a small change that would benefit all users of RDNA1 GPUs, so I believe is very worth it.
This is a NFC for all architectures besides
gfx1010.#1897 requires #1888 to function properly, but it seems like #1888 is already cherry-picked into the
release/rocm-rel-6.1branch in #1905, so cherry-picking #1897 would not break anything.