Skip to content

[KVCache] Fix attention kernel for ROCm#16551

Merged
tqchen merged 1 commit intoapache:mainfrom
MasterJH5574:tvm-dev/2024-02-09-rocm-kv-cache
Feb 11, 2024
Merged

[KVCache] Fix attention kernel for ROCm#16551
tqchen merged 1 commit intoapache:mainfrom
MasterJH5574:tvm-dev/2024-02-09-rocm-kv-cache

Conversation

@MasterJH5574
Copy link
Copy Markdown
Contributor

When compiling a TensorIR function to ROCm backends, we need to be careful to store and load local registers in a consistent approach (either using vector access or scalar access), or otherwise there will be correctness issue.

This PR fixes the attention kernel for ROCm.
Meanwhile, it adds tests for float32 dtype and head dim other than 128.

When compiling a TensorIR function to ROCm backends,
we need to be careful to store and load local registers
in a consistent approach (either using vector access
or scalar access), or otherwise there will be correctness
issue.

This PR fixes the attention kernel for ROCm.
Meanwhile, it adds tests for float32 dtype and head dim
other than 128.
@tqchen tqchen merged commit 0449a16 into apache:main Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants