Added support for normal MLA kernel by annanyapr · Pull Request #17624 · apache/tvm

annanyapr · 2025-02-05T07:40:29Z

Have refactored _attention_prefill_ragged to allow for different v dimension from q/k dimension. This can be used for MLA attention in deepseek models.

annanyapr · 2025-02-05T07:42:47Z

@MasterJH5574 can you take a look?

annanyapr · 2025-02-20T04:18:30Z

@MasterJH5574 TVM seems to building correctly and tvm/tests/python/relax/test_runtime_builtin_paged_attention_kv_cache_tir.py seems to be working fine

MasterJH5574

LGTM, thanks! We are good to go after CI passes.

* Refactored code to allow for different v dimension from q/k dimension * Made a small fix after the rebase * Made changes to the runtime to support normal kernel * Fixed a compilation issue * Fix lint --------- Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>

annanyapr force-pushed the generic-attention branch from 3eac670 to e3ac7b5 Compare February 10, 2025 15:03

annanyapr changed the title ~~Refactored code to allow for different v dimension from q/k dimension~~ Added support for normal MLA kernel Feb 17, 2025

annanyapr force-pushed the generic-attention branch from 506386d to 7569674 Compare February 17, 2025 20:38

annanyapr added 3 commits February 19, 2025 22:03

Refactored code to allow for different v dimension from q/k dimension

c2f0f86

Made a small fix after the rebase

7548bb6

Made changes to the runtime to support normal kernel

acd9fa0

annanyapr force-pushed the generic-attention branch from 7569674 to acd9fa0 Compare February 20, 2025 03:03

Fixed a compilation issue

ca86ca7

MasterJH5574 approved these changes Feb 20, 2025

View reviewed changes

MasterJH5574 force-pushed the generic-attention branch from 1e4b697 to a533a11 Compare February 20, 2025 15:52

Fix lint

bd88313

MasterJH5574 force-pushed the generic-attention branch from a533a11 to bd88313 Compare February 20, 2025 16:57

MasterJH5574 merged commit 6d92f2a into apache:main Feb 20, 2025

ysh329 mentioned this pull request Apr 19, 2025

[Release] v0.20.0 Release Candidate Notes #17860

Closed

kurisu6912 mentioned this pull request Sep 5, 2025

kurisu add assume attr patch 1 tile-ai/tvm#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for normal MLA kernel#17624

Added support for normal MLA kernel#17624
MasterJH5574 merged 5 commits intoapache:mainfrom
annanyapr:generic-attention

annanyapr commented Feb 5, 2025

Uh oh!

annanyapr commented Feb 5, 2025

Uh oh!

annanyapr commented Feb 20, 2025

Uh oh!

MasterJH5574 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

annanyapr commented Feb 5, 2025

Uh oh!

annanyapr commented Feb 5, 2025

Uh oh!

annanyapr commented Feb 20, 2025

Uh oh!

MasterJH5574 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants