Skip to content

cudnn flash attention GQA support#555

Closed
kocchop wants to merge 2 commits into
AI-Hypercomputer:mainfrom
kocchop:cudnn_flash_dpa
Closed

cudnn flash attention GQA support#555
kocchop wants to merge 2 commits into
AI-Hypercomputer:mainfrom
kocchop:cudnn_flash_dpa

Conversation

@kocchop

@kocchop kocchop commented Mar 26, 2024

Copy link
Copy Markdown
Collaborator
  1. Used new stable API for cudnn flash attention
  2. It now has support for GQA

@kocchop kocchop requested a review from rwitten as a code owner March 26, 2024 17:15
@kocchop

kocchop commented Mar 28, 2024

Copy link
Copy Markdown
Collaborator Author

@rwitten here is the PR for GQA support in Flash Attention. Please let me if you have any suggestions.

@rwitten rwitten left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delegating to @yangyuwei for real approval

@yangyuwei yangyuwei left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding GQA support!

copybara-service Bot pushed a commit that referenced this pull request Apr 8, 2024
--
190a1a73753a665087be14f6f82823d24608cce6 by Md Fahim Faysal Khan <mdfahimfaysa@nvidia.com>:

added support for GQA

--
9022b351c2d0e26fc1abbad228379b383cd3fa8a by Md Fahim Faysal Khan <mdfahimfaysa@nvidia.com>:

added GQA support for cudnn flash attention

COPYBARA_INTEGRATE_REVIEW=#555 from kocchop:cudnn_flash_dpa 9022b351c2d0e26fc1abbad228379b383cd3fa8a
PiperOrigin-RevId: 622915708
@yangyuwei

Copy link
Copy Markdown
Collaborator

The code changes have been pushed to Github via this commit (f04ba76) triggered by copybara. I'd close this PR. Thank you!

@yangyuwei yangyuwei closed this Apr 8, 2024
@kocchop kocchop deleted the cudnn_flash_dpa branch May 16, 2025 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants