forked from Dao-AILab/flash-attention
-
Notifications
You must be signed in to change notification settings - Fork 129
Pull requests: vllm-project/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Combine kernel: increase pipeline depth from 4 to 8 stages
#124
opened Mar 4, 2026 by
jmkuebler
Loading…
[Frontend] Add FP8 output quantization support to FlashAttention backend
#113
opened Jan 3, 2026 by
sachinkumarsingh092
Loading…
[Kernel] add attention sinks for flash attention2
#103
opened Oct 19, 2025 by
dudugong-gitch
Loading…
Removed the assertion imposed on cu_seqlens_k and seqused_k
#59
opened Mar 29, 2025 by
chenyang78
Loading…
Add back flash_attn_func api (and support FA3) [Don't Merge Yet]
#40
opened Jan 26, 2025 by
LucasWilkinson
Loading…
ProTip!
Follow long discussions with comments:>50.