{executorch][llama] support mqa by kimishpatel · Pull Request #3080 · pytorch/executorch

kimishpatel · 2024-04-17T04:45:41Z

Stack from ghstack (oldest at bottom):

-> {executorch][llama] support mqa #3080

This diff adds support for multi query attention for sdpa with kv cache

Differential Revision: D56228316

This diff adds support for multi query attention for sdpa with kv cache Differential Revision: [D56228316](https://our.internmc.facebook.com/intern/diff/D56228316/) [ghstack-poisoned]

pytorch-bot · 2024-04-17T04:45:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3080

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures

As of commit 26aced7 with merge base 458d743 ():

NEW FAILURES - The following jobs have failed:

pull / test-llama-runner-linux (fp32, buck2, portable) / linux-job (gh)
RuntimeError: Command docker exec -t e5007c6eb293ed5504c6c916acf6c92b429db9f5fa481ccc64fa47bd432ecfcf /exec failed with exit code 3
pull / test-llama-runner-linux (fp32, buck2, xnnpack+kv+custom) / linux-job (gh)
RuntimeError: Command docker exec -t 4712c9a4779ab7b9702f2fdd0ed10d51b68b723a908ff7f9a8589f6b373515f1 /exec failed with exit code 3
pull / test-llama-runner-linux (fp32, cmake, portable) / linux-job (gh)
RuntimeError: Command docker exec -t e593a949e45ab8ac9097cff51e53b474a79ddf1f14f6c5ec247a13984a21f228 /exec failed with exit code 2
pull / test-llama-runner-linux (fp32, cmake, xnnpack+kv+custom) / linux-job (gh)
RuntimeError: Command docker exec -t 60bb05e52ecb8295287c57d8e151cb7d9c754b92abac6875f035fca8ea0cea18 /exec failed with exit code 2
pull / test-llama-runner-linux-android (cmake) / linux-job (gh)
RuntimeError: Command docker exec -t d20ca19c55fb9e54135dd2f95e780e45a79a82f44f1c5af8a6cb1c7e4c85d69b /exec failed with exit code 2

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-04-17T04:45:53Z

This pull request was exported from Phabricator. Differential Revision: D56228316

This diff adds support for multi query attention for sdpa with kv cache Differential Revision: [D56228316](https://our.internmc.facebook.com/intern/diff/D56228316/) ghstack-source-id: 222819128 Pull Request resolved: #3080

mikekgfb

Please ensure all tests pass before landing!

larryliu0820 · 2024-04-17T05:47:35Z

 # Any targets that should be shared between fbcode and xplat must be defined in
 # targets.bzl. This file can contain fbcode-only targets.

+load("@fbcode_macros//build_defs:python_unittest.bzl", "python_unittest")


Use runtime? You can do

runtime.python_unittest( ... )

oh this was due to arc lint. let me fix.

This diff adds support for multi query attention for sdpa with kv cache Differential Revision: [D56228316](https://our.internmc.facebook.com/intern/diff/D56228316/) [ghstack-poisoned]

Pull Request resolved: #3080 This diff adds support for multi query attention for sdpa with kv cache Differential Revision: [D56228316](https://our.internmc.facebook.com/intern/diff/D56228316/) ghstack-source-id: 222855405

facebook-github-bot · 2024-04-17T14:09:05Z

This pull request was exported from Phabricator. Differential Revision: D56228316

kimishpatel · 2024-04-17T14:56:04Z

bunch of pre-existing failures with re2.h that @larryliu0820 is fixing

facebook-github-bot · 2024-04-17T16:09:27Z

This pull request has been merged in bae0387.

{executorch][llama] support mqa

2ff9055

This diff adds support for multi query attention for sdpa with kv cache Differential Revision: [D56228316](https://our.internmc.facebook.com/intern/diff/D56228316/) [ghstack-poisoned]

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 17, 2024

facebook-github-bot added the fb-exported label Apr 17, 2024

mikekgfb approved these changes Apr 17, 2024

View reviewed changes

larryliu0820 reviewed Apr 17, 2024

View reviewed changes

Update on "{executorch][llama] support mqa"

26aced7

This diff adds support for multi query attention for sdpa with kv cache Differential Revision: [D56228316](https://our.internmc.facebook.com/intern/diff/D56228316/) [ghstack-poisoned]

facebook-github-bot closed this in bae0387 Apr 17, 2024

facebook-github-bot added the Merged label Apr 17, 2024

mergennachin mentioned this pull request Apr 26, 2024

disclaimer #3376

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

{executorch][llama] support mqa#3080

{executorch][llama] support mqa#3080
kimishpatel wants to merge 2 commits intogh/kimishpatel/54/basefrom
gh/kimishpatel/54/head

kimishpatel commented Apr 17, 2024 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Apr 17, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Apr 17, 2024

Uh oh!

mikekgfb left a comment

Uh oh!

larryliu0820 Apr 17, 2024

Uh oh!

kimishpatel Apr 17, 2024

Uh oh!

facebook-github-bot commented Apr 17, 2024

Uh oh!

kimishpatel commented Apr 17, 2024

Uh oh!

facebook-github-bot commented Apr 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kimishpatel commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3080

❌ 5 New Failures

Uh oh!

facebook-github-bot commented Apr 17, 2024

Uh oh!

mikekgfb left a comment

Choose a reason for hiding this comment

Uh oh!

larryliu0820 Apr 17, 2024

Choose a reason for hiding this comment

Uh oh!

kimishpatel Apr 17, 2024

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Apr 17, 2024

Uh oh!

kimishpatel commented Apr 17, 2024

Uh oh!

facebook-github-bot commented Apr 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kimishpatel commented Apr 17, 2024 •

edited

Loading

pytorch-bot Bot commented Apr 17, 2024 •

edited

Loading