Qualcomm AI Engine Direct - Delegate mutable buffer and fix the mutable buffer issue by shewu-quic · Pull Request #11782 · pytorch/executorch

shewu-quic · 2025-06-18T09:53:31Z

Summary:

Add a parameter to support mutable buffer delegation in QNN Backend
- Set the same memory address for I/O of mutable buffer at runtime
- Ref: Qualcomm AI Engine Direct - Delegated mutable buffer #6727
Avoid annotating the input node because mutable buffers will be folded during the convert_pt2e process.
Deprecated use_legacy_export in executorch llama

…le buffer issue Summary: - Add a parameter to support mutable buffer delegation in QNN Backend - Set the same memory address for I/O of mutable buffer at runtime - Avoid annotating the input node because mutable buffers will be folded during the convert_pt2e process. - Deprecated use_legacy_export in executorch llama

pytorch-bot · 2025-06-18T09:53:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11782

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 5fde193 with merge base 44d2643 ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-moshi-linux / linux-job (gh) (trunk failure)
test_exported_decoder_xnnpack

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-06-18T09:54:11Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

cccclai · 2025-06-18T18:16:49Z

Avoid annotating the input node because mutable buffers will be folded during the convert_pt2e process.

Is the input node still folded after we land pytorch/ao#2345?

shewu-quic · 2025-06-19T02:18:51Z

Avoid annotating the input node because mutable buffers will be folded during the convert_pt2e process.

Is the input node still folded after we land pytorch/ao#2345?

Yes, unless we apply run_decomposition after export. I think we can wait until run_decomposition becomes a pass and doesn't require re-tracing. After that we can change it back to annotate mutable buffer. What do you think?

shewu-quic · 2025-06-19T04:22:15Z

BTW, in previous, we have submitted a PR to deprecated convert_bmm_to_matmul pass. It will result in multiple partitions for Meta's llama due to not using to_edge_transform_and_lower_to_qnn. So, I add it back and set False for activation as default value.

facebook-github-bot · 2025-06-19T20:08:34Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai · 2025-06-19T22:07:41Z

I ran into this error for some internal use cases.

TypeError: LLMEdgeManager.__init__() got an unexpected keyword argument 'use_legacy_export'

Can we turn if off in this PR, and I will have another PR to remove this arg?

facebook-github-bot · 2025-06-20T17:17:45Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai

…le buffer issue (pytorch#11782) Summary: - Add a parameter to support mutable buffer delegation in QNN Backend - Set the same memory address for I/O of mutable buffer at runtime - Ref: pytorch#6727 - Avoid annotating the input node because mutable buffers will be folded during the convert_pt2e process. - Deprecated use_legacy_export in executorch llama cc @cccclai @winskuo-quic @cbilgin

Summary: As title, try to see if we can get rid of the legacy export. It should be fixed with pytorch#11782 Differential Revision: D77761473

shewu-quic requested review from cccclai, jackzhxng, larryliu0820, lucylq and swolchok as code owners June 18, 2025 09:53

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 18, 2025

shewu-quic mentioned this pull request Jun 18, 2025

[Draft] Qualcomm AI Engine Direct - Unexpected graph for mutable buffer after export during Quantization #11309

Closed

manuelcandales added the partner: qualcomm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm label Jun 18, 2025

Fixed the CI for meta's llama

19c5aa1

shewu-quic requested a review from mergennachin as a code owner June 19, 2025 04:15

cccclai approved these changes Jun 19, 2025

View reviewed changes

revert change

5fde193

cccclai merged commit e0f81d8 into pytorch:main Jun 20, 2025
102 of 103 checks passed

cccclai added a commit to cccclai/executorch-1 that referenced this pull request Jul 7, 2025

Remove the legacy export (pytorch#12218)

e2cc5b3

Summary: As title, try to see if we can get rid of the legacy export. It should be fixed with pytorch#11782 Differential Revision: D77761473

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qualcomm AI Engine Direct - Delegate mutable buffer and fix the mutable buffer issue#11782

Qualcomm AI Engine Direct - Delegate mutable buffer and fix the mutable buffer issue#11782
cccclai merged 3 commits intopytorch:mainfrom
CodeLinaro:dev1/hutton/fix_mutable_buffer_issue

shewu-quic commented Jun 18, 2025 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented Jun 18, 2025 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 18, 2025

Uh oh!

cccclai commented Jun 18, 2025

Uh oh!

shewu-quic commented Jun 19, 2025

Uh oh!

shewu-quic commented Jun 19, 2025

Uh oh!

facebook-github-bot commented Jun 19, 2025

Uh oh!

cccclai commented Jun 19, 2025

Uh oh!

facebook-github-bot commented Jun 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

shewu-quic commented Jun 18, 2025 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11782

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

github-actions Bot commented Jun 18, 2025

This PR needs a release notes: label

Uh oh!

cccclai commented Jun 18, 2025

Uh oh!

shewu-quic commented Jun 19, 2025

Uh oh!

shewu-quic commented Jun 19, 2025

Uh oh!

facebook-github-bot commented Jun 19, 2025

Uh oh!

cccclai commented Jun 19, 2025

Uh oh!

facebook-github-bot commented Jun 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shewu-quic commented Jun 18, 2025 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented Jun 18, 2025 •

edited

Loading

This PR needs a `release notes:` label