[pytorch/executorch][diff_train] Cortex-M backend: Add AoT scratch-buffer planning. (#19636)#19782
Closed
SS-JIA wants to merge 1 commit into
Closed
[pytorch/executorch][diff_train] Cortex-M backend: Add AoT scratch-buffer planning. (#19636)#19782SS-JIA wants to merge 1 commit into
SS-JIA wants to merge 1 commit into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19782
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This was referenced May 26, 2026
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
This is done for conv, depthwise conv, transpose conv, and bmm.
Add scratch tensors to the operator signatures, which are then
assigned exir.memory.alloc. These allocs are automatically memory
planned by ExecuTorch.
Introduce
required_cmsis_buffer_sizewhich computes the buffersize from node properties + the Cortex-M configuration.
The function uses functions registered by target in
backends/cortex_m/passes/scratch_buffer_sizes.py
This is used to set the size of the allocs in ConvertToCortexMPass
Finally, modify the kernels to use the new scratch tensor instead
of allocating temporary memory. Add a new macro
CORTEX_M_ENABLE_RUNTIME_CHECKS
to do a safety check that the aot computed buffer size is equal to the
buffer size computed at runtime. Use this when testing.
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani @psiddh @AdrianLundell
@oscarandersson8218 @mansnils @Sebastian-Larsson @robell
Signed-off-by: Erik Lundell erik.lundell@arm.com
Internal:
<< DO NOT EDIT BELOW THIS LINE >>
GitHub Author: Erik Lundell erik.lundell@arm.com
GitHub Repo: pytorch/executorch
GitHub Pull Request: #19636
Initially generated by: https://www.internalfb.com/intern/sandcastle/job/13510801582847443/
This was imported as part of a DiffTrain.
Please review this as soon as possible. Since it is a direct copy of a commit on
GitHub, there shouldn't be much to do.
diff-train-source-id: b581615
Co-authored-by: Måns Nilsson mans.nilsson@arm.com
Differential Revision: D106339880