Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions backends/arm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,8 +138,10 @@ The delegated Python API flow is:
For complete examples of that flow, including quantization and target-specific
compile specs, see:

- `docs/source/backends/arm-ethos-u/tutorials/ethos-u-getting-started.md`
- `docs/source/backends/arm-vgf/tutorials/vgf-getting-started.md`
- [Arm Ethos-U tutorial](../../docs/source/backends/arm-ethos-u/tutorials/ethos-u-getting-started.md)
- [Arm VGF tutorial](../../docs/source/backends/arm-vgf/tutorials/vgf-getting-started.md)
- [Arm Cortex-M backend overview](../../docs/source/backends/arm-cortex-m/arm-cortex-m-overview.md)
- [Ethos-U porting guide](../../examples/arm/ethos-u-porting-guide.md)

Additional examples are available in `examples/arm`.

Expand Down
208 changes: 65 additions & 143 deletions examples/arm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,175 +5,97 @@ This source code is licensed under the BSD-style license found in the
LICENSE file in the root directory of this source tree.
-->

## ExecuTorch for Arm backends Ethos-U, VGF and Cortex-M
# Examples for Arm backends Ethos-U, VGF and Cortex-M

This project contains scripts to help you setup and run a PyTorch
model on a Arm backend via ExecuTorch. This backend supports Ethos-U and VGF as
targets (using TOSA) but you can also use the Ethos-U example runner as an example
on Cortex-M if you do not delegate the model.
This directory contains documentation and scripts to
help you setup and run a PyTorch model on the Arm backend
via ExecuTorch.

The main scripts are `setup.sh`, `run.sh` and
`backends/arm/scripts/aot_arm_compiler.py`.
## setup.sh

`setup.sh` will install the needed tools and with --root-dir <FOLDER>
you can change the path to a scratch folder where it will download and generate build
artifacts. If supplied, you must also supply the same folder to run.sh with
--scratch-dir=<FOLDER> If not supplied both scripts will use examples/arm/arm-scratch.
`setup.sh` downloads the Arm cross-compilation toolchain and Corstone FVP
simulators, installs the Python dependencies for TOSA, Ethos-U Vela, and
Cortex-M/CMSIS-NN, and generates `setup_path.sh` scripts for adding those tools
to your environment. Optional flags also install VGF/MLSDK and Vulkan
dependencies.

`run.sh` can be used to build, run and test a model in an easy way and it will call cmake for you
and in cases you want to run a simulator it will start it also. The script will call `aot_arm_compiler.py`
to convert a model and include it in the build/run.

For bare-metal Ethos-U builds `run.sh` configures the standalone
`examples/arm/executor_runner/standalone` CMake entry point automatically. If
`--build-dir` is omitted, the script creates and owns a build tree under
`arm_test/<target>_<build_type>`. Supplying `--build-dir` reuses an existing tree
(for example a VGF host build or out-of-tree configuration) and `run.sh`
verifies it exposes the runner options it needs before compiling.

Build and test artifacts are by default placed under the folder arm_test folder
this can be changed with --et_build_root=<FOLDER>

`aot_arm_compiler.py` is used to convert a Python model or a saved .pt model to a PTE file and is used by `run.sh`
and other test script but can also be used directly.


## Create a PTE file for Arm backends

There is an easy to use example flow to compile your PyTorch model to a PTE file for the Arm backend called `aot_arm_compiler.py`
that you can use to generate PTE files, it can generate PTE files for the supported targets `-t` or even non delegated (Cortex-M)
using different memory modes and can both use a python file as input or just use the models from examples/models with `--model_name`.
It also supports generating Devtools artifacts like BundleIO BPTE files, and ETRecords. Run it with `--help` to check its capabilities.

You point out the model to convert with `--model_name=<MODELNAME/FILE>` It supports running a model from examples/models or models
from a python file if you just specify `ModelUnderTest` and `ModelInputs` in it.

```
$ python3 -m backends.arm.scripts.aot_arm_compiler --help
```

This is how you generate a BundleIO BPTE of a simple add example
Example to install the default Arm backend dependencies and add them to your current shell:

```bash
./examples/arm/setup.sh --i-agree-to-the-contained-eula
source examples/arm/arm-scratch/setup_path.sh
```
$ python3 -m backends.arm.scripts.aot_arm_compiler --model_name=examples/arm/example_modules/add.py --target=ethos-u55-128 --bundleio
```

The example model used has added two extra variables that is picked up to make this work.

`ModelUnderTest` should be a `torch.nn.module` instance.

`ModelInputs` should be a tuple of inputs to the forward function.


You can also use the models from example/models directly by just using the short name e.g.

```
$ python3 -m backends.arm.scripts.aot_arm_compiler --model_name=mv2 --target=ethos-u55-64
```


`aot_arm_compiler.py` is called from the scripts below so you don't need to, but it can be useful to do by hand in some cases.

## Host VGF example applications
## run.sh

The Arm examples directory also contains host-side VGF reference flows for
specific tasks:
`run.sh` is an end-to-end helper for building and executing an Arm backend
example. It sources the `setup_path.sh` script generated by `setup.sh`, runs
`aot_arm_compiler.py` to convert the selected model to a `.pte` or `.bpte`,
builds the matching runner with CMake, and starts the simulator or runtime for
the selected target when `--build_only` is not set.

- `examples/arm/image_classification_example_vgf` for DEiT image
classification.
- `examples/arm/super_resolution_example_vgf` for Swin2SR image
super-resolution.


## ExecuTorch on Arm Ethos-U55/U65 and U85

This example code will help you get going with the Corstone&trade;-300/320 platforms and
run on the FVP and can be used a starting guide in your porting to your board/HW

We will start from a PyTorch model in python, export it, convert it to a `.pte`
file - A binary format adopted by ExecuTorch. Then we will take the `.pte`
model file and embed that with a baremetal application executor_runner. We will
then take the executor_runner file, which contains not only the `.pte` binary but
also necessary software components to run standalone on a baremetal system.
The build flow will pick up the non delegated ops from the generated PTE file and
add CPU implementation of them.
Lastly, we will run the executor_runner binary on a Corstone&trade;-300/320 FVP Simulator platform.


### Example workflow

Below is example workflow to build an application for Ethos-U55/85. The script below requires an internet connection:

```
# Step [1] - setup necessary tools
$ cd <EXECUTORCH-ROOT-FOLDER>
$ ./examples/arm/setup.sh --i-agree-to-the-contained-eula

# Step [2] - Setup path to tools, The `setup.sh` script has generated a script that you need to source every time you restart you shell.
$ source examples/arm/arm-scratch/setup_path.sh
Build and test artifacts are written to `arm_test` by default. Use
`--et_build_root=<FOLDER>` to choose another build root.

# Step [3] - build and run ExecuTorch and executor_runner baremetal example application
# on a Corstone(TM)-320 FVP to run a simple PyTorch model from a file.
$ ./examples/arm/run.sh --model_name=examples/arm/example_modules/add.py --target=ethos-u85-128
```

The argument `--model_name=<MODEL>` is passed to `aot_arm_compiler.py` so you can use it in the same way
e.g. you can also use the models from example/models directly in the same way as above.
For example, after running `setup.sh` and sourcing the generated
`setup_path.sh`, build and run a model on an Ethos-U85 target with:

```
$ ./examples/arm/run.sh --model_name=mv2 --target=ethos-u55-64
```bash
./examples/arm/run.sh --model_name=examples/arm/example_modules/add.py --target=ethos-u85-128
```

The runner will by default set all inputs to "1" and you are supposed to add/change the code
handling the input for your hardware target to give the model proper input, maybe from your camera
or mic hardware.
For bundled input/output and ETDump testing:

While testing you can use the --bundleio flag to use the input from the python model file and
generate a .bpte instead of a .pte file. This will embed the input example data and reference output
in the bpte file/data, which is used to verify the model's output. You can also use --etdump to generate
an ETRecord and a ETDump trace files from your target (they are printed as base64 strings in the serial log).

Just keep in mind that CPU cycles are NOT accurate on the FVP simulator and it can not be used for
performance measurements, so you need to run on FPGA or actual ASIC to get good results from --etdump.
As a note the printed NPU cycle numbers are still usable and closer to real values if the timing
adaptor is setup correctly.

```
# Build + run with BundleIO and ETDump
$ ./examples/arm/run.sh --model_name=lstm --target=ethos-u85-128 --bundleio --etdump
```bash
./examples/arm/run.sh --model_name=lstm --target=ethos-u85-128 --bundleio --etdump
```

For Cortex-M testing, use a Cortex-M target and bundled I/O:

### Ethos-U minimal example

See the jupyter notebook `ethos_u_minimal_example.ipynb` for an explained minimal example of the full flow for running a
PyTorch module on the EthosUDelegate. The notebook runs directly in some IDE:s s.a. VS Code, otherwise it can be run in
your browser using
```
pip install jupyter
jupyter notebook ethos_u_minimal_example.ipynb
```bash
./examples/arm/run.sh --model_name=mv2 --target=cortex-m55+int8 --bundleio
```

## ExecuTorch on ARM Cortex-M
## Example Contents

For Cortex-M you run the script without delegating e.g `--no_delegate` as the build flow already supports picking up
the non delegated ops from the generated PTE file and add CPU implementation of them this will work out of the box in
most cases.
### Notebook examples

To run mobilenet_v2 on the Cortex-M55 only, without using the Ethos-U try this:
- [ethos_u_minimal_example.ipynb](ethos_u_minimal_example.ipynb) - Minimal
Ethos-U AOT, runtime build, and FVP execution flow.
- [vgf_minimal_example.ipynb](vgf_minimal_example.ipynb) - Minimal VGF
lowering and host execution flow.
- [cortex_m_mv2_example.ipynb](cortex_m_mv2_example.ipynb) - Cortex-M
MobileNetV2 export, quantization, runtime build, and FVP execution flow.
- [pruning_minimal_example.ipynb](pruning_minimal_example.ipynb) - Model
conditioning and pruning flow for Ethos-U85.
- [quantizer_tutorial.ipynb](quantizer_tutorial.ipynb) - Quantizer tutorial
for TOSA, Ethos-U, and VGF quantizers.

```
$ ./examples/arm/run.sh --model_name=mv2 --target=ethos-u55-128 --no_delegate
```
### Application examples

- [image_classification_example_ethos_u](image_classification_example_ethos_u/)
- End-to-end DEiT-Tiny image classification flow for Ethos-U, including
model fine-tuning, export, bare-metal runtime build, and Corstone-320 FVP
execution.
Comment on lines +78 to +79
- [image_classification_example_vgf](image_classification_example_vgf/) -
DEiT-Tiny image classification flow for VGF host execution.
- [super_resolution_example_vgf](super_resolution_example_vgf) - Swin2SR image
super-resolution.
- [example_modules/add.py](example_modules/add.py) - Small external model file
usable with `run.sh --model_name=examples/arm/example_modules/add.py`.

### Online Tutorial
### Utility examples and guides

We also have a [tutorial](https://pytorch.org/executorch/stable/backends-arm-ethos-u) explaining the steps performed in these
scripts, expected results, possible problems and more. It is a step-by-step guide
you can follow to better understand this delegate.
- [ethos-u-porting-guide.md](ethos-u-porting-guide.md) - Notes for adapting
the example Ethos-U runtime integration to another target.
- [export_standalone_tosa_graph.py](export_standalone_tosa_graph.py) -
Example of exporting a standalone TOSA graph with multiple outputs.
- [composable_quantizer_example.py](composable_quantizer_example.py) - Minimal
script showing experimental composable quantizer use.
- [visualize.py](visualize.py) - Helper used by `run.sh --model_explorer` to
visualize TOSA or PTE graphs.

### Project Templates
## Project Templates

These project templates provide alternative starting points with different toolchains and build systems:

Expand Down
Loading