Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
270 commits
Select commit Hold shift + click to select a range
cb6ddae
update torch pin (#2944)
cccclai Apr 9, 2024
8a6427e
aten.convolution (Transpose) (#2883)
Apr 9, 2024
c4ac14c
aten.convolution (Depthwise) (#2884)
Apr 9, 2024
4599650
Fix Validation Layer warnings about wrong image layout (#2854)
SS-JIA Apr 9, 2024
b26eee8
Introduce convenience constexpr for `StorageType`s and `GPUMemoryLayo…
SS-JIA Apr 9, 2024
6cb6051
Use __ET_UNLIKELY in assertion macros (#2949)
swolchok Apr 9, 2024
3661a11
s/heirarchies/hierarchies/ (#2772)
swolchok Apr 9, 2024
02f565e
Fix indentation in selective build example code (#2773)
swolchok Apr 9, 2024
f00afe7
aten.convolution (Depthwise Output-Tile) (#2885)
Apr 9, 2024
99c4f4e
aten.convolution (Pointwise) (#2886)
Apr 9, 2024
218f643
Make minor updates to LLM guide setup instructions (#2940)
GregoryComer Apr 10, 2024
de7fdaa
resolve_buck.py: Add an entry for darwin-x86_64 (#2868)
dbort Apr 10, 2024
564c276
Compute graph print readable (#2825)
yipjustin Apr 10, 2024
8aaf2c5
aten.convolution (Bias=False) (#2887)
Apr 10, 2024
f0bfc3c
Add convolution cases to codegen (#2920)
Apr 10, 2024
b145701
add aten.sum.default (#2807)
copyrightly Apr 10, 2024
d993797
Fix failing CI jobs caused by #2934 (#2961)
larryliu0820 Apr 10, 2024
a983ebc
Replace `std::stringstream` with `std::string` for Shader names (#2964)
SS-JIA Apr 10, 2024
e733f2d
Refine the LLM manual (focus on the debugging and profiling part) (#2…
Olivia-liu Apr 10, 2024
26365f1
Android demo app tutorial fix for XNNPACK and QNN (#2962)
kirklandsign Apr 10, 2024
554cd27
Qualcomm AI Engine Direct - Enable per channel linear op (#2822)
chunit-quic Apr 10, 2024
8f8d969
Custom ops API small fixes (#2936)
larryliu0820 Apr 10, 2024
d209e41
Consolidate EXECUTORCH_BUILD_CUSTOM option (#2935)
larryliu0820 Apr 10, 2024
948760a
Consolidate tokenizer interface (#2954)
larryliu0820 Apr 10, 2024
859e924
Update OSS repo (#2033)
mcremon-meta Apr 10, 2024
cb9caa3
Add the missing import generate_etrecord to doc Getting Started with …
Olivia-liu Apr 10, 2024
75c27c3
Fix llama runner test (#2981)
larryliu0820 Apr 11, 2024
2fc99b0
Forward fix macOS job after test-infra #5086 (#2980)
huydhn Apr 11, 2024
d761f99
Add a mock perf test for llama2 on Android (#2963)
huydhn Apr 11, 2024
7d4bafc
Core ML Has Added `Index_Put` Support, No Need to Skip Anymore (#2975)
Apr 11, 2024
7c71970
Minor fix in README.md page
mergennachin Apr 11, 2024
e641ffc
Add llama2 readme in examples/README (#2992)
mergennachin Apr 11, 2024
6e43135
Use new API to register custom ops for llama model (#2916)
larryliu0820 Apr 11, 2024
c7fd394
Fix tutorial for Qualcomm AI Engine Direct Backend (#2956)
kirklandsign Apr 11, 2024
7b8343b
Update name from xtensa to cadence (#2982)
mcremon-meta Apr 11, 2024
c322685
Use new API to register custom ExecuTorch kernels into ATen (#2937)
larryliu0820 Apr 11, 2024
1f6f711
fix et-view (#2843)
metascroy Apr 11, 2024
62a4dd3
Replace view copy with view (3/3) (#2463)
metascroy Apr 11, 2024
ce344bc
Skip annotate boolean input (#2957)
cccclai Apr 11, 2024
3b727a7
Fix build-framework-ios CI job (#2996)
larryliu0820 Apr 11, 2024
5ef8427
Extend constant prop pass to work with int/float/etc scalars and fix …
hsharma35 Apr 11, 2024
76d8513
Introduce `vTensorPtr` to prevent reference invalidation and remove `…
SS-JIA Apr 12, 2024
46cf1c7
Add Tiktoken in python (#2986)
larryliu0820 Apr 12, 2024
65be9b4
Dynamic Shapes (#2442)
mcr229 Apr 12, 2024
bf59da6
dynamic qd8-fc test with 2 batch dims (#2441)
mcr229 Apr 12, 2024
1f5a833
dynamic mobilenetv2 (#2440)
mcr229 Apr 12, 2024
fec9c2f
dynamic mv3 (#2475)
mcr229 Apr 12, 2024
33f41bd
Dynamic ResNet (#2474)
mcr229 Apr 12, 2024
d1bc794
Dynamic ViT (#2476)
mcr229 Apr 12, 2024
ab323a5
add export configs (#2965)
lucylq Apr 12, 2024
6acc86f
Add exir.save and exir.load with export_serialize (#3000)
tarun292 Apr 12, 2024
5b7c4ba
Fix 3 CI jobs (#3006)
larryliu0820 Apr 12, 2024
b1edc3d
Add util to print out ops and frequency (#2983)
mcremon-meta Apr 12, 2024
488afc5
Decouple custom ops in llama_transformer.py Part 1/N (#3005)
mergennachin Apr 12, 2024
74eb8b3
Decouple custom ops in llama_transformer.py Part 2/N (#3007)
mergennachin Apr 12, 2024
0f379ba
Update README.md (#3012)
jerryzh168 Apr 12, 2024
17c64a3
add more instructions and examples on Delegation (#2973)
Gasoonjia Apr 12, 2024
cd248b4
Run LlamaDemo app on AWS Device Farm (#3004)
huydhn Apr 12, 2024
c075eea
Remove RemoveRedundantViewCopyPass (#2464)
metascroy Apr 12, 2024
21fdc4e
Change tokenizer name to bpe_tokenizer and extract a base class (#3009)
larryliu0820 Apr 12, 2024
cd32712
Update README.md and add submodule update (#3029)
iseeyuan Apr 13, 2024
4d7dd03
Throw in VK_GET_OP_FN if op is not found (#3028)
Apr 13, 2024
c095046
update the pinned pytorch hash (#2824)
pytorchupdatebot Apr 13, 2024
c61ef44
Apply clang-format 18
zertosh Apr 14, 2024
57dd7f1
oss: Upgrade `clap`, add `string` feature (#3035)
JakobDegen Apr 14, 2024
057e432
Update to clang 18.1.3
mergennachin Apr 15, 2024
7616d42
Fix handling constant inputs when delegating (#3031)
angelayi Apr 15, 2024
7c81155
Fix lint in clang-format (#3041)
mergennachin Apr 15, 2024
645256d
generation.py with kv cache (#3030)
lucylq Apr 15, 2024
59023ed
Clean up shader library and introduce some new conventions (#3024)
SS-JIA Apr 15, 2024
64497b7
Move compile spec to ArmTester interface (#2991)
freddan80 Apr 15, 2024
075fe40
remove duplicate generate_lib_aten target under aten kernel (#2951)
Gasoonjia Apr 15, 2024
74576e8
native_layer_norm (for width dim) (#3001)
copyrightly Apr 15, 2024
eb44e88
aten.full.default (#3013)
copyrightly Apr 15, 2024
49d1f02
Add tiktoken (#3015)
larryliu0820 Apr 15, 2024
780ed25
Add tiktoken to eval (#3044)
lucylq Apr 15, 2024
15f141b
Update pytorch commit pin to 04/15 (#3047)
larryliu0820 Apr 16, 2024
7b375fe
Dynamic Conv1d + W2L (#2976)
mcr229 Apr 16, 2024
d0208d0
Fix iOS build by excluding external CoreML SDK dependencies (#3043)
Apr 16, 2024
458d743
aten.select.int (#3033)
yipjustin Apr 16, 2024
3b31eff
4b quantized embedding table operator (#3050)
manuelcandales Apr 16, 2024
473c98c
Fix test_llama_runner by hiding tiktoken (#3055)
larryliu0820 Apr 16, 2024
d481c11
Bump Vulkan API requirement to 1.1 and enable 16 bit and 8 bit types …
SS-JIA Apr 16, 2024
ab62707
Enable FP16 type in operators (#3059)
SS-JIA Apr 16, 2024
9931301
Fix formatting issues in executorch/test/size_test.cpp (#3065)
r-barnes Apr 16, 2024
89cfa73
ETRecord ser/de handling "None" outputs and more (#3039)
Olivia-liu Apr 16, 2024
c73bfc0
Update doc-build.yml (#3045)
svekars Apr 16, 2024
eb664a0
Add int16 support to aten_bridge (#3069)
Vysarat Apr 16, 2024
4b6d2c3
fix linear recomposition (#3064)
mcr229 Apr 16, 2024
54f9f3e
Set kernel default visibility to hidden (#3060)
kirklandsign Apr 17, 2024
9b55f48
Fix Android llama2 demo app after #2962 (#3032)
huydhn Apr 17, 2024
5d7949d
Update doc-build.yml (#3071)
svekars Apr 17, 2024
f14dc83
Handle empty (size=0) tensor in Inspector (#2998)
Olivia-liu Apr 17, 2024
1f4b631
Add quantized op support to llama runner (#3062)
larryliu0820 Apr 17, 2024
bae0387
{executorch][llama] support mqa (#3080)
kimishpatel Apr 17, 2024
22dfc6a
Load missing state dict in edge program serialization (#3076)
tarun292 Apr 17, 2024
65f2693
Remove noindex from upload to gh-pages job (#3077)
svekars Apr 17, 2024
ebde8e1
forward fix ConstantArgument initialization (#3074)
pianpwk Apr 17, 2024
980aaca
Fix llama2 README.md cmake instructions (#3096)
larryliu0820 Apr 17, 2024
5f9478d
Fix build time warning (#3097)
larryliu0820 Apr 17, 2024
20bf0db
change call_delegate_autograd (#3073)
mcr229 Apr 17, 2024
73438a5
remove exir.capture from dynamic_shape_propogation test (#3070)
JacobSzwejbka Apr 17, 2024
f729b2d
Create __init__.py in example folder (#3093)
iseeyuan Apr 17, 2024
b341223
move mask as sdpa input instead of attribute (#3036)
cccclai Apr 17, 2024
7e14c0e
remove exir.capture from test_rpc.py (#3102)
JacobSzwejbka Apr 17, 2024
0815c2b
Introduce `SpecVarList` to represent specialization constants (#3078)
SS-JIA Apr 17, 2024
78cb141
Enable additional specialization constants in compute shaders (#3079)
SS-JIA Apr 17, 2024
49928bc
select_copy.int (#3085)
yipjustin Apr 17, 2024
de00717
aten.permute_copy.default (#3086)
yipjustin Apr 17, 2024
28be9d6
Improve codegen for aten.permute (#3087)
yipjustin Apr 17, 2024
5fbd1f4
make_seq_tensor in codegen (#3088)
yipjustin Apr 17, 2024
cca9f65
remove exir.capture from quant fusion test (#3106)
JacobSzwejbka Apr 17, 2024
9c2b41b
Don't crash when execute_method fails (#3104)
kirklandsign Apr 17, 2024
b3ac533
update readme to not use exir.capture (#3107)
JacobSzwejbka Apr 17, 2024
203ae40
remove exir.capture from example delegate test (#3101)
JacobSzwejbka Apr 17, 2024
b19d586
throw Java exception when execution fails (#3112)
kirklandsign Apr 18, 2024
d731866
Handle missing data types. (#2984)
cymbalrush Apr 18, 2024
414cd05
Documentation for Vulkan Delegate (#3113)
Apr 18, 2024
910f851
fix embedding_4bit resize (#3118)
manuelcandales Apr 18, 2024
6510625
Delete llama_quantized lib (#3119)
larryliu0820 Apr 18, 2024
eb47c4e
Add quantized cmake option back to fix build-apple-framework (#3115)
larryliu0820 Apr 18, 2024
e69a662
Fix typo in sub & clean up (#3100)
manuelcandales Apr 18, 2024
e0b0647
Free Vulkan delegate segments after compileModel (#3116)
Apr 18, 2024
4c552d4
Define embedding_4bit (#3121)
manuelcandales Apr 18, 2024
f2e660b
cherry-pick: Add required deps to pyproject.toml (#3117)
dbort Apr 18, 2024
29faa2e
Preserve modelname (#3122)
manuelcandales Apr 18, 2024
ab02a9c
fix llama-runner-linux-android (#3127)
manuelcandales Apr 18, 2024
8d25288
Buck build - fix use_tiktoken config
digantdesai Apr 18, 2024
944dd4c
delete exir/experimental (#3109)
JacobSzwejbka Apr 18, 2024
8fd92bc
4b embedding quantizer (#3135)
manuelcandales Apr 18, 2024
74204f4
Update README.md (#3094)
digantdesai Apr 18, 2024
02ec589
Adding Gotchas in README.md (#3138)
mergennachin Apr 18, 2024
523c2cb
Update README.md for llama3 (#3141)
iseeyuan Apr 18, 2024
1eed125
aten.view_copy (#3129)
yipjustin Apr 18, 2024
06beace
Update README.md on the evaluation parameters (#3139)
iseeyuan Apr 19, 2024
3db0362
Add reference to the llama2 example for llama3 (#3142)
orionr Apr 19, 2024
060d151
Update Llama3 perplexity numbers in README.md (#3145)
iseeyuan Apr 19, 2024
b5085aa
add cpu device to run eval on cpu (#3133)
cccclai Apr 19, 2024
cf78107
Add a simple sdpa (#3037)
cccclai Apr 19, 2024
2c467dd
Fix quantized embedding export logic (#3095)
larryliu0820 Apr 19, 2024
74dba6e
Comply llama2 runner with gcc 11.4 (#3140)
mergennachin Apr 19, 2024
3257c66
qnn end to end flow for stories model (#3038)
cccclai Apr 19, 2024
ceae80a
Instructions for Llama3 (#3154)
mergennachin Apr 19, 2024
269b6ad
Fix embedding_4bit out variant (#3151)
larryliu0820 Apr 19, 2024
fa433cb
Add link to llama3 README file (#3156)
mergennachin Apr 19, 2024
bd07c75
make op_split_with_sizes_copy support dynamic shape (#3152)
Gasoonjia Apr 19, 2024
825db6c
Call destructor explicitly when move constructing `Value` (#3148)
SS-JIA Apr 19, 2024
bf5093a
Clean up api::vTensor class (#3149)
SS-JIA Apr 19, 2024
db17853
Introduce `ParamsBindList` to prevent needing to pass `shared_ptr` to…
SS-JIA Apr 19, 2024
3ef9d2c
Rename tokenizer file in Xcode. (#3160)
shoumikhin Apr 19, 2024
0800594
Adding .model tokenizer to selection (#3163)
kirklandsign Apr 19, 2024
d47f9fe
Docs for lower smaller models to mps/coreml/qnn (#3146)
cccclai Apr 19, 2024
023ca07
Add missing ops for RNNT predictor (#3125)
mcremon-meta Apr 19, 2024
4ea0473
fix test-demo-android (#3168)
manuelcandales Apr 19, 2024
7469a28
Slice, with lots of codegen improvements (#3171)
yipjustin Apr 19, 2024
c8b43d2
add kv cache to eval (#3162)
lucylq Apr 19, 2024
70baafe
Update model arg name rope_theta to be consistent with those in llama…
iseeyuan Apr 19, 2024
1d467d0
conv1d, special case
copyrightly Apr 20, 2024
7b1f10d
Qualcomm AI Engine Direct - Enable SSD300_VGG16 (#3010)
winskuo-quic Apr 20, 2024
87eb155
conv1d with bias=False
copyrightly Apr 20, 2024
36453fc
Switch to a dedicated brach for prebuilt packages. (#3184)
shoumikhin Apr 20, 2024
d89eabb
Use "latest" as the version for prebuilt frameworks. (#3161)
shoumikhin Apr 21, 2024
c350e58
Deprecate `gpu_sizes_ubo()` and `extents()`; also toggle packing layo…
SS-JIA Apr 21, 2024
7c74010
Specify OSX deployment target for python package. (#3193)
shoumikhin Apr 21, 2024
a7a9ab3
Specify OSX deployment target for python package. (#3194)
shoumikhin Apr 22, 2024
ebc38b2
Fix linter. (#3195)
shoumikhin Apr 22, 2024
73599f4
support emit sym value from delegate (#3103)
cccclai Apr 22, 2024
8dc54d5
Update Xcode project to build tiktoken tokenizer for LLaMA 3. (#3197)
shoumikhin Apr 22, 2024
d24af2b
Add quantized ops to pybindings (#3206)
lucylq Apr 22, 2024
90d0c1a
Add memory and vector include in managed_tensor.h (#3201)
larryliu0820 Apr 22, 2024
67123b6
Refactor export_llama_lib.py
larryliu0820 Apr 22, 2024
1a93dee
Update setup.sh for tokenizer selection (#3207)
kirklandsign Apr 22, 2024
3bb591c
Qualcomm AI Engine Direct - Fixed uint16 tensor and linear op (#3196)
shewu-quic Apr 22, 2024
969aa96
Add a pure python wrapper to pybindings.portable_lib (#3137)
dbort Apr 22, 2024
67f3376
Remove unused extension/aot_util directory (#3216)
GregoryComer Apr 22, 2024
dbf90c2
Create dependabot rule to upgrade TorchFix version (#3208)
huydhn Apr 22, 2024
9769386
Bring back `extents_ubo()` as `texture_limits_ubo()` (#3217)
SS-JIA Apr 22, 2024
9d2af4c
backout the schema definition change (#3213)
cccclai Apr 22, 2024
b41f763
Update some SDK docs from MVP (#3212)
Olivia-liu Apr 23, 2024
03c7a99
Bump the torch pin (#3199)
iseeyuan Apr 23, 2024
4389442
Fix LLAMA app (#3228)
kirklandsign Apr 23, 2024
6c30eea
Fix executor_runner_mps and mpsdelegate linking with pybind (#3222)
DenisVieriu97 Apr 23, 2024
aec2549
Update to transformers 4.38 (#3227)
malfet Apr 23, 2024
9783697
Update TorchNightly to 2024.04.22 (#3225)
malfet Apr 23, 2024
4668b5d
Support llama3 (#3232)
kirklandsign Apr 23, 2024
d8e94b0
strip symbol when linking (#3234)
cccclai Apr 23, 2024
4342cf2
fix typo (#3235)
cccclai Apr 23, 2024
0afb73d
Bump torchfix from 0.1.1 to 0.5.0 (#3220)
dependabot[bot] Apr 23, 2024
cb77763
Pin CoreMLTools 7.2 (#3170)
Apr 23, 2024
7b854b6
Expand visibility of targets needed for executorch_llama2 kernel (#3174)
stephenbo-meta Apr 23, 2024
6c36f10
Support tensors in prim_getters (#3203)
tarun292 Apr 23, 2024
ee8c3a6
Enable doc upload for tags, disable for release branches (#3153)
svekars Apr 23, 2024
c004efe
Update Core ML Backend Doc (#3188)
Apr 23, 2024
783e932
bundled program alpha document (#3224)
Gasoonjia Apr 23, 2024
ca8e589
Fix a small inconsistency on the SDK debugging page (#3247)
Olivia-liu Apr 23, 2024
ee28868
Update tutorial (#3242)
angelayi Apr 23, 2024
cf487f1
update sdk delegate integration (#3246)
cccclai Apr 23, 2024
1eaed2b
Add iPad support to demo apps. (#3251)
shoumikhin Apr 23, 2024
3b0f271
Add more prebuilt artifacts (#3245)
kirklandsign Apr 23, 2024
f89c312
SDK tutorial doc update (#3238)
Olivia-liu Apr 23, 2024
45fd796
`conv1d` general case (#3223)
copyrightly Apr 23, 2024
b6e54d0
move code under executorch/example (#3176)
Gasoonjia Apr 23, 2024
8748d57
update XNNPACK/README.md (#3236)
mcr229 Apr 23, 2024
329184a
Update Profiling Section in XNNPACK Delegate Docs (#3237)
mcr229 Apr 23, 2024
719b368
Add allocate_temp method to KernelRuntimeContext (#3209)
Apr 24, 2024
9c99fe1
Inspector APIs page
Olivia-liu Apr 24, 2024
e9d7868
fix qnn install link (#3260)
cccclai Apr 24, 2024
02a6b66
Add index.Tensor and aten.logical_not (#3221)
DenisVieriu97 Apr 24, 2024
ba0caf8
Fix broken links on the coreml tutorial page (#3250)
Olivia-liu Apr 24, 2024
d98dc01
Fix compilation with gcc-9+ (#3262)
malfet Apr 24, 2024
b7b40ac
Add delegate time scale converter to Inspector (#3240)
tarun292 Apr 24, 2024
8b1f49a
Tie quantization of add operands and result together (#3091)
per Apr 24, 2024
6712185
Add semihosting to cmake for executor_runner (#3008)
per Apr 24, 2024
2f5cbd4
Capture output of Vela and print on error (#3057)
Erik-Lundell Apr 24, 2024
b0a400c
Fix for TOSA BI clamp ops (#3092)
freddan80 Apr 24, 2024
bf9888f
delegation debug page (#3254)
Olivia-liu Apr 24, 2024
de0c233
update memory planning docs (#3270)
lucylq Apr 24, 2024
b5bb921
DynamicShim for dlsym user (#3136)
kirklandsign Apr 24, 2024
d053611
Unsqueeze (#3172)
yipjustin Apr 24, 2024
2dac5f3
clone node (#3219)
yipjustin Apr 24, 2024
66a350b
add dynamic export into llm manual (#3202)
Gasoonjia Apr 24, 2024
5b0030f
Update readme. (#3301)
shoumikhin Apr 24, 2024
e25e5d2
Fix portable is[inf|nan|_out compilation on older Linux (#3272)
malfet Apr 24, 2024
b560864
Use relative links in llm/getting-started.md (#3244)
dbort Apr 24, 2024
98a7e66
Update examples/README.md with Llama 3 and names (#3275)
iseeyuan Apr 24, 2024
727a68d
Revert D56480274: Add more prebuilt artifacts
malfet Apr 24, 2024
b669056
update typos (#3300)
Gasoonjia Apr 24, 2024
ce1e9c1
Update readme.
shoumikhin Apr 24, 2024
f6758fc
Update custom kernel registration API
larryliu0820 Apr 24, 2024
34f59ed
llama2 readme (#3315)
lucylq Apr 24, 2024
aa3e736
Fix sdk_example_runner.sh (#3298)
tarun292 Apr 24, 2024
035aee4
Update readme.
shoumikhin Apr 24, 2024
453ebad
Update MPS documentation; add helper script to build mps_executor_run…
DenisVieriu97 Apr 24, 2024
9811eea
Remove the sorting of the nodes from partitioning (not needed for now…
DenisVieriu97 Apr 24, 2024
b2c794a
copy node, aten.repeat (#3299)
yipjustin Apr 24, 2024
590cbce
add buck2 installation into setup.md
Gasoonjia Apr 24, 2024
b2a7243
register `view`, `reshape` and `select`
copyrightly Apr 25, 2024
79b79cb
Update llama2 readme file - main branch (#3340)
mergennachin Apr 25, 2024
30128f3
Build custom ops in pybinding (#3263)
larryliu0820 Apr 25, 2024
fd63d0c
Enable doc job to run on -rc tags. (#3345)
svekars Apr 25, 2024
8fcba36
Eliminate deprecated api usage (#2695)
kirklandsign Apr 25, 2024
319a4f2
Remove unneeded _to_copy in edge dialect.
zhxchen17 Apr 25, 2024
8ec0af9
Extend setup cmake ability (#3349)
larryliu0820 Apr 25, 2024
7b3b485
Half support for index op
manuelcandales Apr 25, 2024
80d72f2
Add EXECUTORCH_SEPARATE_FLATCC_HOST_PROJECT cmake option (#3356)
dbort Apr 25, 2024
c32b0a2
Export the ET_VERSION_DOCS variable in doc build (#3358)
svekars Apr 25, 2024
c209e12
Fix extension/data_loader installation (#3355)
larryliu0820 Apr 25, 2024
7b3f5c6
Reword "preview release" notice now that we are at alpha (#3364)
dbort Apr 25, 2024
44d4bac
Fix quantized_linear cpp op schema
mcremon-meta Apr 26, 2024
3fe25df
Add Disclaimer
mergennachin Apr 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix for TOSA BI clamp ops (#3092)
Summary:
Min/max range values need to be on quantized form.

Pull Request resolved: #3092

Reviewed By: mergennachin

Differential Revision: D56476931

Pulled By: digantdesai

fbshipit-source-id: 80fe1e4981c048653f808ef1ad9339997eb853a6
  • Loading branch information
freddan80 authored and facebook-github-bot committed Apr 24, 2024
commit b0a400c2d92b261d602b19778f654e67d1ce93d8
23 changes: 10 additions & 13 deletions backends/arm/operators/op_addmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def define_node(
quant_node = input_node.all_input_nodes[0]
else:
quant_node = input_node
input_zp = get_quant_node_args(quant_node)[1]
input_zp = get_quant_node_args(quant_node).zp
attr.ConvAttribute(
pad=pad_attr,
stride=stride_attr,
Expand Down Expand Up @@ -111,24 +111,21 @@ def define_node(
# rank > 2 linear layer
if input_node.target == exir_ops.edge.aten.view_copy.default:
quant_node = input_node.all_input_nodes[0]
input_scale, _ = get_quant_node_args(quant_node)
input_scale = get_quant_node_args(quant_node).scale
consumer_node = list(node.users)[0]
consumer_consumer_node = list(consumer_node.users)[0]
(
consumer_node_scale,
consumer_node_node_zp,
) = get_quant_node_args(consumer_consumer_node)

quant_args = get_quant_node_args(consumer_consumer_node)
consumer_node_scale = quant_args.scale
consumer_node_node_zp = quant_args.zp
else:
input_scale, _ = get_quant_node_args(input_node)
input_scale = get_quant_node_args(input_node).scale
consumer_node = list(node.users)[0]
(
consumer_node_scale,
consumer_node_node_zp,
) = get_quant_node_args(consumer_node)
quant_args = get_quant_node_args(consumer_node)
consumer_node_scale = quant_args.scale
consumer_node_node_zp = quant_args.zp

weight_node_q_node = weight_node.all_input_nodes[0]
weight_scale, _ = get_quant_node_args(weight_node_q_node)
weight_scale = get_quant_node_args(weight_node_q_node).scale

output_rescale_scale = (input_scale * weight_scale) / consumer_node_scale
(
Expand Down
4 changes: 2 additions & 2 deletions backends/arm/operators/op_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ def build_avg_pool_2d_common(
output_zp = 0

if is_quant_node:
_, input_zp = get_quant_node_args(node.args[0])
_, output_zp = get_quant_node_args(list(node.users)[0])
input_zp = get_quant_node_args(node.args[0]).zp
output_zp = get_quant_node_args(list(node.users)[0]).zp

attr = ts.TosaSerializerAttribute()
attr.PoolAttribute(
Expand Down
2 changes: 1 addition & 1 deletion backends/arm/operators/op_conv2d.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ def define_node(
)

input_zp = (
get_quant_node_args(node.all_input_nodes[0])[1] if is_quant_node else 0
get_quant_node_args(node.all_input_nodes[0]).zp if is_quant_node else 0
)

attr.ConvAttribute(
Expand Down
31 changes: 26 additions & 5 deletions backends/arm/operators/op_hardtanh.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2023 Arm Limited and/or its affiliates.
# Copyright 2023-2024 Arm Limited and/or its affiliates.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
Expand All @@ -11,6 +11,8 @@
register_node_visitor,
)
from executorch.backends.arm.tosa_mapping import TosaArg

from executorch.backends.arm.tosa_quant_utils import get_quant_node_args
from serializer.tosa_serializer import TosaOp


Expand All @@ -30,12 +32,31 @@ def define_node(
is_quant_node: bool,
) -> None:
attr = ts.TosaSerializerAttribute()

if is_quant_node:
# Get quant parameters
scale, zp, qmin, qmax = get_quant_node_args(node.all_input_nodes[0])
# Convert to quantized representation
clamp_min_qs = round((inputs[1].number / scale) + zp)
clamp_min_qs = max(clamp_min_qs, qmin)
clamp_max_qs = round((inputs[2].number / scale) + zp)
clamp_max_qs = min(clamp_max_qs, qmax)
# Set fp values to 0.0 since they are not used
clamp_min_fp = 0.0
clamp_max_fp = 0.0
else:
clamp_min_fp = inputs[1].number
clamp_max_fp = inputs[2].number
# Set qs values to 0 since they are not used
clamp_min_qs = 0
clamp_max_qs = 0

attr.ClampAttribute(
tosa_graph.builder,
int(inputs[1].number),
int(inputs[2].number),
inputs[1].number,
inputs[2].number,
clamp_min_qs,
clamp_max_qs,
clamp_min_fp,
clamp_max_fp,
)

tosa_graph.addOperator(TosaOp.Op().CLAMP, [inputs[0].name], [output.name], attr)
12 changes: 7 additions & 5 deletions backends/arm/operators/op_placeholder.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,13 @@ def process_placeholder(
weight_node = weight_node_permuted.all_input_nodes[0]

if input_node.target == exir_ops.edge.aten.view_copy.default:
input_node_scale, _ = get_quant_node_args(input_node.all_input_nodes[0])
input_node_scale = get_quant_node_args(
input_node.all_input_nodes[0]
).scale
else:
input_node_scale, _ = get_quant_node_args(input_node)
input_node_scale = get_quant_node_args(input_node).scale

weight_node_scale, _ = get_quant_node_args(weight_node)
weight_node_scale = get_quant_node_args(weight_node).scale

bias_values_quantized = (
(parameter_values / (input_node_scale * weight_node_scale))
Expand All @@ -81,8 +83,8 @@ def process_placeholder(
bias_node,
) = consumer_node.all_input_nodes

input_node_scale, _ = get_quant_node_args(input_node)
weight_node_scale, _ = get_quant_node_args(weight_node)
input_node_scale = get_quant_node_args(input_node).scale
weight_node_scale = get_quant_node_args(weight_node).scale

bias_scales = input_node_scale * weight_node_scale
parameter_values_quantized = (
Expand Down
67 changes: 54 additions & 13 deletions backends/arm/test/ops/test_conv_combos.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import torch
from executorch.backends.arm.test import common
from executorch.backends.arm.test.tester.arm_tester import ArmTester
from parameterized import parameterized

logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
Expand Down Expand Up @@ -126,6 +127,32 @@ def forward(self, x):
return x


class ComboConvRelu6(torch.nn.Module):
edge_op_list = [
"executorch_exir_dialects_edge__ops_aten_convolution_default",
"executorch_exir_dialects_edge__ops_aten_hardtanh_default",
]

test_data = [
(20 * torch.randn(1, 3, 256, 256),),
(5 * torch.randn(1, 3, 256, 256),),
(torch.randn(1, 3, 256, 256),),
(-5 * torch.randn(1, 3, 256, 256),),
]

def __init__(self):
super().__init__()
self.conv2d = torch.nn.Conv2d(
in_channels=3, out_channels=3, kernel_size=3, stride=1, groups=1
)
self.relu6 = torch.nn.ReLU6()

def forward(self, x):
x = self.conv2d(x)
x = self.relu6(x)
return x


class TestConvCombos(unittest.TestCase):
def _test_conv_combo_tosa_MI_pipeline(
self, module: torch.nn.Module, test_data: Tuple[torch.Tensor]
Expand Down Expand Up @@ -222,15 +249,9 @@ def test_conv_batchnorm_relu_tosa_MI(self):
model = ComboConvBatchnormRelu()
self._test_conv_combo_tosa_MI_pipeline(model, model.get_inputs())

# TODO(MLETORCH-85): Investigate numerical issue. This diff is present in legacy
# testcase as well (and also not tested). For now, just increase the
# tolerance, such that we don't skip the test entirely (i.e. we maintain
# functionality).
def test_conv_batchnorm_relu_tosa_BI(self):
model = ComboConvBatchnormRelu()
self._test_conv_combo_tosa_BI_pipeline(
model, model.get_inputs(), atol=1.0, rtol=1.0
)
self._test_conv_combo_tosa_BI_pipeline(model, model.get_inputs())

@unittest.skipIf(
not common.VELA_INSTALLED,
Expand All @@ -240,21 +261,41 @@ def test_conv_batchnorm_relu_u55_BI(self):
model = ComboConvBatchnormRelu()
self._test_conv_combo_u55_BI_pipeline(model, model.get_inputs())

##################
## Conv + ReLU6 ##
##################
@parameterized.expand(ComboConvRelu6.test_data)
def test_conv_relu6_tosa_MI(self, test_data: torch.Tensor):
model = ComboConvRelu6()
test_data = (test_data,)
self._test_conv_combo_tosa_MI_pipeline(model, test_data)

@parameterized.expand(ComboConvRelu6.test_data)
def test_conv_relu6_tosa_BI(self, test_data: torch.Tensor):
model = ComboConvRelu6()
test_data = (test_data,)
self._test_conv_combo_tosa_BI_pipeline(model, test_data)

@parameterized.expand(ComboConvRelu6.test_data)
@unittest.skipIf(
not common.VELA_INSTALLED,
"There is no point in running U55 tests if the Vela tool is not installed",
)
def test_conv_relu6_u55_BI(self, test_data: torch.Tensor):
model = ComboConvRelu6()
test_data = (test_data,)
self._test_conv_combo_u55_BI_pipeline(model, test_data)

###############################
## Block bottleneck residual ##
###############################
def test_block_bottleneck_residual_tosa_MI(self):
model = ComboBlockBottleneckResidual()
self._test_conv_combo_tosa_MI_pipeline(model, model.get_inputs())

# TODO(MLETORCH-85): Investigate numerical issue. This diff was present in legacy
# testcase as well. For now, just increase the tolerance, such that
# we don't skip the test entirely (i.e. we maintain functionality).
def test_block_bottleneck_residual_tosa_BI(self):
model = ComboBlockBottleneckResidual()
self._test_conv_combo_tosa_BI_pipeline(
model, model.get_inputs(), atol=1.0, rtol=1.0
)
self._test_conv_combo_tosa_BI_pipeline(model, model.get_inputs())

@unittest.skipIf(
not common.VELA_INSTALLED,
Expand Down
29 changes: 25 additions & 4 deletions backends/arm/tosa_quant_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,10 @@
# Utiliy functions for TOSA quantized lowerings

import math
from typing import NamedTuple

import serializer.tosa_serializer as ts
import torch.fx
from executorch.backends.arm.tosa_mapping import TosaArg
from executorch.exir.dialects._ops import ops as exir_ops
from serializer.tosa_serializer import TosaOp, TosaSerializerTensor
Expand All @@ -17,7 +19,14 @@
dq_q_ops = [q_op, dq_op]


def is_quant_node(node):
class QuantArgs(NamedTuple):
scale: float
zp: int
qmin: int
qmax: int


def is_quant_node(node: torch.fx.Node):
consumer_node = list(node.users)[0]
input = node.all_input_nodes[0]

Expand All @@ -41,10 +50,22 @@ def is_quant_arg(arg):
return consumer_node.target == q_op


def get_quant_node_args(node):
def get_quant_node_args(node: torch.fx.Node):
"""
Get the quantization parameters from a quant node.

Args:
node: The quant node.
Returns:
QuantArgs: scale, zp, qmin, qmax
"""
quant_args = [TosaArg(arg) for arg in node.args]
# Return the scale and zp
return quant_args[1].number, quant_args[2].number
return QuantArgs(
quant_args[1].number,
quant_args[2].number,
quant_args[3].number,
quant_args[4].number,
)


# Check if scale32 mode is used for given output element type
Expand Down
2 changes: 1 addition & 1 deletion backends/xnnpack/test/tester/tester.py
Original file line number Diff line number Diff line change
Expand Up @@ -595,7 +595,7 @@ def _assert_outputs_equal(model_output, ref_output, atol=1e-03, rtol=1e-03):
f"Output {i} does not match reference output.\n"
f"\tGiven atol: {atol}, rtol: {rtol}.\n"
f"\tOutput tensor shape: {model.shape}, dtype: {model.dtype}\n"
f"\tDifference: max: {torch.max(model-ref)}, abs: {torch.max(torch.abs(model-ref))}.\n"
f"\tDifference: max: {torch.max(model-ref)}, abs: {torch.max(torch.abs(model-ref))}, mean abs error: {torch.mean(torch.abs(model-ref))}.\n"
f"\t-- Model vs. Reference --\n"
f"\t Numel: {model.numel()}, {ref.numel()}\n"
f"\tMedian: {model.median()}, {ref.median()}\n"
Expand Down