Add CoreML Quantize#5228
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5228
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Unrelated FailureAs of commit 554382a with merge base 7e374d7 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
3bc734f to
455cea4
Compare
455cea4 to
06dba4b
Compare
|
@cccclai 🙏 |
|
Hey could you share the command to run the script? The arg list is getting longer and it's hard to guess... |
Sure This is not the final command, though: we are adding fused sdpa |
|
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Hey could you rebase? Run into land race and the other PR touched the same file merge first... |
f639e7c to
554382a
Compare
|
Rebased ✅ GitHub is not showing conflict yet, though. Is the conflict change in Meta internal only for now? (And I need to wait until it gets exported?) |
|
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Motivation
Short term: TorchAO int4 quantization yields float zero point, but CoreML does not have good support for it yet. We will need CoreML int4 quantization for now.
Intermediate term: Before torch implements all CoreML-supported quantizations (e.g. palettization, sparcification, joint compression...), it will be great to have a way to use/experiment those CoreML quantizations.
Solution
In CoreML preprocess, we add CoreML quantization config as a compile spec