-
Notifications
You must be signed in to change notification settings - Fork 608
Add CoreML Quantize #5228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CoreML Quantize #5228
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5228
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Unrelated FailureAs of commit 554382a with merge base 7e374d7 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
3bc734f
to
455cea4
Compare
455cea4
to
06dba4b
Compare
@cccclai 🙏 |
Hey could you share the command to run the script? The arg list is getting longer and it's hard to guess... |
Sure
This is not the final command, though: we are adding fused sdpa |
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Hey could you rebase? Run into land race and the other PR touched the same file merge first... |
f639e7c
to
554382a
Compare
Rebased ✅ GitHub is not showing conflict yet, though. Is the conflict change in Meta internal only for now? (And I need to wait until it gets exported?) |
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Motivation
Short term: TorchAO int4 quantization yields float zero point, but CoreML does not have good support for it yet. We will need CoreML int4 quantization for now.
Intermediate term: Before torch implements all CoreML-supported quantizations (e.g. palettization, sparcification, joint compression...), it will be great to have a way to use/experiment those CoreML quantizations.
Solution
In CoreML preprocess, we add CoreML quantization config as a compile spec