Skip to content

Add CoreML Quantize #5228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

YifanShenSZ
Copy link
Collaborator

@YifanShenSZ YifanShenSZ commented Sep 10, 2024

Motivation

Short term: TorchAO int4 quantization yields float zero point, but CoreML does not have good support for it yet. We will need CoreML int4 quantization for now.

Intermediate term: Before torch implements all CoreML-supported quantizations (e.g. palettization, sparcification, joint compression...), it will be great to have a way to use/experiment those CoreML quantizations.

Solution

In CoreML preprocess, we add CoreML quantization config as a compile spec

Copy link

pytorch-bot bot commented Sep 10, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5228

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Unrelated Failure

As of commit 554382a with merge base 7e374d7 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 10, 2024
@YifanShenSZ YifanShenSZ force-pushed the coreml-quantization branch 2 times, most recently from 3bc734f to 455cea4 Compare September 10, 2024 19:14
@YifanShenSZ YifanShenSZ changed the title add coreml quantize Add CoreML Quantize Sep 10, 2024
@YifanShenSZ
Copy link
Collaborator Author

@cccclai 🙏

@YifanShenSZ YifanShenSZ marked this pull request as ready for review September 10, 2024 20:32
@cccclai
Copy link
Contributor

cccclai commented Sep 10, 2024

Hey could you share the command to run the script? The arg list is getting longer and it's hard to guess...

@YifanShenSZ
Copy link
Collaborator Author

Hey could you share the command to run the script? The arg list is getting longer and it's hard to guess...

Sure

python -m examples.models.llama2.export_llama -c Meta-Llama-3-8B/consolidated.00.pth -p Meta-Llama-3-8B/params.json --disable_dynamic_shape -kv --coreml --coreml-quantize b4w --coreml-enable-state

This is not the final command, though: we are adding fused sdpa

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Sep 11, 2024

Hey could you rebase? Run into land race and the other PR touched the same file merge first...

@YifanShenSZ
Copy link
Collaborator Author

YifanShenSZ commented Sep 11, 2024

Rebased ✅

GitHub is not showing conflict yet, though. Is the conflict change in Meta internal only for now? (And I need to wait until it gets exported?)

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cccclai merged this pull request in 4da3c5d.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants