-
Notifications
You must be signed in to change notification settings - Fork 608
quant params from static inputs #573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for resplendent-gnome-14e531 canceled.
|
This pull request was exported from Phabricator. Differential Revision: D49850149 |
This pull request was exported from Phabricator. Differential Revision: D49850149 |
7633904
to
729e172
Compare
This pull request was exported from Phabricator. Differential Revision: D49850149 |
729e172
to
5d71d4a
Compare
This pull request was exported from Phabricator. Differential Revision: D49850149 |
5d71d4a
to
0254a69
Compare
This pull request was exported from Phabricator. Differential Revision: D49850149 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D49850149 |
0254a69
to
892e22e
Compare
Summary: Making sure we have the macOS build too. To find the hash, see https://hud.pytorch.org/hud/pytorch/pytorch/nightly for the target date. To test, check if the file like `torch-2.2.0.dev20231005-cp311-none-macosx_11_0_arm64.whl` exists at https://download.pytorch.org/whl/nightly/torch/ and run `./install_requirements.sh` on a Mac to see all the dependencies are successfully installed. Pull Request resolved: pytorch/executorch#644 Reviewed By: angelayi Differential Revision: D49973900 Pulled By: shoumikhin fbshipit-source-id: 0f1b170a77c126b0ce7a46dc9f7c7d92cbe45847
Summary: Pull Request resolved: pytorch/executorch#573 Since we allow tensor constants to be quantized inputs, we need to adjust the from_inputs api to search if this input is static or not. If it is static, then we take the first q node in get_attr --> q --> dq. If it is not static, then we just take the dq node to create the QuantParams object. In the past, we can take in static quant inputs only on weights and biases. Reviewed By: digantdesai Differential Revision: D49850149 fbshipit-source-id: 5bf3c4de63a4454fcd7d7bea2f7d113f4f33d937
This pull request was exported from Phabricator. Differential Revision: D49850149 |
892e22e
to
090aa75
Compare
This pull request has been merged in 6230f8f. |
* code beautification * code beautification, move functions together * rewrite model rewriter * rewrite quantizers * weights is none check * typo * not weight -> weight is not None * fix dimensions for parallel prefill * test * typo * bfloat16 on ARM with MacOS 14 * precision for a8w4 * sdpa_kv * fixes * inline qlq definition * trial and error * qdq not working * ci * not so fast with bf16=fast * typo, and handle fast across maxcos version... * typo * type cast
Summary:
Since we allow tensor constants to be quantized inputs, we need to adjust the from_inputs api to search if this input is static or not. If it is static, then we take the first q node in get_attr --> q --> dq. If it is not static, then we just take the dq node to create the QuantParams object.
In the past, we can take in static quant inputs only on weights and biases.
Differential Revision: D49850149