-
Notifications
You must be signed in to change notification settings - Fork 608
Qualcomm AI Engine Direct - Quantizer refine for qat #6513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qualcomm AI Engine Direct - Quantizer refine for qat #6513
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6513
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 086cca4 with merge base 41a57e6 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi @cccclai, We implement the draft PR for
We have another local branch which is testing QAT mobilenet v2 based on it. Yet we encounter some problems. Thank you. :) |
@navsud can you review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
cc: @cccclai for final approval.
from torch.ao.quantization.observer import UniformQuantizationObserverBase | ||
|
||
|
||
class ParamObserver(UniformQuantizationObserverBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great and is generic enough to be added into torch/ao/quantization/observer.py. If possible, please move it there as a follow-up.
How about renaming it to PerChannelParamObserver() or PerChannelWeightObserver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to see you like it! Change the name and add a TODO for it
qscheme=torch.per_tensor_symmetric, | ||
ch_axis=0, | ||
reduce_range=True, | ||
observer=MovingAverageMinMaxObserver, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be PerChannelMovingAverageMinMaxObserver and qscheme=torch.per_channel_symmetric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for misunderstanding. This quant config is for per tensor specifically. The per channel one is here, and we use the condition of a quantizer member function here to assign per channel quant confing.
get_default_8bit_qat_proto, | ||
get_default_8bit_qnn_ptq_config, | ||
get_8a8w_qnn_ptq_config, | ||
get_8a8w_qnn_qat_config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a follow-up PR, can we unify/simplify: get_8a8w_qnn_ptq_config()
and get_8a8w_qnn_qat_config()
into get_8a8w_qnn_config()
with the argument is_train=True/False, which will define whether we are doing PTQ vs. QAT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. We would like to keep separated at first. Once it seems to be stable we will raise a PR to merge them.
Hi @navsud Thank you very much for reviewing! We just uploaded a patch based on comments. Just two questions we would like to ask.
Thank you for reading! If anything is unclear please feel free to let me know. I will try to describe it more. :D |
- Reorginize qualcomm/quantizer - Split quantizer/utils.py to -- qconfig -- annotators -- observers directory - Change coresponding callees - Rename get_default_Nbit_qnn_ptq_config to get_NaNw_qnn_ptq_config - Add 16a4w conv test* (It is not compared with original model)
- Move and rename param_observer.py to per_channel_param_observer.py - Add todo to merge qconfig
- Add todo for per_channel_param_observer.py
d542309
to
0e57c97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hi @cccclai, may you take a glance at this PR when you are free, and perhas merge it, if it looks good to you? Thanks :D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me - thank you for adding the feature!
I notice that I didn't import to internal...in the worst case, if this diff breaks some tests, we may need to revert and reland it... |
It's fine. If so just let me know what should I fix and I will do it asap. :) |
This reverts commit 068f43c.
@chunit-quic Unfortunately it's reverted because it breaks internal test....can you submit a PR again? |
No problem. The new PR is PR6747 |
-- qconfig
-- annotators
-- observers directory