Skip to content

Qualcomm AI Engine Direct - Quantizer refine for qat #6513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 6, 2024

Conversation

chunit-quic
Copy link
Collaborator

  • Reorginize qualcomm/quantizer
  • Split quantizer/utils.py to
    -- qconfig
    -- annotators
    -- observers directory
  • Change coresponding callees
  • Rename get_default_Nbit_qnn_ptq_config to get_NaNw_qnn_ptq_config
  • Add 16a4w conv test* (It is not compared with original model)

Copy link

pytorch-bot bot commented Oct 28, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6513

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 086cca4 with merge base 41a57e6 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 28, 2024
@chunit-quic chunit-quic marked this pull request as draft October 28, 2024 03:58
@chunit-quic
Copy link
Collaborator Author

Hi @cccclai,

We implement the draft PR for

  1. Refine API of QnnQauntizer
  2. Add more qat quant confings
  3. Add 16a4w qat conv test case

We have another local branch which is testing QAT mobilenet v2 based on it. Yet we encounter some problems.
During QAT process, the the accuracy/loss seems to be reasonable if we only quantize conv ops. Yet if we quantize any other op, accuracy/loss drop significantly. Is there any known issue for this? Or maybe it just results from some missetting form us. (We can make a patch or update the local branch here if you are insterested in it)

Thank you. :)

@cccclai
Copy link
Contributor

cccclai commented Oct 28, 2024

@navsud can you review this PR?

Copy link
Contributor

@navsud navsud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.
cc: @cccclai for final approval.

from torch.ao.quantization.observer import UniformQuantizationObserverBase


class ParamObserver(UniformQuantizationObserverBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great and is generic enough to be added into torch/ao/quantization/observer.py. If possible, please move it there as a follow-up.

How about renaming it to PerChannelParamObserver() or PerChannelWeightObserver?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see you like it! Change the name and add a TODO for it

qscheme=torch.per_tensor_symmetric,
ch_axis=0,
reduce_range=True,
observer=MovingAverageMinMaxObserver,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be PerChannelMovingAverageMinMaxObserver and qscheme=torch.per_channel_symmetric?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for misunderstanding. This quant config is for per tensor specifically. The per channel one is here, and we use the condition of a quantizer member function here to assign per channel quant confing.

get_default_8bit_qat_proto,
get_default_8bit_qnn_ptq_config,
get_8a8w_qnn_ptq_config,
get_8a8w_qnn_qat_config,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow-up PR, can we unify/simplify: get_8a8w_qnn_ptq_config() and get_8a8w_qnn_qat_config() into get_8a8w_qnn_config() with the argument is_train=True/False, which will define whether we are doing PTQ vs. QAT.

Copy link
Collaborator Author

@chunit-quic chunit-quic Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. We would like to keep separated at first. Once it seems to be stable we will raise a PR to merge them.

@chunit-quic
Copy link
Collaborator Author

Overall LGTM. cc: @cccclai for final approval.

Hi @navsud

Thank you very much for reviewing! We just uploaded a patch based on comments. Just two questions we would like to ask.

  1. Does QAT only work for ops with weight at this moment? I tried to do QAT for a single op add, and got complained about no parameters to be trained. Is it expected?
  2. Similar to the point 1. We are not pretty sure whether we quantize ops of mobilenetV2 properly. Accuracy drop significantly during QAT if we add quant config to any op but convolution op. More details can be found in the first comment of the PR

Thank you for reading! If anything is unclear please feel free to let me know. I will try to describe it more. :D

Joey Tsai added 3 commits November 4, 2024 17:46
- Reorginize qualcomm/quantizer
- Split quantizer/utils.py to
-- qconfig
-- annotators
-- observers directory
- Change coresponding callees
- Rename get_default_Nbit_qnn_ptq_config to get_NaNw_qnn_ptq_config
- Add 16a4w conv test* (It is not compared with original model)
- Move and rename param_observer.py to per_channel_param_observer.py
- Add todo to merge qconfig
- Add todo for per_channel_param_observer.py
@chunit-quic chunit-quic force-pushed the dev1/chunit/qat_quantizer_refine branch from d542309 to 0e57c97 Compare November 4, 2024 09:47
@chunit-quic chunit-quic changed the title [Qualcomm AI Engine Direct - Quantizer refine for qat] Qualcomm AI Engine Direct - Quantizer refine for qat Nov 4, 2024
Copy link
Contributor

@navsud navsud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chunit-quic
Copy link
Collaborator Author

Hi @cccclai, may you take a glance at this PR when you are free, and perhas merge it, if it looks good to you? Thanks :D

Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - thank you for adding the feature!

@cccclai cccclai merged commit 068f43c into pytorch:main Nov 6, 2024
39 checks passed
@cccclai
Copy link
Contributor

cccclai commented Nov 6, 2024

I notice that I didn't import to internal...in the worst case, if this diff breaks some tests, we may need to revert and reland it...

@chunit-quic
Copy link
Collaborator Author

I notice that I didn't import to internal...in the worst case, if this diff breaks some tests, we may need to revert and reland it...

It's fine. If so just let me know what should I fix and I will do it asap. :)

@cccclai
Copy link
Contributor

cccclai commented Nov 8, 2024

@chunit-quic Unfortunately it's reverted because it breaks internal test....can you submit a PR again?

@chunit-quic
Copy link
Collaborator Author

@chunit-quic Unfortunately it's reverted because it breaks internal test....can you submit a PR again?

No problem. The new PR is PR6747

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants