You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add choose_qparams_per_token_asymmetric for llama on XNNPACK
Summary:
XNNPACK uses asymmetric activation quantizations,
but the existing `choose_qparams_per_token` assumed symmetric
quantization (zero point is always 0). This caused significant
numerical discrepancies between eager and lowered models.
This commit adds a new asymmetric version of `choose_qparams_per_token`
for this purpose.
Reviewed By: digantdesai
Differential Revision: D54323650
fbshipit-source-id: afd1e8f8b582bc8c07d4b03752ab71caa30c2bb0
0 commit comments