Skip to content

Commit e3165a7

Browse files
committed
update pin
1 parent d7cadfc commit e3165a7

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

docs/quantization.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -128,12 +128,12 @@ It takes arguments bitwidth (1, 2, 3, 4, 5, 6, 7), groupsize, and has_weight_zer
128128
The argument has_weight_zeros indicates whether the weights are quantized with scales only (has_weight_zeros: false) or with both scales and zeros (has_weight_zeros: true).
129129
Roughly speaking, {bitwidth: 4, groupsize: 32, has_weight_zeros: false} is similar to GGML's Q4_0 quantization scheme.
130130

131-
You should expect high performance on ARM CPU if bitwidth is 1, 2, 3, 4, 5, or 6 and groupsize is divisible by 16. With other platforms and argument choices, a slow fallback kernel will be used. You will see warnings about this during quantization.
131+
You should expect high performance on ARM CPU if groupsize is divisible by 16. With other platforms and argument choices, a slow fallback kernel will be used. You will see warnings about this during quantization.
132132

133133
#### embedding:wx
134134
The quantization scheme embedding:wx quantizes embeddings in a groupwise manner with the specified bitwidth and groupsize. It takes arguments bitwidth (1, 2, 3, 4, 5, 6, 7) and groupsize. Unlike linear:a8wxdq, embedding:wx always quantizes with scales and zeros.
135135

136-
You should expect high performance on ARM CPU if bitwidth is 1, 2, 3, 4, 5, or 6 and groupsize is divisible by 32. With other platforms and argument choices, a slow fallback kernel will be used. You will see warnings about this during quantization.
136+
You should expect high performance on ARM CPU if groupsize is divisible by 32. With other platforms and argument choices, a slow fallback kernel will be used. You will see warnings about this during quantization.
137137

138138
### Setup
139139
To use linear:a8wxdq and embedding:wx, you must set up the torchao experimental kernels. These will only work on devices with ARM CPUs, for example on Mac computers with Apple Silicon.

install/.pins/torchao-pin.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
f1b4c8e9bc10cf80b47bef5a19555956523bb0b3
1+
c8f1174a06dcc0102849c8348ca6573bde8847a9

0 commit comments

Comments
 (0)