Skip to content

Commit 7733f0c

Browse files
authored
ggml : support AVX512VNNI (#6280)
This change causes some quants (e.g. Q4_0, Q8_0) to go faster on some architectures (e.g. AMD Zen 4).
1 parent a32b77c commit 7733f0c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

ggml-quants.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ static inline __m256 sum_i16_pairs_float(const __m256i x) {
132132
}
133133

134134
static inline __m256 mul_sum_us8_pairs_float(const __m256i ax, const __m256i sy) {
135-
#if __AVXVNNI__
135+
#if defined(__AVXVNNI__) || defined(__AVX512VNNI__)
136136
const __m256i zero = _mm256_setzero_si256();
137137
const __m256i summed_pairs = _mm256_dpbusd_epi32(zero, ax, sy);
138138
return _mm256_cvtepi32_ps(summed_pairs);

0 commit comments

Comments
 (0)