Skip to content

Commit 312a927

Browse files
committed
ggml : fix quantize_row_q8_0() ARM_NEON rounding
1 parent 2c4f9b6 commit 312a927

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

ggml.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1102,8 +1102,7 @@ static void quantize_row_q8_0(const float * restrict x, void * restrict vy, int
11021102

11031103
for (int l = 0; l < 8; l++) {
11041104
const float32x4_t v = vmulq_n_f32(srcv[l], id);
1105-
//TODO: rounding
1106-
const int32x4_t vi = vcvtq_s32_f32(v);
1105+
const int32x4_t vi = vcvtnq_s32_f32(v);
11071106

11081107
y[i].qs[4*l + 0] = vgetq_lane_s32(vi, 0);
11091108
y[i].qs[4*l + 1] = vgetq_lane_s32(vi, 1);

0 commit comments

Comments
 (0)