Skip to content

Commit 0141ee7

Browse files
zhongjuzheIncarnation-p-lee
authored andcommitted
RISC-V: Refine unsigned avg_floor/avg_ceil
This patch is inspired by LLVM patches: llvm/llvm-project#76550 llvm/llvm-project#77473 Use vaaddu for AVG vectorization. Before this patch: vsetivli zero,8,e8,mf2,ta,ma vle8.v v3,0(a1) vle8.v v2,0(a2) vwaddu.vv v1,v3,v2 vsetvli zero,zero,e16,m1,ta,ma vadd.vi v1,v1,1 vsetvli zero,zero,e8,mf2,ta,ma vnsrl.wi v1,v1,1 vse8.v v1,0(a0) ret After this patch: vsetivli zero,8,e8,mf2,ta,ma csrwi vxrm,0 vle8.v v1,0(a1) vle8.v v2,0(a2) vaaddu.vv v1,v1,v2 vse8.v v1,0(a0) ret Note on signed averaging addition Based on the rvv spec, there is also a variant for signed averaging addition called vaadd. But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd. Thus this patch only introduces vaaddu. More details in: riscvarchive/riscv-v-spec#935 riscvarchive/riscv-v-spec#934 Tested on both RV32 and RV64 no regression. Ok for trunk ? gcc/ChangeLog: * config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove. (avg<v_double_trunc>3_floor): New pattern. (<u>avg<v_double_trunc>3_ceil): Remove. (avg<v_double_trunc>3_ceil): New pattern. (uavg<mode>3_floor): Ditto. (uavg<mode>3_ceil): Ditto. * config/riscv/riscv-protos.h (enum insn_flags): Add for average addition. (enum insn_type): Ditto. * config/riscv/riscv-v.cc: Ditto. * config/riscv/vector-iterators.md (ashiftrt): Remove. (ASHIFTRT): Ditto. * config/riscv/vector.md: Add VLS modes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test. * gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
1 parent 57792c3 commit 0141ee7

File tree

13 files changed

+86
-44
lines changed

13 files changed

+86
-44
lines changed

gcc/config/riscv/autovec.md

Lines changed: 38 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2345,47 +2345,47 @@
23452345
;; op[0] = (narrow) ((wide) op[1] + (wide) op[2] + 1)) >> 1;
23462346
;; -------------------------------------------------------------------------
23472347

2348-
(define_expand "<u>avg<v_double_trunc>3_floor"
2348+
(define_expand "avg<v_double_trunc>3_floor"
23492349
[(set (match_operand:<V_DOUBLE_TRUNC> 0 "register_operand")
23502350
(truncate:<V_DOUBLE_TRUNC>
2351-
(<ext_to_rshift>:VWEXTI
2351+
(ashiftrt:VWEXTI
23522352
(plus:VWEXTI
2353-
(any_extend:VWEXTI
2353+
(sign_extend:VWEXTI
23542354
(match_operand:<V_DOUBLE_TRUNC> 1 "register_operand"))
2355-
(any_extend:VWEXTI
2355+
(sign_extend:VWEXTI
23562356
(match_operand:<V_DOUBLE_TRUNC> 2 "register_operand"))))))]
23572357
"TARGET_VECTOR"
23582358
{
23592359
/* First emit a widening addition. */
23602360
rtx tmp1 = gen_reg_rtx (<MODE>mode);
23612361
rtx ops1[] = {tmp1, operands[1], operands[2]};
2362-
insn_code icode = code_for_pred_dual_widen (PLUS, <CODE>, <MODE>mode);
2362+
insn_code icode = code_for_pred_dual_widen (PLUS, SIGN_EXTEND, <MODE>mode);
23632363
riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops1);
23642364

23652365
/* Then a narrowing shift. */
23662366
rtx ops2[] = {operands[0], tmp1, const1_rtx};
2367-
icode = code_for_pred_narrow_scalar (<EXT_TO_RSHIFT>, <MODE>mode);
2367+
icode = code_for_pred_narrow_scalar (ASHIFTRT, <MODE>mode);
23682368
riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops2);
23692369
DONE;
23702370
})
23712371

2372-
(define_expand "<u>avg<v_double_trunc>3_ceil"
2372+
(define_expand "avg<v_double_trunc>3_ceil"
23732373
[(set (match_operand:<V_DOUBLE_TRUNC> 0 "register_operand")
23742374
(truncate:<V_DOUBLE_TRUNC>
2375-
(<ext_to_rshift>:VWEXTI
2375+
(ashiftrt:VWEXTI
23762376
(plus:VWEXTI
23772377
(plus:VWEXTI
2378-
(any_extend:VWEXTI
2378+
(sign_extend:VWEXTI
23792379
(match_operand:<V_DOUBLE_TRUNC> 1 "register_operand"))
2380-
(any_extend:VWEXTI
2380+
(sign_extend:VWEXTI
23812381
(match_operand:<V_DOUBLE_TRUNC> 2 "register_operand")))
23822382
(const_int 1)))))]
23832383
"TARGET_VECTOR"
23842384
{
23852385
/* First emit a widening addition. */
23862386
rtx tmp1 = gen_reg_rtx (<MODE>mode);
23872387
rtx ops1[] = {tmp1, operands[1], operands[2]};
2388-
insn_code icode = code_for_pred_dual_widen (PLUS, <CODE>, <MODE>mode);
2388+
insn_code icode = code_for_pred_dual_widen (PLUS, SIGN_EXTEND, <MODE>mode);
23892389
riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops1);
23902390

23912391
/* Then add 1. */
@@ -2396,11 +2396,37 @@
23962396

23972397
/* Finally, a narrowing shift. */
23982398
rtx ops3[] = {operands[0], tmp2, const1_rtx};
2399-
icode = code_for_pred_narrow_scalar (<EXT_TO_RSHIFT>, <MODE>mode);
2399+
icode = code_for_pred_narrow_scalar (ASHIFTRT, <MODE>mode);
24002400
riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops3);
24012401
DONE;
24022402
})
24032403

2404+
;; csrwi vxrm, 2
2405+
;; vaaddu.vv vd, vs2, vs1
2406+
(define_expand "uavg<mode>3_floor"
2407+
[(match_operand:V_VLSI 0 "register_operand")
2408+
(match_operand:V_VLSI 1 "register_operand")
2409+
(match_operand:V_VLSI 2 "register_operand")]
2410+
"TARGET_VECTOR"
2411+
{
2412+
insn_code icode = code_for_pred (UNSPEC_VAADDU, <MODE>mode);
2413+
riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP_VXRM_RDN, operands);
2414+
DONE;
2415+
})
2416+
2417+
;; csrwi vxrm, 0
2418+
;; vaaddu.vv vd, vs2, vs1
2419+
(define_expand "uavg<mode>3_ceil"
2420+
[(match_operand:V_VLSI 0 "register_operand")
2421+
(match_operand:V_VLSI 1 "register_operand")
2422+
(match_operand:V_VLSI 2 "register_operand")]
2423+
"TARGET_VECTOR"
2424+
{
2425+
insn_code icode = code_for_pred (UNSPEC_VAADDU, <MODE>mode);
2426+
riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP_VXRM_RNU, operands);
2427+
DONE;
2428+
})
2429+
24042430
;; -------------------------------------------------------------------------
24052431
;; ---- [FP] Rounding.
24062432
;; -------------------------------------------------------------------------

gcc/config/riscv/riscv-protos.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -366,6 +366,12 @@ enum insn_flags : unsigned int
366366

367367
/* Means INSN has FRM operand and the value is FRM_RNE. */
368368
FRM_RNE_P = 1 << 19,
369+
370+
/* Means INSN has VXRM operand and the value is VXRM_RNU. */
371+
VXRM_RNU_P = 1 << 20,
372+
373+
/* Means INSN has VXRM operand and the value is VXRM_RDN. */
374+
VXRM_RDN_P = 1 << 21,
369375
};
370376

371377
enum insn_type : unsigned int
@@ -426,6 +432,8 @@ enum insn_type : unsigned int
426432
BINARY_OP_TAMU = __MASK_OP_TAMU | BINARY_OP_P,
427433
BINARY_OP_TUMA = __MASK_OP_TUMA | BINARY_OP_P,
428434
BINARY_OP_FRM_DYN = BINARY_OP | FRM_DYN_P,
435+
BINARY_OP_VXRM_RNU = BINARY_OP | VXRM_RNU_P,
436+
BINARY_OP_VXRM_RDN = BINARY_OP | VXRM_RDN_P,
429437

430438
/* Ternary operator. Always have real merge operand. */
431439
TERNARY_OP = HAS_DEST_P | HAS_MASK_P | USE_ALL_TRUES_MASK_P | HAS_MERGE_P

gcc/config/riscv/riscv-v.cc

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,13 @@ template <int MAX_OPERANDS> class insn_expander
207207
add_input_operand (frm_rtx, Pmode);
208208
}
209209

210+
void
211+
add_rounding_mode_operand (enum fixed_point_rounding_mode rounding_mode)
212+
{
213+
rtx frm_rtx = gen_int_mode (rounding_mode, Pmode);
214+
add_input_operand (frm_rtx, Pmode);
215+
}
216+
210217
/* Return the vtype mode based on insn_flags.
211218
vtype mode mean the mode vsetvl insn set. */
212219
machine_mode
@@ -334,6 +341,10 @@ template <int MAX_OPERANDS> class insn_expander
334341
add_rounding_mode_operand (FRM_RMM);
335342
else if (m_insn_flags & FRM_RNE_P)
336343
add_rounding_mode_operand (FRM_RNE);
344+
else if (m_insn_flags & VXRM_RNU_P)
345+
add_rounding_mode_operand (VXRM_RNU);
346+
else if (m_insn_flags & VXRM_RDN_P)
347+
add_rounding_mode_operand (VXRM_RDN);
337348

338349
gcc_assert (insn_data[(int) icode].n_operands == m_opno);
339350
expand (icode, any_mem_p);

gcc/config/riscv/vector-iterators.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3581,11 +3581,6 @@
35813581
(define_code_attr nmsub_nmadd [(plus "nmsub") (minus "nmadd")])
35823582
(define_code_attr nmsac_nmacc [(plus "nmsac") (minus "nmacc")])
35833583

3584-
(define_code_attr ext_to_rshift [(sign_extend "ashiftrt")
3585-
(zero_extend "lshiftrt")])
3586-
(define_code_attr EXT_TO_RSHIFT [(sign_extend "ASHIFTRT")
3587-
(zero_extend "LSHIFTRT")])
3588-
35893584
(define_code_iterator and_ior [and ior])
35903585

35913586
(define_code_iterator any_float_binop [plus mult minus div])

gcc/config/riscv/vector.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4239,8 +4239,8 @@
42394239
(set_attr "mode" "<MODE>")])
42404240

42414241
(define_insn "@pred_<sat_op><mode>"
4242-
[(set (match_operand:VI 0 "register_operand" "=vd, vd, vr, vr")
4243-
(if_then_else:VI
4242+
[(set (match_operand:V_VLSI 0 "register_operand" "=vd, vd, vr, vr")
4243+
(if_then_else:V_VLSI
42444244
(unspec:<VM>
42454245
[(match_operand:<VM> 1 "vector_mask_operand" " vm, vm,Wc1,Wc1")
42464246
(match_operand 5 "vector_length_operand" " rK, rK, rK, rK")
@@ -4251,10 +4251,10 @@
42514251
(reg:SI VL_REGNUM)
42524252
(reg:SI VTYPE_REGNUM)
42534253
(reg:SI VXRM_REGNUM)] UNSPEC_VPREDICATE)
4254-
(unspec:VI
4255-
[(match_operand:VI 3 "register_operand" " vr, vr, vr, vr")
4256-
(match_operand:VI 4 "register_operand" " vr, vr, vr, vr")] VSAT_OP)
4257-
(match_operand:VI 2 "vector_merge_operand" " vu, 0, vu, 0")))]
4254+
(unspec:V_VLSI
4255+
[(match_operand:V_VLSI 3 "register_operand" " vr, vr, vr, vr")
4256+
(match_operand:V_VLSI 4 "register_operand" " vr, vr, vr, vr")] VSAT_OP)
4257+
(match_operand:V_VLSI 2 "vector_merge_operand" " vu, 0, vu, 0")))]
42584258
"TARGET_VECTOR"
42594259
"v<sat_op>.vv\t%0,%3,%4%p1"
42604260
[(set_attr "type" "<sat_insn_type>")

gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-1.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,9 @@ DEF_AVG_FLOOR (uint8_t, uint16_t, 1024)
2626
DEF_AVG_FLOOR (uint8_t, uint16_t, 2048)
2727

2828
/* { dg-final { scan-assembler-times {vwadd\.vv} 10 } } */
29-
/* { dg-final { scan-assembler-times {vwaddu\.vv} 10 } } */
29+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*2} 10 } } */
3030
/* { dg-final { scan-assembler-times {vnsra\.wi} 10 } } */
31-
/* { dg-final { scan-assembler-times {vnsrl\.wi} 10 } } */
31+
/* { dg-final { scan-assembler-times {vaaddu\.vv} 10 } } */
3232
/* { dg-final { scan-assembler-not {csrr} } } */
3333
/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
3434
/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */

gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-2.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ DEF_AVG_FLOOR (uint16_t, uint32_t, 512)
2424
DEF_AVG_FLOOR (uint16_t, uint32_t, 1024)
2525

2626
/* { dg-final { scan-assembler-times {vwadd\.vv} 9 } } */
27-
/* { dg-final { scan-assembler-times {vwaddu\.vv} 9 } } */
27+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*2} 9 } } */
2828
/* { dg-final { scan-assembler-times {vnsra\.wi} 9 } } */
29-
/* { dg-final { scan-assembler-times {vnsrl\.wi} 9 } } */
29+
/* { dg-final { scan-assembler-times {vaaddu\.vv} 9 } } */
3030
/* { dg-final { scan-assembler-not {csrr} } } */
3131
/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
3232
/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */

gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-3.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@ DEF_AVG_FLOOR (uint32_t, uint64_t, 256)
2222
DEF_AVG_FLOOR (uint32_t, uint64_t, 512)
2323

2424
/* { dg-final { scan-assembler-times {vwadd\.vv} 8 } } */
25-
/* { dg-final { scan-assembler-times {vwaddu\.vv} 8 } } */
25+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*2} 8 } } */
2626
/* { dg-final { scan-assembler-times {vnsra\.wi} 8 } } */
27-
/* { dg-final { scan-assembler-times {vnsrl\.wi} 8 } } */
27+
/* { dg-final { scan-assembler-times {vaaddu\.vv} 8 } } */
2828
/* { dg-final { scan-assembler-not {csrr} } } */
2929
/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
3030
/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */

gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-4.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,10 @@ DEF_AVG_CEIL (uint8_t, uint16_t, 1024)
2626
DEF_AVG_CEIL (uint8_t, uint16_t, 2048)
2727

2828
/* { dg-final { scan-assembler-times {vwadd\.vv} 10 } } */
29-
/* { dg-final { scan-assembler-times {vwaddu\.vv} 10 } } */
29+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 10 } } */
3030
/* { dg-final { scan-assembler-times {vnsra\.wi} 10 } } */
31-
/* { dg-final { scan-assembler-times {vnsrl\.wi} 10 } } */
32-
/* { dg-final { scan-assembler-times {vadd\.vi} 20 } } */
31+
/* { dg-final { scan-assembler-times {vaaddu\.vv} 10 } } */
32+
/* { dg-final { scan-assembler-times {vadd\.vi} 10 } } */
3333
/* { dg-final { scan-assembler-not {csrr} } } */
3434
/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
3535
/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */

gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-5.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,10 @@ DEF_AVG_CEIL (uint16_t, uint32_t, 512)
2424
DEF_AVG_CEIL (uint16_t, uint32_t, 1024)
2525

2626
/* { dg-final { scan-assembler-times {vwadd\.vv} 9 } } */
27-
/* { dg-final { scan-assembler-times {vwaddu\.vv} 9 } } */
27+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 9 } } */
2828
/* { dg-final { scan-assembler-times {vnsra\.wi} 9 } } */
29-
/* { dg-final { scan-assembler-times {vnsrl\.wi} 9 } } */
30-
/* { dg-final { scan-assembler-times {vadd\.vi} 18 } } */
29+
/* { dg-final { scan-assembler-times {vaaddu\.vv} 9 } } */
30+
/* { dg-final { scan-assembler-times {vadd\.vi} 9 } } */
3131
/* { dg-final { scan-assembler-not {csrr} } } */
3232
/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
3333
/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */

gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-6.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,10 @@ DEF_AVG_CEIL (uint16_t, uint32_t, 256)
2222
DEF_AVG_CEIL (uint16_t, uint32_t, 512)
2323

2424
/* { dg-final { scan-assembler-times {vwadd\.vv} 8 } } */
25-
/* { dg-final { scan-assembler-times {vwaddu\.vv} 8 } } */
25+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 8 } } */
2626
/* { dg-final { scan-assembler-times {vnsra\.wi} 8 } } */
27-
/* { dg-final { scan-assembler-times {vnsrl\.wi} 8 } } */
28-
/* { dg-final { scan-assembler-times {vadd\.vi} 16 } } */
27+
/* { dg-final { scan-assembler-times {vaaddu\.vv} 8 } } */
28+
/* { dg-final { scan-assembler-times {vadd\.vi} 8 } } */
2929
/* { dg-final { scan-assembler-not {csrr} } } */
3030
/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
3131
/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */

gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
#include "vec-avg-template.h"
55

66
/* { dg-final { scan-assembler-times {\tvwadd\.vv} 6 } } */
7-
/* { dg-final { scan-assembler-times {\tvwaddu\.vv} 6 } } */
8-
/* { dg-final { scan-assembler-times {\tvadd\.vi} 6 } } */
9-
/* { dg-final { scan-assembler-times {\tvnsrl.wi} 6 } } */
7+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 3 } } */
8+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*2} 3 } } */
9+
/* { dg-final { scan-assembler-times {\tvadd\.vi} 3 } } */
1010
/* { dg-final { scan-assembler-times {\tvnsra.wi} 6 } } */
11+
/* { dg-final { scan-assembler-times {vaaddu\.vv} 6 } } */

gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
#include "vec-avg-template.h"
55

66
/* { dg-final { scan-assembler-times {\tvwadd\.vv} 6 } } */
7-
/* { dg-final { scan-assembler-times {\tvwaddu\.vv} 6 } } */
8-
/* { dg-final { scan-assembler-times {\tvadd\.vi} 6 } } */
9-
/* { dg-final { scan-assembler-times {\tvnsrl\.wi} 6 } } */
7+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 3 } } */
8+
/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*2} 3 } } */
9+
/* { dg-final { scan-assembler-times {\tvadd\.vi} 3 } } */
1010
/* { dg-final { scan-assembler-times {\tvnsra\.wi} 6 } } */
11+
/* { dg-final { scan-assembler-times {vaaddu\.vv} 6 } } */

0 commit comments

Comments
 (0)