-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV] Lower vector_reverse for zvfhmin/zvfbfmin #110218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesPreviously we crashed because we had no lowering for f16/bf16 scalable vectors. Full diff: https://github.com/llvm/llvm-project/pull/110218.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index bd796efd836c75..d58e9d276f8279 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1077,8 +1077,8 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::VP_UINT_TO_FP},
VT, Custom);
setOperationAction({ISD::CONCAT_VECTORS, ISD::INSERT_SUBVECTOR,
- ISD::EXTRACT_SUBVECTOR, ISD::VECTOR_INTERLEAVE,
- ISD::VECTOR_DEINTERLEAVE},
+ ISD::EXTRACT_SUBVECTOR, ISD::VECTOR_DEINTERLEAVE,
+ ISD::VECTOR_INTERLEAVE, ISD::VECTOR_REVERSE},
VT, Custom);
MVT EltVT = VT.getVectorElementType();
if (isTypeLegal(EltVT))
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td b/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
index a27c3a416816e2..18749f00a10a52 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
@@ -2952,7 +2952,7 @@ foreach vti = NoGroupFloatVectors in {
}
}
-foreach vti = AllFloatVectors in {
+foreach vti = !listconcat(AllFloatVectors, AllBFloatVectors) in {
defvar ivti = GetIntVTypeInfo<vti>.Vti;
let Predicates = GetVTypePredicates<ivti>.Predicates in {
def : Pat<(vti.Vector
diff --git a/llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll b/llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll
index 9d0cb22eb5f475..a6c6db345032ec 100644
--- a/llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll
@@ -1,10 +1,16 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-UNKNOWN
-; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-256
-; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-512
-; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-UNKNOWN
-; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-256
-; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-512
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-UNKNOWN
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-256
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-512
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-UNKNOWN
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-256
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-512
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-UNKNOWN
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-256
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-512
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-UNKNOWN
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-256
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-512
;
; VECTOR_REVERSE - masks
@@ -1515,6 +1521,113 @@ define <vscale x 8 x i64> @reverse_nxv8i64(<vscale x 8 x i64> %a) {
; VECTOR_REVERSE - floating point
;
+define <vscale x 1 x bfloat> @reverse_nxv1bf16(<vscale x 1 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv1bf16:
+; CHECK: # %bb.0:
+; CHECK-NEXT: csrr a0, vlenb
+; CHECK-NEXT: srli a0, a0, 3
+; CHECK-NEXT: addi a0, a0, -1
+; CHECK-NEXT: vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-NEXT: vid.v v9
+; CHECK-NEXT: vrsub.vx v10, v9, a0
+; CHECK-NEXT: vrgather.vv v9, v8, v10
+; CHECK-NEXT: vmv1r.v v8, v9
+; CHECK-NEXT: ret
+ %res = call <vscale x 1 x bfloat> @llvm.vector.reverse.nxv1bf16(<vscale x 1 x bfloat> %a)
+ ret <vscale x 1 x bfloat> %res
+}
+
+define <vscale x 2 x bfloat> @reverse_nxv2bf16(<vscale x 2 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv2bf16:
+; CHECK: # %bb.0:
+; CHECK-NEXT: csrr a0, vlenb
+; CHECK-NEXT: srli a0, a0, 2
+; CHECK-NEXT: addi a0, a0, -1
+; CHECK-NEXT: vsetvli a1, zero, e16, mf2, ta, ma
+; CHECK-NEXT: vid.v v9
+; CHECK-NEXT: vrsub.vx v10, v9, a0
+; CHECK-NEXT: vrgather.vv v9, v8, v10
+; CHECK-NEXT: vmv1r.v v8, v9
+; CHECK-NEXT: ret
+ %res = call <vscale x 2 x bfloat> @llvm.vector.reverse.nxv2bf16(<vscale x 2 x bfloat> %a)
+ ret <vscale x 2 x bfloat> %res
+}
+
+define <vscale x 4 x bfloat> @reverse_nxv4bf16(<vscale x 4 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv4bf16:
+; CHECK: # %bb.0:
+; CHECK-NEXT: csrr a0, vlenb
+; CHECK-NEXT: srli a0, a0, 1
+; CHECK-NEXT: addi a0, a0, -1
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, ta, ma
+; CHECK-NEXT: vid.v v9
+; CHECK-NEXT: vrsub.vx v10, v9, a0
+; CHECK-NEXT: vrgather.vv v9, v8, v10
+; CHECK-NEXT: vmv.v.v v8, v9
+; CHECK-NEXT: ret
+ %res = call <vscale x 4 x bfloat> @llvm.vector.reverse.nxv4bf16(<vscale x 4 x bfloat> %a)
+ ret <vscale x 4 x bfloat> %res
+}
+
+define <vscale x 8 x bfloat> @reverse_nxv8bf16(<vscale x 8 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv8bf16:
+; CHECK: # %bb.0:
+; CHECK-NEXT: csrr a0, vlenb
+; CHECK-NEXT: srli a0, a0, 1
+; CHECK-NEXT: addi a0, a0, -1
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, ta, ma
+; CHECK-NEXT: vid.v v10
+; CHECK-NEXT: vrsub.vx v12, v10, a0
+; CHECK-NEXT: vrgather.vv v11, v8, v12
+; CHECK-NEXT: vrgather.vv v10, v9, v12
+; CHECK-NEXT: vmv2r.v v8, v10
+; CHECK-NEXT: ret
+ %res = call <vscale x 8 x bfloat> @llvm.vector.reverse.nxv8bf16(<vscale x 8 x bfloat> %a)
+ ret <vscale x 8 x bfloat> %res
+}
+
+define <vscale x 16 x bfloat> @reverse_nxv16bf16(<vscale x 16 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv16bf16:
+; CHECK: # %bb.0:
+; CHECK-NEXT: csrr a0, vlenb
+; CHECK-NEXT: srli a0, a0, 1
+; CHECK-NEXT: addi a0, a0, -1
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, ta, ma
+; CHECK-NEXT: vid.v v12
+; CHECK-NEXT: vrsub.vx v16, v12, a0
+; CHECK-NEXT: vrgather.vv v15, v8, v16
+; CHECK-NEXT: vrgather.vv v14, v9, v16
+; CHECK-NEXT: vrgather.vv v13, v10, v16
+; CHECK-NEXT: vrgather.vv v12, v11, v16
+; CHECK-NEXT: vmv4r.v v8, v12
+; CHECK-NEXT: ret
+ %res = call <vscale x 16 x bfloat> @llvm.vector.reverse.nxv16bf16(<vscale x 16 x bfloat> %a)
+ ret <vscale x 16 x bfloat> %res
+}
+
+define <vscale x 32 x bfloat> @reverse_nxv32bf16(<vscale x 32 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv32bf16:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vmv8r.v v16, v8
+; CHECK-NEXT: csrr a0, vlenb
+; CHECK-NEXT: srli a0, a0, 1
+; CHECK-NEXT: addi a0, a0, -1
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, ta, ma
+; CHECK-NEXT: vid.v v8
+; CHECK-NEXT: vrsub.vx v24, v8, a0
+; CHECK-NEXT: vrgather.vv v15, v16, v24
+; CHECK-NEXT: vrgather.vv v14, v17, v24
+; CHECK-NEXT: vrgather.vv v13, v18, v24
+; CHECK-NEXT: vrgather.vv v12, v19, v24
+; CHECK-NEXT: vrgather.vv v11, v20, v24
+; CHECK-NEXT: vrgather.vv v10, v21, v24
+; CHECK-NEXT: vrgather.vv v9, v22, v24
+; CHECK-NEXT: vrgather.vv v8, v23, v24
+; CHECK-NEXT: ret
+ %res = call <vscale x 32 x bfloat> @llvm.vector.reverse.nxv32bf16(<vscale x 32 x bfloat> %a)
+ ret <vscale x 32 x bfloat> %res
+}
+
define <vscale x 1 x half> @reverse_nxv1f16(<vscale x 1 x half> %a) {
; CHECK-LABEL: reverse_nxv1f16:
; CHECK: # %bb.0:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The vrgaher change here seems related to any shuffle we lower to a vrgather, is there something which makes reverse special here? I'd rather see this tested via normal shuffle lowering if not.
Just a quick update on this, I'm working on splitting out the vrgather_vl patterns but this involves lowering build_vector for bf16 to test it, which involves a few other small fixes. Most ops aren't lowered yet on fixed length vectors, so far I've only be focusing on scalable vectors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Luke - Your comment about work required for my alternate testing approach is sufficient for me to be okay with this as a stepping stone. Once you get the fixed vector bits working, please make sure you add shuffle testing for this as well.
Previously we crashed because we had no lowering for f16/bf16 scalable vectors. Because the lowering uses vrgather_vv_vl, we need to add bf16 patterns for it.
3381ca3
to
407ada0
Compare
Previously we crashed because we had no lowering for f16/bf16 scalable vectors. Because the lowering uses vrgather_vv_vl, we need to add bf16 patterns for it.
Previously we crashed because we had no lowering for f16/bf16 scalable vectors.
Because the lowering uses vrgather_vv_vl, we need to add bf16 patterns for it.