Skip to content

[RISCV] Lower vector_reverse for zvfhmin/zvfbfmin #110218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 2, 2024

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Sep 27, 2024

Previously we crashed because we had no lowering for f16/bf16 scalable vectors.
Because the lowering uses vrgather_vv_vl, we need to add bf16 patterns for it.

@llvmbot
Copy link
Member

llvmbot commented Sep 27, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

Previously we crashed because we had no lowering for f16/bf16 scalable vectors.
Because the lowering uses vrgather_vv_vl, we need to add bf16 patterns for it.


Full diff: https://github.com/llvm/llvm-project/pull/110218.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+2-2)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll (+119-6)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index bd796efd836c75..d58e9d276f8279 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1077,8 +1077,8 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
                           ISD::VP_UINT_TO_FP},
                          VT, Custom);
       setOperationAction({ISD::CONCAT_VECTORS, ISD::INSERT_SUBVECTOR,
-                          ISD::EXTRACT_SUBVECTOR, ISD::VECTOR_INTERLEAVE,
-                          ISD::VECTOR_DEINTERLEAVE},
+                          ISD::EXTRACT_SUBVECTOR, ISD::VECTOR_DEINTERLEAVE,
+                          ISD::VECTOR_INTERLEAVE, ISD::VECTOR_REVERSE},
                          VT, Custom);
       MVT EltVT = VT.getVectorElementType();
       if (isTypeLegal(EltVT))
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td b/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
index a27c3a416816e2..18749f00a10a52 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
@@ -2952,7 +2952,7 @@ foreach vti = NoGroupFloatVectors in {
   }
 }
 
-foreach vti = AllFloatVectors in {
+foreach vti = !listconcat(AllFloatVectors, AllBFloatVectors) in {
   defvar ivti = GetIntVTypeInfo<vti>.Vti;
   let Predicates = GetVTypePredicates<ivti>.Predicates in {
     def : Pat<(vti.Vector
diff --git a/llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll b/llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll
index 9d0cb22eb5f475..a6c6db345032ec 100644
--- a/llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll
@@ -1,10 +1,16 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-UNKNOWN
-; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-256
-; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-512
-; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-UNKNOWN
-; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-256
-; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-512
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-UNKNOWN
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-256
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-512
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-UNKNOWN
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-256
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfh,+zvfbfmin -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-512
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-UNKNOWN
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-256
+; RUN: llc -mtriple=riscv32 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,RV32-BITS-512
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-UNKNOWN
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -riscv-v-vector-bits-max=256 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-256
+; RUN: llc -mtriple=riscv64 -mattr=+m,+v,+f,+d,+zfh,+zvfhmin,+zvfbfmin -riscv-v-vector-bits-max=512 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,RV64-BITS-512
 
 ;
 ; VECTOR_REVERSE - masks
@@ -1515,6 +1521,113 @@ define <vscale x 8 x i64> @reverse_nxv8i64(<vscale x 8 x i64> %a) {
 ; VECTOR_REVERSE - floating point
 ;
 
+define <vscale x 1 x bfloat> @reverse_nxv1bf16(<vscale x 1 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv1bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    csrr a0, vlenb
+; CHECK-NEXT:    srli a0, a0, 3
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-NEXT:    vid.v v9
+; CHECK-NEXT:    vrsub.vx v10, v9, a0
+; CHECK-NEXT:    vrgather.vv v9, v8, v10
+; CHECK-NEXT:    vmv1r.v v8, v9
+; CHECK-NEXT:    ret
+  %res = call <vscale x 1 x bfloat> @llvm.vector.reverse.nxv1bf16(<vscale x 1 x bfloat> %a)
+  ret <vscale x 1 x bfloat> %res
+}
+
+define <vscale x 2 x bfloat> @reverse_nxv2bf16(<vscale x 2 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv2bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    csrr a0, vlenb
+; CHECK-NEXT:    srli a0, a0, 2
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, mf2, ta, ma
+; CHECK-NEXT:    vid.v v9
+; CHECK-NEXT:    vrsub.vx v10, v9, a0
+; CHECK-NEXT:    vrgather.vv v9, v8, v10
+; CHECK-NEXT:    vmv1r.v v8, v9
+; CHECK-NEXT:    ret
+  %res = call <vscale x 2 x bfloat> @llvm.vector.reverse.nxv2bf16(<vscale x 2 x bfloat> %a)
+  ret <vscale x 2 x bfloat> %res
+}
+
+define <vscale x 4 x bfloat> @reverse_nxv4bf16(<vscale x 4 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv4bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    csrr a0, vlenb
+; CHECK-NEXT:    srli a0, a0, 1
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vid.v v9
+; CHECK-NEXT:    vrsub.vx v10, v9, a0
+; CHECK-NEXT:    vrgather.vv v9, v8, v10
+; CHECK-NEXT:    vmv.v.v v8, v9
+; CHECK-NEXT:    ret
+  %res = call <vscale x 4 x bfloat> @llvm.vector.reverse.nxv4bf16(<vscale x 4 x bfloat> %a)
+  ret <vscale x 4 x bfloat> %res
+}
+
+define <vscale x 8 x bfloat> @reverse_nxv8bf16(<vscale x 8 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv8bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    csrr a0, vlenb
+; CHECK-NEXT:    srli a0, a0, 1
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vid.v v10
+; CHECK-NEXT:    vrsub.vx v12, v10, a0
+; CHECK-NEXT:    vrgather.vv v11, v8, v12
+; CHECK-NEXT:    vrgather.vv v10, v9, v12
+; CHECK-NEXT:    vmv2r.v v8, v10
+; CHECK-NEXT:    ret
+  %res = call <vscale x 8 x bfloat> @llvm.vector.reverse.nxv8bf16(<vscale x 8 x bfloat> %a)
+  ret <vscale x 8 x bfloat> %res
+}
+
+define <vscale x 16 x bfloat> @reverse_nxv16bf16(<vscale x 16 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv16bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    csrr a0, vlenb
+; CHECK-NEXT:    srli a0, a0, 1
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vid.v v12
+; CHECK-NEXT:    vrsub.vx v16, v12, a0
+; CHECK-NEXT:    vrgather.vv v15, v8, v16
+; CHECK-NEXT:    vrgather.vv v14, v9, v16
+; CHECK-NEXT:    vrgather.vv v13, v10, v16
+; CHECK-NEXT:    vrgather.vv v12, v11, v16
+; CHECK-NEXT:    vmv4r.v v8, v12
+; CHECK-NEXT:    ret
+  %res = call <vscale x 16 x bfloat> @llvm.vector.reverse.nxv16bf16(<vscale x 16 x bfloat> %a)
+  ret <vscale x 16 x bfloat> %res
+}
+
+define <vscale x 32 x bfloat> @reverse_nxv32bf16(<vscale x 32 x bfloat> %a) {
+; CHECK-LABEL: reverse_nxv32bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vmv8r.v v16, v8
+; CHECK-NEXT:    csrr a0, vlenb
+; CHECK-NEXT:    srli a0, a0, 1
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vid.v v8
+; CHECK-NEXT:    vrsub.vx v24, v8, a0
+; CHECK-NEXT:    vrgather.vv v15, v16, v24
+; CHECK-NEXT:    vrgather.vv v14, v17, v24
+; CHECK-NEXT:    vrgather.vv v13, v18, v24
+; CHECK-NEXT:    vrgather.vv v12, v19, v24
+; CHECK-NEXT:    vrgather.vv v11, v20, v24
+; CHECK-NEXT:    vrgather.vv v10, v21, v24
+; CHECK-NEXT:    vrgather.vv v9, v22, v24
+; CHECK-NEXT:    vrgather.vv v8, v23, v24
+; CHECK-NEXT:    ret
+  %res = call <vscale x 32 x bfloat> @llvm.vector.reverse.nxv32bf16(<vscale x 32 x bfloat> %a)
+  ret <vscale x 32 x bfloat> %res
+}
+
 define <vscale x 1 x half> @reverse_nxv1f16(<vscale x 1 x half> %a) {
 ; CHECK-LABEL: reverse_nxv1f16:
 ; CHECK:       # %bb.0:

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vrgaher change here seems related to any shuffle we lower to a vrgather, is there something which makes reverse special here? I'd rather see this tested via normal shuffle lowering if not.

@lukel97
Copy link
Contributor Author

lukel97 commented Sep 30, 2024

The vrgaher change here seems related to any shuffle we lower to a vrgather, is there something which makes reverse special here? I'd rather see this tested via normal shuffle lowering if not.

Just a quick update on this, I'm working on splitting out the vrgather_vl patterns but this involves lowering build_vector for bf16 to test it, which involves a few other small fixes.

Most ops aren't lowered yet on fixed length vectors, so far I've only be focusing on scalable vectors.

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Luke - Your comment about work required for my alternate testing approach is sufficient for me to be okay with this as a stepping stone. Once you get the fixed vector bits working, please make sure you add shuffle testing for this as well.

Previously we crashed because we had no lowering for f16/bf16 scalable vectors.
Because the lowering uses vrgather_vv_vl, we need to add bf16 patterns for it.
@lukel97 lukel97 force-pushed the zvfhmin-zvfbfmin/vector-reverse branch from 3381ca3 to 407ada0 Compare October 1, 2024 13:03
@lukel97 lukel97 merged commit 30f58ab into llvm:main Oct 2, 2024
8 checks passed
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
Previously we crashed because we had no lowering for f16/bf16 scalable
vectors.
Because the lowering uses vrgather_vv_vl, we need to add bf16 patterns
for it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants