Skip to content

[RISCV] Select Zvkb VANDN for shorter constant loading sequences #123345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 22, 2025

Conversation

pfusik
Copy link
Contributor

@pfusik pfusik commented Jan 17, 2025

This extends PR #120221 to vector instructions.

@pfusik
Copy link
Contributor Author

pfusik commented Jan 17, 2025

I'm trying to extend #120221 to VANDN.

Reproducer:

void mask(short *a) {
    for (int i = 0; i < 256; i++)
        a[i] &= 0x7fff;
}

I understand this will also need an update in RISCVDAGToDAGISel::selectInvLogicImm to allow splat vector uses, but gdb shows it doesn't even enter this function, so my TableGen pattern must be wrong, but I don't know why. Please help.

@pfusik pfusik changed the title [RISCV] Select Zvkb VANDN for shorter constant loading sequences [RISCV] Select Zvbb VANDN for shorter constant loading sequences Jan 17, 2025
@pfusik pfusik changed the title [RISCV] Select Zvbb VANDN for shorter constant loading sequences [RISCV] Select Zvkb VANDN for shorter constant loading sequences Jan 17, 2025
@topperc
Copy link
Collaborator

topperc commented Jan 17, 2025

I understand this will also need an update in RISCVDAGToDAGISel::selectInvLogicImm to allow splat vector uses, but gdb shows it doesn't even enter this function, so my TableGen pattern must be wrong, but I don't know why. Please help.

The RISCVDAGToDAGISel::selectInvLogicImm is getting executed when I tried to test it.

It failed this check

  for (const SDNode *U : N->users()) {                                           
    if (!ISD::isBitwiseLogicOp(U->getOpcode()))                                  
      return false;                                                              
  }  

@pfusik
Copy link
Contributor Author

pfusik commented Jan 20, 2025

The RISCVDAGToDAGISel::selectInvLogicImm is getting executed when I tried to test it.

You are right, thank you.
I made a stupid mistake of debugging clang while building only llc.

@pfusik pfusik marked this pull request as ready for review January 20, 2025 16:24
@llvmbot
Copy link
Member

llvmbot commented Jan 20, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Piotr Fusik (pfusik)

Changes

This extends PR #120221 to vector instructions.


Full diff: https://github.com/llvm/llvm-project/pull/123345.diff

4 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+18-1)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoZvk.td (+21)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vandn-sdnode.ll (+180-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vandn-vp.ll (+114)
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 36292e3d572cb2..9855028ead9e20 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -3224,8 +3224,25 @@ bool RISCVDAGToDAGISel::selectInvLogicImm(SDValue N, SDValue &Val) {
 
   // Abandon this transform if the constant is needed elsewhere.
   for (const SDNode *U : N->users()) {
-    if (!ISD::isBitwiseLogicOp(U->getOpcode()))
+    switch (U->getOpcode()) {
+    case ISD::AND:
+    case ISD::OR:
+    case ISD::XOR:
+      if (!(Subtarget->hasStdExtZbb() || Subtarget->hasStdExtZbkb()))
+        return false;
+      break;
+    case RISCVISD::VMV_V_X_VL:
+      if (!Subtarget->hasStdExtZvkb())
+        return false;
+      if (!all_of(U->users(), [](const SDNode *V) {
+            return V->getOpcode() == ISD::AND ||
+                   V->getOpcode() == RISCVISD::AND_VL;
+          }))
+        return false;
+      break;
+    default:
       return false;
+    }
   }
 
   // For 64-bit constants, the instruction sequences get complex,
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZvk.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZvk.td
index c69d8885175219..430d75e5cec5b2 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZvk.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZvk.td
@@ -624,6 +624,13 @@ foreach vti = AllIntegerVectors in {
                  vti.RegClass:$rs2,
                  vti.ScalarRegClass:$rs1,
                  vti.AVL, vti.Log2SEW, TA_MA)>;
+    def : Pat<(vti.Vector (and (riscv_splat_vector invLogicImm:$rs1),
+                               vti.RegClass:$rs2)),
+              (!cast<Instruction>("PseudoVANDN_VX_"#vti.LMul.MX)
+                 (vti.Vector (IMPLICIT_DEF)),
+                 vti.RegClass:$rs2,
+                 invLogicImm:$rs1,
+                 vti.AVL, vti.Log2SEW, TA_MA)>;
   }
 }
 
@@ -758,6 +765,20 @@ foreach vti = AllIntegerVectors in {
                  GPR:$vl,
                  vti.Log2SEW,
                  TAIL_AGNOSTIC)>;
+
+    def : Pat<(vti.Vector (riscv_and_vl (riscv_splat_vector invLogicImm:$rs1),
+                                        (vti.Vector vti.RegClass:$rs2),
+                                        (vti.Vector vti.RegClass:$passthru),
+                                        (vti.Mask V0),
+                                        VLOpFrag)),
+              (!cast<Instruction>("PseudoVANDN_VX_"#vti.LMul.MX#"_MASK")
+                 vti.RegClass:$passthru,
+                 vti.RegClass:$rs2,
+                 invLogicImm:$rs1,
+                 (vti.Mask V0),
+                 GPR:$vl,
+                 vti.Log2SEW,
+                 TAIL_AGNOSTIC)>;
   }
 }
 
diff --git a/llvm/test/CodeGen/RISCV/rvv/vandn-sdnode.ll b/llvm/test/CodeGen/RISCV/rvv/vandn-sdnode.ll
index ea8b166c156cb6..cf73dceaae3064 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vandn-sdnode.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vandn-sdnode.ll
@@ -1,8 +1,10 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
 ; RUN: llc -mtriple=riscv32 -mattr=+v -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,CHECK-RV32
 ; RUN: llc -mtriple=riscv64 -mattr=+v -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,CHECK-RV64
-; RUN: llc -mtriple=riscv32 -mattr=+v,+zvkb -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK-ZVKB,CHECK-ZVKB32
-; RUN: llc -mtriple=riscv64 -mattr=+v,+zvkb -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK-ZVKB,CHECK-ZVKB64
+; RUN: llc -mtriple=riscv32 -mattr=+v,+zvkb -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK-ZVKB,CHECK-ZVKB-NOZBB,CHECK-ZVKB32
+; RUN: llc -mtriple=riscv64 -mattr=+v,+zvkb -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK-ZVKB,CHECK-ZVKB-NOZBB,CHECK-ZVKB64
+; RUN: llc -mtriple=riscv32 -mattr=+v,+zvkb,+zbb -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK-ZVKB,CHECK-ZVKB-ZBB,CHECK-ZVKB32
+; RUN: llc -mtriple=riscv64 -mattr=+v,+zvkb,+zbb -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK-ZVKB,CHECK-ZVKB-ZBB,CHECK-ZVKB64
 
 define <vscale x 1 x i8> @vandn_vv_nxv1i8(<vscale x 1 x i8> %x, <vscale x 1 x i8> %y) {
 ; CHECK-LABEL: vandn_vv_nxv1i8:
@@ -1931,3 +1933,179 @@ define <vscale x 8 x i64> @vandn_vx_swapped_nxv8i64(i64 %x, <vscale x 8 x i64> %
   %b = and <vscale x 8 x i64> %splat, %y
   ret <vscale x 8 x i64> %b
 }
+
+define <vscale x 1 x i16> @vandn_vx_imm16(<vscale x 1 x i16> %x) {
+; CHECK-LABEL: vandn_vx_imm16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a0, 8
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-NEXT:    vand.vx v8, v8, a0
+; CHECK-NEXT:    ret
+;
+; CHECK-ZVKB-LABEL: vandn_vx_imm16:
+; CHECK-ZVKB:       # %bb.0:
+; CHECK-ZVKB-NEXT:    lui a0, 1048568
+; CHECK-ZVKB-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-ZVKB-NEXT:    vandn.vx v8, v8, a0
+; CHECK-ZVKB-NEXT:    ret
+  %a = and <vscale x 1 x i16> splat (i16 32767), %x
+  ret <vscale x 1 x i16> %a
+}
+
+define <vscale x 1 x i16> @vandn_vx_swapped_imm16(<vscale x 1 x i16> %x) {
+; CHECK-LABEL: vandn_vx_swapped_imm16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a0, 8
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-NEXT:    vand.vx v8, v8, a0
+; CHECK-NEXT:    ret
+;
+; CHECK-ZVKB-LABEL: vandn_vx_swapped_imm16:
+; CHECK-ZVKB:       # %bb.0:
+; CHECK-ZVKB-NEXT:    lui a0, 1048568
+; CHECK-ZVKB-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-ZVKB-NEXT:    vandn.vx v8, v8, a0
+; CHECK-ZVKB-NEXT:    ret
+  %a = and <vscale x 1 x i16> %x, splat (i16 32767)
+  ret <vscale x 1 x i16> %a
+}
+
+define <vscale x 1 x i64> @vandn_vx_imm64(<vscale x 1 x i64> %x) {
+; CHECK-RV32-LABEL: vandn_vx_imm64:
+; CHECK-RV32:       # %bb.0:
+; CHECK-RV32-NEXT:    addi sp, sp, -16
+; CHECK-RV32-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-RV32-NEXT:    lui a0, 1044480
+; CHECK-RV32-NEXT:    li a1, 255
+; CHECK-RV32-NEXT:    sw a1, 8(sp)
+; CHECK-RV32-NEXT:    sw a0, 12(sp)
+; CHECK-RV32-NEXT:    addi a0, sp, 8
+; CHECK-RV32-NEXT:    vsetvli a1, zero, e64, m1, ta, ma
+; CHECK-RV32-NEXT:    vlse64.v v9, (a0), zero
+; CHECK-RV32-NEXT:    vand.vv v8, v8, v9
+; CHECK-RV32-NEXT:    addi sp, sp, 16
+; CHECK-RV32-NEXT:    .cfi_def_cfa_offset 0
+; CHECK-RV32-NEXT:    ret
+;
+; CHECK-RV64-LABEL: vandn_vx_imm64:
+; CHECK-RV64:       # %bb.0:
+; CHECK-RV64-NEXT:    li a0, -1
+; CHECK-RV64-NEXT:    slli a0, a0, 56
+; CHECK-RV64-NEXT:    addi a0, a0, 255
+; CHECK-RV64-NEXT:    vsetvli a1, zero, e64, m1, ta, ma
+; CHECK-RV64-NEXT:    vand.vx v8, v8, a0
+; CHECK-RV64-NEXT:    ret
+;
+; CHECK-ZVKB32-LABEL: vandn_vx_imm64:
+; CHECK-ZVKB32:       # %bb.0:
+; CHECK-ZVKB32-NEXT:    addi sp, sp, -16
+; CHECK-ZVKB32-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-ZVKB32-NEXT:    lui a0, 1044480
+; CHECK-ZVKB32-NEXT:    li a1, 255
+; CHECK-ZVKB32-NEXT:    sw a1, 8(sp)
+; CHECK-ZVKB32-NEXT:    sw a0, 12(sp)
+; CHECK-ZVKB32-NEXT:    addi a0, sp, 8
+; CHECK-ZVKB32-NEXT:    vsetvli a1, zero, e64, m1, ta, ma
+; CHECK-ZVKB32-NEXT:    vlse64.v v9, (a0), zero
+; CHECK-ZVKB32-NEXT:    vand.vv v8, v8, v9
+; CHECK-ZVKB32-NEXT:    addi sp, sp, 16
+; CHECK-ZVKB32-NEXT:    .cfi_def_cfa_offset 0
+; CHECK-ZVKB32-NEXT:    ret
+;
+; CHECK-ZVKB64-LABEL: vandn_vx_imm64:
+; CHECK-ZVKB64:       # %bb.0:
+; CHECK-ZVKB64-NEXT:    lui a0, 1048560
+; CHECK-ZVKB64-NEXT:    srli a0, a0, 8
+; CHECK-ZVKB64-NEXT:    vsetvli a1, zero, e64, m1, ta, ma
+; CHECK-ZVKB64-NEXT:    vandn.vx v8, v8, a0
+; CHECK-ZVKB64-NEXT:    ret
+  %a = and <vscale x 1 x i64> %x, splat (i64 -72057594037927681)
+  ret <vscale x 1 x i64> %a
+}
+
+define <vscale x 1 x i16> @vandn_vx_multi_imm16(<vscale x 1 x i16> %x, <vscale x 1 x i16> %y) {
+; CHECK-LABEL: vandn_vx_multi_imm16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a0, 4
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-NEXT:    vand.vx v8, v8, a0
+; CHECK-NEXT:    vand.vx v9, v9, a0
+; CHECK-NEXT:    vadd.vv v8, v8, v9
+; CHECK-NEXT:    ret
+;
+; CHECK-ZVKB-LABEL: vandn_vx_multi_imm16:
+; CHECK-ZVKB:       # %bb.0:
+; CHECK-ZVKB-NEXT:    lui a0, 1048572
+; CHECK-ZVKB-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-ZVKB-NEXT:    vandn.vx v8, v8, a0
+; CHECK-ZVKB-NEXT:    vandn.vx v9, v9, a0
+; CHECK-ZVKB-NEXT:    vadd.vv v8, v8, v9
+; CHECK-ZVKB-NEXT:    ret
+  %a = and <vscale x 1 x i16> %x, splat (i16 16383)
+  %b = and <vscale x 1 x i16> %y, splat (i16 16383)
+  %c = add <vscale x 1 x i16> %a, %b
+  ret <vscale x 1 x i16> %c
+}
+
+define <vscale x 1 x i16> @vandn_vx_multi_scalar_imm16(<vscale x 1 x i16> %x, i16 %y) {
+; CHECK-LABEL: vandn_vx_multi_scalar_imm16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 8
+; CHECK-NEXT:    addi a1, a1, -1
+; CHECK-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; CHECK-NEXT:    vand.vx v8, v8, a1
+; CHECK-NEXT:    or a0, a0, a1
+; CHECK-NEXT:    vadd.vx v8, v8, a0
+; CHECK-NEXT:    ret
+;
+; CHECK-ZVKB-NOZBB-LABEL: vandn_vx_multi_scalar_imm16:
+; CHECK-ZVKB-NOZBB:       # %bb.0:
+; CHECK-ZVKB-NOZBB-NEXT:    lui a1, 8
+; CHECK-ZVKB-NOZBB-NEXT:    addi a1, a1, -1
+; CHECK-ZVKB-NOZBB-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; CHECK-ZVKB-NOZBB-NEXT:    vand.vx v8, v8, a1
+; CHECK-ZVKB-NOZBB-NEXT:    or a0, a0, a1
+; CHECK-ZVKB-NOZBB-NEXT:    vadd.vx v8, v8, a0
+; CHECK-ZVKB-NOZBB-NEXT:    ret
+;
+; CHECK-ZVKB-ZBB-LABEL: vandn_vx_multi_scalar_imm16:
+; CHECK-ZVKB-ZBB:       # %bb.0:
+; CHECK-ZVKB-ZBB-NEXT:    lui a1, 1048568
+; CHECK-ZVKB-ZBB-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; CHECK-ZVKB-ZBB-NEXT:    vandn.vx v8, v8, a1
+; CHECK-ZVKB-ZBB-NEXT:    orn a0, a0, a1
+; CHECK-ZVKB-ZBB-NEXT:    vadd.vx v8, v8, a0
+; CHECK-ZVKB-ZBB-NEXT:    ret
+  %a = and <vscale x 1 x i16> %x, splat (i16 32767)
+  %b = or i16 %y, 32767
+  %head = insertelement <vscale x 1 x i16> poison, i16 %b, i32 0
+  %splat = shufflevector <vscale x 1 x i16> %head, <vscale x 1 x i16> poison, <vscale x 1 x i32> zeroinitializer
+  %c = add <vscale x 1 x i16> %a, %splat
+  ret <vscale x 1 x i16> %c
+}
+
+define <vscale x 1 x i16> @vand_vadd_vx_imm16(<vscale x 1 x i16> %x) {
+; CHECK-LABEL: vand_vadd_vx_imm16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a0, 8
+; CHECK-NEXT:    addi a0, a0, -1
+; CHECK-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-NEXT:    vand.vx v8, v8, a0
+; CHECK-NEXT:    vadd.vx v8, v8, a0
+; CHECK-NEXT:    ret
+;
+; CHECK-ZVKB-LABEL: vand_vadd_vx_imm16:
+; CHECK-ZVKB:       # %bb.0:
+; CHECK-ZVKB-NEXT:    lui a0, 8
+; CHECK-ZVKB-NEXT:    addi a0, a0, -1
+; CHECK-ZVKB-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
+; CHECK-ZVKB-NEXT:    vand.vx v8, v8, a0
+; CHECK-ZVKB-NEXT:    vadd.vx v8, v8, a0
+; CHECK-ZVKB-NEXT:    ret
+  %a = and <vscale x 1 x i16> %x, splat (i16 32767)
+  %b = add <vscale x 1 x i16> %a, splat (i16 32767)
+  ret <vscale x 1 x i16> %b
+}
diff --git a/llvm/test/CodeGen/RISCV/rvv/vandn-vp.ll b/llvm/test/CodeGen/RISCV/rvv/vandn-vp.ll
index 763b2908b10267..5d29b266546f59 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vandn-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vandn-vp.ll
@@ -1429,3 +1429,117 @@ define <vscale x 8 x i64> @vandn_vx_vp_nxv8i64(i64 %a, <vscale x 8 x i64> %b, <v
   %x = call <vscale x 8 x i64> @llvm.vp.and.nxv8i64(<vscale x 8 x i64> %b, <vscale x 8 x i64> %splat.not.a, <vscale x 8 x i1> %mask, i32 %evl)
   ret <vscale x 8 x i64> %x
 }
+
+define <vscale x 1 x i16> @vandn_vx_vp_imm16(<vscale x 1 x i16> %x, <vscale x 1 x i1> %mask, i32 zeroext %evl) {
+; CHECK-LABEL: vandn_vx_vp_imm16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 8
+; CHECK-NEXT:    addi a1, a1, -1
+; CHECK-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
+; CHECK-NEXT:    vand.vx v8, v8, a1, v0.t
+; CHECK-NEXT:    ret
+;
+; CHECK-ZVKB-LABEL: vandn_vx_vp_imm16:
+; CHECK-ZVKB:       # %bb.0:
+; CHECK-ZVKB-NEXT:    lui a1, 1048568
+; CHECK-ZVKB-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
+; CHECK-ZVKB-NEXT:    vandn.vx v8, v8, a1, v0.t
+; CHECK-ZVKB-NEXT:    ret
+  %a = call <vscale x 1 x i16> @llvm.vp.and.nxv1i16(<vscale x 1 x i16> splat (i16 32767), <vscale x 1 x i16> %x, <vscale x 1 x i1> %mask, i32 %evl)
+  ret <vscale x 1 x i16> %a
+}
+
+define <vscale x 1 x i16> @vandn_vx_vp_swapped_imm16(<vscale x 1 x i16> %x, <vscale x 1 x i1> %mask, i32 zeroext %evl) {
+; CHECK-LABEL: vandn_vx_vp_swapped_imm16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 8
+; CHECK-NEXT:    addi a1, a1, -1
+; CHECK-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
+; CHECK-NEXT:    vand.vx v8, v8, a1, v0.t
+; CHECK-NEXT:    ret
+;
+; CHECK-ZVKB-LABEL: vandn_vx_vp_swapped_imm16:
+; CHECK-ZVKB:       # %bb.0:
+; CHECK-ZVKB-NEXT:    lui a1, 1048568
+; CHECK-ZVKB-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
+; CHECK-ZVKB-NEXT:    vandn.vx v8, v8, a1, v0.t
+; CHECK-ZVKB-NEXT:    ret
+  %a = call <vscale x 1 x i16> @llvm.vp.and.nxv1i16(<vscale x 1 x i16> %x, <vscale x 1 x i16> splat (i16 32767), <vscale x 1 x i1> %mask, i32 %evl)
+  ret <vscale x 1 x i16> %a
+}
+
+define <vscale x 1 x i64> @vandn_vx_vp_imm64(<vscale x 1 x i64> %x, <vscale x 1 x i1> %mask, i32 zeroext %evl) {
+; CHECK-RV32-LABEL: vandn_vx_vp_imm64:
+; CHECK-RV32:       # %bb.0:
+; CHECK-RV32-NEXT:    addi sp, sp, -16
+; CHECK-RV32-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-RV32-NEXT:    lui a1, 1044480
+; CHECK-RV32-NEXT:    li a2, 255
+; CHECK-RV32-NEXT:    sw a2, 8(sp)
+; CHECK-RV32-NEXT:    sw a1, 12(sp)
+; CHECK-RV32-NEXT:    addi a1, sp, 8
+; CHECK-RV32-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; CHECK-RV32-NEXT:    vlse64.v v9, (a1), zero
+; CHECK-RV32-NEXT:    vand.vv v8, v8, v9, v0.t
+; CHECK-RV32-NEXT:    addi sp, sp, 16
+; CHECK-RV32-NEXT:    .cfi_def_cfa_offset 0
+; CHECK-RV32-NEXT:    ret
+;
+; CHECK-RV64-LABEL: vandn_vx_vp_imm64:
+; CHECK-RV64:       # %bb.0:
+; CHECK-RV64-NEXT:    li a1, -1
+; CHECK-RV64-NEXT:    slli a1, a1, 56
+; CHECK-RV64-NEXT:    addi a1, a1, 255
+; CHECK-RV64-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; CHECK-RV64-NEXT:    vand.vx v8, v8, a1, v0.t
+; CHECK-RV64-NEXT:    ret
+;
+; CHECK-ZVKB32-LABEL: vandn_vx_vp_imm64:
+; CHECK-ZVKB32:       # %bb.0:
+; CHECK-ZVKB32-NEXT:    addi sp, sp, -16
+; CHECK-ZVKB32-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-ZVKB32-NEXT:    lui a1, 1044480
+; CHECK-ZVKB32-NEXT:    li a2, 255
+; CHECK-ZVKB32-NEXT:    sw a2, 8(sp)
+; CHECK-ZVKB32-NEXT:    sw a1, 12(sp)
+; CHECK-ZVKB32-NEXT:    addi a1, sp, 8
+; CHECK-ZVKB32-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; CHECK-ZVKB32-NEXT:    vlse64.v v9, (a1), zero
+; CHECK-ZVKB32-NEXT:    vand.vv v8, v8, v9, v0.t
+; CHECK-ZVKB32-NEXT:    addi sp, sp, 16
+; CHECK-ZVKB32-NEXT:    .cfi_def_cfa_offset 0
+; CHECK-ZVKB32-NEXT:    ret
+;
+; CHECK-ZVKB64-LABEL: vandn_vx_vp_imm64:
+; CHECK-ZVKB64:       # %bb.0:
+; CHECK-ZVKB64-NEXT:    lui a1, 1048560
+; CHECK-ZVKB64-NEXT:    srli a1, a1, 8
+; CHECK-ZVKB64-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; CHECK-ZVKB64-NEXT:    vandn.vx v8, v8, a1, v0.t
+; CHECK-ZVKB64-NEXT:    ret
+  %a = call <vscale x 1 x i64> @llvm.vp.and.nxv1i64(<vscale x 1 x i64> %x, <vscale x 1 x i64> splat (i64 -72057594037927681), <vscale x 1 x i1> %mask, i32 %evl)
+  ret <vscale x 1 x i64> %a
+}
+
+define <vscale x 1 x i16> @vand_vadd_vx_vp_imm16(<vscale x 1 x i16> %x, <vscale x 1 x i1> %mask, i32 zeroext %evl) {
+; CHECK-LABEL: vand_vadd_vx_vp_imm16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 8
+; CHECK-NEXT:    addi a1, a1, -1
+; CHECK-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
+; CHECK-NEXT:    vand.vx v8, v8, a1, v0.t
+; CHECK-NEXT:    vadd.vx v8, v8, a1, v0.t
+; CHECK-NEXT:    ret
+;
+; CHECK-ZVKB-LABEL: vand_vadd_vx_vp_imm16:
+; CHECK-ZVKB:       # %bb.0:
+; CHECK-ZVKB-NEXT:    lui a1, 8
+; CHECK-ZVKB-NEXT:    addi a1, a1, -1
+; CHECK-ZVKB-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
+; CHECK-ZVKB-NEXT:    vand.vx v8, v8, a1, v0.t
+; CHECK-ZVKB-NEXT:    vadd.vx v8, v8, a1, v0.t
+; CHECK-ZVKB-NEXT:    ret
+  %a = call <vscale x 1 x i16> @llvm.vp.and.nxv1i16(<vscale x 1 x i16> splat (i16 32767), <vscale x 1 x i16> %x, <vscale x 1 x i1> %mask, i32 %evl)
+  %b = call <vscale x 1 x i16> @llvm.vp.add.nxv1i16(<vscale x 1 x i16> splat (i16 32767), <vscale x 1 x i16> %a, <vscale x 1 x i1> %mask, i32 %evl)
+  ret <vscale x 1 x i16> %b
+}

return false;
if (!all_of(U->users(), [](const SDNode *V) {
return V->getOpcode() == ISD::AND ||
V->getOpcode() == RISCVISD::AND_VL;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I check if that's not the passthru operand? If so, how to test it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISD::AND doesn't have a passthru. For RISCV::AND_VL you would need to loop through U->uses() instead of U->users(). That will give you an SDUse &. From there you can call SDUse::getUser() to get V and SDUse::getOperandNo to get the operand number of the use. The passthru will be operand 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to test? Is @llvm.riscv.vand's first operand the passthru?

Would it be possible for the same VMV_V_X_VL to be the AND_VL passthru and one of the AND_VL "and" operands at the same time?

Would the following work:

      if (!all_of(U->users(), [U](const SDNode *V) {
             return V->getOpcode() == ISD::AND ||
                   (V->getOpcode() == RISCVISD::AND_VL &&
                    U != V->getOperand(2).getNode()); // not passthru
           }))

?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to test? Is @llvm.riscv.vand's first operand the passthru?

@llvm.riscv.vand uses ISD::INTRINSIC_WO_CHAIN rather than than RISCVISD::AND_VL. RISCVISD::AND_VL is primarily created from @llvm.vp.and or and on fixed vectors. There may be other sequences that requires an AND as part of a larger pattern.

I'm not sure if we ever use the passthru operand of RISCVISD::AND_VL for anything other than undef. There are many opcodes that are defined with the same operand structure, but we don't use the passthru for all of them.

Would it be possible for the same VMV_V_X_VL to be the AND_VL passthru and one of the AND_VL "and" operands at the same time?

I think its possible.

Would the following work:

  if (!all_of(U->users(), [U](const SDNode *V) {
         return V->getOpcode() == ISD::AND ||
               (V->getOpcode() == RISCVISD::AND_VL &&
                U != V->getOperand(2).getNode()); // not passthru
       }))

I think that works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you approve this PR as-is then?
I don't think adding untested code is a good idea. In the unlikely event of U somehow appearing in passthru, it would only lead to somewhat suboptimal code.

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pfusik
Copy link
Contributor Author

pfusik commented Jan 22, 2025

Pre-commit test pushed as 527c030. PR rebased.

@pfusik pfusik merged commit ebb27cc into llvm:main Jan 22, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants