[AArch64] Let patterns for NEON instructions check runtime mode. #95560

sdesmalen-arm · 2024-06-14T16:09:04Z

This helps identify any failures where the compiler might otherwise
silently emit instructions that are not valid for the given runtime
mode. We can probably do a similar thing for HasSVE predicates.

llvmbot · 2024-06-14T16:09:35Z

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-mc

Author: Sander de Smalen (sdesmalen-arm)

Changes

I've had to change:
HasNEON -> IsNeonAvailable (available only in non-streaming mode)
HasNEONorSME -> HasNEON (available in either mode)

In contrast, the Predicate HasSVE (and related) don't need a similar change because all patterns predicated with HasSVE use scalable vector types, and in AArch64ISelLowering we've already made sure that none of those type and operationss are legal if the function is not in the right mode (if it only has +sme, without +sve).

Patch is 106.20 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/95560.diff

7 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+2-1)
(modified) llvm/lib/Target/AArch64/AArch64InstrFormats.td (+57-57)
(modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+48-50)
(modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-load.ll (+107-315)
(modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-store.ll (+174-387)
(removed) llvm/test/MC/AArch64/SME/streaming-mode-neon-bf16.s (-16)
(removed) llvm/test/MC/AArch64/SME/streaming-mode-neon.s (-132)

diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index ac6f1e07c4184..7cbae90ef3ca4 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -22185,7 +22185,8 @@ static SDValue vectorToScalarBitmask(SDNode *N, SelectionDAG &DAG) {
   ComparisonResult = DAG.getSExtOrTrunc(ComparisonResult, DL, VecVT);
 
   SmallVector<SDValue, 16> MaskConstants;
-  if (VecVT == MVT::v16i8) {
+  if (DAG.getSubtarget<AArch64Subtarget>().isNeonAvailable() &&
+      VecVT == MVT::v16i8) {
     // v16i8 is a special case, as we have 16 entries but only 8 positional bits
     // per entry. We split it into two halves, apply the mask, zip the halves to
     // create 8x 16-bit values, and the perform the vector reduce.
diff --git a/llvm/lib/Target/AArch64/AArch64InstrFormats.td b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
index 1f437d0ed6f8d..d67aabeee0010 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrFormats.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
@@ -5762,7 +5762,7 @@ multiclass FPMoveImmediate<string asm> {
 // AdvSIMD
 //----------------------------------------------------------------------------
 
-let Predicates = [HasNEON] in {
+let Predicates = [IsNEONAvailable] in {
 
 //----------------------------------------------------------------------------
 // AdvSIMD three register vector instructions
@@ -5966,14 +5966,14 @@ multiclass SIMDThreeSameVectorB<bit U, bits<5> opc, string asm,
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDThreeSameVectorFP<bit U, bit S, bits<3> opc,
                                  string asm, SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDThreeSameVector<0, U, {S,0b10}, {0b00,opc}, V64,
                                       asm, ".4h",
         [(set (v4f16 V64:$Rd), (OpNode (v4f16 V64:$Rn), (v4f16 V64:$Rm)))]>;
   def v8f16 : BaseSIMDThreeSameVector<1, U, {S,0b10}, {0b00,opc}, V128,
                                       asm, ".8h",
         [(set (v8f16 V128:$Rd), (OpNode (v8f16 V128:$Rn), (v8f16 V128:$Rm)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDThreeSameVector<0, U, {S,0b01}, {0b11,opc}, V64,
                                       asm, ".2s",
         [(set (v2f32 V64:$Rd), (OpNode (v2f32 V64:$Rn), (v2f32 V64:$Rm)))]>;
@@ -5989,14 +5989,14 @@ let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDThreeSameVectorFPCmp<bit U, bit S, bits<3> opc,
                                     string asm,
                                     SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDThreeSameVector<0, U, {S,0b10}, {0b00,opc}, V64,
                                       asm, ".4h",
         [(set (v4i16 V64:$Rd), (OpNode (v4f16 V64:$Rn), (v4f16 V64:$Rm)))]>;
   def v8f16 : BaseSIMDThreeSameVector<1, U, {S,0b10}, {0b00,opc}, V128,
                                       asm, ".8h",
         [(set (v8i16 V128:$Rd), (OpNode (v8f16 V128:$Rn), (v8f16 V128:$Rm)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDThreeSameVector<0, U, {S,0b01}, {0b11,opc}, V64,
                                       asm, ".2s",
         [(set (v2i32 V64:$Rd), (OpNode (v2f32 V64:$Rn), (v2f32 V64:$Rm)))]>;
@@ -6011,7 +6011,7 @@ multiclass SIMDThreeSameVectorFPCmp<bit U, bit S, bits<3> opc,
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDThreeSameVectorFPTied<bit U, bit S, bits<3> opc,
                                  string asm, SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDThreeSameVectorTied<0, U, {S,0b10}, {0b00,opc}, V64,
                                       asm, ".4h",
      [(set (v4f16 V64:$dst),
@@ -6020,7 +6020,7 @@ multiclass SIMDThreeSameVectorFPTied<bit U, bit S, bits<3> opc,
                                       asm, ".8h",
      [(set (v8f16 V128:$dst),
            (OpNode (v8f16 V128:$Rd), (v8f16 V128:$Rn), (v8f16 V128:$Rm)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDThreeSameVectorTied<0, U, {S,0b01}, {0b11,opc}, V64,
                                       asm, ".2s",
      [(set (v2f32 V64:$dst),
@@ -6480,14 +6480,14 @@ multiclass SIMDTwoVectorFP<bit U, bit S, bits<5> opc, string asm,
                            SDPatternOperator OpNode,
                            int fpexceptions = 1> {
   let mayRaiseFPException = fpexceptions, Uses = !if(fpexceptions,[FPCR],[]<Register>) in {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDTwoSameVector<0, U, {S,1}, opc, 0b11, V64,
                                 asm, ".4h", ".4h",
                           [(set (v4f16 V64:$Rd), (OpNode (v4f16 V64:$Rn)))]>;
   def v8f16 : BaseSIMDTwoSameVector<1, U, {S,1}, opc, 0b11, V128,
                                 asm, ".8h", ".8h",
                           [(set (v8f16 V128:$Rd), (OpNode (v8f16 V128:$Rn)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDTwoSameVector<0, U, {S,0}, opc, 0b00, V64,
                                 asm, ".2s", ".2s",
                           [(set (v2f32 V64:$Rd), (OpNode (v2f32 V64:$Rn)))]>;
@@ -6538,14 +6538,14 @@ multiclass SIMDTwoVectorS<bit U, bit S, bits<5> opc, string asm,
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDTwoVectorFPToInt<bit U, bit S, bits<5> opc, string asm,
                            SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDTwoSameVector<0, U, {S,1}, opc, 0b11, V64,
                                 asm, ".4h", ".4h",
                           [(set (v4i16 V64:$Rd), (OpNode (v4f16 V64:$Rn)))]>;
   def v8f16 : BaseSIMDTwoSameVector<1, U, {S,1}, opc, 0b11, V128,
                                 asm, ".8h", ".8h",
                           [(set (v8i16 V128:$Rd), (OpNode (v8f16 V128:$Rn)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDTwoSameVector<0, U, {S,0}, opc, 0b00, V64,
                                 asm, ".2s", ".2s",
                           [(set (v2i32 V64:$Rd), (OpNode (v2f32 V64:$Rn)))]>;
@@ -6560,14 +6560,14 @@ multiclass SIMDTwoVectorFPToInt<bit U, bit S, bits<5> opc, string asm,
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDTwoVectorIntToFP<bit U, bit S, bits<5> opc, string asm,
                            SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDTwoSameVector<0, U, {S,1}, opc, 0b11, V64,
                                 asm, ".4h", ".4h",
                           [(set (v4f16 V64:$Rd), (OpNode (v4i16 V64:$Rn)))]>;
   def v8f16 : BaseSIMDTwoSameVector<1, U, {S,1}, opc, 0b11, V128,
                                 asm, ".8h", ".8h",
                           [(set (v8f16 V128:$Rd), (OpNode (v8i16 V128:$Rn)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDTwoSameVector<0, U, {S,0}, opc, 0b00, V64,
                                 asm, ".2s", ".2s",
                           [(set (v2f32 V64:$Rd), (OpNode (v2i32 V64:$Rn)))]>;
@@ -6720,14 +6720,14 @@ multiclass SIMDFPCmpTwoVector<bit U, bit S, bits<5> opc,
                               string asm, SDNode OpNode> {
 
   let mayRaiseFPException = 1, Uses = [FPCR] in {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4i16rz : BaseSIMDCmpTwoVector<0, U, {S,1}, 0b11, opc, V64,
                                      asm, ".4h", "0.0",
                                      v4i16, v4f16, OpNode>;
   def v8i16rz : BaseSIMDCmpTwoVector<1, U, {S,1}, 0b11, opc, V128,
                                      asm, ".8h", "0.0",
                                      v8i16, v8f16, OpNode>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2i32rz : BaseSIMDCmpTwoVector<0, U, {S,0}, 0b00, opc, V64,
                                      asm, ".2s", "0.0",
                                      v2i32, v2f32, OpNode>;
@@ -6739,7 +6739,7 @@ multiclass SIMDFPCmpTwoVector<bit U, bit S, bits<5> opc,
                                      v2i64, v2f64, OpNode>;
   }
 
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def : InstAlias<asm # "\t$Vd.4h, $Vn.4h, #0",
                   (!cast<Instruction>(NAME # v4i16rz) V64:$Vd, V64:$Vn), 0>;
   def : InstAlias<asm # "\t$Vd.8h, $Vn.8h, #0",
@@ -6751,7 +6751,7 @@ multiclass SIMDFPCmpTwoVector<bit U, bit S, bits<5> opc,
                   (!cast<Instruction>(NAME # v4i32rz) V128:$Vd, V128:$Vn), 0>;
   def : InstAlias<asm # "\t$Vd.2d, $Vn.2d, #0",
                   (!cast<Instruction>(NAME # v2i64rz) V128:$Vd, V128:$Vn), 0>;
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def : InstAlias<asm # ".4h\t$Vd, $Vn, #0",
                   (!cast<Instruction>(NAME # v4i16rz) V64:$Vd, V64:$Vn), 0>;
   def : InstAlias<asm # ".8h\t$Vd, $Vn, #0",
@@ -7394,7 +7394,7 @@ multiclass SIMDThreeScalarHSTied<bit U, bit R, bits<5> opc, string asm> {
 
 multiclass SIMDFPThreeScalar<bit U, bit S, bits<3> opc, string asm,
                              SDPatternOperator OpNode = null_frag,
-                             Predicate pred = HasNEON> {
+                             Predicate pred = IsNEONAvailable> {
   let mayLoad = 0, mayStore = 0, hasSideEffects = 0, mayRaiseFPException = 1, Uses = [FPCR] in {
     let Predicates = [pred] in {
     def NAME#64 : BaseSIMDThreeScalar<U, {S,0b11}, {0b11,opc}, FPR64, asm,
@@ -7419,10 +7419,10 @@ multiclass SIMDThreeScalarFPCmp<bit U, bit S, bits<3> opc, string asm,
       [(set (i64 FPR64:$Rd), (OpNode (f64 FPR64:$Rn), (f64 FPR64:$Rm)))]>;
     def NAME#32 : BaseSIMDThreeScalar<U, {S,0b01}, {0b11,opc}, FPR32, asm,
       [(set (i32 FPR32:$Rd), (OpNode (f32 FPR32:$Rn), (f32 FPR32:$Rm)))]>;
-    let Predicates = [HasNEON, HasFullFP16] in {
+    let Predicates = [IsNEONAvailable, HasFullFP16] in {
     def NAME#16 : BaseSIMDThreeScalar<U, {S,0b10}, {0b00,opc}, FPR16, asm,
       []>;
-    } // Predicates = [HasNEON, HasFullFP16]
+    } // Predicates = [IsNEONAvailable, HasFullFP16]
   }
 
   def : Pat<(v1i64 (OpNode (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),
@@ -7571,7 +7571,7 @@ multiclass SIMDFPCmpTwoScalar<bit U, bit S, bits<5> opc, string asm,
   let mayRaiseFPException = 1, Uses = [FPCR] in {
   def v1i64rz  : BaseSIMDCmpTwoScalar<U, {S,1}, 0b00, opc, FPR64, asm, "0.0">;
   def v1i32rz  : BaseSIMDCmpTwoScalar<U, {S,0}, 0b00, opc, FPR32, asm, "0.0">;
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v1i16rz  : BaseSIMDCmpTwoScalar<U, {S,1}, 0b11, opc, FPR16, asm, "0.0">;
   }
   }
@@ -7580,7 +7580,7 @@ multiclass SIMDFPCmpTwoScalar<bit U, bit S, bits<5> opc, string asm,
                   (!cast<Instruction>(NAME # v1i64rz) FPR64:$Rd, FPR64:$Rn), 0>;
   def : InstAlias<asm # "\t$Rd, $Rn, #0",
                   (!cast<Instruction>(NAME # v1i32rz) FPR32:$Rd, FPR32:$Rn), 0>;
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def : InstAlias<asm # "\t$Rd, $Rn, #0",
                   (!cast<Instruction>(NAME # v1i16rz) FPR16:$Rd, FPR16:$Rn), 0>;
   }
@@ -7603,7 +7603,7 @@ multiclass SIMDTwoScalarD<bit U, bits<5> opc, string asm,
 
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDFPTwoScalar<bit U, bit S, bits<5> opc, string asm,
-                           Predicate pred = HasNEON> {
+                           Predicate pred = IsNEONAvailable> {
   let Predicates = [pred] in {
   def v1i64       : BaseSIMDTwoScalar<U, {S,1}, 0b00, opc, FPR64, FPR64, asm,[]>;
   def v1i32       : BaseSIMDTwoScalar<U, {S,0}, 0b00, opc, FPR32, FPR32, asm,[]>;
@@ -7620,7 +7620,7 @@ multiclass SIMDFPTwoScalarCVT<bit U, bit S, bits<5> opc, string asm,
                                 [(set FPR64:$Rd, (OpNode (f64 FPR64:$Rn)))]>;
   def v1i32 : BaseSIMDTwoScalar<U, {S,0}, 0b00, opc, FPR32, FPR32, asm,
                                 [(set FPR32:$Rd, (OpNode (f32 FPR32:$Rn)))]>;
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v1i16 : BaseSIMDTwoScalar<U, {S,1}, 0b11, opc, FPR16, FPR16, asm,
                                 [(set (f16 FPR16:$Rd), (OpNode (f16 FPR16:$Rn)))]>;
   }
@@ -7698,7 +7698,7 @@ multiclass SIMDPairwiseScalarD<bit U, bits<5> opc, string asm> {
 
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDFPPairwiseScalar<bit S, bits<5> opc, string asm> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v2i16p : BaseSIMDPairwiseScalar<0, {S,0}, opc, FPR16Op, V64,
                                       asm, ".2h">;
   }
@@ -7763,14 +7763,14 @@ multiclass SIMDAcrossLanesHSD<bit U, bits<5> opcode, string asm> {
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDFPAcrossLanes<bits<5> opcode, bit sz1, string asm,
                             SDPatternOperator intOp> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4i16v : BaseSIMDAcrossLanes<0, 0, {sz1, 0}, opcode, FPR16, V64,
                                    asm, ".4h",
         [(set (f16 FPR16:$Rd), (intOp (v4f16 V64:$Rn)))]>;
   def v8i16v : BaseSIMDAcrossLanes<1, 0, {sz1, 0}, opcode, FPR16, V128,
                                    asm, ".8h",
         [(set (f16 FPR16:$Rd), (intOp (v8f16 V128:$Rn)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v4i32v : BaseSIMDAcrossLanes<1, 1, {sz1, 0}, opcode, FPR32, V128,
                                    asm, ".4s",
         [(set FPR32:$Rd, (intOp (v4f32 V128:$Rn)))]>;
@@ -7880,7 +7880,7 @@ class SIMDMovAlias<string asm, string size, Instruction inst,
 multiclass SMov {
   // SMOV with vector index of 0 are legal in Scalable Matrix Extension (SME)
   // streaming mode.
-  let Predicates = [HasNEONorSME] in {
+  let Predicates = [HasNEON] in {
     def vi8to32_idx0 : SIMDSMov<0, ".b", GPR32, VectorIndex0> {
       let Inst{20-16} = 0b00001;
     }
@@ -7927,7 +7927,7 @@ multiclass SMov {
 multiclass UMov {
   // UMOV with vector index of 0 are legal in Scalable Matrix Extension (SME)
   // streaming mode.
-  let Predicates = [HasNEONorSME] in {
+  let Predicates = [HasNEON] in {
     def vi8_idx0 : SIMDUMov<0, ".b", v16i8, GPR32, VectorIndex0> {
       let Inst{20-16} = 0b00001;
     }
@@ -8816,7 +8816,7 @@ multiclass SIMDThreeSameVectorFP8DOT2Index<string asm> {
 multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
                          SDPatternOperator OpNode> {
   let mayRaiseFPException = 1, Uses = [FPCR] in {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4i16_indexed : BaseSIMDIndexed<0, U, 0, 0b00, opc,
                                       V64, V64,
                                       V128_lo, VectorIndexH,
@@ -8842,7 +8842,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
     let Inst{21} = idx{1};
     let Inst{20} = idx{0};
   }
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   def v2i32_indexed : BaseSIMDIndexed<0, U, 0, 0b10, opc,
                                       V64, V64,
@@ -8880,7 +8880,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
     let Inst{21} = 0;
   }
 
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v1i16_indexed : BaseSIMDIndexed<1, U, 1, 0b00, opc,
                                       FPR16Op, FPR16Op, V128_lo, VectorIndexH,
                                       asm, ".h", "", "", ".h",
@@ -8893,7 +8893,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
     let Inst{21} = idx{1};
     let Inst{20} = idx{0};
   }
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   def v1i32_indexed : BaseSIMDIndexed<1, U, 1, 0b10, opc,
                                       FPR32Op, FPR32Op, V128, VectorIndexS,
@@ -8920,7 +8920,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
   }
   } // mayRaiseFPException = 1, Uses = [FPCR]
 
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def : Pat<(f16 (OpNode
                    (f16 (vector_extract (v8f16 V128:$Rn), (i64 0))),
                    (f16 (vector_extract (v8f16 V128:$Rm), VectorIndexH:$idx)))),
@@ -8928,7 +8928,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
               (f16 (EXTRACT_SUBREG V128:$Rn, hsub)), V128:$Rm, VectorIndexH:$idx)>;
   }
 
-  let Predicates = [HasNEON] in {
+  let Predicates = [IsNEONAvailable] in {
   def : Pat<(f32 (OpNode
                    (f32 (vector_extract (v4f32 V128:$Rn), (i64 0))),
                    (f32 (vector_extract (v4f32 V128:$Rm), VectorIndexS:$idx)))),
@@ -8944,7 +8944,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
 }
 
 multiclass SIMDFPIndexedTiedPatterns<string INST, SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   // Patterns for f16: DUPLANE, DUP scalar and vector_extract.
   def : Pat<(v8f16 (OpNode (v8f16 V128:$Rd), (v8f16 V128:$Rn),
                            (AArch64duplane16 (v8f16 V128_lo:$Rm),
@@ -8970,7 +8970,7 @@ multiclass SIMDFPIndexedTiedPatterns<string INST, SDPatternOperator OpNode> {
                          (vector_extract (v8f16 V128_lo:$Rm), VectorIndexH:$idx))),
             (!cast<Instruction>(INST # "v1i16_indexed") FPR16:$Rd, FPR16:$Rn,
                 V128_lo:$Rm, VectorIndexH:$idx)>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   // 2 variants for the .2s version: DUPLANE from 128-bit and DUP scalar.
   def : Pat<(v2f32 (OpNode (v2f32 V64:$Rd), (v2f32 V64:$Rn),
@@ -9021,7 +9021,7 @@ multiclass SIMDFPIndexedTiedPatterns<string INST, SDPatternOperator OpNode> {
 
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDFPIndexedTied<bit U, bits<4> opc, string asm> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4i16_indexed : BaseSIMDIndexedTied<0, U, 0, 0b00, opc, V64, V64,
                                           V128_lo, VectorIndexH,
                                           asm, ".4h", ".4h", ".4h", ".h", []> {
@@ -9040,7 +9040,7 @@ multiclass SIMDFPIndexedTied<bit U, bits<4> opc, string asm> {
     let Inst{21} = idx{1};
     let Inst{20} = idx{0};
   }
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   def v2i32_indexed : BaseSIMDIndexedTied<0, U, 0, 0b10, opc, V64, V64,
                                           V128, VectorIndexS,
@@ -9068,7 +9068,7 @@ multiclass SIMDFPIndexedTied<bit U, bits<4> opc, string asm> {
     let Inst{21} = 0;
   }
 
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v1i16_indexed : BaseSIMDIndexedTied<1, U, 1, 0b00, opc,
                                       FPR16Op, FPR16Op, V128_lo, VectorIndexH,
                                       asm, ".h", "", "", ".h", []> {
@@ -9077,7 +9077,7 @@ multiclass SIMDFPIndexedTied<bit U, bits<4> opc, string asm> {
     let Inst{21} = idx{1};
     let Inst{20} = idx{0};
   }
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   def v1i32_indexed : BaseSIMDIndexedTied<1, U, 1, 0b10, opc,
               ...
[truncated]

efriedma-quic

If I'm understanding correctly, this makes "HasNEONorSME" mean "this target supports NEON encodings", and "HasNEON" means "this target supports NEON encodings, and the current configuration of the SME registers allows executing the full NEON instruction set".

I don't think I like these names... can we rename them in a followup? Maybe use the same names as the Subtarget methods?

That said, the approach here makes sense.

efriedma-quic · 2024-06-14T22:51:24Z

llvm/lib/Target/AArch64/AArch64InstrInfo.td

@@ -5934,7 +5929,7 @@ def : Pat<(v2f64 (AArch64frsqrts (v2f64 FPR128:$Rn), (v2f64 FPR128:$Rm))),
 // Some float -> int -> float conversion patterns for which we want to keep the
 // int values in FP registers using the corresponding NEON instructions to
 // avoid more costly int <-> fp register transfers.
-let Predicates = [HasNEON] in {
+let Predicates = [HasNEONorSME] in {


Comment on the closing brace also needs to be updated.

aemerson · 2024-06-15T00:38:27Z

llvm/lib/Target/AArch64/AArch64InstrInfo.td

+// A subset of NEON instructions are legal in Streaming SVE execution mode.
+def HasNEONorSME     : Predicate<"Subtarget->hasNEON()">,
+                                 AssemblerPredicateWithAll<(any_of FeatureNEON), "neon">;


HasNEONorSME but the predicate doesn't check for SME?

paulwalker-arm · 2024-06-15T13:54:20Z

Personally I don't see value in a mass rename of HasNEON because the runtime mode checking only applies to isel and not assembly. I see the overriding of the predicate as just a convenient way to restrict only isel for the common case of most all NEON instructions not being safe to execute in streaming mode.

I guess I'm worried we'll introduce complexity where it's not needed because the modal behaviour is mostly handled before isel and if we get this far we're already toast. The rational for changing HasNEON is mainly to fail in a nicer way.

That said, I do think we're at the point where HasNEONorSME doesn't represent what we mean (i.e. it's hard to tell but I don't think it's the case that an implementation with SME but not NEON must have these instructions?) so perhaps HasNEONandStreamingSafe is a good way to show the separate feature (i.e. Inst) and runtime (isel) requirements whilst also implying blocks protected by HasNEON are not streaming safe.

This helps identify any failures where the compiler might otherwise silently emit instructions that are not valid for the given runtime mode. We can probably do a similar thing for HasSVE predicates.

…in HasNEONandIsStreamingSafe

paulwalker-arm

One observation but otherwise this looks good to me.

paulwalker-arm · 2024-06-18T11:21:00Z

llvm/lib/Target/AArch64/AArch64InstrInfo.td

@@ -6016,7 +6013,7 @@ def : Pat<(v2f64 (AArch64frsqrts (v2f64 FPR128:$Rn), (v2f64 FPR128:$Rm))),
 // Some float -> int -> float conversion patterns for which we want to keep the
 // int values in FP registers using the corresponding NEON instructions to
 // avoid more costly int <-> fp register transfers.
-let Predicates = [HasNEON] in {
+let Predicates = [HasNEONandIsStreamingSafe] in {


The predicate here doesn't match that used by the instruction's definition (e.g. defm FCVTZS : ...). This looks fine because those definitions don't include any patterns. However, that does suggest the HasNEONandIsStreamingSafe passed into some of those classes (e.g. defm FRECPE : ...) serves no purpose and can be removed rather than changed?

If you keep this change then please update the closing } comment because that still references HasNEON.

Good point, done!

…tructions are affected

…m#95560) This helps identify any failures where the compiler might otherwise silently emit instructions that are not valid for the given runtime mode.

sdesmalen-arm requested review from aemerson and paulwalker-arm June 14, 2024 16:09

llvmbot added backend:AArch64 mc Machine (object) code labels Jun 14, 2024

sdesmalen-arm force-pushed the sme-allow-neon-in-non-streaming-mode-only branch from 3490381 to 3172366 Compare June 14, 2024 17:19

efriedma-quic reviewed Jun 14, 2024

View reviewed changes

aemerson reviewed Jun 15, 2024

View reviewed changes

sdesmalen-arm added 4 commits June 17, 2024 15:57

[AArch64] Let patterns for NEON instructions check runtime mode.

4a33a6f

This helps identify any failures where the compiler might otherwise silently emit instructions that are not valid for the given runtime mode. We can probably do a similar thing for HasSVE predicates.

Rename HasNEONorSME to HasNEONAndIsStreamingSafe

f48cb9c

Fixed missing cases causing build failures, also decapitalised 'and' …

1a85524

…in HasNEONandIsStreamingSafe

Used wrong predicate in previous patch

97f2a88

sdesmalen-arm force-pushed the sme-allow-neon-in-non-streaming-mode-only branch from bd914db to 97f2a88 Compare June 17, 2024 15:16

Fix one more predicate

1c5c3b8

paulwalker-arm approved these changes Jun 18, 2024

View reviewed changes

sdesmalen-arm added 2 commits June 19, 2024 11:22

Fix predicates for UCVTF/SCVTF and update comment.

48c0acd

Move HasNEONandIsStreamingSafe predicate to multiclass, since all ins…

1e52110

…tructions are affected

sdesmalen-arm merged commit 93831c7 into llvm:main Jun 19, 2024
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64] Let patterns for NEON instructions check runtime mode. #95560

[AArch64] Let patterns for NEON instructions check runtime mode. #95560

Uh oh!

sdesmalen-arm commented Jun 14, 2024 •

edited

Loading

Uh oh!

llvmbot commented Jun 14, 2024 •

edited

Loading

Uh oh!

efriedma-quic left a comment

Uh oh!

efriedma-quic Jun 14, 2024

Uh oh!

aemerson Jun 15, 2024

Uh oh!

paulwalker-arm commented Jun 15, 2024 •

edited

Loading

Uh oh!

paulwalker-arm left a comment

Uh oh!

paulwalker-arm Jun 18, 2024

Uh oh!

sdesmalen-arm Jun 19, 2024

Uh oh!

Uh oh!

Uh oh!

[AArch64] Let patterns for NEON instructions check runtime mode. #95560

[AArch64] Let patterns for NEON instructions check runtime mode. #95560

Uh oh!

Conversation

sdesmalen-arm commented Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

efriedma-quic left a comment

Choose a reason for hiding this comment

Uh oh!

efriedma-quic Jun 14, 2024

Choose a reason for hiding this comment

Uh oh!

aemerson Jun 15, 2024

Choose a reason for hiding this comment

Uh oh!

paulwalker-arm commented Jun 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulwalker-arm left a comment

Choose a reason for hiding this comment

Uh oh!

paulwalker-arm Jun 18, 2024

Choose a reason for hiding this comment

Uh oh!

sdesmalen-arm Jun 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sdesmalen-arm commented Jun 14, 2024 •

edited

Loading

llvmbot commented Jun 14, 2024 •

edited

Loading

paulwalker-arm commented Jun 15, 2024 •

edited

Loading