Skip to content

[AArch64] Let patterns for NEON instructions check runtime mode. #95560

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

sdesmalen-arm
Copy link
Collaborator

@sdesmalen-arm sdesmalen-arm commented Jun 14, 2024

This helps identify any failures where the compiler might otherwise
silently emit instructions that are not valid for the given runtime
mode. We can probably do a similar thing for HasSVE predicates.

@llvmbot llvmbot added backend:AArch64 mc Machine (object) code labels Jun 14, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 14, 2024

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-mc

Author: Sander de Smalen (sdesmalen-arm)

Changes

I've had to change:
HasNEON -> IsNeonAvailable (available only in non-streaming mode)
HasNEONorSME -> HasNEON (available in either mode)

In contrast, the Predicate HasSVE (and related) don't need a similar change because all patterns predicated with HasSVE use scalable vector types, and in AArch64ISelLowering we've already made sure that none of those type and operationss are legal if the function is not in the right mode (if it only has +sme, without +sve).


Patch is 106.20 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/95560.diff

7 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+2-1)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrFormats.td (+57-57)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+48-50)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-load.ll (+107-315)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-store.ll (+174-387)
  • (removed) llvm/test/MC/AArch64/SME/streaming-mode-neon-bf16.s (-16)
  • (removed) llvm/test/MC/AArch64/SME/streaming-mode-neon.s (-132)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index ac6f1e07c4184..7cbae90ef3ca4 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -22185,7 +22185,8 @@ static SDValue vectorToScalarBitmask(SDNode *N, SelectionDAG &DAG) {
   ComparisonResult = DAG.getSExtOrTrunc(ComparisonResult, DL, VecVT);
 
   SmallVector<SDValue, 16> MaskConstants;
-  if (VecVT == MVT::v16i8) {
+  if (DAG.getSubtarget<AArch64Subtarget>().isNeonAvailable() &&
+      VecVT == MVT::v16i8) {
     // v16i8 is a special case, as we have 16 entries but only 8 positional bits
     // per entry. We split it into two halves, apply the mask, zip the halves to
     // create 8x 16-bit values, and the perform the vector reduce.
diff --git a/llvm/lib/Target/AArch64/AArch64InstrFormats.td b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
index 1f437d0ed6f8d..d67aabeee0010 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrFormats.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
@@ -5762,7 +5762,7 @@ multiclass FPMoveImmediate<string asm> {
 // AdvSIMD
 //----------------------------------------------------------------------------
 
-let Predicates = [HasNEON] in {
+let Predicates = [IsNEONAvailable] in {
 
 //----------------------------------------------------------------------------
 // AdvSIMD three register vector instructions
@@ -5966,14 +5966,14 @@ multiclass SIMDThreeSameVectorB<bit U, bits<5> opc, string asm,
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDThreeSameVectorFP<bit U, bit S, bits<3> opc,
                                  string asm, SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDThreeSameVector<0, U, {S,0b10}, {0b00,opc}, V64,
                                       asm, ".4h",
         [(set (v4f16 V64:$Rd), (OpNode (v4f16 V64:$Rn), (v4f16 V64:$Rm)))]>;
   def v8f16 : BaseSIMDThreeSameVector<1, U, {S,0b10}, {0b00,opc}, V128,
                                       asm, ".8h",
         [(set (v8f16 V128:$Rd), (OpNode (v8f16 V128:$Rn), (v8f16 V128:$Rm)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDThreeSameVector<0, U, {S,0b01}, {0b11,opc}, V64,
                                       asm, ".2s",
         [(set (v2f32 V64:$Rd), (OpNode (v2f32 V64:$Rn), (v2f32 V64:$Rm)))]>;
@@ -5989,14 +5989,14 @@ let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDThreeSameVectorFPCmp<bit U, bit S, bits<3> opc,
                                     string asm,
                                     SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDThreeSameVector<0, U, {S,0b10}, {0b00,opc}, V64,
                                       asm, ".4h",
         [(set (v4i16 V64:$Rd), (OpNode (v4f16 V64:$Rn), (v4f16 V64:$Rm)))]>;
   def v8f16 : BaseSIMDThreeSameVector<1, U, {S,0b10}, {0b00,opc}, V128,
                                       asm, ".8h",
         [(set (v8i16 V128:$Rd), (OpNode (v8f16 V128:$Rn), (v8f16 V128:$Rm)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDThreeSameVector<0, U, {S,0b01}, {0b11,opc}, V64,
                                       asm, ".2s",
         [(set (v2i32 V64:$Rd), (OpNode (v2f32 V64:$Rn), (v2f32 V64:$Rm)))]>;
@@ -6011,7 +6011,7 @@ multiclass SIMDThreeSameVectorFPCmp<bit U, bit S, bits<3> opc,
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDThreeSameVectorFPTied<bit U, bit S, bits<3> opc,
                                  string asm, SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDThreeSameVectorTied<0, U, {S,0b10}, {0b00,opc}, V64,
                                       asm, ".4h",
      [(set (v4f16 V64:$dst),
@@ -6020,7 +6020,7 @@ multiclass SIMDThreeSameVectorFPTied<bit U, bit S, bits<3> opc,
                                       asm, ".8h",
      [(set (v8f16 V128:$dst),
            (OpNode (v8f16 V128:$Rd), (v8f16 V128:$Rn), (v8f16 V128:$Rm)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDThreeSameVectorTied<0, U, {S,0b01}, {0b11,opc}, V64,
                                       asm, ".2s",
      [(set (v2f32 V64:$dst),
@@ -6480,14 +6480,14 @@ multiclass SIMDTwoVectorFP<bit U, bit S, bits<5> opc, string asm,
                            SDPatternOperator OpNode,
                            int fpexceptions = 1> {
   let mayRaiseFPException = fpexceptions, Uses = !if(fpexceptions,[FPCR],[]<Register>) in {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDTwoSameVector<0, U, {S,1}, opc, 0b11, V64,
                                 asm, ".4h", ".4h",
                           [(set (v4f16 V64:$Rd), (OpNode (v4f16 V64:$Rn)))]>;
   def v8f16 : BaseSIMDTwoSameVector<1, U, {S,1}, opc, 0b11, V128,
                                 asm, ".8h", ".8h",
                           [(set (v8f16 V128:$Rd), (OpNode (v8f16 V128:$Rn)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDTwoSameVector<0, U, {S,0}, opc, 0b00, V64,
                                 asm, ".2s", ".2s",
                           [(set (v2f32 V64:$Rd), (OpNode (v2f32 V64:$Rn)))]>;
@@ -6538,14 +6538,14 @@ multiclass SIMDTwoVectorS<bit U, bit S, bits<5> opc, string asm,
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDTwoVectorFPToInt<bit U, bit S, bits<5> opc, string asm,
                            SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDTwoSameVector<0, U, {S,1}, opc, 0b11, V64,
                                 asm, ".4h", ".4h",
                           [(set (v4i16 V64:$Rd), (OpNode (v4f16 V64:$Rn)))]>;
   def v8f16 : BaseSIMDTwoSameVector<1, U, {S,1}, opc, 0b11, V128,
                                 asm, ".8h", ".8h",
                           [(set (v8i16 V128:$Rd), (OpNode (v8f16 V128:$Rn)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDTwoSameVector<0, U, {S,0}, opc, 0b00, V64,
                                 asm, ".2s", ".2s",
                           [(set (v2i32 V64:$Rd), (OpNode (v2f32 V64:$Rn)))]>;
@@ -6560,14 +6560,14 @@ multiclass SIMDTwoVectorFPToInt<bit U, bit S, bits<5> opc, string asm,
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDTwoVectorIntToFP<bit U, bit S, bits<5> opc, string asm,
                            SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4f16 : BaseSIMDTwoSameVector<0, U, {S,1}, opc, 0b11, V64,
                                 asm, ".4h", ".4h",
                           [(set (v4f16 V64:$Rd), (OpNode (v4i16 V64:$Rn)))]>;
   def v8f16 : BaseSIMDTwoSameVector<1, U, {S,1}, opc, 0b11, V128,
                                 asm, ".8h", ".8h",
                           [(set (v8f16 V128:$Rd), (OpNode (v8i16 V128:$Rn)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2f32 : BaseSIMDTwoSameVector<0, U, {S,0}, opc, 0b00, V64,
                                 asm, ".2s", ".2s",
                           [(set (v2f32 V64:$Rd), (OpNode (v2i32 V64:$Rn)))]>;
@@ -6720,14 +6720,14 @@ multiclass SIMDFPCmpTwoVector<bit U, bit S, bits<5> opc,
                               string asm, SDNode OpNode> {
 
   let mayRaiseFPException = 1, Uses = [FPCR] in {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4i16rz : BaseSIMDCmpTwoVector<0, U, {S,1}, 0b11, opc, V64,
                                      asm, ".4h", "0.0",
                                      v4i16, v4f16, OpNode>;
   def v8i16rz : BaseSIMDCmpTwoVector<1, U, {S,1}, 0b11, opc, V128,
                                      asm, ".8h", "0.0",
                                      v8i16, v8f16, OpNode>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v2i32rz : BaseSIMDCmpTwoVector<0, U, {S,0}, 0b00, opc, V64,
                                      asm, ".2s", "0.0",
                                      v2i32, v2f32, OpNode>;
@@ -6739,7 +6739,7 @@ multiclass SIMDFPCmpTwoVector<bit U, bit S, bits<5> opc,
                                      v2i64, v2f64, OpNode>;
   }
 
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def : InstAlias<asm # "\t$Vd.4h, $Vn.4h, #0",
                   (!cast<Instruction>(NAME # v4i16rz) V64:$Vd, V64:$Vn), 0>;
   def : InstAlias<asm # "\t$Vd.8h, $Vn.8h, #0",
@@ -6751,7 +6751,7 @@ multiclass SIMDFPCmpTwoVector<bit U, bit S, bits<5> opc,
                   (!cast<Instruction>(NAME # v4i32rz) V128:$Vd, V128:$Vn), 0>;
   def : InstAlias<asm # "\t$Vd.2d, $Vn.2d, #0",
                   (!cast<Instruction>(NAME # v2i64rz) V128:$Vd, V128:$Vn), 0>;
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def : InstAlias<asm # ".4h\t$Vd, $Vn, #0",
                   (!cast<Instruction>(NAME # v4i16rz) V64:$Vd, V64:$Vn), 0>;
   def : InstAlias<asm # ".8h\t$Vd, $Vn, #0",
@@ -7394,7 +7394,7 @@ multiclass SIMDThreeScalarHSTied<bit U, bit R, bits<5> opc, string asm> {
 
 multiclass SIMDFPThreeScalar<bit U, bit S, bits<3> opc, string asm,
                              SDPatternOperator OpNode = null_frag,
-                             Predicate pred = HasNEON> {
+                             Predicate pred = IsNEONAvailable> {
   let mayLoad = 0, mayStore = 0, hasSideEffects = 0, mayRaiseFPException = 1, Uses = [FPCR] in {
     let Predicates = [pred] in {
     def NAME#64 : BaseSIMDThreeScalar<U, {S,0b11}, {0b11,opc}, FPR64, asm,
@@ -7419,10 +7419,10 @@ multiclass SIMDThreeScalarFPCmp<bit U, bit S, bits<3> opc, string asm,
       [(set (i64 FPR64:$Rd), (OpNode (f64 FPR64:$Rn), (f64 FPR64:$Rm)))]>;
     def NAME#32 : BaseSIMDThreeScalar<U, {S,0b01}, {0b11,opc}, FPR32, asm,
       [(set (i32 FPR32:$Rd), (OpNode (f32 FPR32:$Rn), (f32 FPR32:$Rm)))]>;
-    let Predicates = [HasNEON, HasFullFP16] in {
+    let Predicates = [IsNEONAvailable, HasFullFP16] in {
     def NAME#16 : BaseSIMDThreeScalar<U, {S,0b10}, {0b00,opc}, FPR16, asm,
       []>;
-    } // Predicates = [HasNEON, HasFullFP16]
+    } // Predicates = [IsNEONAvailable, HasFullFP16]
   }
 
   def : Pat<(v1i64 (OpNode (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),
@@ -7571,7 +7571,7 @@ multiclass SIMDFPCmpTwoScalar<bit U, bit S, bits<5> opc, string asm,
   let mayRaiseFPException = 1, Uses = [FPCR] in {
   def v1i64rz  : BaseSIMDCmpTwoScalar<U, {S,1}, 0b00, opc, FPR64, asm, "0.0">;
   def v1i32rz  : BaseSIMDCmpTwoScalar<U, {S,0}, 0b00, opc, FPR32, asm, "0.0">;
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v1i16rz  : BaseSIMDCmpTwoScalar<U, {S,1}, 0b11, opc, FPR16, asm, "0.0">;
   }
   }
@@ -7580,7 +7580,7 @@ multiclass SIMDFPCmpTwoScalar<bit U, bit S, bits<5> opc, string asm,
                   (!cast<Instruction>(NAME # v1i64rz) FPR64:$Rd, FPR64:$Rn), 0>;
   def : InstAlias<asm # "\t$Rd, $Rn, #0",
                   (!cast<Instruction>(NAME # v1i32rz) FPR32:$Rd, FPR32:$Rn), 0>;
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def : InstAlias<asm # "\t$Rd, $Rn, #0",
                   (!cast<Instruction>(NAME # v1i16rz) FPR16:$Rd, FPR16:$Rn), 0>;
   }
@@ -7603,7 +7603,7 @@ multiclass SIMDTwoScalarD<bit U, bits<5> opc, string asm,
 
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDFPTwoScalar<bit U, bit S, bits<5> opc, string asm,
-                           Predicate pred = HasNEON> {
+                           Predicate pred = IsNEONAvailable> {
   let Predicates = [pred] in {
   def v1i64       : BaseSIMDTwoScalar<U, {S,1}, 0b00, opc, FPR64, FPR64, asm,[]>;
   def v1i32       : BaseSIMDTwoScalar<U, {S,0}, 0b00, opc, FPR32, FPR32, asm,[]>;
@@ -7620,7 +7620,7 @@ multiclass SIMDFPTwoScalarCVT<bit U, bit S, bits<5> opc, string asm,
                                 [(set FPR64:$Rd, (OpNode (f64 FPR64:$Rn)))]>;
   def v1i32 : BaseSIMDTwoScalar<U, {S,0}, 0b00, opc, FPR32, FPR32, asm,
                                 [(set FPR32:$Rd, (OpNode (f32 FPR32:$Rn)))]>;
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v1i16 : BaseSIMDTwoScalar<U, {S,1}, 0b11, opc, FPR16, FPR16, asm,
                                 [(set (f16 FPR16:$Rd), (OpNode (f16 FPR16:$Rn)))]>;
   }
@@ -7698,7 +7698,7 @@ multiclass SIMDPairwiseScalarD<bit U, bits<5> opc, string asm> {
 
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDFPPairwiseScalar<bit S, bits<5> opc, string asm> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v2i16p : BaseSIMDPairwiseScalar<0, {S,0}, opc, FPR16Op, V64,
                                       asm, ".2h">;
   }
@@ -7763,14 +7763,14 @@ multiclass SIMDAcrossLanesHSD<bit U, bits<5> opcode, string asm> {
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDFPAcrossLanes<bits<5> opcode, bit sz1, string asm,
                             SDPatternOperator intOp> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4i16v : BaseSIMDAcrossLanes<0, 0, {sz1, 0}, opcode, FPR16, V64,
                                    asm, ".4h",
         [(set (f16 FPR16:$Rd), (intOp (v4f16 V64:$Rn)))]>;
   def v8i16v : BaseSIMDAcrossLanes<1, 0, {sz1, 0}, opcode, FPR16, V128,
                                    asm, ".8h",
         [(set (f16 FPR16:$Rd), (intOp (v8f16 V128:$Rn)))]>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
   def v4i32v : BaseSIMDAcrossLanes<1, 1, {sz1, 0}, opcode, FPR32, V128,
                                    asm, ".4s",
         [(set FPR32:$Rd, (intOp (v4f32 V128:$Rn)))]>;
@@ -7880,7 +7880,7 @@ class SIMDMovAlias<string asm, string size, Instruction inst,
 multiclass SMov {
   // SMOV with vector index of 0 are legal in Scalable Matrix Extension (SME)
   // streaming mode.
-  let Predicates = [HasNEONorSME] in {
+  let Predicates = [HasNEON] in {
     def vi8to32_idx0 : SIMDSMov<0, ".b", GPR32, VectorIndex0> {
       let Inst{20-16} = 0b00001;
     }
@@ -7927,7 +7927,7 @@ multiclass SMov {
 multiclass UMov {
   // UMOV with vector index of 0 are legal in Scalable Matrix Extension (SME)
   // streaming mode.
-  let Predicates = [HasNEONorSME] in {
+  let Predicates = [HasNEON] in {
     def vi8_idx0 : SIMDUMov<0, ".b", v16i8, GPR32, VectorIndex0> {
       let Inst{20-16} = 0b00001;
     }
@@ -8816,7 +8816,7 @@ multiclass SIMDThreeSameVectorFP8DOT2Index<string asm> {
 multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
                          SDPatternOperator OpNode> {
   let mayRaiseFPException = 1, Uses = [FPCR] in {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4i16_indexed : BaseSIMDIndexed<0, U, 0, 0b00, opc,
                                       V64, V64,
                                       V128_lo, VectorIndexH,
@@ -8842,7 +8842,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
     let Inst{21} = idx{1};
     let Inst{20} = idx{0};
   }
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   def v2i32_indexed : BaseSIMDIndexed<0, U, 0, 0b10, opc,
                                       V64, V64,
@@ -8880,7 +8880,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
     let Inst{21} = 0;
   }
 
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v1i16_indexed : BaseSIMDIndexed<1, U, 1, 0b00, opc,
                                       FPR16Op, FPR16Op, V128_lo, VectorIndexH,
                                       asm, ".h", "", "", ".h",
@@ -8893,7 +8893,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
     let Inst{21} = idx{1};
     let Inst{20} = idx{0};
   }
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   def v1i32_indexed : BaseSIMDIndexed<1, U, 1, 0b10, opc,
                                       FPR32Op, FPR32Op, V128, VectorIndexS,
@@ -8920,7 +8920,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
   }
   } // mayRaiseFPException = 1, Uses = [FPCR]
 
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def : Pat<(f16 (OpNode
                    (f16 (vector_extract (v8f16 V128:$Rn), (i64 0))),
                    (f16 (vector_extract (v8f16 V128:$Rm), VectorIndexH:$idx)))),
@@ -8928,7 +8928,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
               (f16 (EXTRACT_SUBREG V128:$Rn, hsub)), V128:$Rm, VectorIndexH:$idx)>;
   }
 
-  let Predicates = [HasNEON] in {
+  let Predicates = [IsNEONAvailable] in {
   def : Pat<(f32 (OpNode
                    (f32 (vector_extract (v4f32 V128:$Rn), (i64 0))),
                    (f32 (vector_extract (v4f32 V128:$Rm), VectorIndexS:$idx)))),
@@ -8944,7 +8944,7 @@ multiclass SIMDFPIndexed<bit U, bits<4> opc, string asm,
 }
 
 multiclass SIMDFPIndexedTiedPatterns<string INST, SDPatternOperator OpNode> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   // Patterns for f16: DUPLANE, DUP scalar and vector_extract.
   def : Pat<(v8f16 (OpNode (v8f16 V128:$Rd), (v8f16 V128:$Rn),
                            (AArch64duplane16 (v8f16 V128_lo:$Rm),
@@ -8970,7 +8970,7 @@ multiclass SIMDFPIndexedTiedPatterns<string INST, SDPatternOperator OpNode> {
                          (vector_extract (v8f16 V128_lo:$Rm), VectorIndexH:$idx))),
             (!cast<Instruction>(INST # "v1i16_indexed") FPR16:$Rd, FPR16:$Rn,
                 V128_lo:$Rm, VectorIndexH:$idx)>;
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   // 2 variants for the .2s version: DUPLANE from 128-bit and DUP scalar.
   def : Pat<(v2f32 (OpNode (v2f32 V64:$Rd), (v2f32 V64:$Rn),
@@ -9021,7 +9021,7 @@ multiclass SIMDFPIndexedTiedPatterns<string INST, SDPatternOperator OpNode> {
 
 let mayRaiseFPException = 1, Uses = [FPCR] in
 multiclass SIMDFPIndexedTied<bit U, bits<4> opc, string asm> {
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v4i16_indexed : BaseSIMDIndexedTied<0, U, 0, 0b00, opc, V64, V64,
                                           V128_lo, VectorIndexH,
                                           asm, ".4h", ".4h", ".4h", ".h", []> {
@@ -9040,7 +9040,7 @@ multiclass SIMDFPIndexedTied<bit U, bits<4> opc, string asm> {
     let Inst{21} = idx{1};
     let Inst{20} = idx{0};
   }
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   def v2i32_indexed : BaseSIMDIndexedTied<0, U, 0, 0b10, opc, V64, V64,
                                           V128, VectorIndexS,
@@ -9068,7 +9068,7 @@ multiclass SIMDFPIndexedTied<bit U, bits<4> opc, string asm> {
     let Inst{21} = 0;
   }
 
-  let Predicates = [HasNEON, HasFullFP16] in {
+  let Predicates = [IsNEONAvailable, HasFullFP16] in {
   def v1i16_indexed : BaseSIMDIndexedTied<1, U, 1, 0b00, opc,
                                       FPR16Op, FPR16Op, V128_lo, VectorIndexH,
                                       asm, ".h", "", "", ".h", []> {
@@ -9077,7 +9077,7 @@ multiclass SIMDFPIndexedTied<bit U, bits<4> opc, string asm> {
     let Inst{21} = idx{1};
     let Inst{20} = idx{0};
   }
-  } // Predicates = [HasNEON, HasFullFP16]
+  } // Predicates = [IsNEONAvailable, HasFullFP16]
 
   def v1i32_indexed : BaseSIMDIndexedTied<1, U, 1, 0b10, opc,
               ...
[truncated]

@sdesmalen-arm sdesmalen-arm force-pushed the sme-allow-neon-in-non-streaming-mode-only branch from 3490381 to 3172366 Compare June 14, 2024 17:19
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding correctly, this makes "HasNEONorSME" mean "this target supports NEON encodings", and "HasNEON" means "this target supports NEON encodings, and the current configuration of the SME registers allows executing the full NEON instruction set".

I don't think I like these names... can we rename them in a followup? Maybe use the same names as the Subtarget methods?

That said, the approach here makes sense.

@@ -5934,7 +5929,7 @@ def : Pat<(v2f64 (AArch64frsqrts (v2f64 FPR128:$Rn), (v2f64 FPR128:$Rm))),
// Some float -> int -> float conversion patterns for which we want to keep the
// int values in FP registers using the corresponding NEON instructions to
// avoid more costly int <-> fp register transfers.
let Predicates = [HasNEON] in {
let Predicates = [HasNEONorSME] in {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on the closing brace also needs to be updated.

Comment on lines 237 to 239
// A subset of NEON instructions are legal in Streaming SVE execution mode.
def HasNEONorSME : Predicate<"Subtarget->hasNEON()">,
AssemblerPredicateWithAll<(any_of FeatureNEON), "neon">;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HasNEONorSME but the predicate doesn't check for SME?

@paulwalker-arm
Copy link
Collaborator

paulwalker-arm commented Jun 15, 2024

Personally I don't see value in a mass rename of HasNEON because the runtime mode checking only applies to isel and not assembly. I see the overriding of the predicate as just a convenient way to restrict only isel for the common case of most all NEON instructions not being safe to execute in streaming mode.

I guess I'm worried we'll introduce complexity where it's not needed because the modal behaviour is mostly handled before isel and if we get this far we're already toast. The rational for changing HasNEON is mainly to fail in a nicer way.

That said, I do think we're at the point where HasNEONorSME doesn't represent what we mean (i.e. it's hard to tell but I don't think it's the case that an implementation with SME but not NEON must have these instructions?) so perhaps HasNEONandStreamingSafe is a good way to show the separate feature (i.e. Inst) and runtime (isel) requirements whilst also implying blocks protected by HasNEON are not streaming safe.

This helps identify any failures where the compiler might otherwise
silently emit instructions that are not valid for the given runtime
mode. We can probably do a similar thing for HasSVE predicates.
@sdesmalen-arm sdesmalen-arm force-pushed the sme-allow-neon-in-non-streaming-mode-only branch from bd914db to 97f2a88 Compare June 17, 2024 15:16
Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One observation but otherwise this looks good to me.

@@ -6016,7 +6013,7 @@ def : Pat<(v2f64 (AArch64frsqrts (v2f64 FPR128:$Rn), (v2f64 FPR128:$Rm))),
// Some float -> int -> float conversion patterns for which we want to keep the
// int values in FP registers using the corresponding NEON instructions to
// avoid more costly int <-> fp register transfers.
let Predicates = [HasNEON] in {
let Predicates = [HasNEONandIsStreamingSafe] in {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The predicate here doesn't match that used by the instruction's definition (e.g. defm FCVTZS : ...). This looks fine because those definitions don't include any patterns. However, that does suggest the HasNEONandIsStreamingSafe passed into some of those classes (e.g. defm FRECPE : ...) serves no purpose and can be removed rather than changed?

If you keep this change then please update the closing } comment because that still references HasNEON.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done!

@sdesmalen-arm sdesmalen-arm merged commit 93831c7 into llvm:main Jun 19, 2024
5 of 6 checks passed
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
…m#95560)

This helps identify any failures where the compiler might otherwise
silently emit instructions that are not valid for the given runtime mode.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants