[AArch64][InstCombine] Bail from combining SRAD on +/-1 divisor #109274

MDevereau · 2024-09-19T11:58:47Z

This fixes a crash when svdiv's third parameter is svdup_s64(1)

MDevereau · 2024-09-19T11:59:06Z

Related to #109155

llvmbot · 2024-09-19T11:59:19Z

@llvm/pr-subscribers-backend-aarch64

Author: Matthew Devereau (MDevereau)

Changes

This fixes a crash when svdiv's third parameter is svdup_s64(1)

Full diff: https://github.com/llvm/llvm-project/pull/109274.diff

2 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+3)
(modified) llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll (+24)

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 649ba1ac8749f5..686503a794d7be 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -1972,7 +1972,10 @@ static std::optional<Instruction *> instCombineSVESDIV(InstCombiner &IC,
   ConstantInt *SplatConstantInt = dyn_cast_or_null<ConstantInt>(SplatValue);
   if (!SplatConstantInt)
     return std::nullopt;
+
   APInt Divisor = SplatConstantInt->getValue();
+  if (Divisor.abs().getZExtValue() == 1)
+    return std::nullopt;
 
   if (Divisor.isPowerOf2()) {
     Constant *DivisorLog2 = ConstantInt::get(Int32Ty, Divisor.logBase2());
diff --git a/llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll b/llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll
index 68d871355e2a23..dc134ef109fe75 100644
--- a/llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll
+++ b/llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll
@@ -68,6 +68,30 @@ define <vscale x 4 x i32> @sdiv_i32_not_zero(<vscale x 4 x i32> %a, <vscale x 4
   ret <vscale x 4 x i32> %out
 }
 
+; Don't instcombine to SRAD when the divisor is (+/-)1
+define <vscale x 2 x i64> @divide_by_1(<vscale x 16 x i1> %p, <vscale x 2 x i64> %a) #0 {
+; CHECK-LABEL: @divide_by_1(
+; CHECK-NEXT:    [[TMP1:%.*]] = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[P:%.*]])
+; CHECK-NEXT:    [[TMP2:%.*]] = call <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1> [[TMP1]], <vscale x 2 x i64> [[A:%.*]], <vscale x 2 x i64> shufflevector (<vscale x 2 x i64> insertelement (<vscale x 2 x i64> poison, i64 1, i64 0), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer))
+; CHECK-NEXT:    ret <vscale x 2 x i64> [[TMP2]]
+;
+  %1 = call <vscale x 2 x i64> @llvm.aarch64.sve.dup.x.nxv2i64(i64 1)
+  %2 = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> %p)
+  %3 = call <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1> %2, <vscale x 2 x i64> %a, <vscale x 2 x i64> %1)
+  ret <vscale x 2 x i64> %3
+}
+
+define <vscale x 2 x i64> @divide_by_m1(<vscale x 16 x i1> %p, <vscale x 2 x i64> %a) #0 {
+; CHECK-LABEL: @divide_by_m1(
+; CHECK-NEXT:    [[TMP1:%.*]] = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[P:%.*]])
+; CHECK-NEXT:    [[TMP2:%.*]] = call <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1> [[TMP1]], <vscale x 2 x i64> [[A:%.*]], <vscale x 2 x i64> shufflevector (<vscale x 2 x i64> insertelement (<vscale x 2 x i64> poison, i64 -1, i64 0), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer))
+; CHECK-NEXT:    ret <vscale x 2 x i64> [[TMP2]]
+;
+  %1 = call <vscale x 2 x i64> @llvm.aarch64.sve.dup.x.nxv2i64(i64 -1)
+  %2 = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> %p)
+  %3 = call <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1> %2, <vscale x 2 x i64> %a, <vscale x 2 x i64> %1)
+  ret <vscale x 2 x i64> %3
+}
 
 declare <vscale x 4 x i32> @llvm.aarch64.sve.sdiv.nxv4i32(<vscale x 4 x i1>, <vscale x 4 x i32>, <vscale x 4 x i32>)
 declare <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1>, <vscale x 2 x i64>, <vscale x 2 x i64>)

llvmbot · 2024-09-19T11:59:19Z

@llvm/pr-subscribers-llvm-transforms

Author: Matthew Devereau (MDevereau)

Changes

This fixes a crash when svdiv's third parameter is svdup_s64(1)

Full diff: https://github.com/llvm/llvm-project/pull/109274.diff

2 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+3)
(modified) llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll (+24)

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 649ba1ac8749f5..686503a794d7be 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -1972,7 +1972,10 @@ static std::optional<Instruction *> instCombineSVESDIV(InstCombiner &IC,
   ConstantInt *SplatConstantInt = dyn_cast_or_null<ConstantInt>(SplatValue);
   if (!SplatConstantInt)
     return std::nullopt;
+
   APInt Divisor = SplatConstantInt->getValue();
+  if (Divisor.abs().getZExtValue() == 1)
+    return std::nullopt;
 
   if (Divisor.isPowerOf2()) {
     Constant *DivisorLog2 = ConstantInt::get(Int32Ty, Divisor.logBase2());
diff --git a/llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll b/llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll
index 68d871355e2a23..dc134ef109fe75 100644
--- a/llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll
+++ b/llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-sdiv.ll
@@ -68,6 +68,30 @@ define <vscale x 4 x i32> @sdiv_i32_not_zero(<vscale x 4 x i32> %a, <vscale x 4
   ret <vscale x 4 x i32> %out
 }
 
+; Don't instcombine to SRAD when the divisor is (+/-)1
+define <vscale x 2 x i64> @divide_by_1(<vscale x 16 x i1> %p, <vscale x 2 x i64> %a) #0 {
+; CHECK-LABEL: @divide_by_1(
+; CHECK-NEXT:    [[TMP1:%.*]] = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[P:%.*]])
+; CHECK-NEXT:    [[TMP2:%.*]] = call <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1> [[TMP1]], <vscale x 2 x i64> [[A:%.*]], <vscale x 2 x i64> shufflevector (<vscale x 2 x i64> insertelement (<vscale x 2 x i64> poison, i64 1, i64 0), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer))
+; CHECK-NEXT:    ret <vscale x 2 x i64> [[TMP2]]
+;
+  %1 = call <vscale x 2 x i64> @llvm.aarch64.sve.dup.x.nxv2i64(i64 1)
+  %2 = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> %p)
+  %3 = call <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1> %2, <vscale x 2 x i64> %a, <vscale x 2 x i64> %1)
+  ret <vscale x 2 x i64> %3
+}
+
+define <vscale x 2 x i64> @divide_by_m1(<vscale x 16 x i1> %p, <vscale x 2 x i64> %a) #0 {
+; CHECK-LABEL: @divide_by_m1(
+; CHECK-NEXT:    [[TMP1:%.*]] = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[P:%.*]])
+; CHECK-NEXT:    [[TMP2:%.*]] = call <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1> [[TMP1]], <vscale x 2 x i64> [[A:%.*]], <vscale x 2 x i64> shufflevector (<vscale x 2 x i64> insertelement (<vscale x 2 x i64> poison, i64 -1, i64 0), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer))
+; CHECK-NEXT:    ret <vscale x 2 x i64> [[TMP2]]
+;
+  %1 = call <vscale x 2 x i64> @llvm.aarch64.sve.dup.x.nxv2i64(i64 -1)
+  %2 = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> %p)
+  %3 = call <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1> %2, <vscale x 2 x i64> %a, <vscale x 2 x i64> %1)
+  ret <vscale x 2 x i64> %3
+}
 
 declare <vscale x 4 x i32> @llvm.aarch64.sve.sdiv.nxv4i32(<vscale x 4 x i1>, <vscale x 4 x i32>, <vscale x 4 x i32>)
 declare <vscale x 2 x i64> @llvm.aarch64.sve.sdiv.nxv2i64(<vscale x 2 x i1>, <vscale x 2 x i64>, <vscale x 2 x i64>)

paulwalker-arm · 2024-09-19T12:06:39Z

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

  APInt Divisor = SplatConstantInt->getValue();
+  if (Divisor.abs().getZExtValue() == 1)
+    return std::nullopt;


For the +1 case, would it be wrong to always return Vec here?

Yeah that makes sense. I guess the predicate wouldn't matter in that case since it would be a no-op on the whole vector

[AArch64][InstCombine] Bail from combining SRAD on +/-1 divisor

b582da3

This fixes a crash when svdiv's third parameter is svdup_s64(1)

MDevereau requested review from davemgreen and paulwalker-arm September 19, 2024 11:58

llvmbot added backend:AArch64 llvm:transforms labels Sep 19, 2024

paulwalker-arm reviewed Sep 19, 2024

View reviewed changes

MDevereau added 2 commits September 19, 2024 12:52

Return 2nd param when divisor is 1

2a7cd16

/s/II.getOperand(1)/Vec

8d19ecd

paulwalker-arm approved these changes Sep 20, 2024

View reviewed changes

MDevereau merged commit 1808fc1 into llvm:main Sep 20, 2024
8 checks passed

MDevereau deleted the srad branch September 20, 2024 12:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64][InstCombine] Bail from combining SRAD on +/-1 divisor #109274

[AArch64][InstCombine] Bail from combining SRAD on +/-1 divisor #109274

Uh oh!

MDevereau commented Sep 19, 2024

Uh oh!

MDevereau commented Sep 19, 2024

Uh oh!

llvmbot commented Sep 19, 2024

Uh oh!

llvmbot commented Sep 19, 2024

Uh oh!

paulwalker-arm Sep 19, 2024

Uh oh!

MDevereau Sep 19, 2024

Uh oh!

Uh oh!

Uh oh!

[AArch64][InstCombine] Bail from combining SRAD on +/-1 divisor #109274

[AArch64][InstCombine] Bail from combining SRAD on +/-1 divisor #109274

Uh oh!

Conversation

MDevereau commented Sep 19, 2024

Uh oh!

MDevereau commented Sep 19, 2024

Uh oh!

llvmbot commented Sep 19, 2024

Uh oh!

llvmbot commented Sep 19, 2024

Uh oh!

paulwalker-arm Sep 19, 2024

Choose a reason for hiding this comment

Uh oh!

MDevereau Sep 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!