Skip to content

[GISel] LegalizationArtifactCombiner: Elide redundant G_SEXT_INREG #93687

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

s-barannikov
Copy link
Contributor

This is similar to 373c343, but for targets with zero-or-negative-one booleans.
The difference in tests is mostly due to G_SEXT_INREG being illegal for some targets, in which case it gets expanded into G_SHL/G_ASHR pair, which is not currently optimized by the combiner.

This is similar to 373c343, but for targets with zero-or-negative-one
booleans.
The difference in tests is mostly due to G_SEXT_INREG being illegal for
some targets, in which case it gets expanded into G_SHL/G_ASHR pair,
which is not currently optimized by the combiner.
@llvmbot
Copy link
Member

llvmbot commented May 29, 2024

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-globalisel

Author: Sergei Barannikov (s-barannikov)

Changes

This is similar to 373c343, but for targets with zero-or-negative-one booleans.
The difference in tests is mostly due to G_SEXT_INREG being illegal for some targets, in which case it gets expanded into G_SHL/G_ASHR pair, which is not currently optimized by the combiner.


Patch is 34.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93687.diff

13 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h (+11-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir (+24-36)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir (+6-9)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir (+1-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir (+1-2)
  • (modified) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir (+1-4)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir (+2-4)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/rem_and_div.ll (+8-24)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/sitofp_and_uitofp.ll (+4-12)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/jumptable.ll (-3)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-abs-rv32.mir (+13-19)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-abs-rv64.mir (+18-26)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-jump-table-brjt-rv64.mir (+1-2)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h b/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
index 2efc48e3be4c5..c6267ac67c12e 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
@@ -192,7 +192,8 @@ class LegalizationArtifactCombiner {
 
   bool tryCombineSExt(MachineInstr &MI,
                       SmallVectorImpl<MachineInstr *> &DeadInsts,
-                      SmallVectorImpl<Register> &UpdatedDefs) {
+                      SmallVectorImpl<Register> &UpdatedDefs,
+                      GISelObserverWrapper &Observer) {
     using namespace llvm::MIPatternMatch;
     assert(MI.getOpcode() == TargetOpcode::G_SEXT);
 
@@ -211,7 +212,14 @@ class LegalizationArtifactCombiner {
       uint64_t SizeInBits = SrcTy.getScalarSizeInBits();
       if (DstTy != MRI.getType(TruncSrc))
         TruncSrc = Builder.buildAnyExtOrTrunc(DstTy, TruncSrc).getReg(0);
-      Builder.buildSExtInReg(DstReg, TruncSrc, SizeInBits);
+      // Elide G_SEXT_INREG if possible. This is similar to eliding G_AND in
+      // tryCombineZExt. Refer to the comment in tryCombineZExt for rationale.
+      if (KB && KB->computeNumSignBits(TruncSrc) >=
+                    DstTy.getScalarSizeInBits() - SizeInBits + 1)
+        replaceRegOrBuildCopy(DstReg, TruncSrc, MRI, Builder, UpdatedDefs,
+                              Observer);
+      else
+        Builder.buildSExtInReg(DstReg, TruncSrc, SizeInBits);
       markInstAndDefDead(MI, *MRI.getVRegDef(SrcReg), DeadInsts);
       return true;
     }
@@ -1322,7 +1330,7 @@ class LegalizationArtifactCombiner {
       Changed = tryCombineZExt(MI, DeadInsts, UpdatedDefs, WrapperObserver);
       break;
     case TargetOpcode::G_SEXT:
-      Changed = tryCombineSExt(MI, DeadInsts, UpdatedDefs);
+      Changed = tryCombineSExt(MI, DeadInsts, UpdatedDefs, WrapperObserver);
       break;
     case TargetOpcode::G_UNMERGE_VALUES:
       Changed = tryCombineUnmergeValues(cast<GUnmerge>(MI), DeadInsts,
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir
index 35c9538627b35..403c12d1f4235 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir
@@ -221,11 +221,10 @@ body: |
     ; CHECK-NEXT: %vec:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: %vec1:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(slt), %vec(<2 x s64>), %vec1
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND %vec1, [[XOR]]
     ; CHECK-NEXT: %smin:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY %smin(<2 x s64>)
@@ -249,17 +248,15 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(slt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(slt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP1]], 1
-    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG1]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG1]]
+    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP1]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP1]]
     ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR1]]
     ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND2]], [[AND3]]
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
@@ -494,11 +491,10 @@ body: |
     ; CHECK-NEXT: %vec:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: %vec1:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ult), %vec(<2 x s64>), %vec1
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND %vec1, [[XOR]]
     ; CHECK-NEXT: %umin:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY %umin(<2 x s64>)
@@ -522,17 +518,15 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ult), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ult), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP1]], 1
-    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG1]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG1]]
+    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP1]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP1]]
     ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR1]]
     ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND2]], [[AND3]]
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
@@ -767,11 +761,10 @@ body: |
     ; CHECK-NEXT: %vec:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: %vec1:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(sgt), %vec(<2 x s64>), %vec1
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND %vec1, [[XOR]]
     ; CHECK-NEXT: %smax:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY %smax(<2 x s64>)
@@ -795,17 +788,15 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(sgt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(sgt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP1]], 1
-    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG1]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG1]]
+    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP1]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP1]]
     ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR1]]
     ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND2]], [[AND3]]
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
@@ -1040,11 +1031,10 @@ body: |
     ; CHECK-NEXT: %vec:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: %vec1:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ugt), %vec(<2 x s64>), %vec1
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND %vec1, [[XOR]]
     ; CHECK-NEXT: %umax:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY %umax(<2 x s64>)
@@ -1068,17 +1058,15 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ugt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ugt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP1]], 1
-    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG1]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG1]]
+    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP1]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP1]]
     ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR1]]
     ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND2]], [[AND3]]
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir
index 8b5cac1ec8735..92f8e524dbb31 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir
@@ -17,11 +17,10 @@ body:             |
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(sgt), [[COPY]](<2 x s64>), [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C1]](s64), [[C1]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR1]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[COPY1]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR1]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[COPY1]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[COPY]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY [[OR]](<2 x s64>)
@@ -52,11 +51,10 @@ body:             |
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[C]](s32), [[C]](s32)
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s32>) = G_ICMP intpred(sgt), [[COPY]](<2 x s32>), [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s32>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 -1
     ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[C1]](s32), [[C1]](s32)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s32>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR1]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s32>) = G_AND [[COPY1]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s32>) = G_XOR [[ICMP]], [[BUILD_VECTOR1]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s32>) = G_AND [[COPY1]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s32>) = G_AND [[COPY]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s32>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $d0 = COPY [[OR]](<2 x s32>)
@@ -87,11 +85,10 @@ body:             |
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s8) = G_CONSTANT i8 0
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<16 x s8>) = G_BUILD_VECTOR [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8)
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<16 x s8>) = G_ICMP intpred(sgt), [[COPY]](<16 x s8>), [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<16 x s8>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s8) = G_CONSTANT i8 -1
     ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<16 x s8>) = G_BUILD_VECTOR [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<16 x s8>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR1]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<16 x s8>) = G_AND [[COPY1]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<16 x s8>) = G_XOR [[ICMP]], [[BUILD_VECTOR1]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<16 x s8>) = G_AND [[COPY1]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<16 x s8>) = G_AND [[COPY]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<16 x s8>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY [[OR]](<16 x s8>)
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir
index eb86a981c9f1e..16ad07e0df58e 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir
@@ -18,8 +18,7 @@ body: |
     ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(s32) = G_SEXT_INREG [[COPY]], 7
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(slt), [[SEXT_INREG]](s32), [[SEXT_INREG1]]
     ; CHECK-NEXT: [[SEXT_INREG2:%[0-9]+]]:_(s32) = G_SEXT_INREG [[COPY1]], 7
-    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-    ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(slt), [[SEXT_INREG2]](s32), [[COPY2]]
+    ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(slt), [[SEXT_INREG2]](s32), [[C]]
     ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(s1) = G_XOR [[ICMP1]], [[ICMP]]
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 127
     ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[ADD]], [[C1]]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir
index 220450c5e4ec6..aa59de0118ad6 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir
@@ -18,8 +18,7 @@ body: |
     ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(s32) = G_SEXT_INREG [[COPY]], 7
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(slt), [[SEXT_INREG]](s32), [[SEXT_INREG1]]
     ; CHECK-NEXT: [[SEXT_INREG2:%[0-9]+]]:_(s32) = G_SEXT_INREG [[COPY1]], 7
-    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-    ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(sgt), [[SEXT_INREG2]](s32), [[COPY2]]
+    ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(sgt), [[SEXT_INREG2]](s32), [[C]]
     ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(s1) = G_XOR [[ICMP1]], [[ICMP]]
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 127
     ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[SUB]], [[C1]]
diff --git a/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir b/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir
index caf8db442dc53..1a2d7be5674e8 100644
--- a/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir
+++ b/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir
@@ -211,10 +211,7 @@ body:             |
     ; CHECK: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[V8]](s8)
     ; CHECK: [[SEXT:%[0-9]+]]:_(s32) = G_SEXT [[V8]](s8)
     ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SEXT]]
-    ; CHECK: [[BITS:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
-    ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[OR]], [[BITS]](s32)
-    ; CHECK: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[BITS]](s32)
-    ; CHECK: $r0 = COPY [[ASHR]]
+    ; CHECK: $r0 = COPY [[OR]]
 
     %5(s32) = G_SEXT %4(s16)
     $r0 = COPY %5
diff --git a/llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir b/llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir
index 0ced5d52d262c..65a7e78044997 100644
--- a/llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir
+++ b/llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir
@@ -54,8 +54,7 @@ body:             |
   bb.1.entry:
     ; MIPS32-LABEL: name: signed_i16
     ; MIPS32: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 -32768
-    ; MIPS32-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-    ; MIPS32-NEXT: $v0 = COPY [[COPY]](s32)
+    ; MIPS32-NEXT: $v0 = COPY [[C]](s32)
     ; MIPS32-NEXT: RetRA implicit $v0
     %0:_(s16) = G_CONSTANT i16 -32768
     %1:_(s32) = G_SEXT %0(s16)
@@ -71,8 +70,7 @@ body:             |
   bb.1.entry:
     ; MIPS32-LABEL: name: signed_i8
     ; MIPS32: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented May 29, 2024

@llvm/pr-subscribers-backend-arm

Author: Sergei Barannikov (s-barannikov)

Changes

This is similar to 373c343, but for targets with zero-or-negative-one booleans.
The difference in tests is mostly due to G_SEXT_INREG being illegal for some targets, in which case it gets expanded into G_SHL/G_ASHR pair, which is not currently optimized by the combiner.


Patch is 34.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93687.diff

13 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h (+11-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir (+24-36)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir (+6-9)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir (+1-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir (+1-2)
  • (modified) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir (+1-4)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir (+2-4)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/rem_and_div.ll (+8-24)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/sitofp_and_uitofp.ll (+4-12)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/jumptable.ll (-3)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-abs-rv32.mir (+13-19)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-abs-rv64.mir (+18-26)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-jump-table-brjt-rv64.mir (+1-2)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h b/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
index 2efc48e3be4c5..c6267ac67c12e 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
@@ -192,7 +192,8 @@ class LegalizationArtifactCombiner {
 
   bool tryCombineSExt(MachineInstr &MI,
                       SmallVectorImpl<MachineInstr *> &DeadInsts,
-                      SmallVectorImpl<Register> &UpdatedDefs) {
+                      SmallVectorImpl<Register> &UpdatedDefs,
+                      GISelObserverWrapper &Observer) {
     using namespace llvm::MIPatternMatch;
     assert(MI.getOpcode() == TargetOpcode::G_SEXT);
 
@@ -211,7 +212,14 @@ class LegalizationArtifactCombiner {
       uint64_t SizeInBits = SrcTy.getScalarSizeInBits();
       if (DstTy != MRI.getType(TruncSrc))
         TruncSrc = Builder.buildAnyExtOrTrunc(DstTy, TruncSrc).getReg(0);
-      Builder.buildSExtInReg(DstReg, TruncSrc, SizeInBits);
+      // Elide G_SEXT_INREG if possible. This is similar to eliding G_AND in
+      // tryCombineZExt. Refer to the comment in tryCombineZExt for rationale.
+      if (KB && KB->computeNumSignBits(TruncSrc) >=
+                    DstTy.getScalarSizeInBits() - SizeInBits + 1)
+        replaceRegOrBuildCopy(DstReg, TruncSrc, MRI, Builder, UpdatedDefs,
+                              Observer);
+      else
+        Builder.buildSExtInReg(DstReg, TruncSrc, SizeInBits);
       markInstAndDefDead(MI, *MRI.getVRegDef(SrcReg), DeadInsts);
       return true;
     }
@@ -1322,7 +1330,7 @@ class LegalizationArtifactCombiner {
       Changed = tryCombineZExt(MI, DeadInsts, UpdatedDefs, WrapperObserver);
       break;
     case TargetOpcode::G_SEXT:
-      Changed = tryCombineSExt(MI, DeadInsts, UpdatedDefs);
+      Changed = tryCombineSExt(MI, DeadInsts, UpdatedDefs, WrapperObserver);
       break;
     case TargetOpcode::G_UNMERGE_VALUES:
       Changed = tryCombineUnmergeValues(cast<GUnmerge>(MI), DeadInsts,
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir
index 35c9538627b35..403c12d1f4235 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-min-max.mir
@@ -221,11 +221,10 @@ body: |
     ; CHECK-NEXT: %vec:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: %vec1:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(slt), %vec(<2 x s64>), %vec1
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND %vec1, [[XOR]]
     ; CHECK-NEXT: %smin:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY %smin(<2 x s64>)
@@ -249,17 +248,15 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(slt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(slt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP1]], 1
-    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG1]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG1]]
+    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP1]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP1]]
     ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR1]]
     ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND2]], [[AND3]]
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
@@ -494,11 +491,10 @@ body: |
     ; CHECK-NEXT: %vec:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: %vec1:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ult), %vec(<2 x s64>), %vec1
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND %vec1, [[XOR]]
     ; CHECK-NEXT: %umin:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY %umin(<2 x s64>)
@@ -522,17 +518,15 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ult), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ult), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP1]], 1
-    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG1]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG1]]
+    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP1]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP1]]
     ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR1]]
     ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND2]], [[AND3]]
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
@@ -767,11 +761,10 @@ body: |
     ; CHECK-NEXT: %vec:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: %vec1:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(sgt), %vec(<2 x s64>), %vec1
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND %vec1, [[XOR]]
     ; CHECK-NEXT: %smax:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY %smax(<2 x s64>)
@@ -795,17 +788,15 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(sgt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(sgt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP1]], 1
-    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG1]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG1]]
+    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP1]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP1]]
     ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR1]]
     ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND2]], [[AND3]]
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
@@ -1040,11 +1031,10 @@ body: |
     ; CHECK-NEXT: %vec:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: %vec1:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ugt), %vec(<2 x s64>), %vec1
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND %vec, [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND %vec1, [[XOR]]
     ; CHECK-NEXT: %umax:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY %umax(<2 x s64>)
@@ -1068,17 +1058,15 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<2 x s64>) = G_IMPLICIT_DEF
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ugt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(ugt), [[DEF]](<2 x s64>), [[DEF]]
-    ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP1]], 1
-    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG1]], [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[SEXT_INREG1]]
+    ; CHECK-NEXT: [[XOR1:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP1]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: [[AND2:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[ICMP1]]
     ; CHECK-NEXT: [[AND3:%[0-9]+]]:_(<2 x s64>) = G_AND [[DEF]], [[XOR1]]
     ; CHECK-NEXT: [[OR1:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND2]], [[AND3]]
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir
index 8b5cac1ec8735..92f8e524dbb31 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir
@@ -17,11 +17,10 @@ body:             |
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(sgt), [[COPY]](<2 x s64>), [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s64>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 -1
     ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C1]](s64), [[C1]](s64)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR1]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[COPY1]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s64>) = G_XOR [[ICMP]], [[BUILD_VECTOR1]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[COPY1]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[COPY]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY [[OR]](<2 x s64>)
@@ -52,11 +51,10 @@ body:             |
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[C]](s32), [[C]](s32)
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s32>) = G_ICMP intpred(sgt), [[COPY]](<2 x s32>), [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<2 x s32>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 -1
     ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[C1]](s32), [[C1]](s32)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s32>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR1]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s32>) = G_AND [[COPY1]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<2 x s32>) = G_XOR [[ICMP]], [[BUILD_VECTOR1]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s32>) = G_AND [[COPY1]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s32>) = G_AND [[COPY]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s32>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $d0 = COPY [[OR]](<2 x s32>)
@@ -87,11 +85,10 @@ body:             |
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s8) = G_CONSTANT i8 0
     ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<16 x s8>) = G_BUILD_VECTOR [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8), [[C]](s8)
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<16 x s8>) = G_ICMP intpred(sgt), [[COPY]](<16 x s8>), [[BUILD_VECTOR]]
-    ; CHECK-NEXT: [[SEXT_INREG:%[0-9]+]]:_(<16 x s8>) = G_SEXT_INREG [[ICMP]], 1
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s8) = G_CONSTANT i8 -1
     ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<16 x s8>) = G_BUILD_VECTOR [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8), [[C1]](s8)
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<16 x s8>) = G_XOR [[SEXT_INREG]], [[BUILD_VECTOR1]]
-    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<16 x s8>) = G_AND [[COPY1]], [[SEXT_INREG]]
+    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(<16 x s8>) = G_XOR [[ICMP]], [[BUILD_VECTOR1]]
+    ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<16 x s8>) = G_AND [[COPY1]], [[ICMP]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<16 x s8>) = G_AND [[COPY]], [[XOR]]
     ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<16 x s8>) = G_OR [[AND]], [[AND1]]
     ; CHECK-NEXT: $q0 = COPY [[OR]](<16 x s8>)
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir
index eb86a981c9f1e..16ad07e0df58e 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir
@@ -18,8 +18,7 @@ body: |
     ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(s32) = G_SEXT_INREG [[COPY]], 7
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(slt), [[SEXT_INREG]](s32), [[SEXT_INREG1]]
     ; CHECK-NEXT: [[SEXT_INREG2:%[0-9]+]]:_(s32) = G_SEXT_INREG [[COPY1]], 7
-    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-    ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(slt), [[SEXT_INREG2]](s32), [[COPY2]]
+    ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(slt), [[SEXT_INREG2]](s32), [[C]]
     ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(s1) = G_XOR [[ICMP1]], [[ICMP]]
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 127
     ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[ADD]], [[C1]]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir
index 220450c5e4ec6..aa59de0118ad6 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir
@@ -18,8 +18,7 @@ body: |
     ; CHECK-NEXT: [[SEXT_INREG1:%[0-9]+]]:_(s32) = G_SEXT_INREG [[COPY]], 7
     ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(slt), [[SEXT_INREG]](s32), [[SEXT_INREG1]]
     ; CHECK-NEXT: [[SEXT_INREG2:%[0-9]+]]:_(s32) = G_SEXT_INREG [[COPY1]], 7
-    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-    ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(sgt), [[SEXT_INREG2]](s32), [[COPY2]]
+    ; CHECK-NEXT: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(sgt), [[SEXT_INREG2]](s32), [[C]]
     ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(s1) = G_XOR [[ICMP1]], [[ICMP]]
     ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 127
     ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[SUB]], [[C1]]
diff --git a/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir b/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir
index caf8db442dc53..1a2d7be5674e8 100644
--- a/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir
+++ b/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-exts.mir
@@ -211,10 +211,7 @@ body:             |
     ; CHECK: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[V8]](s8)
     ; CHECK: [[SEXT:%[0-9]+]]:_(s32) = G_SEXT [[V8]](s8)
     ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SEXT]]
-    ; CHECK: [[BITS:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
-    ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[OR]], [[BITS]](s32)
-    ; CHECK: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[BITS]](s32)
-    ; CHECK: $r0 = COPY [[ASHR]]
+    ; CHECK: $r0 = COPY [[OR]]
 
     %5(s32) = G_SEXT %4(s16)
     $r0 = COPY %5
diff --git a/llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir b/llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir
index 0ced5d52d262c..65a7e78044997 100644
--- a/llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir
+++ b/llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir
@@ -54,8 +54,7 @@ body:             |
   bb.1.entry:
     ; MIPS32-LABEL: name: signed_i16
     ; MIPS32: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 -32768
-    ; MIPS32-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-    ; MIPS32-NEXT: $v0 = COPY [[COPY]](s32)
+    ; MIPS32-NEXT: $v0 = COPY [[C]](s32)
     ; MIPS32-NEXT: RetRA implicit $v0
     %0:_(s16) = G_CONSTANT i16 -32768
     %1:_(s32) = G_SEXT %0(s16)
@@ -71,8 +70,7 @@ body:             |
   bb.1.entry:
     ; MIPS32-LABEL: name: signed_i8
     ; MIPS32: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32...
[truncated]

@s-barannikov s-barannikov requested a review from topperc May 29, 2024 13:57
Comment on lines 217 to 218
if (KB && KB->computeNumSignBits(TruncSrc) >=
DstTy.getScalarSizeInBits() - SizeInBits + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x >= y + 1 -> x > y?

@aemerson
Copy link
Contributor

In the original review for 373c343 you'll see that there were multiple justifications stated for adding the combine to the artifact combiner. One of them was that it was a net compile time benefit and that the call to KB wouldn't be useless work most of the time.

In order to add this to the artifact combiner we'll need similar justification because strictly speaking it doesn't belong here.

@s-barannikov
Copy link
Contributor Author

The justification is just the same. This eliminates redundant instructions after each instruction producing a boolean value (comparisons and long arithmetic). I though it is artifact combiner's job to eliminate the redundant extensions/truncations inserted by the legalizer. Is it not?

I don't have large benchmarks, but on what I have, this patch gives ~1.0% compile time speedup at -O0 and ~1.6% speedup at -O2. .text size is reduced by 2% to 4% depending on optimization level (there is a difference even at -O2 because shl + ashr pair is not currently optimized by the generic combiner).

I didn't measure targets with zero-or-one booleans, but I don't expect the compile time to increase perceptibly because trunc+sext pair is much less likely to occur for them at legalization time.
I did run time ninja check-llvm-codegen-amdgpu-globalisel though (a target with 776 G_SEXT_INREG in GlobalISel tests), and there was no difference in testing time. It might be worth noting that a call to KnownBits is limited by Depth == 2 at -O0.

@tobias-stadler
Copy link
Contributor

I quickly ran AArch64 CTMark on this for O0 and O2: compile-time instruction count difference is below the noise floor, size..text is unchanged. So this doesn't seem to regress anything for ZeroOrOneBooleanContent Targets.

When I added the zext-combine, the main reasons were huge compile-time and O0 code-size improvements for ZeroOrOneBooleanContent Targets, so it makes sense that ZeroOrNegativeOneBooleanContent Targets would see similar improvements from this commit. I do agree that using KnownBits in the ArtifactCombiner is hacky, because it can't fully replace the post-legalize version of these KnownBits combines. However, I don't see another way to efficiently handle boolean legalization, so this still seems like the most practical solution to me. That said, I believe these 2 combines should remain the only ones using KnownBits in the ArtifactCombiner.

@aemerson
Copy link
Contributor

aemerson commented May 30, 2024

The justification is just the same. This eliminates redundant instructions after each instruction producing a boolean value (comparisons and long arithmetic). I though it is artifact combiner's job to eliminate the redundant extensions/truncations inserted by the legalizer. Is it not?

The real reason the artifact combiner is there is because it's actually required in order for our legalization algorithm to terminate. We need to elide unlegalizable artifacts that cancel out in order for it to work at all. Other combines just tag along and we should be careful about adding new ones. That said, if this is purely a mirror for the G_AND one for other boolean format targets then I think it's ok.

I don't have large benchmarks, but on what I have, this patch gives ~1.0% compile time speedup at -O0 and ~1.6% speedup at -O2. .text size is reduced by 2% to 4% depending on optimization level (there is a difference even at -O2 because shl + ashr pair is not currently optimized by the generic combiner).

I didn't measure targets with zero-or-one booleans, but I don't expect the compile time to increase perceptibly because trunc+sext pair is much less likely to occur for them at legalization time. I did run time ninja check-llvm-codegen-amdgpu-globalisel though (a target with 776 G_SEXT_INREG in GlobalISel tests), and there was no difference in testing time. It might be worth noting that a call to KnownBits is limited by Depth == 2 at -O0.

@s-barannikov
Copy link
Contributor Author

The real reason the artifact combiner is there is because it's actually required in order for our legalization algorithm to terminate. We need to elide unlegalizable artifacts that cancel out in order for it to work at all.

I see, thanks.

@s-barannikov s-barannikov merged commit 3fee8b3 into llvm:main May 30, 2024
7 checks passed
@s-barannikov s-barannikov deleted the gisel/sext-of-trunc-artifact-combine branch May 30, 2024 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants