Skip to content

DAG/GlobalISel: Set disjoint for or in copysign lowering #97057

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 28, 2024

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Jun 28, 2024

We masked out the sign bit from one value, and the non-sign bits
from the other so there should be no common bits set.

No idea how to test this on the DAG path, other than scraping
the debug logs. A few targets hit this path with f16 values, but
the resulting i16 ors get anyext promoted and lose the disjoint
flag. In the fp128 case, PPC gets further and the or loses the flag
somewhere else later. Adding a haveNoCommonBits assert shows this
works though.

@arsenm arsenm added llvm:globalisel llvm:SelectionDAG SelectionDAGISel as well labels Jun 28, 2024 — with Graphite App
Copy link
Contributor Author

arsenm commented Jun 28, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @arsenm and the rest of your teammates on Graphite Graphite

@llvmbot
Copy link
Member

llvmbot commented Jun 28, 2024

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-llvm-globalisel

Author: Matt Arsenault (arsenm)

Changes

We masked out the sign bit from one value, and the non-sign bits
from the other so there should be no common bits set.

No idea how to test this on the DAG path, other than scraping
the debug logs. A few targets hit this path with f16 values, but
the resulting i16 ors get anyext promoted and lose the disjoint
flag. In the fp128 case, PPC gets further and the or loses the flag
somewhere else later. Adding a haveNoCommonBits assert shows this
works though.


Patch is 76.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97057.diff

5 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+4)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+6-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-fcopysign.mir (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fcopysign.mir (+108-108)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-intrinsic-round.mir (+126-126)
diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index 7087265f335f9..975f19b8596b9 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -7210,6 +7210,10 @@ LegalizerHelper::lowerFCopySign(MachineInstr &MI) {
   // constants are a nan and -0.0, but the final result should preserve
   // everything.
   unsigned Flags = MI.getFlags();
+
+  // We masked the sign bit and the not-sign bit, so these are disjoint.
+  Flags |= MachineInstr::Disjoint;
+
   MIRBuilder.buildOr(Dst, And0, And1, Flags);
 
   MI.eraseFromParent();
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
index dfc24f01eb112..d036a0285e571 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -1681,8 +1681,13 @@ SDValue SelectionDAGLegalize::ExpandFCOPYSIGN(SDNode *Node) const {
     SignBit = DAG.getNode(ISD::TRUNCATE, DL, MagVT, SignBit);
   }
 
+  SDNodeFlags Flags;
+  Flags.setDisjoint(true);
+
   // Store the part with the modified sign and convert back to float.
-  SDValue CopiedSign = DAG.getNode(ISD::OR, DL, MagVT, ClearedSign, SignBit);
+  SDValue CopiedSign =
+      DAG.getNode(ISD::OR, DL, MagVT, ClearedSign, SignBit, Flags);
+
   return modifySignAsInt(MagAsInt, DL, CopiedSign);
 }
 
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-fcopysign.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-fcopysign.mir
index dd794b7af9466..cfdf0900f2f06 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-fcopysign.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-fcopysign.mir
@@ -22,8 +22,8 @@ body:             |
     ; CHECK-NEXT: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[C1]](s32), [[C1]](s32)
     ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s32>) = G_AND [[BUILD_VECTOR]], [[BUILD_VECTOR3]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s32>) = G_AND [[BUILD_VECTOR1]], [[BUILD_VECTOR2]]
-    ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s32>) = G_OR [[AND]], [[AND1]]
-    ; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[OR]](<2 x s32>)
+    ; CHECK-NEXT: %6:_(<2 x s32>) = disjoint G_OR [[AND]], [[AND1]]
+    ; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES %6(<2 x s32>)
     ; CHECK-NEXT: %fcopysign:_(s32) = COPY [[UV]](s32)
     ; CHECK-NEXT: $s0 = COPY %fcopysign(s32)
     ; CHECK-NEXT: RET_ReallyLR implicit $s0
@@ -54,8 +54,8 @@ body:             |
     ; CHECK-NEXT: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[C1]](s64), [[C1]](s64)
     ; CHECK-NEXT: [[AND:%[0-9]+]]:_(<2 x s64>) = G_AND [[BUILD_VECTOR]], [[BUILD_VECTOR3]]
     ; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s64>) = G_AND [[BUILD_VECTOR1]], [[BUILD_VECTOR2]]
-    ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[AND]], [[AND1]]
-    ; CHECK-NEXT: [[UV:%[0-9]+]]:_(s64), [[UV1:%[0-9]+]]:_(s64) = G_UNMERGE_VALUES [[OR]](<2 x s64>)
+    ; CHECK-NEXT: %6:_(<2 x s64>) = disjoint G_OR [[AND]], [[AND1]]
+    ; CHECK-NEXT: [[UV:%[0-9]+]]:_(s64), [[UV1:%[0-9]+]]:_(s64) = G_UNMERGE_VALUES %6(<2 x s64>)
     ; CHECK-NEXT: %fcopysign:_(s64) = COPY [[UV]](s64)
     ; CHECK-NEXT: $d0 = COPY %fcopysign(s64)
     ; CHECK-NEXT: RET_ReallyLR implicit $d0
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fcopysign.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fcopysign.mir
index ac72bf1dbbb6f..60ccd20c095cd 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fcopysign.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fcopysign.mir
@@ -22,8 +22,8 @@ body: |
     ; SI-NEXT: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 32767
     ; SI-NEXT: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
     ; SI-NEXT: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C]]
-    ; SI-NEXT: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[AND1]]
-    ; SI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
+    ; SI-NEXT: %4:_(s16) = disjoint G_OR [[AND]], [[AND1]]
+    ; SI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT %4(s16)
     ; SI-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
     ;
     ; VI-LABEL: name: test_copysign_s16_s16
@@ -37,8 +37,8 @@ body: |
     ; VI-NEXT: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 32767
     ; VI-NEXT: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
     ; VI-NEXT: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C]]
-    ; VI-NEXT: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[AND1]]
-    ; VI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
+    ; VI-NEXT: %4:_(s16) = disjoint G_OR [[AND]], [[AND1]]
+    ; VI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT %4(s16)
     ; VI-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
     ;
     ; GFX9-LABEL: name: test_copysign_s16_s16
@@ -52,8 +52,8 @@ body: |
     ; GFX9-NEXT: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 32767
     ; GFX9-NEXT: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
     ; GFX9-NEXT: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[AND1]]
-    ; GFX9-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
+    ; GFX9-NEXT: %4:_(s16) = disjoint G_OR [[AND]], [[AND1]]
+    ; GFX9-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT %4(s16)
     ; GFX9-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
     %0:_(s32) = COPY $vgpr0
     %1:_(s32) = COPY $vgpr1
@@ -79,8 +79,8 @@ body: |
     ; SI-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2147483647
     ; SI-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C1]]
     ; SI-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]]
-    ; SI-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[AND1]]
-    ; SI-NEXT: $vgpr0 = COPY [[OR]](s32)
+    ; SI-NEXT: %2:_(s32) = disjoint G_OR [[AND]], [[AND1]]
+    ; SI-NEXT: $vgpr0 = COPY %2(s32)
     ;
     ; VI-LABEL: name: test_copysign_s32_s32
     ; VI: liveins: $vgpr0, $vgpr1
@@ -91,8 +91,8 @@ body: |
     ; VI-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2147483647
     ; VI-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C1]]
     ; VI-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]]
-    ; VI-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[AND1]]
-    ; VI-NEXT: $vgpr0 = COPY [[OR]](s32)
+    ; VI-NEXT: %2:_(s32) = disjoint G_OR [[AND]], [[AND1]]
+    ; VI-NEXT: $vgpr0 = COPY %2(s32)
     ;
     ; GFX9-LABEL: name: test_copysign_s32_s32
     ; GFX9: liveins: $vgpr0, $vgpr1
@@ -103,8 +103,8 @@ body: |
     ; GFX9-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2147483647
     ; GFX9-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C1]]
     ; GFX9-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[AND1]]
-    ; GFX9-NEXT: $vgpr0 = COPY [[OR]](s32)
+    ; GFX9-NEXT: %2:_(s32) = disjoint G_OR [[AND]], [[AND1]]
+    ; GFX9-NEXT: $vgpr0 = COPY %2(s32)
     %0:_(s32) = COPY $vgpr0
     %1:_(s32) = COPY $vgpr1
     %2:_(s32) = G_FCOPYSIGN %0, %1
@@ -126,8 +126,8 @@ body: |
     ; SI-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807
     ; SI-NEXT: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY]], [[C1]]
     ; SI-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
-    ; SI-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND1]]
-    ; SI-NEXT: $vgpr0_vgpr1 = COPY [[OR]](s64)
+    ; SI-NEXT: %2:_(s64) = disjoint G_OR [[AND]], [[AND1]]
+    ; SI-NEXT: $vgpr0_vgpr1 = COPY %2(s64)
     ;
     ; VI-LABEL: name: test_copysign_s64_s64
     ; VI: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
@@ -138,8 +138,8 @@ body: |
     ; VI-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807
     ; VI-NEXT: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY]], [[C1]]
     ; VI-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
-    ; VI-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND1]]
-    ; VI-NEXT: $vgpr0_vgpr1 = COPY [[OR]](s64)
+    ; VI-NEXT: %2:_(s64) = disjoint G_OR [[AND]], [[AND1]]
+    ; VI-NEXT: $vgpr0_vgpr1 = COPY %2(s64)
     ;
     ; GFX9-LABEL: name: test_copysign_s64_s64
     ; GFX9: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
@@ -150,8 +150,8 @@ body: |
     ; GFX9-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807
     ; GFX9-NEXT: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY]], [[C1]]
     ; GFX9-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND1]]
-    ; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[OR]](s64)
+    ; GFX9-NEXT: %2:_(s64) = disjoint G_OR [[AND]], [[AND1]]
+    ; GFX9-NEXT: $vgpr0_vgpr1 = COPY %2(s64)
     %0:_(s64) = COPY $vgpr0_vgpr1
     %1:_(s64) = COPY $vgpr2_vgpr3
     %2:_(s64) = G_FCOPYSIGN %0, %1
@@ -176,8 +176,8 @@ body: |
     ; SI-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 32
     ; SI-NEXT: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[ZEXT]], [[C2]](s32)
     ; SI-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C]]
-    ; SI-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND1]]
-    ; SI-NEXT: $vgpr0_vgpr1 = COPY [[OR]](s64)
+    ; SI-NEXT: %2:_(s64) = disjoint G_OR [[AND]], [[AND1]]
+    ; SI-NEXT: $vgpr0_vgpr1 = COPY %2(s64)
     ;
     ; VI-LABEL: name: test_copysign_s64_s32
     ; VI: liveins: $vgpr0_vgpr1, $vgpr2
@@ -191,8 +191,8 @@ body: |
     ; VI-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 32
     ; VI-NEXT: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[ZEXT]], [[C2]](s32)
     ; VI-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C]]
-    ; VI-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND1]]
-    ; VI-NEXT: $vgpr0_vgpr1 = COPY [[OR]](s64)
+    ; VI-NEXT: %2:_(s64) = disjoint G_OR [[AND]], [[AND1]]
+    ; VI-NEXT: $vgpr0_vgpr1 = COPY %2(s64)
     ;
     ; GFX9-LABEL: name: test_copysign_s64_s32
     ; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -206,8 +206,8 @@ body: |
     ; GFX9-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 32
     ; GFX9-NEXT: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[ZEXT]], [[C2]](s32)
     ; GFX9-NEXT: [[AND1:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND1]]
-    ; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[OR]](s64)
+    ; GFX9-NEXT: %2:_(s64) = disjoint G_OR [[AND]], [[AND1]]
+    ; GFX9-NEXT: $vgpr0_vgpr1 = COPY %2(s64)
     %0:_(s64) = COPY $vgpr0_vgpr1
     %1:_(s32) = COPY $vgpr2
     %2:_(s64) = G_FCOPYSIGN %0, %1
@@ -232,8 +232,8 @@ body: |
     ; SI-NEXT: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[COPY1]], [[C2]](s32)
     ; SI-NEXT: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[LSHR]](s64)
     ; SI-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[TRUNC]], [[C]]
-    ; SI-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[AND1]]
-    ; SI-NEXT: $vgpr0 = COPY [[OR]](s32)
+    ; SI-NEXT: %2:_(s32) = disjoint G_OR [[AND]], [[AND1]]
+    ; SI-NEXT: $vgpr0 = COPY %2(s32)
     ;
     ; VI-LABEL: name: test_copysign_s32_s64
     ; VI: liveins: $vgpr0, $vgpr1_vgpr2
@@ -247,8 +247,8 @@ body: |
     ; VI-NEXT: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[COPY1]], [[C2]](s32)
     ; VI-NEXT: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[LSHR]](s64)
     ; VI-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[TRUNC]], [[C]]
-    ; VI-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[AND1]]
-    ; VI-NEXT: $vgpr0 = COPY [[OR]](s32)
+    ; VI-NEXT: %2:_(s32) = disjoint G_OR [[AND]], [[AND1]]
+    ; VI-NEXT: $vgpr0 = COPY %2(s32)
     ;
     ; GFX9-LABEL: name: test_copysign_s32_s64
     ; GFX9: liveins: $vgpr0, $vgpr1_vgpr2
@@ -262,8 +262,8 @@ body: |
     ; GFX9-NEXT: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[COPY1]], [[C2]](s32)
     ; GFX9-NEXT: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[LSHR]](s64)
     ; GFX9-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[TRUNC]], [[C]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[AND1]]
-    ; GFX9-NEXT: $vgpr0 = COPY [[OR]](s32)
+    ; GFX9-NEXT: %2:_(s32) = disjoint G_OR [[AND]], [[AND1]]
+    ; GFX9-NEXT: $vgpr0 = COPY %2(s32)
     %0:_(s32) = COPY $vgpr0
     %1:_(s64) = COPY $vgpr1_vgpr2
     %2:_(s32) = G_FCOPYSIGN %0, %1
@@ -289,8 +289,8 @@ body: |
     ; SI-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[COPY1]], [[C2]](s32)
     ; SI-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR]](s32)
     ; SI-NEXT: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C]]
-    ; SI-NEXT: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[AND1]]
-    ; SI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
+    ; SI-NEXT: %3:_(s16) = disjoint G_OR [[AND]], [[AND1]]
+    ; SI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT %3(s16)
     ; SI-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
     ;
     ; VI-LABEL: name: test_copysign_s16_s32
@@ -306,8 +306,8 @@ body: |
     ; VI-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[COPY1]], [[C2]](s32)
     ; VI-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR]](s32)
     ; VI-NEXT: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C]]
-    ; VI-NEXT: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[AND1]]
-    ; VI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
+    ; VI-NEXT: %3:_(s16) = disjoint G_OR [[AND]], [[AND1]]
+    ; VI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT %3(s16)
     ; VI-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
     ;
     ; GFX9-LABEL: name: test_copysign_s16_s32
@@ -323,8 +323,8 @@ body: |
     ; GFX9-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[COPY1]], [[C2]](s32)
     ; GFX9-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR]](s32)
     ; GFX9-NEXT: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[AND1]]
-    ; GFX9-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
+    ; GFX9-NEXT: %3:_(s16) = disjoint G_OR [[AND]], [[AND1]]
+    ; GFX9-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT %3(s16)
     ; GFX9-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
     %0:_(s32) = COPY $vgpr0
     %1:_(s32) = COPY $vgpr1
@@ -353,8 +353,8 @@ body: |
     ; SI-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]]
     ; SI-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32)
     ; SI-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[SHL]], [[C]]
-    ; SI-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[AND2]]
-    ; SI-NEXT: $vgpr0 = COPY [[OR]](s32)
+    ; SI-NEXT: %3:_(s32) = disjoint G_OR [[AND]], [[AND2]]
+    ; SI-NEXT: $vgpr0 = COPY %3(s32)
     ;
     ; VI-LABEL: name: test_copysign_s32_s16
     ; VI: liveins: $vgpr0, $vgpr1
@@ -369,8 +369,8 @@ body: |
     ; VI-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]]
     ; VI-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32)
     ; VI-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[SHL]], [[C]]
-    ; VI-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[AND2]]
-    ; VI-NEXT: $vgpr0 = COPY [[OR]](s32)
+    ; VI-NEXT: %3:_(s32) = disjoint G_OR [[AND]], [[AND2]]
+    ; VI-NEXT: $vgpr0 = COPY %3(s32)
     ;
     ; GFX9-LABEL: name: test_copysign_s32_s16
     ; GFX9: liveins: $vgpr0, $vgpr1
@@ -385,8 +385,8 @@ body: |
     ; GFX9-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]]
     ; GFX9-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32)
     ; GFX9-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[SHL]], [[C]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[AND2]]
-    ; GFX9-NEXT: $vgpr0 = COPY [[OR]](s32)
+    ; GFX9-NEXT: %3:_(s32) = disjoint G_OR [[AND]], [[AND2]]
+    ; GFX9-NEXT: $vgpr0 = COPY %3(s32)
     %0:_(s32) = COPY $vgpr0
     %1:_(s32) = COPY $vgpr1
     %2:_(s16) = G_TRUNC %1
@@ -414,8 +414,8 @@ body: |
     ; SI-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 48
     ; SI-NEXT: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[AND1]], [[C3]](s32)
     ; SI-NEXT: [[AND2:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C]]
-    ; SI-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND2]]
-    ; SI-NEXT: $vgpr0_vgpr1 = COPY [[OR]](s64)
+    ; SI-NEXT: %3:_(s64) = disjoint G_OR [[AND]], [[AND2]]
+    ; SI-NEXT: $vgpr0_vgpr1 = COPY %3(s64)
     ;
     ; VI-LABEL: name: test_copysign_s64_s16
     ; VI: liveins: $vgpr0_vgpr1, $vgpr2
@@ -431,8 +431,8 @@ body: |
     ; VI-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 48
     ; VI-NEXT: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[AND1]], [[C3]](s32)
     ; VI-NEXT: [[AND2:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C]]
-    ; VI-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND2]]
-    ; VI-NEXT: $vgpr0_vgpr1 = COPY [[OR]](s64)
+    ; VI-NEXT: %3:_(s64) = disjoint G_OR [[AND]], [[AND2]]
+    ; VI-NEXT: $vgpr0_vgpr1 = COPY %3(s64)
     ;
     ; GFX9-LABEL: name: test_copysign_s64_s16
     ; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -448,8 +448,8 @@ body: |
     ; GFX9-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 48
     ; GFX9-NEXT: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[AND1]], [[C3]](s32)
     ; GFX9-NEXT: [[AND2:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND2]]
-    ; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[OR]](s64)
+    ; GFX9-NEXT: %3:_(s64) = disjoint G_OR [[AND]], [[AND2]]
+    ; GFX9-NEXT: $vgpr0_vgpr1 = COPY %3(s64)
     %0:_(s64) = COPY $vgpr0_vgpr1
     %1:_(s32) = COPY $vgpr2
     %2:_(s16) = G_TRUNC %1
@@ -476,8 +476,8 @@ body: |
     ; SI-NEXT: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[COPY1]], [[C2]](s32)
     ; SI-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR]](s64)
     ; SI-NEXT: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C]]
-    ; SI-NEXT: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[AND1]]
-    ; SI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
+    ; SI-NEXT: %3:_(s16) = disjoint G_OR [[AND]], [[AND1]]
+    ; SI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT %3(s16)
     ; SI-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
     ;
     ; VI-LABEL: name: test_copysign_s16_s64
@@ -493,8 +493,8 @@ body: |
     ; VI-NEXT: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[COPY1]], [[C2]](s32)
     ; VI-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR]](s64)
     ; VI-NEXT: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C]]
-    ; VI-NEXT: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[AND1]]
-    ; VI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
+    ; VI-NEXT: %3:_(s16) = disjoint G_OR [[AND]], [[AND1]]
+    ; VI-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT %3(s16)
     ; VI-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
     ;
     ; GFX9-LABEL: name: test_copysign_s16_s64
@@ -510,8 +510,8 @@ body: |
     ; GFX9-NEXT: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[COPY1]], [[C2]](s32)
     ; GFX9-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR]](s64)
     ; GFX9-NEXT: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[AND1]]
-    ; GFX9-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
+    ; GFX9-NEXT: %3:_(s16) = disjoint G_OR [[AND]], [[AND1]]
+    ; GFX9-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT %3(s16)
     ; GFX9-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
     %0:_(s32) = COPY $vgpr0
     %1:_(s64) = COPY $vgpr1_vgpr2
@@ -543,8 +543,8 @@ body: |
     ; SI-NEXT: [[BITCAST1:%[0-9]+]]:_(<2 x s16>) = G_BITCAST [[OR1]](s32)
     ; SI-NEXT: [[AND:%[0-9]+]]:_(<2 x s16>) = G_AND [[COPY]], [[BITCAST1]]
     ; SI-NEXT: [[AND1:%[0-9]+]]:_(<2 x s16>) = G_AND [[COPY1]], [[BITCAST]]
-    ; SI-NEXT: [[OR2:%[0-9]+]]:_(<2 x s16>) = G_OR [[AND]], [[AND1]]
-    ; SI-NEXT: $vgpr0 = COPY [[OR2]](<2 x s16>)
+    ; SI-NEXT: %2:_(<2 x s16>) = disjoint G_OR [[AND]], [[AND1]]
+    ; SI-NEXT: $vgpr0 = COPY %2(<2 x s16>)
     ;
     ; VI-LABEL: name: test_copysign_v2s16_v2s16
     ; VI: liveins: $vgpr0, $vgpr1
@@ -562,8 +562,8 @@ body: |
     ; VI-NEXT: [[BITCAST1:%[0-9]+]]:_(<2 x s16>) = G_BITCAST [[OR1]](s32)
     ; VI-NEXT: [[AND:%[0-9]+]]:_(<2 x s16>) = G_AND [[COPY]], [[BITCAST1]]
     ; VI-NEXT: [[AND1:%[0-9]+]]:_(<2 x s16>) = G_AND [[COPY1]], [[BITCAST]]
-    ; VI-NEXT: [[OR2:%[0-9]+]]:_(<2 x s16>) = G_OR [[AND]], [[AND1]]
-    ; VI-NEXT: $vgpr0 = COPY [[OR2]](<2 x s16>)
+    ; VI-NEXT: %2:_(<2 x s16>) = disjoint G_OR [[AND]], [[AND1]]
+    ; VI-NEXT: $vgpr0 = COPY %2(<2 x s16>)
     ;
     ; GFX9-LABEL: name: test_copysign_v2s16_v2s16
     ; GFX9: liveins: $vgpr0, $vgpr1
@@ -576,8 +576,8 @@ body: |
     ; GFX9-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[C1]](s16), [[C1]](s16)
     ; GFX9-NEXT: [[AND:%[0-9]+]]:_(<2 x s16>) = G_AND [[COPY]], [[BUILD_VECTOR1]]
     ; GFX9-NEXT: [[AND1:%[0-9]+]]:_(<2 x s16>) = G_AND [[COPY1]], [[BUILD_VECTOR]]
-    ; GFX9-NEXT: [[OR:%[0-9]+]]:_(<2 x s16>) = G_OR [[AND]], [[AND1]]
-    ; GFX9-NEXT: $vgpr0 = COPY [[OR]](<2 x s16>)
+    ; GFX9-NEXT: %2:_(<2 x s16>) = disjoint G_OR [[AND]], [[AND1]]
+    ; GFX9-NEXT: $vgpr0 = COPY %2(<2 x s16>)
     %0:_(<2 x s16>) = COPY $vgpr0
     %1:_(<2 x s16>) = COPY $vgpr1
     %2:_(<2 x s16>) = G_FCOPYSIGN %0, %1
@@ -601,8 +601,8 @@ body: |
     ; SI-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[C1]](s32), [[C1]](s32)
     ; SI-NEXT: [[AND:%[0-9]+]]:_(<2 x s32>) = G_AND [[COPY]], [[...
[truncated]

We masked out the sign bit from one value, and the non-sign bits
from the other so there should be no common bits set.

No idea how to test this on the DAG path, other than scraping
the debug logs. A few targets hit this path with f16 values, but
the resulting i16 ors get anyext promoted and lose the disjoint
flag. In the fp128 case, PPC gets further and the or loses the flag
somewhere else later. Adding a haveNoCommonBits assert shows this
works though.
@arsenm arsenm force-pushed the users/arsenm/dag-set-disjoint-on-lowered-copysign branch from 2dbd199 to 67496c0 Compare June 28, 2024 18:35
@arsenm arsenm merged commit 2df2373 into main Jun 28, 2024
5 of 6 checks passed
@arsenm arsenm deleted the users/arsenm/dag-set-disjoint-on-lowered-copysign branch June 28, 2024 21:03
lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
We masked out the sign bit from one value, and the non-sign bits
from the other so there should be no common bits set.

No idea how to test this on the DAG path, other than scraping
the debug logs. A few targets hit this path with f16 values, but
the resulting i16 ors get anyext promoted and lose the disjoint
flag. In the fp128 case, PPC gets further and the or loses the flag
somewhere else later. Adding a haveNoCommonBits assert shows this
works though.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants