-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[GlobalISel] Check whether G_CTLZ
is legal in matchUMulHToLShr
#126457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-globalisel Author: Shilei Tian (shiltian) ChangesWe need to check Full diff: https://github.com/llvm/llvm-project/pull/126457.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index 4648414cc46aef6..0dfbb91f2ac5437 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -5641,6 +5641,7 @@ bool CombinerHelper::matchUMulHToLShr(MachineInstr &MI) const {
Register RHS = MI.getOperand(2).getReg();
Register Dst = MI.getOperand(0).getReg();
LLT Ty = MRI.getType(Dst);
+ LLT RHSTy = MRI.getType(RHS);
LLT ShiftAmtTy = getTargetLowering().getPreferredShiftAmountTy(Ty);
auto MatchPow2ExceptOne = [&](const Constant *C) {
if (auto *CI = dyn_cast<ConstantInt>(C))
@@ -5649,7 +5650,10 @@ bool CombinerHelper::matchUMulHToLShr(MachineInstr &MI) const {
};
if (!matchUnaryPredicate(MRI, RHS, MatchPow2ExceptOne, false))
return false;
- return isLegalOrBeforeLegalizer({TargetOpcode::G_LSHR, {Ty, ShiftAmtTy}});
+ // We need to check both G_LSHR and G_CTLZ because the combine uses G_CTLZ to
+ // get log base 2, and it is not always legal for on a target.
+ return isLegalOrBeforeLegalizer({TargetOpcode::G_LSHR, {Ty, ShiftAmtTy}}) &&
+ isLegalOrBeforeLegalizer({TargetOpcode::G_CTLZ, {RHSTy, RHSTy}});
}
void CombinerHelper::applyUMulHToLShr(MachineInstr &MI) const {
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/no-ctlz-from-umul-to-lshr-in-postlegalizer.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/no-ctlz-from-umul-to-lshr-in-postlegalizer.mir
new file mode 100644
index 000000000000000..00ead74cb37bb84
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/no-ctlz-from-umul-to-lshr-in-postlegalizer.mir
@@ -0,0 +1,23 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -run-pass=amdgpu-postlegalizer-combiner %s -o - | FileCheck %s
+
+---
+name: test
+tracksRegLiveness: true
+legalized: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+ ; CHECK-LABEL: name: test
+ ; CHECK: liveins: $vgpr0, $vgpr1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; CHECK-NEXT: [[UMULH:%[0-9]+]]:_(s32) = G_UMULH [[COPY]], [[C]]
+ ; CHECK-NEXT: $vgpr0 = COPY [[UMULH]](s32)
+ ; CHECK-NEXT: SI_RETURN implicit $vgpr0
+ %0:_(s32) = COPY $vgpr0
+ %1:_(s32) = G_CONSTANT i32 4
+ %2:_(s32) = G_UMULH %0:_, %1:_
+ $vgpr0 = COPY %2:_(s32)
+ SI_RETURN implicit $vgpr0
|
@llvm/pr-subscribers-backend-amdgpu Author: Shilei Tian (shiltian) ChangesWe need to check Full diff: https://github.com/llvm/llvm-project/pull/126457.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index 4648414cc46aef6..0dfbb91f2ac5437 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -5641,6 +5641,7 @@ bool CombinerHelper::matchUMulHToLShr(MachineInstr &MI) const {
Register RHS = MI.getOperand(2).getReg();
Register Dst = MI.getOperand(0).getReg();
LLT Ty = MRI.getType(Dst);
+ LLT RHSTy = MRI.getType(RHS);
LLT ShiftAmtTy = getTargetLowering().getPreferredShiftAmountTy(Ty);
auto MatchPow2ExceptOne = [&](const Constant *C) {
if (auto *CI = dyn_cast<ConstantInt>(C))
@@ -5649,7 +5650,10 @@ bool CombinerHelper::matchUMulHToLShr(MachineInstr &MI) const {
};
if (!matchUnaryPredicate(MRI, RHS, MatchPow2ExceptOne, false))
return false;
- return isLegalOrBeforeLegalizer({TargetOpcode::G_LSHR, {Ty, ShiftAmtTy}});
+ // We need to check both G_LSHR and G_CTLZ because the combine uses G_CTLZ to
+ // get log base 2, and it is not always legal for on a target.
+ return isLegalOrBeforeLegalizer({TargetOpcode::G_LSHR, {Ty, ShiftAmtTy}}) &&
+ isLegalOrBeforeLegalizer({TargetOpcode::G_CTLZ, {RHSTy, RHSTy}});
}
void CombinerHelper::applyUMulHToLShr(MachineInstr &MI) const {
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/no-ctlz-from-umul-to-lshr-in-postlegalizer.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/no-ctlz-from-umul-to-lshr-in-postlegalizer.mir
new file mode 100644
index 000000000000000..00ead74cb37bb84
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/no-ctlz-from-umul-to-lshr-in-postlegalizer.mir
@@ -0,0 +1,23 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -run-pass=amdgpu-postlegalizer-combiner %s -o - | FileCheck %s
+
+---
+name: test
+tracksRegLiveness: true
+legalized: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1
+ ; CHECK-LABEL: name: test
+ ; CHECK: liveins: $vgpr0, $vgpr1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
+ ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; CHECK-NEXT: [[UMULH:%[0-9]+]]:_(s32) = G_UMULH [[COPY]], [[C]]
+ ; CHECK-NEXT: $vgpr0 = COPY [[UMULH]](s32)
+ ; CHECK-NEXT: SI_RETURN implicit $vgpr0
+ %0:_(s32) = COPY $vgpr0
+ %1:_(s32) = G_CONSTANT i32 4
+ %2:_(s32) = G_UMULH %0:_, %1:_
+ $vgpr0 = COPY %2:_(s32)
+ SI_RETURN implicit $vgpr0
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add and end to end IR test where this reproduced
We need to check `G_CTLZ` because the combine uses `G_CTLZ` to get log base 2, and it is not always legal for on a target.
9760a8d
to
bf21d87
Compare
@@ -0,0 +1,98 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two interesting things about this test case:
- It is not reproducible w/o
-O0
argument - The
udiv
is actually a UB, and replacing either of its operand will make the crash go away (or the pattern not recognized such that the issue will not happen).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would have an easier time reproducing this by directly emitting a multiply + shift by half bitwidth. it just happens one appears inside the div expansion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like this? it doesn't seem to be able to reproduce the issue.
define void @test(ptr %p, i32 %a, i32 %b) {
%mul = mul i32 %a, %b
%lshr = lshr i32 %mul, 16
store i32 %lshr, ptr %p, align 4
ret void
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want 64-bit multiply, but I don't see the general combine to form umulh from mul + shift
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I guess I'm gonna merge as it is. Didn't get the luck with 64 bit as well.
…lvm#126457) We need to check `G_CTLZ` because the combine uses `G_CTLZ` to get log base 2, and it is not always legal for on a target. Fixes SWDEV-512440.
We need to check
G_CTLZ
because the combine usesG_CTLZ
to get log base 2,and it is not always legal for on a target.
Fixes SWDEV-512440.