-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[CVP][LVI] Add support for InsertElementInst in LVI #99368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Author: Rajat Bajpai (rajatbajpai) ChangesCurrently, the LVI analysis pass doesn't support InsertElementInst vector instruction. Due to this, some optimization opportunities are missed. For example, in the below example, ICMP instruction can be folded but it doesn't.
This change adds InsertElementInst support in the LVI analysis pass to fix the motivating example. Full diff: https://github.com/llvm/llvm-project/pull/99368.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/LazyValueInfo.cpp b/llvm/lib/Analysis/LazyValueInfo.cpp
index 92389f2896b8e..d28d4fa47fdae 100644
--- a/llvm/lib/Analysis/LazyValueInfo.cpp
+++ b/llvm/lib/Analysis/LazyValueInfo.cpp
@@ -428,6 +428,8 @@ class LazyValueInfoImpl {
std::optional<ValueLatticeElement> solveBlockValueIntrinsic(IntrinsicInst *II,
BasicBlock *BB);
std::optional<ValueLatticeElement>
+ solveBlockValueInsertElement(InsertElementInst *IEI, BasicBlock *BB);
+ std::optional<ValueLatticeElement>
solveBlockValueExtractValue(ExtractValueInst *EVI, BasicBlock *BB);
bool isNonNullAtEndOfBlock(Value *Val, BasicBlock *BB);
void intersectAssumeOrGuardBlockValueConstantRange(Value *Val,
@@ -657,6 +659,9 @@ LazyValueInfoImpl::solveBlockValueImpl(Value *Val, BasicBlock *BB) {
if (BinaryOperator *BO = dyn_cast<BinaryOperator>(BBI))
return solveBlockValueBinaryOp(BO, BB);
+ if (auto *IEI = dyn_cast<InsertElementInst>(BBI))
+ return solveBlockValueInsertElement(IEI, BB);
+
if (auto *EVI = dyn_cast<ExtractValueInst>(BBI))
return solveBlockValueExtractValue(EVI, BB);
@@ -1038,6 +1043,37 @@ LazyValueInfoImpl::solveBlockValueIntrinsic(IntrinsicInst *II, BasicBlock *BB) {
MetadataVal);
}
+std::optional<ValueLatticeElement>
+LazyValueInfoImpl::solveBlockValueInsertElement(InsertElementInst *IEI,
+ BasicBlock *BB) {
+ std::optional<ValueLatticeElement> OptEltVal =
+ getBlockValue(IEI->getOperand(1), BB, IEI);
+ if (!OptEltVal)
+ return std::nullopt;
+ ValueLatticeElement &EltVal = *OptEltVal;
+
+ if (auto *CV = dyn_cast<ConstantVector>(IEI->getOperand(0))) {
+ // Must be vector of integers. Merge these elements to create
+ // the range.
+ for (unsigned i = 0, e = CV->getNumOperands(); i != e; ++i) {
+ Constant *Elem = CV->getAggregateElement(i);
+ if (isa<PoisonValue>(Elem))
+ continue;
+ std::optional<ConstantRange> CR = getRangeFor(Elem, IEI, BB);
+ if (!CR)
+ return std::nullopt;
+ EltVal.mergeIn(ValueLatticeElement::getRange(*CR));
+ }
+ } else if (!isa<PoisonValue>(IEI->getOperand(0))) {
+ std::optional<ValueLatticeElement> OptVecResult =
+ solveBlockValueImpl(IEI->getOperand(0), BB);
+ if (!OptVecResult)
+ return std::nullopt;
+ EltVal.mergeIn(*OptVecResult);
+ }
+ return EltVal;
+}
+
std::optional<ValueLatticeElement>
LazyValueInfoImpl::solveBlockValueExtractValue(ExtractValueInst *EVI,
BasicBlock *BB) {
diff --git a/llvm/test/Transforms/CorrelatedValuePropagation/insertelement.ll b/llvm/test/Transforms/CorrelatedValuePropagation/insertelement.ll
new file mode 100644
index 0000000000000..769f431738342
--- /dev/null
+++ b/llvm/test/Transforms/CorrelatedValuePropagation/insertelement.ll
@@ -0,0 +1,76 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=correlated-propagation -S | FileCheck %s
+
+;; Check if ICMP instruction is constant folded or not.
+
+define void @test1(ptr addrspace(1) %out) {
+; CHECK-LABEL: define void @test1(
+; CHECK-SAME: ptr addrspace(1) [[OUT:%.*]]) {
+; CHECK-NEXT: [[CALL:%.*]] = call i32 @llvm.nvvm.read.ptx.sreg.tid.x(), !range [[RNG0:![0-9]+]]
+; CHECK-NEXT: [[UDIV_LHS_TRUNC:%.*]] = trunc i32 [[CALL]] to i16
+; CHECK-NEXT: [[UDIV1:%.*]] = udiv i16 [[UDIV_LHS_TRUNC]], 5
+; CHECK-NEXT: [[UDIV_ZEXT:%.*]] = zext i16 [[UDIV1]] to i32
+; CHECK-NEXT: [[ADD1:%.*]] = add nuw nsw i32 [[UDIV_ZEXT]], 768
+; CHECK-NEXT: [[ADD2:%.*]] = add nuw nsw i32 [[UDIV_ZEXT]], 896
+; CHECK-NEXT: [[IE1:%.*]] = insertelement <2 x i32> poison, i32 [[ADD1]], i64 0
+; CHECK-NEXT: [[IE2:%.*]] = insertelement <2 x i32> [[IE1]], i32 [[ADD2]], i64 1
+; CHECK-NEXT: [[EI1:%.*]] = extractelement <2 x i1> <i1 true, i1 true>, i64 0
+; CHECK-NEXT: [[EI2:%.*]] = extractelement <2 x i1> <i1 true, i1 true>, i64 1
+; CHECK-NEXT: [[ADDUP:%.*]] = add i1 [[EI1]], [[EI2]]
+; CHECK-NEXT: [[ADDUP_UPCAST:%.*]] = zext i1 [[ADDUP]] to i32
+; CHECK-NEXT: store i32 [[ADDUP_UPCAST]], ptr addrspace(1) [[OUT]], align 4
+; CHECK-NEXT: ret void
+;
+ %call = call i32 @llvm.nvvm.read.ptx.sreg.tid.x(), !range !1
+ %udiv = udiv i32 %call, 5
+ %add1 = add i32 %udiv, 768
+ %add2 = add i32 %udiv, 896
+ %ie1 = insertelement <2 x i32> poison, i32 %add1, i64 0
+ %ie2 = insertelement <2 x i32> %ie1, i32 %add2, i64 1
+ %icmp1 = icmp slt <2 x i32> %ie2, <i32 1024, i32 1024>
+ %ei1 = extractelement <2 x i1> %icmp1, i64 0
+ %ei2 = extractelement <2 x i1> %icmp1, i64 1
+ %addUp = add i1 %ei1, %ei2
+ %addUp.upcast = zext i1 %addUp to i32
+ store i32 %addUp.upcast, ptr addrspace(1) %out, align 4
+ ret void
+}
+
+
+;; Check if LVI is able to handle constant vector operands
+;; in InsertElementInst and CVP is able to fold ICMP instruction.
+
+define void @test2(ptr addrspace(1) %out) {
+; CHECK-LABEL: define void @test2(
+; CHECK-SAME: ptr addrspace(1) [[OUT:%.*]]) {
+; CHECK-NEXT: [[CALL:%.*]] = call i32 @llvm.nvvm.read.ptx.sreg.tid.x(), !range [[RNG0]]
+; CHECK-NEXT: [[UDIV_LHS_TRUNC:%.*]] = trunc i32 [[CALL]] to i16
+; CHECK-NEXT: [[UDIV1:%.*]] = udiv i16 [[UDIV_LHS_TRUNC]], 5
+; CHECK-NEXT: [[UDIV_ZEXT:%.*]] = zext i16 [[UDIV1]] to i32
+; CHECK-NEXT: [[ADD2:%.*]] = add nuw nsw i32 [[UDIV_ZEXT]], 896
+; CHECK-NEXT: [[IE1:%.*]] = insertelement <2 x i32> <i32 poison, i32 1>, i32 [[ADD2]], i64 0
+; CHECK-NEXT: [[EI1:%.*]] = extractelement <2 x i1> <i1 true, i1 true>, i64 0
+; CHECK-NEXT: [[EI2:%.*]] = extractelement <2 x i1> <i1 true, i1 true>, i64 1
+; CHECK-NEXT: [[ADDUP:%.*]] = add i1 [[EI1]], [[EI2]]
+; CHECK-NEXT: [[ADDUP_UPCAST:%.*]] = zext i1 [[ADDUP]] to i32
+; CHECK-NEXT: store i32 [[ADDUP_UPCAST]], ptr addrspace(1) [[OUT]], align 4
+; CHECK-NEXT: ret void
+;
+ %call = call i32 @llvm.nvvm.read.ptx.sreg.tid.x(), !range !1
+ %udiv = udiv i32 %call, 5
+ %add2 = add i32 %udiv, 896
+ %ie1 = insertelement <2 x i32> <i32 poison, i32 1>, i32 %add2, i64 0
+ %icmp1 = icmp slt <2 x i32> %ie1, <i32 1024, i32 1024>
+ %ei1 = extractelement <2 x i1> %icmp1, i64 0
+ %ei2 = extractelement <2 x i1> %icmp1, i64 1
+ %addUp = add i1 %ei1, %ei2
+ %addUp.upcast = zext i1 %addUp to i32
+ store i32 %addUp.upcast, ptr addrspace(1) %out, align 4
+ ret void
+}
+
+
+!1 = !{i32 0, i32 640}
+;.
+; CHECK: [[RNG0]] = !{i32 0, i32 640}
+;.
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
llvm/test/Transforms/CorrelatedValuePropagation/insertelement.ll
Outdated
Show resolved
Hide resolved
4e223c3
to
5e2a073
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
5e2a073
to
0dfc0f9
Compare
Will merge the changes after getting the clean pipeline. Thank you for reviewing this change. |
0dfc0f9
to
e6145b4
Compare
Currently, the LVI analysis pass doesn't support InsertElementInst vector instruction. Due to this, some optimization opportunities are missed. For example, in the below example, ICMP instruction can be folded but it doesn't. ``` ... %ie1 = insertelement <2 x i32> poison, i32 10, i64 0 %ie2 = insertelement <2 x i32> %ie1, i32 20, i64 1 %icmp1 = icmp <2 x i1> %ie2, <i32 40, i32 40> ... ``` This change adds InsertElementInst support in the LVI analysis pass to fix the motivating example.
e6145b4
to
530f0c6
Compare
The Linux build is failing and the error doesn't seem to be related to my change. I'll wait some time for the setup to recover. |
@rajatbajpai You can ignore the linux build. |
@nikic could you please merge this PR for me? I don't have permission to merge a PR. Thanks! |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/2815 Here is the relevant piece of the build log for the reference:
|
Summary: Currently, the LVI analysis pass doesn't support InsertElementInst vector instruction. Due to this, some optimization opportunities are missed. For example, in the below example, ICMP instruction can be folded but it doesn't. ``` ... %ie1 = insertelement <2 x i32> poison, i32 10, i64 0 %ie2 = insertelement <2 x i32> %ie1, i32 20, i64 1 %icmp = icmp <2 x i1> %ie2, <i32 40, i32 40> ... ``` This change adds InsertElementInst support in the LVI analysis pass to fix the motivating example. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251386
Hello, I bisected a crash back to this patch:
|
Reduced test case: @g = external global i32
define <2 x i16> @test() {
%ins = insertelement <2 x i16> poison, i16 ptrtoint (ptr @g to i16), i32 0
ret <2 x i16> %ins
} |
@mikaelholmen Should be fixed by 7a7a426. |
That was quick. Thanks! |
Thanks @mikaelholmen, for reporting the issue and @nikic for a quick fix. |
Currently, the LVI analysis pass doesn't support InsertElementInst vector instruction. Due to this, some optimization opportunities are missed. For example, in the below example, ICMP instruction can be folded but it doesn't.
This change adds InsertElementInst support in the LVI analysis pass to fix the motivating example.