Skip to content

Commit e1e20c0

Browse files
committed
[SLP]Fix bitwidth analysis for signed nodes, incoming into UITOFP nodes
If the signed node is the operand of UITOFP, the bitwidth analysis should consider minimum value between incoming bitwidth and the bitwidth of the UITOFP node. Fixes #129244
1 parent 56cc929 commit e1e20c0

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18423,6 +18423,14 @@ void BoUpSLP::computeMinimumValueSizes() {
1842318423
return Known.isNonNegative();
1842418424
});
1842518425

18426+
if (!IsKnownPositive && !IsTopRoot && E.UserTreeIndex &&
18427+
E.UserTreeIndex.UserTE->hasState() &&
18428+
E.UserTreeIndex.UserTE->getOpcode() == Instruction::UIToFP)
18429+
MaxBitWidth =
18430+
std::min(DL->getTypeSizeInBits(
18431+
E.UserTreeIndex.UserTE->Scalars.front()->getType()),
18432+
DL->getTypeSizeInBits(ScalarTy));
18433+
1842618434
// We first check if all the bits of the roots are demanded. If they're not,
1842718435
// we can truncate the roots to this narrower type.
1842818436
for (Value *Root : E.Scalars) {

llvm/test/Transforms/SLPVectorizer/X86/uitofp-with-signed-value-bitwidth.ll

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ define i32 @test(ptr %d, i32 %0) {
77
; CHECK-NEXT: [[ENTRY:.*:]]
88
; CHECK-NEXT: [[TMP1:%.*]] = insertelement <2 x i32> poison, i32 [[TMP0]], i32 0
99
; CHECK-NEXT: [[TMP2:%.*]] = shufflevector <2 x i32> [[TMP1]], <2 x i32> poison, <2 x i32> zeroinitializer
10-
; CHECK-NEXT: [[TMP3:%.*]] = uitofp <2 x i32> [[TMP2]] to <2 x double>
10+
; CHECK-NEXT: [[TMP8:%.*]] = sext <2 x i32> [[TMP2]] to <2 x i64>
11+
; CHECK-NEXT: [[TMP3:%.*]] = uitofp <2 x i64> [[TMP8]] to <2 x double>
1112
; CHECK-NEXT: [[TMP4:%.*]] = fdiv <2 x double> [[TMP3]], zeroinitializer
1213
; CHECK-NEXT: [[TMP5:%.*]] = fcmp ogt <2 x double> [[TMP4]], zeroinitializer
1314
; CHECK-NEXT: [[TMP6:%.*]] = extractelement <2 x i1> [[TMP5]], i32 1

0 commit comments

Comments
 (0)