Skip to content

Commit f42b930

Browse files
authored
[SLP] Pessimistically handle unknown vector entries in SLP vectorizer (#75438)
SLP Vectorizer can discard vector entries at unknown positions. This example shows the behaviour: https://godbolt.org/z/or43EM594 The following instruction inserts an element at an unknown position: ``` %2 = insertelement <3 x i64> poison, i64 %value, i64 %position ``` The position depends on an argument that is unknown at compile time. After running SLP, one can see there is no more instruction present referencing `%position`. This happens as SLP parallelizes the two adds in the example. It then needs to merge the original vector with the new vector. Within `isUndefVector`, the SLP vectorizer constructs a bitmap indicating which elements of the original vector are poison values. It does this by walking the insertElement instructions. If it encounters an insert with a non-constant position, it is ignored. This will result in poison values to be used for all entries, where there are no inserts with constant positions. However, as the position is unknown, the element could be anywhere. Therefore, I think it is only safe to assume none of the entries are poison values and to simply take them all over when constructing the shuffleVector instruction. This fixes #75437
1 parent 4888218 commit f42b930

File tree

2 files changed

+29
-2
lines changed

2 files changed

+29
-2
lines changed

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -391,8 +391,10 @@ static SmallBitVector isUndefVector(const Value *V,
391391
if (isa<T>(II->getOperand(1)))
392392
continue;
393393
std::optional<unsigned> Idx = getInsertIndex(II);
394-
if (!Idx)
395-
continue;
394+
if (!Idx) {
395+
Res.reset();
396+
return Res;
397+
}
396398
if (*Idx < UseMask.size() && !UseMask.test(*Idx))
397399
Res.reset(*Idx);
398400
}
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
2+
; RUN: opt < %s -passes=slp-vectorizer -S | FileCheck %s
3+
4+
target triple = "x86_64-unknown-linux-gnu"
5+
6+
define <3 x i64> @ahyes(i64 %position, i64 %value) {
7+
; CHECK-LABEL: define <3 x i64> @ahyes(
8+
; CHECK-SAME: i64 [[POSITION:%.*]], i64 [[VALUE:%.*]]) {
9+
; CHECK-NEXT: entry:
10+
; CHECK-NEXT: [[TMP0:%.*]] = insertelement <2 x i64> poison, i64 [[VALUE]], i32 0
11+
; CHECK-NEXT: [[TMP1:%.*]] = shufflevector <2 x i64> [[TMP0]], <2 x i64> poison, <2 x i32> zeroinitializer
12+
; CHECK-NEXT: [[TMP2:%.*]] = add <2 x i64> [[TMP1]], <i64 1, i64 2>
13+
; CHECK-NEXT: [[TMP3:%.*]] = insertelement <3 x i64> poison, i64 [[VALUE]], i64 [[POSITION]]
14+
; CHECK-NEXT: [[TMP4:%.*]] = shufflevector <2 x i64> [[TMP2]], <2 x i64> poison, <3 x i32> <i32 0, i32 1, i32 poison>
15+
; CHECK-NEXT: [[TMP5:%.*]] = shufflevector <3 x i64> [[TMP3]], <3 x i64> [[TMP4]], <3 x i32> <i32 3, i32 4, i32 2>
16+
; CHECK-NEXT: ret <3 x i64> [[TMP5]]
17+
;
18+
entry:
19+
%0 = add i64 %value, 1
20+
%1 = add i64 %value, 2
21+
%2 = insertelement <3 x i64> poison, i64 %value, i64 %position
22+
%3 = insertelement <3 x i64> %2, i64 %0, i64 0
23+
%4 = insertelement <3 x i64> %3, i64 %1, i64 1
24+
ret <3 x i64> %4
25+
}

0 commit comments

Comments
 (0)