Skip to content

Commit ba818c4

Browse files
committed
[DAG] replaceStoreOfInsertLoad - don't fold if the inserted element is implicitly truncated
D152276 wasn't handling the case where the inserted element is implicitly truncated into the vector - resulting in a i1 element (implicitly truncated from i8) overwriting 8 bits instead of 1 bit. This patch is intended to be merged into 17.x so I've just disallowed any vector element vs inserted element type mismatch - technically we could be more elegant and permit truncated stores (as long as the store is still byte sized), but the use cases for that are so limited I'd prefer to play it safe for now. Candidate patch for llvm#64655 17.x merge Differential Revision: https://reviews.llvm.org/D158366
1 parent 69bd66b commit ba818c4

File tree

2 files changed

+14
-3
lines changed

2 files changed

+14
-3
lines changed

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20508,9 +20508,11 @@ SDValue DAGCombiner::replaceStoreOfInsertLoad(StoreSDNode *ST) {
2050820508
SDValue Elt = Value.getOperand(1);
2050920509
SDValue Idx = Value.getOperand(2);
2051020510

20511-
// If the element isn't byte sized then we can't compute an offset
20511+
// If the element isn't byte sized or is implicitly truncated then we can't
20512+
// compute an offset.
2051220513
EVT EltVT = Elt.getValueType();
20513-
if (!EltVT.isByteSized())
20514+
if (!EltVT.isByteSized() ||
20515+
EltVT != Value.getOperand(0).getValueType().getVectorElementType())
2051420516
return SDValue();
2051520517

2051620518
auto *Ld = dyn_cast<LoadSDNode>(Value.getOperand(0));

llvm/test/CodeGen/X86/pr64655.ll

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,16 @@ define void @f(ptr %0) {
4141
;
4242
; AVX512-LABEL: f:
4343
; AVX512: # %bb.0:
44-
; AVX512-NEXT: movb $1, 1(%rdi)
44+
; AVX512-NEXT: kmovb (%rdi), %k0
45+
; AVX512-NEXT: movb $-3, %al
46+
; AVX512-NEXT: kmovd %eax, %k1
47+
; AVX512-NEXT: kandb %k1, %k0, %k0
48+
; AVX512-NEXT: movb $1, %al
49+
; AVX512-NEXT: kmovd %eax, %k1
50+
; AVX512-NEXT: kshiftlb $7, %k1, %k1
51+
; AVX512-NEXT: kshiftrb $6, %k1, %k1
52+
; AVX512-NEXT: korb %k1, %k0, %k0
53+
; AVX512-NEXT: kmovb %k0, (%rdi)
4554
; AVX512-NEXT: retq
4655
%2 = load <8 x i1>, ptr %0
4756
%3 = insertelement <8 x i1> %2, i1 true, i32 1

0 commit comments

Comments
 (0)