Skip to content

Commit b4bb394

Browse files
RKSimontru
authored andcommitted
[DAG] replaceStoreOfInsertLoad - don't fold if the inserted element is implicitly truncated
D152276 wasn't handling the case where the inserted element is implicitly truncated into the vector - resulting in a i1 element (implicitly truncated from i8) overwriting 8 bits instead of 1 bit. This patch is intended to be merged into 17.x so I've just disallowed any vector element vs inserted element type mismatch - technically we could be more elegant and permit truncated stores (as long as the store is still byte sized), but the use cases for that are so limited I'd prefer to play it safe for now. Candidate patch for llvm#64655 17.x merge Differential Revision: https://reviews.llvm.org/D158366 (cherry picked from commit ba818c4)
1 parent e1e4603 commit b4bb394

File tree

2 files changed

+14
-3
lines changed

2 files changed

+14
-3
lines changed

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20492,9 +20492,11 @@ SDValue DAGCombiner::replaceStoreOfInsertLoad(StoreSDNode *ST) {
2049220492
SDValue Elt = Value.getOperand(1);
2049320493
SDValue Idx = Value.getOperand(2);
2049420494

20495-
// If the element isn't byte sized then we can't compute an offset
20495+
// If the element isn't byte sized or is implicitly truncated then we can't
20496+
// compute an offset.
2049620497
EVT EltVT = Elt.getValueType();
20497-
if (!EltVT.isByteSized())
20498+
if (!EltVT.isByteSized() ||
20499+
EltVT != Value.getOperand(0).getValueType().getVectorElementType())
2049820500
return SDValue();
2049920501

2050020502
auto *Ld = dyn_cast<LoadSDNode>(Value.getOperand(0));

llvm/test/CodeGen/X86/pr64655.ll

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,16 @@ define void @f(ptr %0) {
4141
;
4242
; AVX512-LABEL: f:
4343
; AVX512: # %bb.0:
44-
; AVX512-NEXT: movb $1, 1(%rdi)
44+
; AVX512-NEXT: kmovb (%rdi), %k0
45+
; AVX512-NEXT: movb $-3, %al
46+
; AVX512-NEXT: kmovd %eax, %k1
47+
; AVX512-NEXT: kandb %k1, %k0, %k0
48+
; AVX512-NEXT: movb $1, %al
49+
; AVX512-NEXT: kmovd %eax, %k1
50+
; AVX512-NEXT: kshiftlb $7, %k1, %k1
51+
; AVX512-NEXT: kshiftrb $6, %k1, %k1
52+
; AVX512-NEXT: korb %k1, %k0, %k0
53+
; AVX512-NEXT: kmovb %k0, (%rdi)
4554
; AVX512-NEXT: retq
4655
%2 = load <8 x i1>, ptr %0
4756
%3 = insertelement <8 x i1> %2, i1 true, i32 1

0 commit comments

Comments
 (0)