Skip to content

Commit 807b708

Browse files
committed
[X86][SSE41] Non-temporal loads shouldn't be folded if it can be avoided (PR32743)
Missed SSE41 non-temporal load case in previous commit Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304722
1 parent 4ef096b commit 807b708

File tree

2 files changed

+258
-98
lines changed

2 files changed

+258
-98
lines changed

llvm/lib/Target/X86/X86InstrFragmentsSIMD.td

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -743,9 +743,13 @@ def alignedloadv8i64 : PatFrag<(ops node:$ptr),
743743
// allows unaligned accesses, match any load, though this may require
744744
// setting a feature bit in the processor (on startup, for example).
745745
// Opteron 10h and later implement such a feature.
746+
// Avoid non-temporal aligned loads on supported targets.
746747
def memop : PatFrag<(ops node:$ptr), (load node:$ptr), [{
747-
return Subtarget->hasSSEUnalignedMem()
748-
|| cast<LoadSDNode>(N)->getAlignment() >= 16;
748+
return (Subtarget->hasSSEUnalignedMem() ||
749+
cast<LoadSDNode>(N)->getAlignment() >= 16) &&
750+
(!Subtarget->hasSSE41() ||
751+
!(cast<LoadSDNode>(N)->getAlignment() >= 16 &&
752+
cast<LoadSDNode>(N)->isNonTemporal()));
749753
}]>;
750754

751755
// 128-bit memop pattern fragments

0 commit comments

Comments
 (0)