Skip to content

Commit 2b0e80c

Browse files
vchuravygiordano
authored andcommitted
[X86] Use fence(seq_cst) in IdempotentRMWIntoFencedLoad (llvm#126521)
This extends this optimization for scenarios where the subtarget has `!hasMFence` or we have SyncScope SingleThread, by avoiding the direct usage of `llvm.x64.sse2.mfence`. (cherry picked from commit ed6bde9)
1 parent 6ca3b70 commit 2b0e80c

File tree

3 files changed

+665
-129
lines changed

3 files changed

+665
-129
lines changed

llvm/lib/Target/X86/X86ISelLowering.cpp

Lines changed: 3 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -31788,21 +31788,10 @@ X86TargetLowering::lowerIdempotentRMWIntoFencedLoad(AtomicRMWInst *AI) const {
3178831788
// otherwise, we might be able to be more aggressive on relaxed idempotent
3178931789
// rmw. In practice, they do not look useful, so we don't try to be
3179031790
// especially clever.
31791-
if (SSID == SyncScope::SingleThread)
31792-
// FIXME: we could just insert an ISD::MEMBARRIER here, except we are at
31793-
// the IR level, so we must wrap it in an intrinsic.
31794-
return nullptr;
31795-
31796-
if (!Subtarget.hasMFence())
31797-
// FIXME: it might make sense to use a locked operation here but on a
31798-
// different cache-line to prevent cache-line bouncing. In practice it
31799-
// is probably a small win, and x86 processors without mfence are rare
31800-
// enough that we do not bother.
31801-
return nullptr;
3180231791

31803-
Function *MFence =
31804-
llvm::Intrinsic::getOrInsertDeclaration(M, Intrinsic::x86_sse2_mfence);
31805-
Builder.CreateCall(MFence, {});
31792+
// Use `fence seq_cst` over `llvm.x64.sse2.mfence` here to get the correct
31793+
// lowering for SSID == SyncScope::SingleThread and !hasMFence
31794+
Builder.CreateFence(AtomicOrdering::SequentiallyConsistent, SSID);
3180631795

3180731796
// Finally we can emit the atomic load.
3180831797
LoadInst *Loaded = Builder.CreateAlignedLoad(

0 commit comments

Comments
 (0)