Skip to content

Commit f4cca3f

Browse files
carlobertollironlieb
authored andcommitted
Emit fast FP atomics for gfx942. This should not include atomic compare.
Change-Id: I7851050301401ed24652dc975afe7066dd5f03b8
1 parent df9121a commit f4cca3f

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4278,9 +4278,10 @@ llvm::Value *CGOpenMPRuntimeGPU::getXteamRedSum(
42784278
bool CGOpenMPRuntimeGPU::supportFastFPAtomics() {
42794279
CudaArch Arch = getCudaArch(CGM);
42804280
switch (Arch) {
4281-
case CudaArch::GFX90a:
4282-
return true;
4283-
default:
4281+
case CudaArch::GFX90a:
4282+
case CudaArch::GFX942:
4283+
return true;
4284+
default:
42844285
break;
42854286
}
42864287
return false;

0 commit comments

Comments
 (0)