Skip to content

Commit fffe9a1

Browse files
authored
[clang][FE][Cuda] Fix a sm90a cuda arch define check in TargetInfo (#12885)
The original commit from upstream: [llvm-project/commit/631c6e8](llvm/llvm-project@631c6e8) conditionally defines `__CUDA_ARCH_FEAT_SM90_ALL` separately from the CUDA_ARCH, but we break this by making it a either or decision in a if-else block. Hence, we were not setting correctly the definitions for upstream's clang -x cuda execution mode for sm90a. I believe this slipped in as a wrongly resolved merge during an upstream pulldown.
1 parent f26d984 commit fffe9a1

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

clang/lib/Basic/Targets/NVPTX.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -284,10 +284,10 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions &Opts,
284284

285285
if (Opts.SYCLIsDevice) {
286286
Builder.defineMacro("__SYCL_CUDA_ARCH__", CUDAArchCode);
287-
} else if (GPU == CudaArch::SM_90a) {
288-
Builder.defineMacro("__CUDA_ARCH_FEAT_SM90_ALL", "1");
289287
} else {
290288
Builder.defineMacro("__CUDA_ARCH__", CUDAArchCode);
289+
if (GPU == CudaArch::SM_90a)
290+
Builder.defineMacro("__CUDA_ARCH_FEAT_SM90_ALL", "1");
291291
}
292292
}
293293
}

0 commit comments

Comments
 (0)