-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[NVPTX] Add -march=general
option to mirror default configuration
#85222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang-driver @llvm/pr-subscribers-clang Author: Yichen Yan (oraluben) ChangesThis PR adds With this PR, users can explicitly request the default CUDA architecture. This default is regularly updated, and the most recent configuration as of commit ab202aa sets it to This PR does not address any compatibility issues between different CUDA versions. Full diff: https://github.com/llvm/llvm-project/pull/85222.diff 2 Files Affected:
diff --git a/clang/lib/Driver/ToolChains/Cuda.cpp b/clang/lib/Driver/ToolChains/Cuda.cpp
index c6007d3cfab864..4cb98f9f28963c 100644
--- a/clang/lib/Driver/ToolChains/Cuda.cpp
+++ b/clang/lib/Driver/ToolChains/Cuda.cpp
@@ -750,8 +750,8 @@ NVPTXToolChain::TranslateArgs(const llvm::opt::DerivedArgList &Args,
if (!llvm::is_contained(*DAL, A))
DAL->append(A);
- // TODO: We should accept 'generic' as a valid architecture.
- if (!DAL->hasArg(options::OPT_march_EQ) && OffloadKind != Action::OFK_None) {
+ if ((!DAL->hasArg(options::OPT_march_EQ) && OffloadKind != Action::OFK_None) ||
+ (DAL->getLastArgValue(options::OPT_march_EQ) == "generic")) {
DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
CudaArchToString(CudaArch::CudaDefault));
} else if (DAL->getLastArgValue(options::OPT_march_EQ) == "native") {
diff --git a/clang/test/Driver/cuda-cross-compiling.c b/clang/test/Driver/cuda-cross-compiling.c
index 086840accebe7f..e5aeca8300f85c 100644
--- a/clang/test/Driver/cuda-cross-compiling.c
+++ b/clang/test/Driver/cuda-cross-compiling.c
@@ -32,10 +32,15 @@
//
// RUN: %clang -target nvptx64-nvidia-cuda -march=sm_61 -### %s 2>&1 \
// RUN: | FileCheck -check-prefix=ARGS %s
+// RUN: %clang -target nvptx64-nvidia-cuda -march=generic -### %s 2>&1 \
+// RUN: | FileCheck -check-prefix=GENERIC %s
// ARGS: -cc1" "-triple" "nvptx64-nvidia-cuda" "-S" {{.*}} "-target-cpu" "sm_61" "-target-feature" "+ptx{{[0-9]+}}" {{.*}} "-o" "[[PTX:.+]].s"
// ARGS-NEXT: ptxas{{.*}}"-m64" "-O0" "--gpu-name" "sm_61" "--output-file" "[[CUBIN:.+]].cubin" "[[PTX]].s" "-c"
// ARGS-NEXT: nvlink{{.*}}"-o" "a.out" "-arch" "sm_61" {{.*}} "[[CUBIN]].cubin"
+// GENERIC: -cc1" "-triple" "nvptx64-nvidia-cuda" "-S" {{.*}} "-target-cpu" "sm_52" "-target-feature" "+ptx{{[0-9]+}}" {{.*}} "-o" "[[PTX:.+]].s"
+// GENERIC-NEXT: ptxas{{.*}}"-m64" "-O0" "--gpu-name" "sm_52" "--output-file" "[[CUBIN:.+]].cubin" "[[PTX]].s" "-c"
+// GENERIC-NEXT: nvlink{{.*}}"-o" "a.out" "-arch" "sm_52" {{.*}} "[[CUBIN]].cubin"
//
// Test the generated arguments to the CUDA binary utils when targeting NVPTX.
@@ -85,6 +90,6 @@
// MISSING: error: Must pass in an explicit nvptx64 gpu architecture to 'nvlink'
// RUN: %clang -target nvptx64-nvidia-cuda -flto -c %s -### 2>&1 \
-// RUN: | FileCheck -check-prefix=GENERIC %s
+// RUN: | FileCheck -check-prefix=COMPILE %s
-// GENERIC-NOT: -cc1" "-triple" "nvptx64-nvidia-cuda" {{.*}} "-target-cpu"
+// COMPILE-NOT: -cc1" "-triple" "nvptx64-nvidia-cuda" {{.*}} "-target-cpu"
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this. When the user compiles with -march=xyz
it introduces a lot of subtarget specific metadata intro the output IR. The purpose of the original patch was to keep -target-cpu
unset in cases where -march=xyz
was not passed in. The expected semantics here is that -march=sm_52 -march=generic
will override -march=sm_52
and result in no -target-cpu
being set just like if you didn't pass -march
at all.
cb795dd
to
be93832
Compare
Co-authored-by: Joseph Huber <[email protected]>
be93832
to
b0ae86c
Compare
FWIW I think you can kind of do this with |
73c9e0f
to
60a8c03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, thanks for the patch.
Co-authored-by: Joseph Huber <[email protected]>
Thanks, I'll merge it once it passes CI. |
@oraluben Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested Please check whether problems have been caused by your change specifically, as How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
This PR adds
-march=generic
support for the NVPTX backend. This fulfills a TODO introduced in #79873.With this PR, users can explicitly request the "default" CUDA architecture, which makes sure that no specific architecture is specified.
This PR does not address any compatibility issues between different CUDA versions.