Skip to content

Commit 6dab69f

Browse files
authored
[SYCL][CUDA] Return invalid subgroup size warning (#6183)
This is a solution to #6103 for the CUDA case only. HIP AMD case still needs to be considered as discussed here: #6103 (comment). CUDA only currently supports one subgroup (warp) size : 32 for all devices. This PR introduces a solution to #6103 appropriate for backends which only support a single subgroup size: if the optional kernel attribute reqd_sub_group_size() is used with the supported subgroup size then it will compile and behave as the programmer intends. If reqd_sub_group_size() is used with another incompatible subgroup size a warning is returned when compiling, such as: reqd-sub-group-size-cuda.cpp:12:73: warning: attribute argument 8 is invalid and will be ignored; CUDA requires sub_group size 32 [-Wcuda-compat] h.single_task<class invalid_kernel>([=] [[sycl::reqd_sub_group_size(8)]] {}); ^ Signed-off-by: JackAKirk [email protected]
1 parent 467df22 commit 6dab69f

File tree

3 files changed

+28
-0
lines changed

3 files changed

+28
-0
lines changed

clang/include/clang/Basic/DiagnosticSemaKinds.td

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3288,6 +3288,10 @@ def err_attribute_argument_is_zero : Error<
32883288
def warn_attribute_argument_n_negative : Warning<
32893289
"%0 attribute parameter %1 is negative and will be ignored">,
32903290
InGroup<CudaCompat>;
3291+
def warn_reqd_sub_group_attribute_cuda_n_32
3292+
: Warning<"attribute argument %0 is invalid and will be ignored; CUDA "
3293+
"requires sub_group size 32">,
3294+
InGroup<CudaCompat>;
32913295
def err_property_function_in_objc_container : Error<
32923296
"use of Objective-C property in function nested in Objective-C "
32933297
"container not supported, move function outside its container">;

clang/lib/Sema/SemaDeclAttr.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3886,6 +3886,10 @@ void Sema::AddIntelReqdSubGroupSize(Decl *D, const AttributeCommonInfo &CI,
38863886
<< CI << /*positive*/ 0;
38873887
return;
38883888
}
3889+
if (Context.getTargetInfo().getTriple().isNVPTX() && ArgVal != 32) {
3890+
Diag(E->getExprLoc(), diag::warn_reqd_sub_group_attribute_cuda_n_32)
3891+
<< ArgVal.getSExtValue();
3892+
}
38893893

38903894
// Check to see if there's a duplicate attribute with different values
38913895
// already applied to the declaration.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
// RUN: %clang_cc1 -fsycl-is-device -triple nvptx -internal-isystem %S/Inputs -std=c++2b -verify %s
2+
//
3+
// This tests that a warning is returned when a sub group size other than 32 is
4+
// requested in the CUDA backend via the reqd_sub_group_size() kernel attribute.
5+
#include "sycl.hpp"
6+
7+
int main() {
8+
9+
sycl::queue Q;
10+
11+
Q.submit([&](sycl::handler &h) {
12+
h.single_task<class invalid_kernel>([=] [[sycl::reqd_sub_group_size(8)]] {}); // expected-warning {{attribute argument 8 is invalid and will be ignored; CUDA requires sub_group size 32}}
13+
});
14+
15+
Q.submit([&](sycl::handler &h) {
16+
h.single_task<class valid_kernel>([=] [[sycl::reqd_sub_group_size(32)]] {});
17+
});
18+
19+
return 0;
20+
}

0 commit comments

Comments
 (0)