Skip to content

[SYCL] Implement SYCL 2020 spec functionality: no propagation from functions to the caller #3836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 36 additions & 8 deletions clang/include/clang/Basic/AttrDocs.td
Original file line number Diff line number Diff line change
Expand Up @@ -415,6 +415,10 @@ def SYCLSimdDocs : Documentation {
The compiler may decide to compile such functions using different optimization
and code generation pipeline. Also, this attribute is used to distinguish
ESIMD private globals from regular SYCL global variables.

In SYCL 1.2.1 mode, the ``intel::sycl_explicit_simd`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.
}];
}

Expand Down Expand Up @@ -2443,8 +2447,9 @@ lambda capture, or function object member, of the callable to which the
attribute was applied. This effect is equivalent to annotating restrict on
**all** kernel pointer arguments in an OpenCL or SPIR-V kernel.

If ``intel::kernel_args_restrict`` is applied to a function called from a device
kernel, the attribute is not ignored and it is propagated to the kernel.
In SYCL 1.2.1 mode, the ``intel::kernel_args_restrict`` attribute is propagated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this risk breaking code, or was the old documentation wrong in SYCL 2020 mode? Same question applies to the other documentation instances where we go from always propagating to sometimes propagating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this risk breaking code, or was the old documentation wrong in SYCL 2020 mode? Same question applies to the other documentation instances where we go from always propagating to sometimes propagating.

Not sure i understand your question correctly.

The current PR changes the existing behavior. Only breaking part happens here - propagation with SYCL 1.2.1 and no propagation with SYCL 2020 mode when the attribute is applied to a function from a device kernel. so old documentation was wrong in SYCL 2020 mode.

from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

The attribute forms an unchecked assertion, in that implementations
do not need to check/confirm the pre-condition in any way. If a user applies
Expand Down Expand Up @@ -2482,8 +2487,10 @@ def SYCLIntelNumSimdWorkItemsAttrDocs : Documentation {
let Content = [{
Applies to a device function/lambda function. Indicates the number of work
items that should be processed in parallel. Valid values are positive integers.
If ``intel::num_simd_work_items`` is applied to a function called from a
device kernel, the attribute is not ignored and it is propagated to the kernel.

In SYCL 1.2.1 mode, the ``intel::num_simd_work_items`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

Expand Down Expand Up @@ -2633,6 +2640,11 @@ allows the Y and Z arguments to be optional. If not provided by the user, the
value of Y and Z defaults to 1. See section 5.8.1 Kernel Attributes for more
details.

In SYCL 1.2.1 mode, the ``intel::reqd_work_group_size`` or
``sycl::reqd_work_group_size`` attribute is propagated from the function it is
applied to onto the kernel which calls the function. In SYCL 2020 mode, the
attribute is not propagated to the kernel.

.. code-block:: c++

[[sycl::reqd_work_group_size(4, 4, 4)]] void foo() {}
Expand Down Expand Up @@ -2773,8 +2785,10 @@ Applies to a device function/lambda function. Indicates the maximum dimensions
of a work group. Values must be positive integers. This is similar to
reqd_work_group_size, but allows work groups that are smaller or equal to the
specified sizes.
If ``intel::max_work_group_size`` is applied to a function called from a
device kernel, the attribute is not ignored and it is propagated to the kernel.

In SYCL 1.2.1 mode, the ``intel::max_work_group_size`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

Expand Down Expand Up @@ -2805,8 +2819,10 @@ Applies to a device function/lambda function or function call operator (of a
function object). Indicates the largest valid global work dimension that will be
accepted when running the kernel on a device. Valid values are integers in a
range of [0, 3].
If ``intel::max_global_work_dim`` is applied to a function called from a
device kernel, the attribute is not ignored and it is propagated to the kernel.

In SYCL 1.2.1 mode, the ``intel::max_global_work_dim`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

Expand Down Expand Up @@ -2863,6 +2879,10 @@ device operation, guiding the FPGA backend to insert the appropriate number of
registers to break-up the combinational logic circuit, and thereby controlling
the length of the longest combinational path.

In SYCL 1.2.1 mode, the ``intel::scheduler_target_fmax_mhz`` attribute is
propagated from the function it is applied to onto the kernel which calls the
function. In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

[[intel::scheduler_target_fmax_mhz(4)]] void foo() {}
Expand Down Expand Up @@ -2893,6 +2913,10 @@ function object). If 1, compiler doesn't use the global work offset values for
the device function. Valid values are 0 and 1. If used without argument, value
of 1 is set implicitly.

In SYCL 1.2.1 mode, the ``intel::no_global_work_offset`` attribute is
propagated from the function it is applied to onto the kernel which calls the
function. In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

[[intel::no_global_work_offset]]
Expand Down Expand Up @@ -4567,6 +4591,10 @@ In addition to device functions, the required sub-group size attribute may also
be specified in the definition of a named functor object and lambda functions,
as in the examples below:

In SYCL 1.2.1 mode, the ``intel::reqd_sub_group_size`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

class Functor
Expand Down
8 changes: 8 additions & 0 deletions clang/lib/Sema/SemaDeclAttr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3369,6 +3369,14 @@ static void handleIntelNamedSubGroupSize(Sema &S, Decl *D,
SizeType)) {
S.Diag(Loc, diag::warn_attribute_type_not_supported) << AL << SizeStr;
}

// If the [[intel::named_sub_group_size]] attribute spelling is used in
// SYCL 2017 mode, we want to diagnose it as being an ignored attribute.
Comment on lines +3373 to +3374
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the docs may need to be updated for this attribute, as they currently imply there's a mode other than SYCL 2020 mode: https://github.com/intel/llvm/blob/sycl/clang/include/clang/Basic/AttrDocs.td#L4613

Also, this should be expressed in Attr.td in a LangOpts clause.

if (S.LangOpts.getSYCLVersion() == LangOptions::SYCL_2017) {
S.Diag(AL.getLoc(), diag::warn_attribute_ignored) << AL;
return;
}

D->addAttr(IntelNamedSubGroupSizeAttr::Create(S.Context, SizeType, AL));
}

Expand Down
55 changes: 37 additions & 18 deletions clang/lib/Sema/SemaSYCL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -343,15 +343,43 @@ static void collectSYCLAttributes(Sema &S, FunctionDecl *FD,
if (!FD->hasAttrs())
return;

llvm::copy_if(FD->getAttrs(), std::back_inserter(Attrs), [](Attr *A) {
// FIXME: Make this list self-adapt as new SYCL attributes are added.
return isa<IntelReqdSubGroupSizeAttr, IntelNamedSubGroupSizeAttr,
ReqdWorkGroupSizeAttr, SYCLIntelKernelArgsRestrictAttr,
SYCLIntelNumSimdWorkItemsAttr,
SYCLIntelSchedulerTargetFmaxMhzAttr,
SYCLIntelMaxWorkGroupSizeAttr, SYCLIntelMaxGlobalWorkDimAttr,
SYCLIntelNoGlobalWorkOffsetAttr, SYCLSimdAttr>(A);
});
// Attributes that should be propagated from device functions to a kernel
// in SYCL 1.2.1.
if (S.getASTContext().getLangOpts().getSYCLVersion() <
LangOptions::SYCL_2020) {
llvm::copy_if(FD->getAttrs(), std::back_inserter(Attrs), [](Attr *A) {
// FIXME: Make this list self-adapt as new SYCL attributes are added.
return isa<IntelReqdSubGroupSizeAttr, SYCLIntelKernelArgsRestrictAttr,
ReqdWorkGroupSizeAttr, SYCLIntelNumSimdWorkItemsAttr,
SYCLIntelSchedulerTargetFmaxMhzAttr,
SYCLIntelMaxWorkGroupSizeAttr, SYCLIntelMaxGlobalWorkDimAttr,
SYCLIntelNoGlobalWorkOffsetAttr, SYCLSimdAttr>(A);
});
// Attributes that should not be propagated from device functions to a
// kernel in SYCL 1.2.1.
if (DirectlyCalled) {
llvm::copy_if(FD->getAttrs(), std::back_inserter(Attrs), [](Attr *A) {
return isa<SYCLIntelLoopFuseAttr, SYCLIntelFPGAMaxConcurrencyAttr,
SYCLIntelFPGADisableLoopPipeliningAttr,
SYCLIntelFPGAInitiationIntervalAttr>(A);
});
}
} else {
// Attributes that should not be propagated from device functions to a
// kernel in SYCL 2020.
if (DirectlyCalled) {
llvm::copy_if(FD->getAttrs(), std::back_inserter(Attrs), [](Attr *A) {
return isa<
SYCLIntelFPGAMaxConcurrencyAttr,
SYCLIntelFPGADisableLoopPipeliningAttr, SYCLSimdAttr,
SYCLIntelKernelArgsRestrictAttr, ReqdWorkGroupSizeAttr,
SYCLIntelNumSimdWorkItemsAttr, SYCLIntelSchedulerTargetFmaxMhzAttr,
SYCLIntelNoGlobalWorkOffsetAttr, SYCLIntelMaxWorkGroupSizeAttr,
IntelReqdSubGroupSizeAttr, SYCLIntelMaxGlobalWorkDimAttr,
IntelNamedSubGroupSizeAttr, SYCLIntelFPGAInitiationIntervalAttr>(A);
});
}
Comment on lines +367 to +381
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can get rid of this entire branch because in 2020 mode, it seems like none of the attributes propagate anyway (or did I get that wrong)?

Copy link
Contributor

@elizabethandrews elizabethandrews Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we can simplify this code somehow but this branch collects attributes to be applied to device functions and kernel itself. The attributes applied to kernel still need to be collected and applied irrespective of SYCL version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @AaronBallman. i was wrong. We need this entire else branch in SYCL2020 mode since we still need to copy the attributes for DirectlyCalled = TRUE. I have no better idea about how we can avoid duplicating the codes here.

Copy link
Contributor

@AaronBallman AaronBallman Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am starting to think more and more that it's past time to table generate this logic rather than continuing to struggle to maintain these lists manually. Do I understand correctly that we really have one collection need with three (maybe four) modes: never propagate, always propagate, only propagate when directly called (possibly with a diagnostic), but the mode may differ based on the language options? If so, we could perhaps try to design a tablegen feature to do this. e.g., something along the lines of

>>let SYCLKernelPropagationBehavior = [SYCLKernelPropMode<SYCL2020, NeverPropagate>, // SYCL 2020 behavior
                                     SYCLKernelPropMode<SYCL2017, AlwaysPropagate>] // SYCL 2017 behavior

(Where SYCL2020 and SYCL2017 are new LangOpt definitions we add to Attr.td and the *Propagate are enumerations we define.)

Thinking out loud: we'd generate a function named static bool isAttributeCollected(const Attr *A, const LanguageOptions &LangOpts, bool IsDirectlyCalled); that returns whether an attribute should be collected or not. We'd have to collect all of the SYCLKernelPropagationBehavior objects in Attr.td so that we could group the language mode checks together in the resulting generated file. What we'd generate would effectively look like:

>>if (LangOpts.getSYCLVersion() == SYCL_2017) {
  if (isa<large generated list of attributes here>(A) && IsDirectlyCalled) // SYCL 2017, propagate if directly called
    return true;
  if (isa<large generated list of attributes here>(A)) // SYCL 2017, always propagate
    return true;
>>}
>>if (LangOpts.getSYCLVersion() >= SYCL_2020) {
  if (isa<large generated list of attributes here>(A) && IsDirectlyCalled) // SYCL 2020, propagate if directly called
    return true;
  if (isa<large generated list of attributes here>(A)) // SYCL 2020, always propagate
    return true;
}
return false;

where each of the generated lists of attributes in the isa<> checks are based off the propagation enumeration from Attr.td.

If we want to include diagnostics in the logic, I think we'd return an enumeration rather than a boolean and let the caller figure out what diagnostic to emit, whether to drop the attribute, etc. But given that we only have one of those, we may just want to handle that case specially.

Then, collectSYCLAttributes() will defer most of the logic to the generated isAttributeCollected(), but can still house any custom logic we need (like for diagnostics).

WDYT? (Note, there may be tweaks needed to the idea -- this was designed somewhat off-the-cuff, so if you have a better idea of how to express this in Attr.td, we should definitely explore it.)

As for whether this is a separate task or done as part of this one... I'm on the fence. It's a bit separable, but at the same time, it'd implement the main point to this review so it seems reasonable to just do it here.

Thanks @AaronBallman for the tablegen design. I agree with you that it seems reasonable to do this here. I did not have a chance to look into the new design yesterday. I will take a look at this today and will follow up with you about this for any question.

}

// Allow the kernel attribute "use_stall_enable_clusters" only on lambda
// functions and function objects called directly from a kernel.
Expand All @@ -366,15 +394,6 @@ static void collectSYCLAttributes(Sema &S, FunctionDecl *FD,
FD->dropAttr<SYCLIntelUseStallEnableClustersAttr>();
}
}

// Attributes that should not be propagated from device functions to a kernel.
if (DirectlyCalled) {
llvm::copy_if(FD->getAttrs(), std::back_inserter(Attrs), [](Attr *A) {
return isa<SYCLIntelLoopFuseAttr, SYCLIntelFPGAMaxConcurrencyAttr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these be added to the DirectlyCalled list in L362?

Copy link
Contributor Author

@smanna12 smanna12 Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attributes directly apply on kernel functor/lambda in SYCL2020 modes, so i did not add them in L362.
SYCLIntelFPGAMaxConcurrencyAttr,
SYCLIntelFPGADisableLoopPipeliningAttr,
SYCLIntelFPGAInitiationIntervalAttr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These attributes are available prior to SYCL2020 right? Shouldn't they apply for earlier versions as well? I think this patch changes existing behavior for these attributes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These attributes are available prior to SYCL2020 right? Shouldn't they apply for earlier versions as well? I think this patch changes existing behavior for these attributes.

All attributes were added recently. they were added after SYCl2020 spec release. I think they should not apply in SYCL2017 modes.

#3388
#3441

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how its expected to work but unless the extensions these attributes support are limited to SYCL2020, or these attributes are documented to work only in SYCL 2020, we probably should not be changing this behavior for earlier versions of SYCL. @AaronBallman please let us know your thoughts here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if these attributes are not supported for SYCL 2017, shouldn't we be diagnosing it?

The diagnostic seems reasonable to me.

Copy link
Contributor

@elizabethandrews elizabethandrews Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wait for @AaronBallman's input. In my opinion it is confusing/strange to have individual attributes behave differently in different versions of SYCL spec, but I guess we are doing that with this change anyway. I guess the question is more - should we change existing behavior for these attributes in SYCL 2017

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion it is confusing/strange to have individual attributes behave differently in different versions of SYCL spec,

I agree with you, @elizabethandrews.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delayed response -- having power issues at the house.

I'm not sure how its expected to work but unless the extensions these attributes support are limited to SYCL2020, or these attributes are documented to work only in SYCL 2020, we probably should not be changing this behavior for earlier versions of SYCL. @AaronBallman please let us know your thoughts here.

Agreed. I think the route we want to go is:

  • When a new attribute is added to the intel namespace, we support it in all the language modes where it has valid semantics. The semantics of the attribute can do whatever is most sensible for that given language mode, but once the semantics are set and the attribute has been released in the wild, the semantics should not change except to become more permissive (e.g., we shouldn't change the semantics such that code breaks, but we should be fine to allow the attribute to be used in ways that used to produce an error).
  • When a new attribute is added to the sycl namespace, we support it in all the language modes where it has valid semantics, but we diagnose use of a new feature in the older modes as an extension. The semantics of the attribute have to follow what's specified by the SYCL spec. If we think some semantics are going to cause implementation concerns for us, we need to talk to the SYCL spec authors about how to resolve it on a case-by-case basis.
    ** Note: community typically also adds a compatibility warning in the newer mode so people who want their code to remain compatible with older language standards can do so. If we didn't support SYCL 1.2.1, we could skip this diagnostic, but from talks with @kbsmith-intel, it sounds like SYCL 1.2.1 support is still mandatory and so the future compat warning should also be added.
  • When the attribute does not have valid semantics in a given mode (regardless of what vendor namespace the attribute is in), we should ignore the attribute with a diagnostic to let the user know it's being ignored. The only exception to this rule are SYCL attributes that are ignored in host mode but not device mode.

I guess the question is more - should we change existing behavior for these attributes in SYCL 2017

I don't think we should change existing behaviors -- that runs too much risk of silently breaking user code. However, it's also not clear to much just how much implementation effort it is to retain the old behavior in each case and whether the old behavior was sensible or not in SYCL 1.2.1. My reading of the 1.2.1 spec suggests that only the vec_type_hint, work_group_size_hint, and reqd_work_group_size attributes are propagated and none of the rest of them are. I get this from (emphasis added by me for clarity):

The vec_type_hint, work_group_size_hint and reqd_work_group_size kernel attributes in OpenCL C
apply to kernel functions, but this is not syntactically possible in SYCL. In SYCL, these attributes are legal on
device functions and their specification is propagated down to any caller of those device functions, such that the
kernel attributes are the sum of all the kernel attributes of all device functions called.

That said, I have no idea if this is an accurate understanding of the SYCL spec.

Copy link
Contributor Author

@smanna12 smanna12 Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AaronBallman and @elizabethandrews. I have added the attributes below in SYCL 1.2.1 mode. The semantic is same in all modes.

SYCLIntelFPGAMaxConcurrencyAttr,
SYCLIntelFPGADisableLoopPipeliningAttr,
SYCLIntelFPGAInitiationIntervalAttr

SYCLIntelFPGADisableLoopPipeliningAttr,
SYCLIntelFPGAInitiationIntervalAttr>(A);
});
}
}

class DiagDeviceFunction : public RecursiveASTVisitor<DiagDeviceFunction> {
Expand Down
1 change: 1 addition & 0 deletions clang/test/CodeGenSYCL/disable_loop_pipelining.cpp
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -sycl-std=2017 -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -sycl-std=2020 -emit-llvm -o - %s | FileCheck %s

#include "sycl.hpp"
Expand Down
1 change: 1 addition & 0 deletions clang/test/CodeGenSYCL/initiation_interval.cpp
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -sycl-std=2017 -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -sycl-std=2020 -emit-llvm -o - %s | FileCheck %s

#include "sycl.hpp"
Expand Down
21 changes: 20 additions & 1 deletion clang/test/CodeGenSYCL/intel-fpga-no-global-work-offset.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -emit-llvm -o - -sycl-std=2017 -DSYCL2017 %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -emit-llvm -o - -sycl-std=2020 -DSYCL2020 %s

#include "sycl.hpp"

Expand All @@ -19,6 +20,8 @@ class Functor {
template <int N>
[[intel::no_global_work_offset(N)]] void func() {}

[[intel::no_global_work_offset(1)]] void func1() {}

int main() {
q.submit([&](handler &h) {
Foo boo;
Expand All @@ -30,12 +33,26 @@ int main() {
h.single_task<class kernel_name3>(
[]() [[intel::no_global_work_offset(0)]]{});

// Test class template argument.
Functor<1> f;
h.single_task<class kernel_name4>(f);

#if defined(SYCL2017)
// Test template argument with propagated function attribute.
h.single_task<class kernel_name5>([]() {
func<1>();
});

// Test attribute is propagated.
h.single_task<class kernel_name6>(
[]() { func1(); });
#endif // SYCL2017

#if defined(SYCL2020)
// Test attribute is not propagated.
h.single_task<class kernel_name7>(
[]() { func1(); });
#endif //SYCL2020
});
return 0;
}
Expand All @@ -45,5 +62,7 @@ int main() {
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name3"() #0 {{.*}} ![[NUM4:[0-9]+]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name4"() #0 {{.*}} !no_global_work_offset ![[NUM5]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name5"() #0 {{.*}} !no_global_work_offset ![[NUM5]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name6"() #0 {{.*}} !no_global_work_offset ![[NUM5]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name7"() #0 {{.*}} ![[NUM5]]
// CHECK-NOT: ![[NUM4]] = !{i32 0}
// CHECK: ![[NUM5]] = !{}
22 changes: 21 additions & 1 deletion clang/test/CodeGenSYCL/intel-max-global-work-dim.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -emit-llvm -o - -sycl-std=2017 -DSYCL2017 %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -emit-llvm -o - -sycl-std=2020 -DSYCL2020 %s

#include "sycl.hpp"

Expand All @@ -19,6 +20,8 @@ class Functor {
template <int N>
[[intel::max_global_work_dim(N)]] void func() {}

[[intel::max_global_work_dim(2)]] void func1() {}

int main() {
q.submit([&](handler &h) {
Foo boo;
Expand All @@ -27,12 +30,26 @@ int main() {
h.single_task<class kernel_name2>(
[]() [[intel::max_global_work_dim(2)]]{});

// Test class template argument.
Functor<2> f;
h.single_task<class kernel_name3>(f);

#if defined(SYCL2017)
// Test template argument with propagated function attribute.
h.single_task<class kernel_name4>([]() {
func<2>();
});

// Test attribute is propagated.
h.single_task<class kernel_name5>(
[]() { func1(); });
#endif // SYCL2017

#if defined(SYCL2020)
// Test attribute is not propagated.
h.single_task<class kernel_name6>(
[]() { func1(); });
#endif //SYCL2020
});
return 0;
}
Expand All @@ -41,5 +58,8 @@ int main() {
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name2"() #0 {{.*}} !max_global_work_dim ![[NUM2:[0-9]+]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name3"() #0 {{.*}} !max_global_work_dim ![[NUM2]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name4"() #0 {{.*}} !max_global_work_dim ![[NUM2]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name5"() #0 {{.*}} !max_global_work_dim ![[NUM2]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name6"() #0 {{.*}} ![[NUM0:[0-9]+]]
// CHECK: ![[NUM1]] = !{i32 1}
// CHECK: ![[NUM2]] = !{i32 2}
// CHECK: ![[NUM0]] = !{}
22 changes: 21 additions & 1 deletion clang/test/CodeGenSYCL/intel-max-work-group-size.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -emit-llvm -o - -sycl-std=2017 -DSYCL2017 %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -emit-llvm -o - -sycl-std=2020 -DSYCL2020 %s

#include "sycl.hpp"

Expand All @@ -24,6 +25,8 @@ class Functor {
template <int N, int N1, int N2>
[[intel::max_work_group_size(N, N1, N2)]] void func() {}

[[intel::max_work_group_size(10, 10, 10)]] void func1() {}

int main() {
q.submit([&](handler &h) {
Foo boo;
Expand All @@ -35,12 +38,26 @@ int main() {
Bar bar;
h.single_task<class kernel_name3>(bar);

// Test class template argument.
Functor<2, 2, 2> f;
h.single_task<class kernel_name4>(f);

#if defined(SYCL2017)
// Test template argument with propagated function attribute.
h.single_task<class kernel_name5>([]() {
func<4, 4, 4>();
});

// Test attribute is propagated.
h.single_task<class kernel_name6>(
[]() { func1(); });
#endif // SYCL2017

#if defined(SYCL2020)
// Test attribute is not propagated.
h.single_task<class kernel_name7>(
[]() { func1(); });
#endif //SYCL2020
});
return 0;
}
Expand All @@ -50,8 +67,11 @@ int main() {
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name3"() #0 {{.*}} !max_work_group_size ![[NUM6:[0-9]+]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name4"() #0 {{.*}} !max_work_group_size ![[NUM2:[0-9]+]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name5"() #0 {{.*}} !max_work_group_size ![[NUM4:[0-9]+]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name6"() #0 {{.*}} !max_work_group_size ![[NUM10:[0-9]+]]
// CHECK: define {{.*}}spir_kernel void @{{.*}}kernel_name7"() #0 {{.*}} ![[NUM0:[0-9]+]]
// CHECK: ![[NUM1]] = !{i32 1, i32 1, i32 1}
// CHECK: ![[NUM8]] = !{i32 8, i32 8, i32 8}
// CHECK: ![[NUM6]] = !{i32 6, i32 3, i32 1}
// CHECK: ![[NUM2]] = !{i32 2, i32 2, i32 2}
// CHECK: ![[NUM4]] = !{i32 4, i32 4, i32 4}
// CHECK: ![[NUM0]] = !{}
2 changes: 1 addition & 1 deletion clang/test/CodeGenSYCL/intel-restrict.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// RUN: %clang_cc1 -fsycl-is-device %s -emit-llvm -triple spir64-unknown-unknown-sycldevice -o - | FileCheck %s
// RUN: %clang_cc1 -fsycl-is-device %s -emit-llvm -sycl-std=2017 -triple spir64-unknown-unknown-sycldevice -o - | FileCheck %s

template <typename name, typename Func>
__attribute__((sycl_kernel)) void kernel(const Func &kernelFunc) {
Expand Down
2 changes: 1 addition & 1 deletion clang/test/CodeGenSYCL/loop_fuse_device.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -sycl-std=2017 -emit-llvm -o - %s | FileCheck %s

#include "sycl.hpp"

Expand Down
2 changes: 1 addition & 1 deletion clang/test/CodeGenSYCL/loop_fuse_ind_device.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -sycl-std=2017 -triple spir64-unknown-unknown-sycldevice -emit-llvm -o - %s | FileCheck %s

#include "sycl.hpp"

Expand Down
1 change: 1 addition & 0 deletions clang/test/CodeGenSYCL/max-concurrency.cpp
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -triple spir64-unknown-unknown-sycldevice -disable-llvm-passes -sycl-std=2017 -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -disable-llvm-passes -triple spir64-unknown-unknown-sycldevice -sycl-std=2020 -emit-llvm -o - %s | FileCheck %s

#include "sycl.hpp"
Expand Down
Loading