-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][FPGA][NFC] Fix num_simd_work_items and reqd_work_group_size argument check #3337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
baa3542
00deaee
e5d1479
602ef39
bf958c6
ca86bb5
5dd9786
895e3bc
579ad6f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2418,26 +2418,44 @@ device kernel, the attribute is not ignored and it is propagated to the kernel. | |
[[intel::num_simd_work_items(N)]] void operator()() const {} | ||
}; | ||
|
||
If the`` intel::reqd_work_group_size`` or ``cl::reqd_work_group_size`` | ||
attribute is specified on a declaration along with a | ||
intel::num_simd_work_items attribute, the work group size attribute | ||
argument (the first argument) must be evenly divisible by the argument specified | ||
If the ``intel::reqd_work_group_size`` attribute is specified on a declaration | ||
along with a ``intel::num_simd_work_items`` attribute, the work group size | ||
(the last argument) must be evenly divisible by the argument specified | ||
in the ``intel::num_simd_work_items`` attribute. | ||
|
||
.. code-block:: c++ | ||
|
||
struct func { | ||
[[intel::num_simd_work_items(4)]] | ||
[[intel::reqd_work_group_size(64, 64, 64)]] | ||
[[intel::reqd_work_group_size(5, 5, 64)]] | ||
void operator()() const {} | ||
}; | ||
|
||
struct bar { | ||
[[intel::reqd_work_group_size(64, 64, 64)]] | ||
[[intel::reqd_work_group_size(5, 5, 64)]] | ||
[[intel::num_simd_work_items(4)]] | ||
void operator()() const {} | ||
}; | ||
|
||
If the ``cl::reqd_work_group_size`` or ``__attribute__((reqd_work_group_size()))`` | ||
attribute is specified on a declaration along with a ``intel::num_simd_work_items`` | ||
attribute, the work group size (the first argument) must be evenly divisible by | ||
the argument specified in the ``intel::num_simd_work_items`` attribute. | ||
GNU and ``cl::reqd_work_group_size`` spelling of ReqdWorkGroupSizeAttr maps to | ||
the OpenCL semantic. The first and last argument are only swapped for the Intel | ||
attributes in SYCL and not the OpenCL ones. | ||
|
||
.. code-block:: c++ | ||
|
||
struct func1 { | ||
[[intel::num_simd_work_items(4)]] | ||
[[cl::reqd_work_group_size(4, 5, 5)]] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When possible, do not pick 2 numbers the same. |
||
void operator()() const {} | ||
}; | ||
|
||
[[intel::num_simd_work_items(2)]] | ||
__attribute__((reqd_work_group_size(4,3,3))) void bar(); | ||
|
||
}]; | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3152,11 +3152,28 @@ static void handleWorkGroupSize(Sema &S, Decl *D, const ParsedAttr &AL) { | |
return; | ||
ZDimExpr = ZDim.get(); | ||
|
||
// If the declaration has an [[intel::num_simd_work_items()]] attribute, | ||
// check to see if the first argument of [[cl::reqd_work_group_size()]] | ||
// or __attribute__((reqd_work_group_size)) attribute and last argument | ||
// of [intel::reqd_work_group_size()]] attribute can be evenly divided by | ||
// the num_simd_work_items attribute. GNU and [[cl::reqd_work_group_size]] | ||
// spelling of ReqdWorkGroupSizeAttr maps to the OpenCL | ||
// semantics. First and last argument are only swapped for the Intel | ||
// atributes in SYCL and not the OpenCL ones. | ||
if (const auto *A = D->getAttr<SYCLIntelNumSimdWorkItemsAttr>()) { | ||
int64_t NumSimdWorkItems = | ||
A->getValue()->getIntegerConstantExpr(Ctx)->getSExtValue(); | ||
|
||
if (XDimVal.getZExtValue() % NumSimdWorkItems != 0) { | ||
bool UsesOpenCLArgOrdering = | ||
((AL.getSyntax() == AttributeCommonInfo::AS_CXX11 || | ||
AL.getSyntax() == AttributeCommonInfo::AS_C2x) && | ||
AL.hasScope() && AL.getScopeName()->isStr("cl")) || | ||
AL.getSyntax() == AttributeCommonInfo::AS_GNU; | ||
|
||
unsigned WorkGroupSize = UsesOpenCLArgOrdering ? XDimVal.getZExtValue() | ||
: ZDimVal.getZExtValue(); | ||
|
||
if (WorkGroupSize % NumSimdWorkItems != 0) { | ||
S.Diag(A->getLocation(), diag::err_sycl_num_kernel_wrong_reqd_wg_size) | ||
<< A << AL; | ||
S.Diag(AL.getLoc(), diag::note_conflicting_attribute); | ||
|
@@ -3302,14 +3319,24 @@ void Sema::AddSYCLIntelNumSimdWorkItemsAttr(Decl *D, | |
return; | ||
} | ||
} | ||
|
||
// If the declaration has an [[intel::reqd_work_group_size]] attribute, | ||
// check to see if the first argument can be evenly divided by the | ||
// num_simd_work_items attribute. | ||
// If the declaration has a [[intel::reqd_work_group_size()]] | ||
// attribute, check to see if the last argument can be evenly | ||
// divided by the num_simd_work_items attribute. | ||
// If the declaration has a __attribute__((reqd_work_group_size)) | ||
// or [[cl::reqd_work_group_size()]] attribute, check to see if the | ||
// first argument can be evenly divided by the num_simd_work_items | ||
// attribute. GNU and [[cl::reqd_work_group_size()]] spelling of | ||
// ReqdWorkGroupSizeAttr maps to the OpenCL semantics. First and last | ||
// argument are only swapped for the Intel atributes in SYCL and not | ||
// the OpenCL ones. | ||
if (const auto *DeclAttr = D->getAttr<ReqdWorkGroupSizeAttr>()) { | ||
Optional<llvm::APSInt> XDimVal = DeclAttr->getXDimVal(Context); | ||
Optional<llvm::APSInt> ZDimVal = DeclAttr->getZDimVal(Context); | ||
AaronBallman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
if (*XDimVal % ArgVal != 0) { | ||
llvm::APSInt WorkGroupSize = | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We might have a testing hole here with a bug hiding in it -- won't this wind up with
|
||
DeclAttr->usesOpenCLArgOrdering() ? *XDimVal : *ZDimVal; | ||
|
||
if (WorkGroupSize % ArgVal != 0) { | ||
Diag(CI.getLoc(), diag::err_sycl_num_kernel_wrong_reqd_wg_size) | ||
<< CI << DeclAttr; | ||
Diag(DeclAttr->getLocation(), diag::note_conflicting_attribute); | ||
|
@@ -3335,6 +3362,32 @@ SYCLIntelNumSimdWorkItemsAttr *Sema::MergeSYCLIntelNumSimdWorkItemsAttr( | |
return nullptr; | ||
} | ||
} | ||
// If the declaration has a [[intel::reqd_work_group_size()]] | ||
// attribute, check to see if the last argument can be evenly | ||
// divided by the num_simd_work_items attribute. | ||
// If the declaration has a __attribute__((reqd_work_group_size)) | ||
// or [[cl::reqd_work_group_size()]] attribute, check to see if the | ||
// first argument can be evenly divided by the num_simd_work_items | ||
// attribute. GNU and [[cl::reqd_work_group_size()]] spelling of | ||
// ReqdWorkGroupSizeAttr maps to the OpenCL semantics. First and last | ||
// argument are only swapped for the Intel atributes in SYCL and not | ||
// the OpenCL ones. | ||
if (const auto *DeclAttr = D->getAttr<ReqdWorkGroupSizeAttr>()) { | ||
const auto *MergeExpr = dyn_cast<ConstantExpr>(A.getValue()); | ||
const auto *DeclXExpr = dyn_cast<ConstantExpr>(DeclAttr->getXDim()); | ||
const auto *DeclZExpr = dyn_cast<ConstantExpr>(DeclAttr->getZDim()); | ||
|
||
llvm::APSInt WorkGroupSize = DeclAttr->usesOpenCLArgOrdering() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar here -- I think this could crash because the work group size or the merge expression hasn't been evaluated yet (so we'd dereference a null pointer). I think on something like this:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried couple of examples below, no crash happens with current patch:
Am i missing something here? Did you mean to test something different examples here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would crash. I think you meant to test this example. The crash happens here because the work group size or the merge expression hasn't been evaluated yet for existing implementation of reqd_work_group_size attribute( plan to refactor this attribute later - outside of this PR) template <int X, int Y, int Z> template <int X, int Y, int Z> int main() { There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's possible my test cases weren't quite right. The cases I'm worried about are when either the arg expression from (Note how we're checking for a null |
||
? DeclXExpr->getResultAsAPSInt() | ||
: DeclZExpr->getResultAsAPSInt(); | ||
|
||
if (WorkGroupSize % MergeExpr->getResultAsAPSInt() != 0) { | ||
Diag(A.getLoc(), diag::err_sycl_num_kernel_wrong_reqd_wg_size) | ||
<< &A << DeclAttr; | ||
Diag(DeclAttr->getLocation(), diag::note_conflicting_attribute); | ||
return nullptr; | ||
} | ||
} | ||
return ::new (Context) | ||
SYCLIntelNumSimdWorkItemsAttr(Context, A, A.getValue()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more clear if we rearranged the changes to the docs. My recommendation is:
The docs for
reqd_work_group_size
is where most of these details should live. Those docs should be updated to explain what the three arguments are and that the argument which specifies the "work group size" differs depending on which spelling of the attribute is used. If possible, we should put an anchor in the docs for the work group size so that later links can direct users to that specific anchor.The documentation that refers to
reqd_work_group_size
(like here) can then say "If thereqd_work_group_size
attribute is specified on a declaration along with aintel::num_simd_work_items
attribute, the requested work group size argument must be evenly divisible by the argument specified in thenum_simd_work_items
attribute.This way, we keep all the "we made bad decisions about the argument ordering" words are all in one place (instead of repeated) and can refer to the oddity from whatever other attributes have this sort of interaction with the work group size.