-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][FPGA][NFC] Fix num_simd_work_items and reqd_work_group_size argument check #3275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… check This patch fixes intel#3252. According to https://www.intel.com/content/www/us/en/programmable/documentation/mwh1391807965224.html#mwh1391807939093 (section : 5.2.13. Specifying Number of SIMD Work-Items): Important: Introduce the num_simd_work_items attribute in conjunction with the reqd_work_group_size attribute. The num_simd_work_items attribute you specify must evenly divide the work-group size you specify for the reqd_work_group_size attribute. Based on discussion, the requirement applies only for X argument of reqd_work_group_size attribute. Signed-off-by: Soumi Manna <[email protected]>
Signed-off-by: Soumi Manna <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks Aaron. |
Can i get a review from @premanandrao / @elizabethandrews? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with the changes. I just noticed a minor change in the documentation, but I am fine if that is done at a more appropriate time later.
Signed-off-by: Soumi Manna <[email protected]>
Signed-off-by: Soumi Manna <[email protected]>
@premanandrao / @elizabethandrews , could you please review the PR before merge conflicts happen again? |
@@ -2421,8 +2421,8 @@ device kernel, the attribute is not ignored and it is propagated to the kernel. | |||
If the`` intel::reqd_work_group_size`` or ``cl::reqd_work_group_size`` | |||
attribute is specified on a declaration along with a | |||
intel::num_simd_work_items attribute, the work group size attribute | |||
arguments must all be evenly divisible by the argument specified in | |||
the ``intel::num_simd_work_items`` attribute. | |||
argument (the first argument) must be evenly divisible by the argument specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For SYCL code isn't this supposed to be the last argument? From the bug report, I thought this attribute is specified as reqd_work_group_size(Z,Y,X)
where X is the last argument and must be evenly divisible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be X - first argument and must be evenly divisible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please add a short comment at the beginning of the test to indicate what is being tested.
Signed-off-by: Soumi Manna <[email protected]>
Thank you, I have added comments and updated tests (used different arguments values for X and Z to avoid confusion here). |
Signed-off-by: Soumi Manna <[email protected]>
…gument check The order of the arguments to reqd_work_group_size in SYCL and OpenCL are reversed. In the OpenCL syntax, reqd_work_group_size(X,Y,Z), X is the index that increments the fastest, followed by Y, then Z. If we preserve that definition of X, Y, and Z, then in SYCL the attribute is specified as reqd_work_group_size(Z,Y,X). This gets more complicated in the case where fewer than three arguments are provided. In such a case it's not the "X" argument that is omitted (by the definitions of X, Y, and Z above) it's the Z dimension that's omitted. So the two argument version has the syntax reqd_work_group_size(Y,X). Similarly, the one argument variant has syntax reqd_work_group_size(X). This patches fixes the problem on intel#3275 where we checked the wrong argument. In SYCL, the num_simd_work_items must evenly divide the last argument to reqd_work_group_size regardless of how many arguments req_work_group_size has. Signed-off-by: Soumi Manna <[email protected]>
…t check (#3728) The arguments to reqd_work_group_size are ordered based on which index increments the fastest. In OpenCL, the first argument is the index that increments the fastest, and in SYCL, the last argument is the index that increments the fastest. In OpenCL, all three arguments are required. In SYCL, the attribute accepts either one, two, or three arguments; in each form, the last (or only) argument is the index that increments fastest. The number of arguments passed to the attribute must match the dimensionality of the kernel the attribute is applied to. If the reqd_work_group_size attribute is specified on a declaration along with num_simd_work_items, the required work group size specified by num_simd_work_items attribute must evenly divide the index that increments fastest in the reqd_work_group_size attribute. This patches fixes the problem on #3275 where we checked the wrong argument for SYCL and OpenCL attributes. Signed-off-by: Soumi Manna <[email protected]>
This patch fixes #3252.
According to https://www.intel.com/content/www/us/en/programmable/documentation/mwh1391807965224.html#mwh1391807939093 (section : 5.2.13. Specifying Number of SIMD Work-Items):
Important: Introduce the num_simd_work_items attribute in conjunction with the reqd_work_group_size attribute. The num_simd_work_items attribute you specify must evenly divide the work-group size you specify for the reqd_work_group_size attribute.
Based on discussion, the requirement applies only for X argument of reqd_work_group_size attribute.
Signed-off-by: Soumi Manna [email protected]