[SYCL] Do not attach reqd_work_group_size info when multiple are detected #13523

jzc · 2024-04-22T20:09:43Z

No description provided.

…cted

sycl/test-e2e/Regression/no-split-reqd-wg-size.cpp

dm-vodopyanov · 2024-04-23T15:21:00Z

sycl/test-e2e/Regression/no-split-reqd-wg-size.cpp

+  q.submit([&](handler &cgh) {
+     cgh.parallel_for<class testNDRange>(
+         NDRange,
+         [=](nd_item<2> it) [[sycl::reqd_work_group_size(WGSIZE, WGSIZE)]] {});


Not sure exactly, but can sycl::reqd_work_group_size(WGSIZE, WGSIZE) and sycl::reqd_work_group_size(WGSIZE) be moved to some template or function parameter (+ move dimension to the template parameter), to not duplicate the same code in both kernel_launch_* functions.

Unfortunately, it seems attributes cannot accept parameter packs, so the best we can do is the macro version:

#define TEST(...) \ { \ range globalRange(__VA_ARGS__); \ range localRange(__VA_ARGS__); \ nd_range NDRange(globalRange, localRange); \ q.parallel_for(NDRange, \ [=](auto) [[sycl::reqd_work_group_size(__VA_ARGS__)]] {}); \ }

Unfortunately, it seems attributes cannot accept parameter packs

We should create a tracker for that, because it is possible to add such support and we have done it already for other attributes

Created an issue: #13686

AlexeySachkov · 2024-04-23T15:24:38Z

llvm/lib/SYCLLowerIR/SYCLDeviceRequirements.cpp

+  if (MultipleReqdWGSize)
+    Reqs.ReqdWorkGroupSize.reset();


There should be a comment explaining this. My understanding is that strictly speaking, we expect only one value of reqd_work_group_size metadata because of per-optional-kernel-feature device code split that had supposed to happen before.

However, there is an exception when device code split is disabled, which causes kernels with different reqd_work_group_size requirements to be bundled together. I think that ideally we want to assert here that device code split is disabled and that otherwise MultipleReqdWGSize is false, but I'm not sure if we have access to that knowledge here.

However, there is an exception when device code split is disabled, which causes kernels with different reqd_work_group_size requirements to be bundled together. I think that ideally we want to assert here that device code split is disabled and that otherwise MultipleReqdWGSize is false, but I'm not sure if we have access to that knowledge here.

I agree, but yea, the device code split mode information is not present in this function. If we want to go that far, I think it makes sense to add it as a parameter to the function.

LU-JOHN · 2024-05-07T21:19:27Z

llvm/lib/SYCLLowerIR/SYCLDeviceRequirements.cpp

@@ -64,6 +65,8 @@ llvm::computeDeviceRequirements(const module_split::ModuleDesc &MD) {
            ExtractUnsignedIntegerFromMDNodeOperand(MDN, I));
      if (!Reqs.ReqdWorkGroupSize.has_value())
        Reqs.ReqdWorkGroupSize = NewReqdWorkGroupSize;
+      if (Reqs.ReqdWorkGroupSize != NewReqdWorkGroupSize)


On line 61 can we add a check for !MultipleReqdWGSize? There is no point in checking again if we already know multiple WG sizes are required.

Also can we call Reqs.ReqdWorkGroupSize.reset() after line 69 to keep all the code related to ReqdWorkGroupSize together?

LU-JOHN · 2024-05-07T21:42:39Z

Are we allowed to discard a requirement just because they are contradictory? From https://intel.github.io/llvm-docs/design/OptionalDeviceFeatures.html:

For a kernel that is decorated with the [[sycl::reqd_work_group_size(W)]] or [[sycl::reqd_sub_group_size(S)]] attribute, the exception must be thrown if the device does not support the work group size W or the sub-group size S.

We can't honor this requirement if we discard reqd_work_group_size.

AlexeySachkov · 2024-05-14T08:22:06Z

Are we allowed to discard a requirement just because they are contradictory? From https://intel.github.io/llvm-docs/design/OptionalDeviceFeatures.html:

For a kernel that is decorated with the [[sycl::reqd_work_group_size(W)]] or [[sycl::reqd_sub_group_size(S)]] attribute, the exception must be thrown if the device does not support the work group size W or the sub-group size S.

We can't honor this requirement if we discard reqd_work_group_size.

The situation when we discard that metadata and therefore lose ability to emit that error can only happen when a user explicitly specifies non-standard -fsycl-device-code-split=off. We have not claimed to be fully conformant with the SYCL specification with that flag.

Essentially this is a trade-off between user experience and being conformant. The problem with user experience we had is that we also have a check that local size passed to parallel_for is the same as what is attached as an attribute to a kernel. Since we record an attribute on a per-device-image basis assuming that it is the same for all kernels, this caused false alarms, fully preventing users from launching any kernels. Disabled device code split path is essentially a default for FPGA devices and therefore we decided to go this way.

…16236) There was a bug (#13523) where a kernel couldn't be launched when `-fsycl-device-code-split=off` was used and multiple kernels with different required work group sizes were present. This issue was fixed by ensuring that the required work group size metadata is not attached to the device image when multiple required work group sizes are detected in a single module. However, there was a similar but related case that was not fixed by that PR, which is now demonstrated in the new test no-split-reqd-wg-size-2.cpp. This issue occurs when there is a single kernel with a required work group size and another kernel without one. In this case, the module doesn't contain multiple required work group sizes, so the required work group size metadata is still attached. As a result of the metadata being attached, the runtime cannot launch the kernel without a required work group size. This PR removes the logic of ensuring metadata is not attached when there are multiple required work group sizes, and instead adds logic that ensures the metadata is not attached when the split mode is `SPLIT_NONE`. This covers the old cases from the previous PR and the new case in this PR.

[SYCL] Do not attach reqd_work_group_size info when multiple are dete…

f39488c

…cted

jzc requested review from a team as code owners April 22, 2024 20:09

jzc requested a review from dm-vodopyanov April 22, 2024 20:09

jzc had a problem deploying to WindowsCILock April 22, 2024 20:51 — with GitHub Actions Failure

jzc had a problem deploying to WindowsCILock April 22, 2024 21:32 — with GitHub Actions Failure

Fix logic

65248c6

jzc temporarily deployed to WindowsCILock April 23, 2024 14:54 — with GitHub Actions Inactive

dm-vodopyanov reviewed Apr 23, 2024

View reviewed changes

AlexeySachkov reviewed Apr 23, 2024

View reviewed changes

jzc temporarily deployed to WindowsCILock April 23, 2024 15:40 — with GitHub Actions Inactive

jzc added 3 commits April 23, 2024 13:13

Tidy up test

9620b2e

Use core.hpp and unsupport hip

087ae06

Add comment

50887d9

jzc temporarily deployed to WindowsCILock April 24, 2024 14:06 — with GitHub Actions Inactive

jzc temporarily deployed to WindowsCILock April 24, 2024 14:45 — with GitHub Actions Inactive

AlexeySachkov approved these changes May 6, 2024

View reviewed changes

LU-JOHN reviewed May 7, 2024

View reviewed changes

AlexeySachkov requested a review from dm-vodopyanov May 14, 2024 08:22

dm-vodopyanov approved these changes May 15, 2024

View reviewed changes

dm-vodopyanov merged commit 6934bcf into intel:sycl May 21, 2024

AlexeySachkov mentioned this pull request May 23, 2024

[Doc] Add Mar'24 Release Notes #13879

Merged

jzc mentioned this pull request Dec 2, 2024

[SYCL] Fix a bug when using no device split and reqd_work_group_size #16236

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL] Do not attach reqd_work_group_size info when multiple are detected #13523

[SYCL] Do not attach reqd_work_group_size info when multiple are detected #13523

Uh oh!

jzc commented Apr 22, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dm-vodopyanov Apr 23, 2024

Uh oh!

jzc Apr 23, 2024

Uh oh!

AlexeySachkov May 6, 2024

Uh oh!

jzc May 7, 2024

Uh oh!

AlexeySachkov Apr 23, 2024

Uh oh!

jzc Apr 24, 2024

Uh oh!

LU-JOHN May 7, 2024

Uh oh!

LU-JOHN commented May 7, 2024 •

edited

Loading

Uh oh!

AlexeySachkov commented May 14, 2024

Uh oh!

Uh oh!

[SYCL] Do not attach reqd_work_group_size info when multiple are detected #13523

[SYCL] Do not attach reqd_work_group_size info when multiple are detected #13523

Uh oh!

Conversation

jzc commented Apr 22, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dm-vodopyanov Apr 23, 2024

Choose a reason for hiding this comment

Uh oh!

jzc Apr 23, 2024

Choose a reason for hiding this comment

Uh oh!

AlexeySachkov May 6, 2024

Choose a reason for hiding this comment

Uh oh!

jzc May 7, 2024

Choose a reason for hiding this comment

Uh oh!

AlexeySachkov Apr 23, 2024

Choose a reason for hiding this comment

Uh oh!

jzc Apr 24, 2024

Choose a reason for hiding this comment

Uh oh!

LU-JOHN May 7, 2024

Choose a reason for hiding this comment

Uh oh!

LU-JOHN commented May 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlexeySachkov commented May 14, 2024

Uh oh!

Uh oh!

LU-JOHN commented May 7, 2024 •

edited

Loading