-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] Do not attach reqd_work_group_size info when multiple are detected #13523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -38,6 +38,7 @@ static llvm::StringRef ExtractStringFromMDNodeOperand(const MDNode *N, | |
SYCLDeviceRequirements | ||
llvm::computeDeviceRequirements(const module_split::ModuleDesc &MD) { | ||
SYCLDeviceRequirements Reqs; | ||
bool MultipleReqdWGSize = false; | ||
// Process all functions in the module | ||
for (const Function &F : MD.getModule()) { | ||
if (auto *MDN = F.getMetadata("sycl_used_aspects")) { | ||
|
@@ -64,6 +65,8 @@ llvm::computeDeviceRequirements(const module_split::ModuleDesc &MD) { | |
ExtractUnsignedIntegerFromMDNodeOperand(MDN, I)); | ||
if (!Reqs.ReqdWorkGroupSize.has_value()) | ||
Reqs.ReqdWorkGroupSize = NewReqdWorkGroupSize; | ||
if (Reqs.ReqdWorkGroupSize != NewReqdWorkGroupSize) | ||
MultipleReqdWGSize = true; | ||
} | ||
|
||
if (auto *MDN = F.getMetadata("sycl_joint_matrix")) { | ||
|
@@ -99,6 +102,14 @@ llvm::computeDeviceRequirements(const module_split::ModuleDesc &MD) { | |
assert(*Reqs.SubGroupSize == static_cast<uint32_t>(MDValue)); | ||
} | ||
} | ||
|
||
// Usually, we would only expect one ReqdWGSize, as the module passed to | ||
// this function would be split according to that. However, when splitting | ||
// is disabled, this cannot be guaranteed. In this case, we reset the value, | ||
// which makes so that no value is reqd_work_group_size data is attached in | ||
// in the device image. | ||
if (MultipleReqdWGSize) | ||
Reqs.ReqdWorkGroupSize.reset(); | ||
Comment on lines
+111
to
+112
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There should be a comment explaining this. My understanding is that strictly speaking, we expect only one value of However, there is an exception when device code split is disabled, which causes kernels with different There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree, but yea, the device code split mode information is not present in this function. If we want to go that far, I think it makes sense to add it as a parameter to the function. |
||
return Reqs; | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
// This test checks that with -fsycl-device-code-split=off, kernels | ||
// with different reqd_work_group_size dimensions can be launched. | ||
|
||
// RUN: %{build} -fsycl -fsycl-device-code-split=off -o %t.out | ||
jzc marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// RUN: %{run} %t.out | ||
|
||
// UNSUPPORTED: hip | ||
|
||
#include <sycl/detail/core.hpp> | ||
|
||
using namespace sycl; | ||
|
||
#define TEST(...) \ | ||
{ \ | ||
range globalRange(__VA_ARGS__); \ | ||
range localRange(__VA_ARGS__); \ | ||
nd_range NDRange(globalRange, localRange); \ | ||
q.parallel_for(NDRange, \ | ||
[=](auto) [[sycl::reqd_work_group_size(__VA_ARGS__)]] {}); \ | ||
} | ||
|
||
int main(int argc, char **argv) { | ||
queue q; | ||
TEST(4); | ||
TEST(4, 5); | ||
TEST(4, 5, 6); | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On line 61 can we add a check for !MultipleReqdWGSize? There is no point in checking again if we already know multiple WG sizes are required.
Also can we call Reqs.ReqdWorkGroupSize.reset() after line 69 to keep all the code related to ReqdWorkGroupSize together?