-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] Fix a bug when using no device split and reqd_work_group_size #16236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -152,7 +152,8 @@ std::optional<T> getKernelSingleEltMetadata(const Function &Func, | |||
|
|||
PropSetRegTy computeModuleProperties(const Module &M, | |||
const EntryPointSet &EntryPoints, | |||
const GlobalBinImageProps &GlobProps) { | |||
const GlobalBinImageProps &GlobProps, | |||
module_split::IRSplitMode SplitMode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would prefer we didnt add another argument to this function, can we instead use module metadata like we do for for sycl/esimd split (saveSplitInformationAsMetadata
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I just added a way to remove properties from the PropertySetRegistry and removed it in sycl-post-link.cpp
// remove the required work group size metadata in this case. | ||
if (SplitMode == module_split::SPLIT_NONE) | ||
PropSet.remove(PropSetRegTy::SYCL_DEVICE_REQUIREMENTS, | ||
"reqd_work_group_size_uint64_t"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make reqd_work_group_size_uint64_t
a constant somewhere so we dont have any typo problems in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of nits.
// When the split mode is none, the required work group size will be added | ||
// to the whole module, which will make the runtime unable to | ||
// launch the other kernels in the module that have different | ||
// required work group sizes or no requried work group sizes. So we need to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// required work group sizes or no requried work group sizes. So we need to | |
// required work group sizes or no required work group sizes. So we need to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feedback is addressed, will leave technical review to others
// This test checks that with -fsycl-device-code-split=off, kernels | ||
// with different reqd_work_group_size dimensions can be launched. | ||
|
||
// RUN: %{build} -fsycl -fsycl-device-code-split=off -o %t.out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: %{build}
includes -fsycl
Incorporates recent changes to `sycl-post-link` into the RTC-specific version: - #16729 - #16236 - #17211 --------- Signed-off-by: Julian Oppermann <[email protected]>
There was a bug (#13523) where a kernel couldn't be launched when
-fsycl-device-code-split=off
was used and multiple kernels with different required work group sizes were present. This issue was fixed by ensuring that the required work group size metadata is not attached to the device image when multiple required work group sizes are detected in a single module.However, there was a similar but related case that was not fixed by that PR, which is now demonstrated in the new test no-split-reqd-wg-size-2.cpp. This issue occurs when there is a single kernel with a required work group size and another kernel without one. In this case, the module doesn't contain multiple required work group sizes, so the required work group size metadata is still attached. As a result of the metadata being attached, the runtime cannot launch the kernel without a required work group size.
This PR removes the logic of ensuring metadata is not attached when there are multiple required work group sizes, and instead adds logic that ensures the metadata is not attached when the split mode is
SPLIT_NONE
. This covers the old cases from the previous PR and the new case in this PR.