Skip to content

[SYCL] Fix a bug when using no device split and reqd_work_group_size #16236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 6, 2024

Conversation

jzc
Copy link
Contributor

@jzc jzc commented Dec 2, 2024

There was a bug (#13523) where a kernel couldn't be launched when -fsycl-device-code-split=off was used and multiple kernels with different required work group sizes were present. This issue was fixed by ensuring that the required work group size metadata is not attached to the device image when multiple required work group sizes are detected in a single module.

However, there was a similar but related case that was not fixed by that PR, which is now demonstrated in the new test no-split-reqd-wg-size-2.cpp. This issue occurs when there is a single kernel with a required work group size and another kernel without one. In this case, the module doesn't contain multiple required work group sizes, so the required work group size metadata is still attached. As a result of the metadata being attached, the runtime cannot launch the kernel without a required work group size.

This PR removes the logic of ensuring metadata is not attached when there are multiple required work group sizes, and instead adds logic that ensures the metadata is not attached when the split mode is SPLIT_NONE. This covers the old cases from the previous PR and the new case in this PR.

@jzc jzc requested review from a team as code owners December 2, 2024 19:53
@jzc jzc requested a review from maarquitos14 December 2, 2024 19:53
@@ -152,7 +152,8 @@ std::optional<T> getKernelSingleEltMetadata(const Function &Func,

PropSetRegTy computeModuleProperties(const Module &M,
const EntryPointSet &EntryPoints,
const GlobalBinImageProps &GlobProps) {
const GlobalBinImageProps &GlobProps,
module_split::IRSplitMode SplitMode) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would prefer we didnt add another argument to this function, can we instead use module metadata like we do for for sycl/esimd split (saveSplitInformationAsMetadata)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I just added a way to remove properties from the PropertySetRegistry and removed it in sycl-post-link.cpp

// remove the required work group size metadata in this case.
if (SplitMode == module_split::SPLIT_NONE)
PropSet.remove(PropSetRegTy::SYCL_DEVICE_REQUIREMENTS,
"reqd_work_group_size_uint64_t");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make reqd_work_group_size_uint64_t a constant somewhere so we dont have any typo problems in the future?

@jzc jzc temporarily deployed to WindowsCILock December 2, 2024 20:57 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock December 2, 2024 23:41 — with GitHub Actions Inactive
Copy link
Contributor

@maarquitos14 maarquitos14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of nits.

// When the split mode is none, the required work group size will be added
// to the whole module, which will make the runtime unable to
// launch the other kernels in the module that have different
// required work group sizes or no requried work group sizes. So we need to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// required work group sizes or no requried work group sizes. So we need to
// required work group sizes or no required work group sizes. So we need to

@jzc jzc temporarily deployed to WindowsCILock December 4, 2024 20:13 — with GitHub Actions Inactive
Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feedback is addressed, will leave technical review to others

// This test checks that with -fsycl-device-code-split=off, kernels
// with different reqd_work_group_size dimensions can be launched.

// RUN: %{build} -fsycl -fsycl-device-code-split=off -o %t.out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: %{build} includes -fsycl

@jzc jzc temporarily deployed to WindowsCILock December 4, 2024 22:19 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock December 5, 2024 15:55 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock December 5, 2024 18:16 — with GitHub Actions Inactive
@AlexeySachkov AlexeySachkov merged commit 9012e6d into intel:sycl Dec 6, 2024
13 checks passed
sommerlukas pushed a commit that referenced this pull request Mar 14, 2025
Incorporates recent changes to `sycl-post-link` into the RTC-specific
version:
- #16729
- #16236
- #17211

---------

Signed-off-by: Julian Oppermann <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants