Skip to content
This repository was archived by the owner on Mar 28, 2023. It is now read-only.

[SYCL] Change uses of info::kernel_device_specific::max_sub_group_size #1266

Merged

Conversation

steffenlarsen
Copy link

SYCL 2020 changes the signature of the info query for info::kernel_device_specific::max_sub_group_size. This commit adjusts tests accordingly.

SYCL 2020 changes the signature of the info query for
info::kernel_device_specific::max_sub_group_size. This commit adjusts
tests accordingly.

Signed-off-by: Larsen, Steffen <[email protected]>
Comment on lines 31 to 36
WriteAcc[0][SG.get_group_linear_id()][SG.get_local_linear_id()] =
SG.leader();
WriteAcc[1][SG.get_group_linear_id()][SG.get_local_linear_id()] =
SG.get_group_linear_range();
WriteAcc[2][SG.get_group_linear_id()][SG.get_local_linear_id()] =
SG.get_local_linear_range();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we at least outline these get_*_linear_id to a local variables please?

Or even better, I'd change the order of dimensions so that we could do something like (if possible):

  auto PerWI = WriteAcc[SG.get_group_linear_id()][SG.get_local_linear_id()];
  PerWI[0] = ;
  PerWI[1] = ;
  ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required a small change to the layout of the data, but I agree that this looks much better.

Res = Kernel.get_info<info::kernel_device_specific::max_sub_group_size>(
Device);
bool Expected =
std::find(sg_sizes.begin(), sg_sizes.end(), Res) != sg_sizes.end();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should ensure that it's actually the maximum element of the range.

Suggested change
std::find(sg_sizes.begin(), sg_sizes.end(), Res) != sg_sizes.end();
bool Expected = (Res == *std::max_element(sg_sizes.begin(), sg_sizes.end());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree.

info::device::sub_group_sizes is the list of all sub-group sizes supported by the device, whereas info::kernel_device_specific::max_sub_group_size is the maximum sub-group size supported by a specific kernel. The kernel-specific number may be lower than the device's maximum sub-group size (e.g. if the kernel uses a feature that only works with certain sub-group sizes), so I think the current test is correct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've realized that after reading the PR in intel/llvm but forgot to return to here. I think we need to add a comment here explaining just that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment has been added.

Comment on lines 27 to 28
// Get sub-group size once.
if (SG.get_group_id() == 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're getting it here group_size times, aren't we?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think this should be:

Suggested change
// Get sub-group size once.
if (SG.get_group_id() == 0)
// Get sub-group size once.
if (SG.get_group_id() == 0 && SG.get_local_id() == 0)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, absolutely! Went with item.get_global_linear_id() which should enforce it being done once.

SG.get_group_linear_range();
WriteAcc[2][SG.get_group_linear_id()][SG.get_local_linear_id()] =
SG.get_local_linear_range();
});
});
}

sycl::host_accessor HostAcc{Buf, sycl::read_only};

const size_t MaxNumSubgroups = 32 / SubgroupSize;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called MaxNumSubgroups and not just NumSubgroups? Since all the work-groups have to be the same size, I think they must all have the same number of sub-groups too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I have renamed it.

Signed-off-by: Larsen, Steffen <[email protected]>
Pennycook
Pennycook previously approved these changes Sep 15, 2022
Co-authored-by: aelovikov-intel <[email protected]>
@steffenlarsen steffenlarsen merged commit 47a169f into intel:intel Sep 26, 2022
myler pushed a commit to myler/llvm-test-suite that referenced this pull request Mar 22, 2023
intel#1266)

SYCL 2020 changes the signature of the info query for
info::kernel_device_specific::max_sub_group_size. This commit adjusts
tests accordingly.

Signed-off-by: Larsen, Steffen <[email protected]>
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Mar 27, 2023
intel/llvm-test-suite#1266)

SYCL 2020 changes the signature of the info query for
info::kernel_device_specific::max_sub_group_size. This commit adjusts
tests accordingly.

Signed-off-by: Larsen, Steffen <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants