-
Notifications
You must be signed in to change notification settings - Fork 130
[SYCL] Change uses of info::kernel_device_specific::max_sub_group_size #1266
[SYCL] Change uses of info::kernel_device_specific::max_sub_group_size #1266
Conversation
SYCL 2020 changes the signature of the info query for info::kernel_device_specific::max_sub_group_size. This commit adjusts tests accordingly. Signed-off-by: Larsen, Steffen <[email protected]>
WriteAcc[0][SG.get_group_linear_id()][SG.get_local_linear_id()] = | ||
SG.leader(); | ||
WriteAcc[1][SG.get_group_linear_id()][SG.get_local_linear_id()] = | ||
SG.get_group_linear_range(); | ||
WriteAcc[2][SG.get_group_linear_id()][SG.get_local_linear_id()] = | ||
SG.get_local_linear_range(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we at least outline these get_*_linear_id
to a local variables please?
Or even better, I'd change the order of dimensions so that we could do something like (if possible):
auto PerWI = WriteAcc[SG.get_group_linear_id()][SG.get_local_linear_id()];
PerWI[0] = ;
PerWI[1] = ;
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Required a small change to the layout of the data, but I agree that this looks much better.
Res = Kernel.get_info<info::kernel_device_specific::max_sub_group_size>( | ||
Device); | ||
bool Expected = | ||
std::find(sg_sizes.begin(), sg_sizes.end(), Res) != sg_sizes.end(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should ensure that it's actually the maximum element of the range.
std::find(sg_sizes.begin(), sg_sizes.end(), Res) != sg_sizes.end(); | |
bool Expected = (Res == *std::max_element(sg_sizes.begin(), sg_sizes.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree.
info::device::sub_group_sizes
is the list of all sub-group sizes supported by the device, whereas info::kernel_device_specific::max_sub_group_size
is the maximum sub-group size supported by a specific kernel. The kernel-specific number may be lower than the device's maximum sub-group size (e.g. if the kernel uses a feature that only works with certain sub-group sizes), so I think the current test is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've realized that after reading the PR in intel/llvm but forgot to return to here. I think we need to add a comment here explaining just that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment has been added.
// Get sub-group size once. | ||
if (SG.get_group_id() == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're getting it here group_size
times, aren't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I think this should be:
// Get sub-group size once. | |
if (SG.get_group_id() == 0) | |
// Get sub-group size once. | |
if (SG.get_group_id() == 0 && SG.get_local_id() == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely! Went with item.get_global_linear_id()
which should enforce it being done once.
SG.get_group_linear_range(); | ||
WriteAcc[2][SG.get_group_linear_id()][SG.get_local_linear_id()] = | ||
SG.get_local_linear_range(); | ||
}); | ||
}); | ||
} | ||
|
||
sycl::host_accessor HostAcc{Buf, sycl::read_only}; | ||
|
||
const size_t MaxNumSubgroups = 32 / SubgroupSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this called MaxNumSubgroups
and not just NumSubgroups
? Since all the work-groups have to be the same size, I think they must all have the same number of sub-groups too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I have renamed it.
Signed-off-by: Larsen, Steffen <[email protected]>
9c0449c
Co-authored-by: aelovikov-intel <[email protected]>
intel#1266) SYCL 2020 changes the signature of the info query for info::kernel_device_specific::max_sub_group_size. This commit adjusts tests accordingly. Signed-off-by: Larsen, Steffen <[email protected]>
intel/llvm-test-suite#1266) SYCL 2020 changes the signature of the info query for info::kernel_device_specific::max_sub_group_size. This commit adjusts tests accordingly. Signed-off-by: Larsen, Steffen <[email protected]>
SYCL 2020 changes the signature of the info query for info::kernel_device_specific::max_sub_group_size. This commit adjusts tests accordingly.