-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] Improve handling of large reqd_work_group_size values #10620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
uint64_t ReqdWGSizeAllDimsTotal = 1; | ||
std::vector<uint64_t> ReqdWGSizeVec; | ||
size_t ReqdWGSizeAllDimsTotal = 1; | ||
std::vector<size_t> ReqdWGSizeVec; | ||
int Dims = 0; | ||
while (!ReqdWGSize.empty()) { | ||
// The reqd_work_group_size data is stored as uint32_t's, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment should be updated that only old binaries store reqd_work_group_size
as 32-bit.
We should also put TODO
s to both runtime and the compiler to remove that extra _size_t
property once ABI break is allowed again
sycl/test-e2e/OptionalKernelFeatures/large-reqd-work-group-size.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm/
part LGTM.
@intel/sycl-language-enabling-triage: FYI
sycl/test-e2e/OptionalKernelFeatures/large-reqd-work-group-size.cpp
Outdated
Show resolved
Hide resolved
sycl/test-e2e/OptionalKernelFeatures/large-reqd-work-group-size.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@jzc can you please fix the failure in post-commit? https://github.com/intel/llvm/actions/runs/5819408460/job/15777743360 |
Post commit started to failed on 25c3666 (#10620) for macOS build because: ``` error: no viable overloaded '=' ReqdWorkGroupSize = NewReqdWorkGroupSize; note: candidate function not viable: no known conversion from 'llvm::SmallVector<size_t, 3>' (aka 'SmallVector<unsigned long, 3>') to 'const std::optional<llvm::SmallVector<unsigned long long, 3>>' for 1st argument _LIBCPP_INLINE_VISIBILITY optional& operator=(const optional&) = default;
…0620) When a kernel with a large (unsupported) work-group size is submitted, the backend would fail upon trying build the program, and throw the wrong exception (compile_program_error with `errc::build`, but it should just be a `sycl::exception` with `errc::kernel_not_supported` according to [5.8.1. Kernel attributes](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:kernel.attributes)). https://github.com/intel/llvm/blob/b65969014f001f9730349a5caad5c2b85c9bc378/sycl/source/detail/program_manager/program_manager.cpp#L749 This PR adds a check before building to validate the `reqd_work_group_size`. Additionally, another bug in `sycl-post-link` was fixed. The bug occurred when a kernel was decorated with a multi-dimensional required work-group size and two dimensions had the same value. Due to how `sycl-post-link` worked, the attached metadata on the device image would include fewer dimensions than specified. (e.g. If the decorated with `reqd_work_group_size(16, 16)`, then the device image would appear as if decorated by `reqd_work_group_size(16)`.)
When a kernel with a large (unsupported) work-group size is submitted, the backend would fail upon trying build the program, and throw the wrong exception (compile_program_error with
errc::build
, but it should just be asycl::exception
witherrc::kernel_not_supported
according to 5.8.1. Kernel attributes).llvm/sycl/source/detail/program_manager/program_manager.cpp
Line 749 in b659690
This PR adds a check before building to validate the
reqd_work_group_size
.Additionally, another bug in
sycl-post-link
was fixed. The bug occurred when a kernel was decorated with a multi-dimensional required work-group size and two dimensions had the same value. Due to howsycl-post-link
worked, the attached metadata on the device image would include fewer dimensions than specified. (e.g. If the decorated withreqd_work_group_size(16, 16)
, then the device image would appear as if decorated byreqd_work_group_size(16)
.)