-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] Add implementation of kernel_bundle. Part 2 #3287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add implementation of kernel_bundle. Part 2 #3287
Conversation
The patch adds implementation of basic API and data structures. Error handling will added in a separate patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review part I
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review part 2. Still have to go through program_manager.cpp and below.
The clang-format check doesn't agree with clang-format on my machine. Suggest ignoring this fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clang-format check doesn't agree with clang-format on my machine. Suggest ignoring this fail.
I suggest aligning your local clang-format with the one we use in the check then.
detail::createSyclObjFromImpl<sycl::kernel_id>(KernleIDImpl); | ||
|
||
// Insert new element keeping MKernelIDs sorted. | ||
auto It = std::lower_bound(MKernelIDs.begin(), MKernelIDs.end(), KernelID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I think pushing all the elements into the vector and then sorting it once should be more efficient than inserting each one while keeping the vector sorted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do measurements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that it might depends a lot on the real number of kernels.
Perhaps with SYCL generated by ML frameworks could have millions of kernels... ;-)
But for "normal" code and a few kernels?
Anyway, the std::sort
path should be fast enough too for just a few kernels...
@@ -0,0 +1,96 @@ | |||
// RUN: %clangxx -fsycl -fsycl-device-code-split=per_kernel -fsycl-targets=%sycl_triple %s -o %t.out | |||
// RUN: %t.out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to post-commit testing this test doesn't work on the host device, so we either should move it to on-device
directory or fix execution on the host device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention of the spec, of course, is that these APIs would work on all devices.
The patch adds implementation of basic API and data structures.
Error handling will added in a separate patch.