-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] Support multiple call operators in kernel functor #8525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@intel/dpcpp-cfe-reviewers I would like some early feedback on the PR while I update the test. Thanks! |
@intel/dpcpp-cfe-reviewers ping |
Also, please pay attention to pre-commit testing, it seems something is wrong with it. Maybe merging sycl back to your branch will help. |
ping @intel/dpcpp-cfe-reviewers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok, but I would wait for approval from other codeowners too.
ping @intel/dpcpp-cfe-reviewers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@intel/llvm-gatekeepers Can you please merge this PR. Thanks! |
A named function object that is used to define a SYCL kernel, can have more than one definition of function call operator
'operator()()'
.For such cases, prior to this change, the compiler front end always returned the first available 'operator()()' call found in the functor, regardless of whether that was the one the user invoked from within the kernel invocation function.
This resulted in incorrect kernel invocation and incorrect attribute collection ( if the 'operator()()' calls had attributes applied on them).
This is best illustrated by the following example:
In the above example, the
paralle_for
kernel invocation function has arange
argument, in addition to the functor.In the simplified DPC++ header implementation of parallel_for below,
KernelFunc() is invoked with 'id' as an argument because 'parallel_for' invocation has a 'range' passed to it.
However, for the example above, the CFE returns
'operator()()'
instead of'void operator()(sycl::id<1> id) const'
.This results in incorrect kernel being invoked as well as losing the
intel::reqd_sub_group_size(4)]
attribute, the user intended to use.The root of the kernel is the function marked with
“[[clang::sycl_kernel]]”
and the user’s kernel is “KernelFunc
”, and any C++ attributes the user declared for the kernel are on that function.In the new implementation, we use
RecursiveASTVisitor
Visitor to traverse the nodes from the root of the kernel function (e.g. “kernel_parallel_for”) and find the version of“operator()()”
that is called by“KernelFunc()”
. There will only be one call to“KernelFunc()”
in that call graph because the DPC++ headers are structured such that the user’s kernel function is only called once.This will result in the correct kernel/operator()() call being invoked by the CFE.