-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] implement no_offset property for accessor_property_list class #4920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@intel/llvm-reviewers-runtime friendly ping |
Co-authored-by: vladimirlaz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Co-authored-by: Steffen Larsen <[email protected]>
a1cef53
@HabKaffee, it looks like post-commit build has failed? Could you please prepare fix ASAP? |
@vladimirlaz, yes, sure, I'm working on it |
// RUN: %clangxx -fsycl-device-only -fsycl-early-optimizations -fsycl-dead-args-optimization -D__SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING__ -S -emit-llvm -o - %s | FileCheck %s | ||
|
||
#include <CL/sycl.hpp> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have E2E test in https://github.com/intel/llvm-test-suite as well. Are we planning to add it there? Basic requirement for the test to check:
- It doesn't crash during AOT and JIT compilation and execution on different devices (CPU/GPU/FPGA) (I assume, later the IR will be translated to SPIR-V and we need to check that this SPIR-V is acceptable for all of the mentioned devices (I'm sure, that it will be accepted, but still));
- Basic math operations with 'no-offsetted' accessors are having the correct result.
In this PR, if a kernel pointer argument comes from a global accessor, we generate a new metadata(kernel_arg_runtime_aligned) to the kernel to indicate that this pointer has runtime allocated alignment. If this information is available to the FPGA backend and if the accessor has no offset (e.g. through the user applying the [no_offset](#4920) property to their accessor), improvements to area of loads and stores can be made by using aligned LSUs. Without this enhancement we will continue to experience excess area. The SYCL spec already guarantees that accessors are aligned to some runtime-specific alignment. So the user's source doesn't need to change to provide the backend with this guarantee, we simply need to allow this information to propagate to the backend. Current IR implementation for kernel pointer argument from accessor looks like: `define dso_local spir_kernel void @_ZTSZ4mainE8kernel_A(i32 addrspace(1)* %_arg_, %"struct.cl::sycl::range"* byval(%"struct.cl::sycl::range") align 4 %_arg_1, %"struct.cl::sycl::range"* byval(%"struct.cl::sycl::range") align 4 %_arg_2, %"struct.cl::sycl::id"* byval(%"struct.cl::sycl::id") align 4 %_arg_3) #0 !kernel_arg_buffer_location !4 {` The new implementation will look like: `define dso_local spir_kernel void @_ZTSZ4mainE8kernel_A(i32 addrspace(1)* %_arg_, %"struct.cl::sycl::range"* byval(%"struct.cl::sycl::range") align 4 %_arg_1, %"struct.cl::sycl::range"* byval(%"struct.cl::sycl::range") align 4 %_arg_2, %"struct.cl::sycl::id"* byval(%"struct.cl::sycl::id") align 4 %_arg_3) #0 !kernel_arg_buffer_location !4 !kernel_arg_runtime_aligned !5 {` `!5 = !{i1 true, i1 false, i1 false, i1 false}` The metadata is applied to the kernel but really carries data about the kernel’s arguments. The first element of the metadata maps to the first kernel argument, the second to the second and so on. For this particular metadata the request is that the value of any metadata element is 'true' for any kernel arguments that corresponds to the base pointer of an accessor and 'false' otherwise. Accessors are handled specially by the frontend (because they are marked with sycl_special_class) and when a user captures an accessor in their SYCL kernel the FE splits up the single accessor into 4 separate kernel arguments. The first of those 4 arguments is a pointer and is the base pointer of the accessor. That pointer is known to have runtime-specific alignment and thus the element of the kernel-arg-runtime metadata that corresponds to that argument will have a value of “true”.
no_offset property allows to reduce the amount of kernel arguments
https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/accessor_properties/SYCL_ONEAPI_accessor_properties.asciidoc