Skip to content

[SYCL] implement no_offset property for accessor_property_list class #4920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 12, 2021
Merged

[SYCL] implement no_offset property for accessor_property_list class #4920

merged 4 commits into from
Nov 12, 2021

Conversation

HabKaffee
Copy link
Contributor

@HabKaffee HabKaffee requested a review from a team as a code owner November 9, 2021 10:47
@alexbatashev
Copy link
Contributor

@intel/llvm-reviewers-runtime friendly ping

vladimirlaz
vladimirlaz previously approved these changes Nov 11, 2021
steffenlarsen
steffenlarsen previously approved these changes Nov 11, 2021
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@HabKaffee HabKaffee dismissed stale reviews from steffenlarsen and vladimirlaz via a1cef53 November 11, 2021 18:04
@vladimirlaz vladimirlaz merged commit 308e5ad into intel:sycl Nov 12, 2021
@vladimirlaz
Copy link
Contributor

@HabKaffee, it looks like post-commit build has failed? Could you please prepare fix ASAP?

@HabKaffee
Copy link
Contributor Author

@vladimirlaz, yes, sure, I'm working on it

// RUN: %clangxx -fsycl-device-only -fsycl-early-optimizations -fsycl-dead-args-optimization -D__SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING__ -S -emit-llvm -o - %s | FileCheck %s

#include <CL/sycl.hpp>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have E2E test in https://github.com/intel/llvm-test-suite as well. Are we planning to add it there? Basic requirement for the test to check:

  1. It doesn't crash during AOT and JIT compilation and execution on different devices (CPU/GPU/FPGA) (I assume, later the IR will be translated to SPIR-V and we need to check that this SPIR-V is acceptable for all of the mentioned devices (I'm sure, that it will be accepted, but still));
  2. Basic math operations with 'no-offsetted' accessors are having the correct result.

bader pushed a commit that referenced this pull request Jan 18, 2022
In this PR, if a kernel pointer argument comes from a global accessor, we generate a new metadata(kernel_arg_runtime_aligned) to the kernel to indicate that this pointer has runtime allocated alignment.

If this information is available to the FPGA backend and if the accessor has no offset (e.g. through the user applying the [no_offset](#4920) property to their accessor), improvements to area of loads and stores can be made by using aligned LSUs.  Without this enhancement we will continue to experience excess area.

The SYCL spec already guarantees that accessors are aligned to some runtime-specific alignment.  So the user's source doesn't need to change to provide the backend with this guarantee, we simply need to allow this information to propagate to the backend.


Current IR implementation for kernel pointer argument from accessor looks like:

`define dso_local spir_kernel void @_ZTSZ4mainE8kernel_A(i32 addrspace(1)* %_arg_, %"struct.cl::sycl::range"* byval(%"struct.cl::sycl::range") align 4 %_arg_1, %"struct.cl::sycl::range"* byval(%"struct.cl::sycl::range") align 4 %_arg_2, %"struct.cl::sycl::id"* byval(%"struct.cl::sycl::id") align 4 %_arg_3) #0 !kernel_arg_buffer_location !4 {`


The new implementation will look like:
`define dso_local spir_kernel void @_ZTSZ4mainE8kernel_A(i32 addrspace(1)* %_arg_, %"struct.cl::sycl::range"* byval(%"struct.cl::sycl::range") align 4 %_arg_1, %"struct.cl::sycl::range"* byval(%"struct.cl::sycl::range") align 4 %_arg_2, %"struct.cl::sycl::id"* byval(%"struct.cl::sycl::id") align 4 %_arg_3) #0 !kernel_arg_buffer_location !4 !kernel_arg_runtime_aligned !5 {`

`!5 = !{i1 true, i1 false, i1 false, i1 false}`

The metadata is applied to the kernel but really carries data about the kernel’s arguments.  
The first element of the metadata maps to the first kernel argument, the second to the second and so on.  For this particular metadata the request is that the value of any metadata element is 'true' for any kernel arguments that corresponds to the base pointer of an accessor and 'false' otherwise.  

Accessors are handled specially by the frontend (because they are marked with sycl_special_class) and when a user captures an accessor in their SYCL kernel the FE splits up the single accessor into 4 separate kernel arguments.  The first of those 4 arguments is a pointer and is the base pointer of the accessor.  That pointer is known to have runtime-specific alignment and thus the element of the kernel-arg-runtime metadata that corresponds to that argument will have a value of “true”.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants