-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][ESIMD] Remove wrapping buffer objects into images. #2746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ABI is broken. Not sure if we can merge any patches with ABI changes now based on https://github.com/intel/llvm/blob/sycl/sycl/doc/ABIPolicyGuide.md#changing-abi.
@kbobrovs, can this patch be re-written without changes in ABI?
// a global accessor is always wrapped into a 1d image buffer to enable | ||
// surface index-based addressing. | ||
void __init(OCLImage1dBufferTy ImgBuf) { ImageBuffer = ImgBuf; } | ||
const ConcreteASPtrType getNativeImageObj() const { return MData; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having duplicated API is redundant. We can use getQualifiedPtr
instead of getNativeImageObj
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is used in ESIMD APIs. Let's rename/refactor as a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a TODO
b19d7a8
to
a689365
Compare
@dm-vodopyanov, I removed the ABI breaking change - please take a look. |
021fe9a
to
055f859
Compare
Collapsed all commits into the 4 - 2xFE, RT and the LowerESIMD. That's how they should be committed (w/o further squashing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any chance to merge patches for different components separately? It would simplify "git blaming" in the future.
cl::sycl::accessor<int, 1, cl::sycl::access::mode::read_write> accessorA; | ||
cl::sycl::accessor<int, 1, cl::sycl::access::mode::read> accessorB; | ||
|
||
cl::sycl::kernel_single_task<class esimd_kernel>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please use queue::submit and handler::single_task from the mock header instead? And include the mock header as system header using -internal-isystem
. Like here https://github.com/intel/llvm/blob/sycl/clang/test/CodeGenSYCL/stall_enable.cpp
We need to follow single style in FE test and this style must be as close as possible to the way how we use real sycl headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see single style in FE tests, so I picked one. But OK, I will try to rework as you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see single style in FE tests, so I picked on. But OK, I will try to rework as you suggest.
Yes, we don't have single style yet. This problem is already captured, and I remember that someone was working on it, but it is not finished.
|
||
cl::sycl::kernel_single_task<class esimd_kernel>( | ||
[=]() __attribute__((sycl_explicit_simd)) { | ||
accessorA.use(val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also try other kinds of arguments (like POD for example) and check that all metadata is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a nit? The main point of the test is checking accessors argument translation and MD generation. PODs are no different from other types in this regard - and int val
should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was to check that we don't generate something unnecessary for non-accessor types in case if it can confuse underlying backend. And I probably missed that you already have captured int, so yes, this comment is more a nit.
// CHECK: i32 "VCArgumentDesc" "VCArgumentIOKind"="0" "VCArgumentKind"="0" %_arg_1, | ||
// CHECK: i32 addrspace(1)* nocapture "VCArgumentDesc"="buffer_t" "VCArgumentIOKind"="0" "VCArgumentKind"="2" %_arg_3) | ||
// --- Attributes | ||
// CHECK: {{.*}} !kernel_arg_accessor_ptr ![[ACC_PTR_ATTR:[0-9]+]] !sycl_explicit_simd !{{[0-9]+}} {{.*}}{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please check that __init_esimd
is called instead of regular __init
in case of ESIMD kernel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically proven by the fact that an accessor is translated into a single pointer arg (as in __init_esimd) rather than 4 (as in __init). I'll see if I can make direct check. Do you have suggestions, BTW?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember exactly, but I was under impression that we don't enable early optimizations for ESIMD mode, am I right?
In case if IR is not optimized, I would just check that there are two calls to __init_esimd function
, i.e. (the following LLVM IR may contain errors):
CHECK: call spir_func void @{{.*}}__init_esimd{{.*}}(%"class{{.*}}cl::sycl::accessor" addrspace(4)* {{.*}}, i32 addrspace(1)* {{.*}}) /* two times */
Or you can even check which argument of the kernel is passed like here
// CHECK: [[GEP2:%[a-zA-Z0-9_]+]] = getelementptr inbounds %"class._ZTSZ4mainE3$_0.anon", %"class._ZTSZ4mainE3$_0.anon"* [[KERNEL_OBJ]], i32 0, i32 0 |
In case if IR is optimized for ESIMD mode, we can check presence of __init_esimd
in non-optimized IR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thanks. Will try
// -- ESIMD Lambda kernel. | ||
template <typename ID, typename F> | ||
void kernelA(F f) __attribute__((sycl_kernel)) { | ||
f(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, please use queue::submit and single_task from the mock header included like system one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
that's the intent. I would like this PR to be committed w/o squashing - as 4 patches. |
This attribute is to be used by the FE to mark ESIMD kernel arguments originating from buffer accessors, which is then translated to BE-specific metadata needed for correct code generation. Signed-off-by: Konstantin S Bobrovsky <[email protected]>
35c05fa
to
012a0ef
Compare
- Use 'esimd_acc_ptr' attribute to mark ESIMD kernel pointer arguments originating from accessors - Use '__init_esimd' initializer in FE for ESIMD kernel accessors. - Add 'isESIMD()' function to the integration header for the RT to be able to distinguish ESIMD kernels from normal ones for the purpose of proper accessor arg setting. Signed-off-by: Konstantin S Bobrovsky <[email protected]>
012a0ef
to
61d87bb
Compare
@Fznamznon - comments are addressed, please take a look. Had to force-push to maintain the proper patch structure, hope this is OK. |
@DenisBakhvalov, please review/approve the LowerESIMD part |
@dm-vodopyanov - could you please restore your approval? |
@kbobrovs, I see a few other PR with similar changes. Could you resolve duplication by closing PRs, which are not intended for merging, please? Make sure that all comments are addressed. I have a couple of comments WRT LLVM pass changes (mentioned in #2748):
|
@bader - I closed #2747 and #2748. #2699 is not part of this one and still needs to be worked on/merged separately.
There actually is, but it is part of SYCL CodeGen: clang/test/CodeGenSYCL/esimd-accessor-ptr-md.cpp.
sorry, missed that one. The question is:
yes. Only pointer arguments originating from accessors have 1 in the corresponding place in kernel_arg_accessor_ptr metadata
I think all are addressed - reviewers all approved. Do you have some specific comment in mind (except the above one)? |
079fe1e
61d87bb
to
079fe1e
Compare
@bader - test is added, please take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except one nit.
; 'kernel_arg_accessor_ptr' metadata. Particularly, that it generates additional | ||
; vector of per-argument metadata (accessible from "genx.kernels" top-level | ||
; metadata node): | ||
; - for those arguments having '1' in the corresponding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code seems to allow any non-zero value. Please, either update the comment or add an assert (or some other check) that FE emitted expected value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented the nit
…ents. Signed-off-by: Konstantin S Bobrovsky <[email protected]>
BE is now able to accept buffers instead of image1d buffers, so remove the w/a wrapping. This patch also gets rid of majority of __SYCL_EXPLICIT_SIMD__ macro checks in the API code, paving a way for ESIMD/SYCL kernel co-existence. Details: - Use new int header feature KernelInfo<>::isESIMD() to properly set accessor args - Add __init_esimd accessor methods to be used by FE with ESIMD kernels - Some ESIMD-related stuff is now unsed - can't be removed now not to break ABI. TODOs added. Signed-off-by: Konstantin S Bobrovsky <[email protected]>
079fe1e
to
cb221b6
Compare
@bader, please do not squash when merging |
All other merge options are disabled for this project. |
That's unfortunate. Those PRs won't pass testing. Can we change this setting BTW? |
/SPIRV-LLVM-Translator/lib/SPIRV/SPIRVRegularizeLLVM.cpp:535:15: warning: implicit conversion turns string literal into bool: 'const char[39]' to 'bool' [-Wstring-conversion] 535 | assert(!"Cache controls must decorate a pointer"); | ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/include/assert.h:93:27: note: expanded from macro 'assert' 93 | (static_cast <bool> (expr) \ | ^~~~ The original code is not wrong but `false &&` is only a few characters more and does the same thing without warning. Original commit: KhronosGroup/SPIRV-LLVM-Translator@573e951a3207fe9
This PR removes buffer object wrapping into images, which was used as a SIMD BE workaround and caused a number of problems (e.g. scatter/gather of 1- and 2- byte values worked incorrectly in some cases). It also paves a way for ESIMD/SYCL kernel co-existence in the same source. There are 3 main parts
Runtime
in ESIMD kernels.
FE
originating from buffer accessors, which is then translated to BE-specific metadata needed for correct code generation.
purpose of proper accessor arg setting
LowerESIMD