Skip to content

[SYCL][ESIMD] Remove wrapping buffer objects into images. #2746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 17, 2020

Conversation

kbobrovs
Copy link
Contributor

@kbobrovs kbobrovs commented Nov 9, 2020

This PR removes buffer object wrapping into images, which was used as a SIMD BE workaround and caused a number of problems (e.g. scatter/gather of 1- and 2- byte values worked incorrectly in some cases). It also paves a way for ESIMD/SYCL kernel co-existence in the same source. There are 3 main parts

Runtime

  • Remove the workaround wrapping.
  • Gets rid of majority of SYCL_EXPLICIT_SIMD macro checks in the API code
  • Use the new integration header feature - KernelInfo<>::isESIMD() to properly set accessor arguments
    in ESIMD kernels.
  • Add __init_esimd accessor methods added for FE to pick them up instead of __init for ESIMD kernels, as init sequence is different.

FE

  • Add 'esimd_acc_ptr' parameter attribute to FE. This attribute is to be used by the FE to mark ESIMD kernel arguments
    originating from buffer accessors, which is then translated to BE-specific metadata needed for correct code generation.
  • Use 'esimd_acc_ptr' attribute to mark ESIMD kernel pointer arguments originating from accessors
  • Use '__init_esimd' initializer in FE for ESIMD kernel accessors.
  • Add 'isESIMD()' function to the integration header for the RT to be able to distinguish ESIMD kernels from normal ones for the
    purpose of proper accessor arg setting

LowerESIMD

  • Add 'buffer_t' metadata for accessor kernel arguments and mark it as surface for proper code gen by ESIMD BE.

Copy link
Contributor

@dm-vodopyanov dm-vodopyanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ABI is broken. Not sure if we can merge any patches with ABI changes now based on https://github.com/intel/llvm/blob/sycl/sycl/doc/ABIPolicyGuide.md#changing-abi.
@kbobrovs, can this patch be re-written without changes in ABI?

// a global accessor is always wrapped into a 1d image buffer to enable
// surface index-based addressing.
void __init(OCLImage1dBufferTy ImgBuf) { ImageBuffer = ImgBuf; }
const ConcreteASPtrType getNativeImageObj() const { return MData; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having duplicated API is redundant. We can use getQualifiedPtr instead of getNativeImageObj.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is used in ESIMD APIs. Let's rename/refactor as a separate PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a TODO

@kbobrovs
Copy link
Contributor Author

@dm-vodopyanov, I removed the ABI breaking change - please take a look.
I had to rebase atop other patches to have a chance to pass testing. Please review
[SYCL][ESIMD] RT: remove wrapping buffer objects into images. … eca13ef
and later

@kbobrovs kbobrovs force-pushed the drop_image_wrap_rt branch 4 times, most recently from 021fe9a to 055f859 Compare November 14, 2020 08:31
@kbobrovs
Copy link
Contributor Author

kbobrovs commented Nov 14, 2020

Collapsed all commits into the 4 - 2xFE, RT and the LowerESIMD. That's how they should be committed (w/o further squashing)
I think I addressed all review comments.
@Fznamznon - please check/approve the FE (6fb2ce1, 06dd08c)
@DenisBakhvalov - please check/approve the LowerESIMD (873d4a6)
@dm-vodopyanov - please check/approve the RT (055f859)

@kbobrovs kbobrovs changed the title [SYCL][ESIMD] RT: remove wrapping buffer objects into images. [SYCL][ESIMD] Remove wrapping buffer objects into images. Nov 14, 2020
dm-vodopyanov
dm-vodopyanov previously approved these changes Nov 16, 2020
Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance to merge patches for different components separately? It would simplify "git blaming" in the future.

cl::sycl::accessor<int, 1, cl::sycl::access::mode::read_write> accessorA;
cl::sycl::accessor<int, 1, cl::sycl::access::mode::read> accessorB;

cl::sycl::kernel_single_task<class esimd_kernel>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please use queue::submit and handler::single_task from the mock header instead? And include the mock header as system header using -internal-isystem . Like here https://github.com/intel/llvm/blob/sycl/clang/test/CodeGenSYCL/stall_enable.cpp

We need to follow single style in FE test and this style must be as close as possible to the way how we use real sycl headers.

Copy link
Contributor Author

@kbobrovs kbobrovs Nov 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see single style in FE tests, so I picked one. But OK, I will try to rework as you suggest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see single style in FE tests, so I picked on. But OK, I will try to rework as you suggest.

Yes, we don't have single style yet. This problem is already captured, and I remember that someone was working on it, but it is not finished.


cl::sycl::kernel_single_task<class esimd_kernel>(
[=]() __attribute__((sycl_explicit_simd)) {
accessorA.use(val);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also try other kinds of arguments (like POD for example) and check that all metadata is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a nit? The main point of the test is checking accessors argument translation and MD generation. PODs are no different from other types in this regard - and int val should be enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was to check that we don't generate something unnecessary for non-accessor types in case if it can confuse underlying backend. And I probably missed that you already have captured int, so yes, this comment is more a nit.

// CHECK: i32 "VCArgumentDesc" "VCArgumentIOKind"="0" "VCArgumentKind"="0" %_arg_1,
// CHECK: i32 addrspace(1)* nocapture "VCArgumentDesc"="buffer_t" "VCArgumentIOKind"="0" "VCArgumentKind"="2" %_arg_3)
// --- Attributes
// CHECK: {{.*}} !kernel_arg_accessor_ptr ![[ACC_PTR_ATTR:[0-9]+]] !sycl_explicit_simd !{{[0-9]+}} {{.*}}{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please check that __init_esimd is called instead of regular __init in case of ESIMD kernel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically proven by the fact that an accessor is translated into a single pointer arg (as in __init_esimd) rather than 4 (as in __init). I'll see if I can make direct check. Do you have suggestions, BTW?

Copy link
Contributor

@Fznamznon Fznamznon Nov 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember exactly, but I was under impression that we don't enable early optimizations for ESIMD mode, am I right?
In case if IR is not optimized, I would just check that there are two calls to __init_esimd function, i.e. (the following LLVM IR may contain errors):

CHECK: call spir_func void @{{.*}}__init_esimd{{.*}}(%"class{{.*}}cl::sycl::accessor" addrspace(4)* {{.*}}, i32 addrspace(1)* {{.*}}) /* two times */

Or you can even check which argument of the kernel is passed like here

// CHECK: [[GEP2:%[a-zA-Z0-9_]+]] = getelementptr inbounds %"class._ZTSZ4mainE3$_0.anon", %"class._ZTSZ4mainE3$_0.anon"* [[KERNEL_OBJ]], i32 0, i32 0
.

In case if IR is optimized for ESIMD mode, we can check presence of __init_esimd in non-optimized IR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks. Will try

Comment on lines 8 to 12
// -- ESIMD Lambda kernel.
template <typename ID, typename F>
void kernelA(F f) __attribute__((sycl_kernel)) {
f();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, please use queue::submit and single_task from the mock header included like system one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@kbobrovs
Copy link
Contributor Author

Is there any chance to merge patches for different components separately? It would simplify "git blaming" in the future.

that's the intent. I would like this PR to be committed w/o squashing - as 4 patches.

This attribute is to be used by the FE to mark ESIMD kernel arguments
originating from buffer accessors, which is then translated to BE-specific
metadata needed for correct code generation.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
- Use 'esimd_acc_ptr' attribute to mark ESIMD kernel pointer
  arguments originating from accessors
- Use '__init_esimd' initializer in FE for ESIMD kernel accessors.
- Add 'isESIMD()' function to the integration header for the RT to
  be able to distinguish ESIMD kernels from normal ones for the
  purpose of proper accessor arg setting.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
@kbobrovs
Copy link
Contributor Author

@Fznamznon - comments are addressed, please take a look. Had to force-push to maintain the proper patch structure, hope this is OK.
@dm-vodopyanov, please re-approve the RT part.

Fznamznon
Fznamznon previously approved these changes Nov 16, 2020
@kbobrovs
Copy link
Contributor Author

@DenisBakhvalov, please review/approve the LowerESIMD part

DenisBakhvalov
DenisBakhvalov previously approved these changes Nov 16, 2020
@kbobrovs
Copy link
Contributor Author

@dm-vodopyanov - could you please restore your approval?
@bader, can you please take a look and merge if no objections?

dm-vodopyanov
dm-vodopyanov previously approved these changes Nov 17, 2020
@bader
Copy link
Contributor

bader commented Nov 17, 2020

@dm-vodopyanov - could you please restore your approval?
@bader, can you please take a look and merge if no objections?

@kbobrovs, I see a few other PR with similar changes. Could you resolve duplication by closing PRs, which are not intended for merging, please? Make sure that all comments are addressed.
In particular: #2699 and #2748.

I have a couple of comments WRT LLVM pass changes (mentioned in #2748):

  1. There is no regression test for LLVM pass
  2. I don't quite get the protocol between FE and LLVM pass and I hoped that we can simplify the logic in the pass. Could you answer this question, please?

@kbobrovs
Copy link
Contributor Author

@bader - I closed #2747 and #2748. #2699 is not part of this one and still needs to be worked on/merged separately.

There is no regression test for LLVM pass

There actually is, but it is part of SYCL CodeGen: clang/test/CodeGenSYCL/esimd-accessor-ptr-md.cpp.
Will add one more to llvm/test/SYCLLowerIR

I don't quite get the protocol between FE and LLVM pass and I hoped that we can simplify the logic in the pass. Could you answer this question, please?

sorry, missed that one. The question is:

Is it possible to have a kernel_arg_accessor_ptr metadata with 0 value?

yes. Only pointer arguments originating from accessors have 1 in the corresponding place in kernel_arg_accessor_ptr metadata

Make sure that all comments are addressed.

I think all are addressed - reviewers all approved. Do you have some specific comment in mind (except the above one)?

@kbobrovs
Copy link
Contributor Author

@bader - test is added, please take a look

Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except one nit.

; 'kernel_arg_accessor_ptr' metadata. Particularly, that it generates additional
; vector of per-argument metadata (accessible from "genx.kernels" top-level
; metadata node):
; - for those arguments having '1' in the corresponding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code seems to allow any non-zero value. Please, either update the comment or add an assert (or some other check) that FE emitted expected value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implemented the nit

BE is now able to accept buffers instead of image1d buffers, so remove the
w/a wrapping. This patch also gets rid of majority of __SYCL_EXPLICIT_SIMD__
macro checks in the API code, paving a way for ESIMD/SYCL kernel
co-existence.

Details:
- Use new int header feature KernelInfo<>::isESIMD() to properly set
  accessor args
- Add __init_esimd accessor methods to be used by FE with ESIMD kernels
- Some ESIMD-related stuff is now unsed - can't be removed now not to break
  ABI. TODOs added.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
@kbobrovs kbobrovs requested a review from bader November 17, 2020 15:50
@kbobrovs
Copy link
Contributor Author

@bader, please do not squash when merging

@bader
Copy link
Contributor

bader commented Nov 17, 2020

@bader, please do not squash when merging

All other merge options are disabled for this project.
The only way to avoid squash is to create a separate PR for each commit.

@kbobrovs
Copy link
Contributor Author

All other merge options are disabled for this project.
The only way to avoid squash is to create a separate PR for each commit.

That's unfortunate. Those PRs won't pass testing. Can we change this setting BTW?
Otherwise, let's squash then, as we don't seem to have a choice. The description can be used as the commit message.

@bader bader merged commit d2d20d6 into intel:sycl Nov 17, 2020
@kbobrovs kbobrovs deleted the drop_image_wrap_rt branch November 17, 2020 17:02
jsji pushed a commit that referenced this pull request Oct 11, 2024
/SPIRV-LLVM-Translator/lib/SPIRV/SPIRVRegularizeLLVM.cpp:535:15: warning: implicit conversion turns string literal into bool: 'const char[39]' to 'bool' [-Wstring-conversion]
  535 |       assert(!"Cache controls must decorate a pointer");
      |              ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/assert.h:93:27: note: expanded from macro 'assert'
   93 |      (static_cast <bool> (expr)                                         \
      |                           ^~~~

The original code is not wrong but `false &&` is only a few
characters more and does the same thing without warning.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@573e951a3207fe9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants