-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL][ESIMD] Implement IR pass to lower C++ ESIMD intrinsics. #1881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me, just minor comments. And currently build is failing with this error:
CMake Error at cmake/modules/AddLLVM.cmake:646 (add_dependencies):
The dependency target "LLVMGenXIntrinsics" of target "LLVMSYCLLowerIR" does
not exist.
Coule you please fix this.
@@ -0,0 +1,39 @@ | |||
//===-- LowerESIMD.cpp - lower Explicit SIMD (ESIMD) constructs -----------===// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please fix the header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
; RUN: opt < %s -LowerESIMD -S | FileCheck %s | ||
|
||
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64" | ||
target triple = "spir64-unknown-windows-sycldevice" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass uses information about triple, so probably linux should also be tested, not sure if there will be any difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
llvm/lib/SYCLLowerIR/LowerESIMD.cpp
Outdated
// Add this kernel to the root. | ||
Kernels->addOperand(MDNode::get(Ctx, MDArgs)); | ||
F.addFnAttr("oclrt", "1"); | ||
// F.setDLLStorageClass(llvm::GlobalValue::DLLExportStorageClass); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably you forgot to remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
llvm/lib/SYCLLowerIR/LowerESIMD.cpp
Outdated
|
||
// Iterate argument list to gather argument kinds and generate argument | ||
// descriptors. | ||
for (auto AI = F.arg_begin(), AE = F.arg_end(); AI != AE; ++AI) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to have: range-based for loop over F.args().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Right. As I wrote in the description, this is pending (other people are working on this already):
|
Oh, I see, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few notes and comments, mostly about StringRef
usage.
In general, I have no objections, but I'm not sure that this approach is the best one. Have you considered using existing clang infrastructure for built-ins?
I think it would be better (or at least more reliable than parsing and comparing mangled names) to use it.
I imagine that some SYCL header file will just call a built-in function, which will be then mapped to corresponding intrinsic and no pass would be needed at all. I cannot say for sure that amount of code would be less than in this pass, but it would look more aligned with existing clang stuff, for sure.
A few links:
llvm/lib/SYCLLowerIR/LowerESIMD.cpp
Outdated
if (!Name.startswith(ESIMD_INTRIN_PREF0)) | ||
continue; | ||
// now skip the digits | ||
StringRef Name1 = Name.substr(std::strlen(ESIMD_INTRIN_PREF0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not sure why do you need Name1
here if you can just re-assign result of substr
or drop_while
back to Name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I can simplify
Yes we have. Big difference with e.g. AMD GPU is that we compile with SPIRV target, not with Gen-specific target. SPIRV target can't have Gen-specific intrinsics enabled easily and non-intrusively. Another difference is that there is no Gen back-end, and no Gen TargetMachine, which also complicates things. |
AFAIK, you can have not only backend-specific built-ins, but also lang-specific or even lib-specific built-ins. Also, I don't fully understand why do you need actual backend or |
Can you please please be more specific about your suggestion and why it is better than what's implemented? We can discuss this as a potential future alternative and add a TODO (if we agree that it is something worth doing). So far:
What I said is that absence of TargetMachine will (well, is likely to) complicate implementation of this approach. You showed x86 and AMD as an example, but similar approach for Gen won't work w/o target machine. |
Please note that I'm not against going with the current approach, I'm just trying to understand whether we can implement the same in a way which is better aligned with existing stuff in clang & llvm (and potentially more reliable: dealing with mangler/demangler is evil)
Why so? I don't understand why emitting call to an intrinsic in
Overall, I see that we could declare several new clang built-ins, which will be lowered into corresponding GenX intrinsics. Header files should be updated to call these new built-ins. Main advantages of that approach are the following:
I don't see any problems with this
I may miss something, so correct me if something is wrong, missing, or more complex than I think: My understanding is that you are converting call to some function into intrinsic, based on some rules: for example:
Here first argument of the intrinsic is the first argument of a call, the second argument of the intrinsic is the third template argument of a called function, etc. I think that absolutely the same can be expressed via existing clang built-ins infrastructure. If you know particular types which could be used in
There is no way to represent any arbitrary LLVM intrinsic in SPIR-V: this is not LLVM IR representation or generation problem, this is SPIR-V specification design. If you want to have more LLVM IR intrinsics represented in SPIR-V, then you need to have a SPIR-V spec updated or SPIR-V spec extension prepared or SPIR-V extended instruction set prepared.
I think this patch implements a conversion from DPC++ methods/functions into LLVM IR intrinsics, but it has nothing to do with SPIR-V. |
llvm/lib/SYCLLowerIR/LowerESIMD.cpp
Outdated
@@ -1027,7 +1026,7 @@ static void translateESIMDIntrinsicCall(CallInst &CI) { | |||
Function *F = CI.getCalledFunction(); | |||
StringRef MnglName = F->getName(); | |||
const char *MnglNameCStr = MnglName.data(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable is probably became unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
understand whether we can implement the same in a way which is better aligned with existing stuff in clang & llvm (and potentially more reliable: dealing with mangler/demangler is evil)
The word "better" and "evil" are highly subjective in this case :) I have the opposite opinion, for example. That's OK. I would suggest that you file an issue after we merge this patch and put your design sketches there. We definitely need more people to discuss what's the best here if there are doubts. Does this sound good to you?
There is no way to represent any arbitrary LLVM intrinsic in SPIR-V:
As far as I can see --spirv-allow-unknown-intrinsics
serves pretty well for this purpose. What exactly do you mean?
I think this patch implements a conversion from DPC++ methods/functions into LLVM IR intrinsics
right
, but it has nothing to do with SPIR-V.
It has. This is done for the SPIRV target which means I can't use device-specific built-ins like AMDGPU built-ins, which are enabled for specific target. So SPIRV is essential here.
Actually my main doubt/objection about using clang is not really complications related to hacking built-ins mechanism for our purposes. It is the need to introduce a lot of new intrinsics derived from combinations of element type, vector length and some other parameters. Not only I'd have to introduce those, I'd also have to write lowering for each, of course.
Overall, I think this kind of debate should better happen in the dedicated issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why so? I don't understand why emitting call to an intrinsic in CGBuiltin.cpp based on some descriptor from Builtins.td requires some TargetMachine and why it won't work for Gen.
Do you mean Builtins.def? This is target-neutral list of built-ins. It is not a good place for Gen intrinsics. Also I don't think e.g. ARM built-ins can be used with targets other than ARM. So I'd have to follow the same mechanism as employed for BuiltinsARM.def
. I haven't digged deeper into this, but all the target-specific built-ins precedents I'm aware of are for real BEs with target machines and code gen infra.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable is probably became unused
yep, I haven't yet pushed the fixed variant, just committed you suggestions. Will do shortly
09560e1
to
572d874
Compare
Had to rebase and force-push. |
This workaround is to avoid compilation error when building on Windows with MSVC++ 2017. Signed-off-by: Konstantin S Bobrovsky <[email protected]>
The pass transforms *__esimd_* Itanium - mangled C++ intrinsics to genx.*style parseable by the ESIMD - capable SPIRV translator. Authors: Konstantin S Bobrovsky <[email protected]> Gang Chen <[email protected]> Wei Pan Denis Bakhvalov <[email protected]> Anton Sidorenko <[email protected]> Kaiyu Chen <[email protected]> Pratik Ashar <[email protected]> Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Co-authored-by: Alexey Sachkov <[email protected]>
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
…erIR. SYCLLowerIR is the only user of vc-intrinsics. Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Had to rebase again to fix another problem in ItaniumDemangler.h. |
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I already said, I have no objections. Different approaches can be discussed and evaluated later, if needed
@@ -445,6 +445,14 @@ class EnableIfAttr : public Node { | |||
} | |||
}; | |||
|
|||
#ifdef _MSC_VER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... is this a GitHub bug or this code will be added again?
It should be already in place - #2021.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to make compilation and testing for this PR pass. I will rebase and this will go away.
DO NOT REVIEW THE FIRST COMMIT (covered by #1880)
The pass transforms _esimd Itanium - mangled C++ intrinsics to
genx.*style parseable by the ESIMD - capable SPIRV translator.
This pass depends on the vc-intrinsics repo (GenXIntrinsics component), which is not part of the build yet - in progress.
Authors:
Konstantin S Bobrovsky [email protected]
Gang Chen [email protected]
Wei Pan
Denis Bakhvalov [email protected]
Anton Sidorenko [email protected]
Kaiyu Chen [email protected]
Pratik Ashar [email protected]
Signed-off-by: Konstantin S Bobrovsky [email protected]