-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL][ESIMD]Limit propagation of ESIMD attributes #7191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Complementary test PR: intel/llvm-test-suite#1352 |
#define DEBUG_TYPE "LowerESIMDKernelAttrs" | ||
|
||
using namespace llvm; | ||
|
||
namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allocator is used in other places - please factor this out to a separate source to avoid code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -30,7 +63,28 @@ SYCLFixupESIMDKernelWrapperMDPass::run(Module &M, ModuleAnalysisManager &MAM) { | |||
sycl::utils::traverseCallgraphUp( | |||
&F, | |||
[&](Function *GraphNode) { | |||
if (!llvm::esimd::isESIMD(*GraphNode)) { | |||
if (!llvm::esimd::isESIMD(*GraphNode) && | |||
llvm::esimd::isKernel(*GraphNode)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This filtering leaves out SYCL_EXTERNAL functions, which does not seem correct.
if (!Name.contains("sycl::_V1::detail::RoundedRangeKernel<")) { | ||
return; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the GraphNode->setMetadata...
call below is reached only when GraphNode is a RoundedRangeKernel, which means only RoundedRangeKernel gets the marker, all others are left out. Previous logic was that all functions were marked. This needs additional investigation. I'll also give this a thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally this pass was introduced to resolve a problem of mixing ESIMD and NonESIMD kernels and back then the issue was determined as
The immediate cause of the issue is that ESIMD kernel is wrapped with RoundedRangeKernel and sycl_explicit_esimd attribute is not propagated to the wrapper class. As a result ModuleSplitter in sycl-post-link creates a spurious split for the wrapper and later creates a spurious non-esimd module and pulls called functions into it. As a result non-esimd module would contain esimd functions and later fails to link. It looks like that for named ESIMD kernels the attributes are properly propagated and therefore the issue doesn't occur.
(#6557)
This bug basically says that marking all functions that call ESIMD functions is also incorrect as it results in a crash in some circumstances. Therefore the fix here is to limit marking to RoundedRangeKernel which was the original source of the problem. So in this sense the fix does what it is expected to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What this patch does is it excludes all the functions in the caller chain from given ESIMD function F up to RoundedRangeKernel from marking them as ESIMD, instead, it only marks the kernel. Previously all the function in the call chain were marked.
Now, this patch is supposed to fix internal LLVM error on some function met in the IR:
Assertion `isa(Val) && "cast() argument of incompatible type!"' failed.
So
- All the functions in a call chain ending at ESIMD function must be ESIMD functions, so not marking them is incorrect.
- I think the patch just masks the problem by skipping marking the problematic function (which in turn can spawn other problems).
I think we need to come up with a patch that fixes the root cause of the above assert.
@@ -0,0 +1,53 @@ | |||
//===--------- Allocator.h - Allocator for demangler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should've expressed my proposal clearer, sorry. This is just implementation detail of demangling code, not a general purpose service, so I would very much prefer this to reside in ESIMDUtils.cpp
/ESIMDUtils.hpp
llvm/lib/SYCLLowerIR/SYCLUtils.cpp
Outdated
|
||
using namespace llvm::esimd; | ||
bool isAddressArgumentInvokeSIMD(const CallInst *CI) { | ||
constexpr char INVOKE_SIMD_PREF[] = "_Z33__regcall3____builtin_invoke_simd"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please factor out. There is another user:
llvm/lib/SYCLLowerIR/LowerInvokeSimd.cpp:constexpr char INVOKE_SIMD_PREF[] = "_Z33__regcall3____builtin_invoke_simd";
llvm/lib/SYCLLowerIR/SYCLUtils.cpp
Outdated
|
||
namespace llvm { | ||
namespace sycl { | ||
namespace utils { | ||
|
||
using namespace llvm::esimd; | ||
bool isAddressArgumentInvokeSIMD(const CallInst *CI) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool isAddressArgumentInvokeSIMD(const CallInst *CI) { | |
bool isInvokeSimdBuiltinCall(const CallInst *CI) { |
llvm/lib/SYCLLowerIR/SYCLUtils.cpp
Outdated
return false; | ||
} | ||
|
||
bool filterFunctionPointer(Value *address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment what this function does, what it returns.
I also thought the deduceFunction
from LowerInvokeSimd.cpp could be used directly (factored out first, of course).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that deduceFunction is doing the opposite: i.e. given an argument of invoke_simd, it tries to find the actual function that invoke_simd tries to call.
llvm/lib/SYCLLowerIR/SYCLUtils.cpp
Outdated
@@ -43,6 +101,13 @@ void traverseCallgraphUp(llvm::Function *F, CallGraphNodeAction ActionF, | |||
} else { | |||
// ... non-call is OK - add using function to the worklist | |||
if (auto *I = dyn_cast<Instruction>(FCall)) { | |||
if (auto *SI = dyn_cast<StoreInst>(I)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we talked, such filtering within traverseCallgraphUp
should be opaque, and filtering function should be passed as parameter rather than hardcoded, as there are multiple users of traverseCallgraphUp
, but we need this invoke_simd behavior only for attribute markup. Somthing like
void traverseCallgraphUp(llvm::Function *F, CallGraphNodeAction ActionF,
SmallPtrSetImpl<Function *> &FunctionsVisited,
bool ErrorOnNonCallUse, std::function<bool(const Instruction *I, const Function *F)> && FuncUseFilter);
and the LowerESIMDKernelAttrs pass would invoke the traverseCallgraphUp
with proper FuncUseFilter argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. However, I believe we are going to encounter similar problems in other cases where traverseCallgraphUp is used with invoke_simd and therefore embedding the filtering functionality to traverseCallgraphUp would make more sense
void traverseCallgraphUp(llvm::Function *F, CallGraphNodeAction NodeF, | ||
SmallPtrSetImpl<Function *> &Visited, | ||
bool ErrorOnNonCallUse); | ||
void traverseCallgraphUp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment how functionFilter parameter is used.
#include "llvm/Demangle/Demangle.h" | ||
#include "llvm/Demangle/ItaniumDemangle.h" | ||
#include "llvm/IR/Module.h" | ||
#include "llvm/Pass.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these still needed?
#define DEBUG_TYPE "LowerESIMDKernelAttrs" | ||
|
||
using namespace llvm; | ||
using namespace llvm::esimd; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using namespace llvm::esimd; |
@@ -37,7 +39,7 @@ SYCLFixupESIMDKernelWrapperMDPass::run(Module &M, ModuleAnalysisManager &MAM) { | |||
Modified = true; | |||
} | |||
}, | |||
false); | |||
false, filterFunctionPointer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false, filterFunctionPointer); | |
false, llvm::esimd::filterFunctionPointer); |
llvm/lib/SYCLLowerIR/SYCLUtils.cpp
Outdated
@@ -43,6 +47,13 @@ void traverseCallgraphUp(llvm::Function *F, CallGraphNodeAction ActionF, | |||
} else { | |||
// ... non-call is OK - add using function to the worklist | |||
if (auto *I = dyn_cast<Instruction>(FCall)) { | |||
if (auto *SI = dyn_cast<StoreInst>(I)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment why this special branch is needed if (auto *SI = dyn_cast<StoreInst>(I))
llvm/lib/SYCLLowerIR/SYCLUtils.cpp
Outdated
@@ -43,6 +47,13 @@ void traverseCallgraphUp(llvm::Function *F, CallGraphNodeAction ActionF, | |||
} else { | |||
// ... non-call is OK - add using function to the worklist | |||
if (auto *I = dyn_cast<Instruction>(FCall)) { | |||
if (auto *SI = dyn_cast<StoreInst>(I)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The StoreInst check should be moved into the filter as it has nothing to do with this generic traversal functionality. That's why I suggested std::function<bool(const Instruction *I, const Function *F)>
, not std::function<bool(const Value *)>
if (auto *SI = dyn_cast<StoreInst>(I)) { | |
if (!functionFilter(I, CurF)) { | |
continue; | |
} |
llvm/lib/SYCLLowerIR/SYCLUtils.cpp
Outdated
Value *addr = SI->getPointerOperand(); | ||
if (!functionFilter(addr)) { | ||
continue; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Value *addr = SI->getPointerOperand(); | |
if (!functionFilter(addr)) { | |
continue; | |
} | |
} |
// Tracks the use of a function pointer being stored in a memory. | ||
// Returns false if the function pointer is used as an argument for invoke_simd | ||
// function call, true otherwise. | ||
bool filterFunctionPointer(Value *address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Since this function remains quite specific and is not reused anywhere, it should be moved into the place of use - LowerESIMDKernelAttrs.cpp.
- Please improve comment. It is not clear what is memory, what is function pointer, what is the incoming argument.
- I added a note below that this should have different signature.
if (address == nullptr) { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: LLVM Coding Standard says don’t use braces on simple single-statement bodies of if/else/loop statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always thought this is not strict - we prefer to omit the braces to avoid unnecessary line noise
.
if (F && F->getName().startswith(esimd::INVOKE_SIMD_PREF)) { | ||
return true; | ||
} | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (F && F->getName().startswith(esimd::INVOKE_SIMD_PREF)) { | |
return true; | |
} | |
return false; | |
return F && F->getName().startswith(esimd::INVOKE_SIMD_PREF); |
// if the instruction is to store address of a function, check if it is later | ||
// used by InvokeSimd. | ||
if (auto *SI = dyn_cast<StoreInst>(I)) { | ||
const Value *addr = SI->getPointerOperand(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const Value *addr = SI->getPointerOperand(); | |
const Value *Addr = SI->getPointerOperand(); |
@@ -73,10 +72,6 @@ ModulePass *llvm::createSYCLLowerInvokeSimdPass() { | |||
|
|||
namespace { | |||
// TODO support lambda and functor overloads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to move this TODO
as well or is it unrelated to the removed code below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO is unrelated to the removed code
// Checks the use of a function address being stored in a memory. | ||
// Returns false if the function address is used as an argument for | ||
// invoke_simd function call, true otherwise. | ||
bool checkFunctionAddressUse(const Value *address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment still not updated
BTW, why not just inline it into filterInvokeSimdUse
, since it is not general purpose function anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It uses recursion to follow the use of a function. filterInvokeSimdUse
was introduced to take StoreInst
check out of generic traverseCallgraphUp
and inlining will complicate the recursion.
llvm/lib/SYCLLowerIR/SYCLUtils.cpp
Outdated
@@ -43,6 +47,10 @@ void traverseCallgraphUp(llvm::Function *F, CallGraphNodeAction ActionF, | |||
} else { | |||
// ... non-call is OK - add using function to the worklist | |||
if (auto *I = dyn_cast<Instruction>(FCall)) { | |||
if (!functionFilter(I)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function*
should be part of the interface, as we are deciding whether a use of particular function should be filtered or not. What if there are multiple function pointers used by I
?
@@ -25,8 +25,8 @@ bool isInvokeSimdBuiltinCall(const CallInst *CI) { | |||
return F && F->getName().startswith(esimd::INVOKE_SIMD_PREF); | |||
} | |||
|
|||
// Checks the use of a function address being stored in a memory. | |||
// Returns false if the function address is used as an argument for | |||
// Checks the use of a function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no "function" per se in any of the parameters. So the comment is not clear.
Co-authored-by: kbobrovs <[email protected]>
// used by InvokeSimd. | ||
if (auto *SI = dyn_cast<StoreInst>(I)) { | ||
const Value *Addr = SI->getPointerOperand(); | ||
return checkFunctionAddressUse(Addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return checkFunctionAddressUse(Addr); | |
return !isInvokeSimdArgumentSource(Addr, F); |
// location is used diectly or indirectly as an argument for invoke_simd. | ||
// Returns false if the function is used as an argument for | ||
// invoke_simd function call, true otherwise. | ||
bool checkFunctionAddressUse(const Value *address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool checkFunctionAddressUse(const Value *address) { | |
bool isInvokeSimdArgumentSource(const Value *address) { |
// invoke_simd function call, true otherwise. | ||
bool checkFunctionAddressUse(const Value *address) { | ||
if (address == nullptr) | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return true; | |
return false; |
} else if (const auto *CI = dyn_cast<CallInst>(V)) { | ||
// if __builtin_invoke_simd uses the pointer, do not traverse the function | ||
if (isInvokeSimdBuiltinCall(CI)) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return false; | |
return true; |
// location is used diectly or indirectly as an argument for invoke_simd. | ||
// Returns false if the function is used as an argument for | ||
// invoke_simd function call, true otherwise. | ||
bool checkFunctionAddressUse(const Value *address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into the code deeper, I think this part should be reworked. The overall approach can be:
Loads = find all loads from `address`
for (Load : Loads) {
Uses = collectUsesSkipThroughCasts(Load);
for (Use : Uses) {
if (Use is parameter to invoke_simd) {
return true;
}
}
return false;
}
Current logic is very different. We also don't need recursion, so this can be inlined intp the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the suggested rework is a simplification of the current approach and we are going to miss even more cases that we may currently miss.
For example what if we store the value to a different address
, we are not going to track this new address and are going to miss the valid code flow that may lead to invoke_simd. The purpose of recursion was to track such chain of load or load/store manipulations especially for -O0 when no optimizations that eliminate redundant loads/stores are applied.
For example following code will fail:
FuncType SIMD_CALLEE_PTR = SIMD_CALLEE;
....
FuncType SIMD_CALLEE_PTR1 = SIMD_CALLEE_PTR;
....
invoke_simd(sg, SIMD_CALLEE_PTR1 , ...);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think original checkFunctionAddressUse
is borrowed from https://github.com/intel/llvm/blob/sycl/llvm/lib/SYCLLowerIR/LowerInvokeSimd.cpp#L102, whose purpose is quite different from what we need here. collectPossibleStoredVals
is used to conservatively detect if optimization is possible, checkFunctionAddressUse
must tell with 100% reliability that given address:
- either contains only given function address or cannot contain it at all, anything in between should be an assert (as we lose reliability in this case)
- if it contains only given function address, then the value stored (the function pointer) either can't appear as an argument of __builtin_invoke_simd.
In general case this is not possible, e.g. when user program uses some complex data flow on the function pointer before calling invoke_simd.
The more I think of this, the more I'm getting inclined to stop adding a call graph edge on any use of a function other than call.
In our case, if a user function can call ESIMD function via a pointer, we should require that this function is marked with SYCL_ESIMD_FUNCTION explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow the conclusion. The problem that this fix is trying to fix is to mark all callers of an ESIMD function up the chain with ESIMD until it reaches either kernel or invoke_simd. So it is invoked only for functions explicitly marked as SYCL_ESIMD_FUNCTION. If we allow function pointers as arguments for invoke_simd we would need some data flow analysis, similar to what checkFunctionAddressUse does. Yes it has its limitations and doesn't cover all the cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we talked, the idea is:
- When propagating sycl_explicit_simd attribute we never go up through non-call use of function pointers, which automatically covers invoke_simd
- Users must explicitly add sycl_explicit_simd attribute if function calls a sycl_explicit_simd function through pointer.
Co-authored-by: kbobrovs <[email protected]>
Co-authored-by: kbobrovs <[email protected]>
Co-authored-by: kbobrovs <[email protected]>
Co-authored-by: kbobrovs <[email protected]>
hip fails are due to driver fail.
I've rebooted runner. |
@@ -68,5 +74,35 @@ void collectUsesLookThroughCastsAndZeroGEPs(const Value *V, | |||
/// in it. Returns nullptr if failed to do so. | |||
Type *getVectorTyOrNull(StructType *STy); | |||
|
|||
// Simplest possible implementation of an allocator for the Itanium demangler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change (and other ones related to demangling) is no longer needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon second thought, it is good to have the allocator defined in shared header, as others may need to use it too.
@AlexeySachkov, do you approve? |
My comments were mostly minor about coding style, I don't have objections against merging this PR, feel free to go ahead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formal approval (I forgot that there are two files owned by dpcpp-tools-reviewers)
No description provided.