-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL] sycl-post-link changes to support invoke_simd. #6160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
return EntryPointGroups; | ||
} | ||
|
||
// For device global variables with the 'device_image_scope' property, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function was just moved out of the anonymous namespace into ModuleSplitterBase
to make it callable from the sycl-post-link.cpp
. It was also renamed to verifyNoCrossModuleDeviceGlobalUsage
to reflect the purpose more closely.
@@ -424,3 +366,213 @@ std::unique_ptr<ModuleSplitterBase> module_split::getSplitterByMode( | |||
else | |||
return std::make_unique<ModuleCopier>(std::move(M), std::move(Groups)); | |||
} | |||
|
|||
void ModuleSplitterBase::verifyNoCrossModuleDeviceGlobalUsage() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a rename+move of checkImageScopedDeviceGlobals
function.
@@ -675,8 +806,10 @@ int main(int argc, char **argv) { | |||
" '-split=auto' mode automatically selects the best way of splitting\n" | |||
" kernels into modules based on some heuristic.\n" | |||
" The '-split' option is compatible with '-split-esimd'. In this case,\n" | |||
" first input module will be split into SYCL and ESIMD modules. Then\n" | |||
" both modules will be further split according to the '-split' option.\n" | |||
" first input module will be split according to the '-split' option\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-split is now applied before -split-esimd, hence the update
@@ -706,14 +839,16 @@ int main(int argc, char **argv) { | |||
|
|||
bool DoSplit = SplitMode.getNumOccurrences() > 0; | |||
bool DoSplitEsimd = SplitEsimd.getNumOccurrences() > 0; | |||
bool DoLowerEsimd = LowerEsimd.getNumOccurrences() > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-lower-esimd is now a first-class action - i.e. it can be the only action requested from sycl-post-link, w/o causing option processing error.
OutputFilename = (Twine(sys::path::stem(InputFilename)) + S).str(); | ||
} | ||
|
||
std::unique_ptr<util::SimpleTable> Table = processInputModule(std::move(M)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Table generation was "homogenised" and consolidated in processInputModule
// Do invoke_simd processing before splitting because this: | ||
// - saves processing time (the pass is run once, even though on larger IR) | ||
// - doing it before SYCL/ESIMD splitting is required for correctness | ||
Modified |= lowerInvokeSimd(*M); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new addition for invoke_simd
processing. This t-form was dormant in the code base until now.
Modified |= Splitter->totalSplits() > 1; | ||
|
||
if (DeviceGlobals) | ||
Splitter->verifyNoCrossModuleDeviceGlobalUsage(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check was done internally in the splitter, which required unnecessary DeviceGloabls option propagation, now it is moved here.
|
||
TableFiles processOneModule(std::unique_ptr<Module> M, bool IsEsimd, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was refactored and in-lined into processInputModule for better clarity:
- table generation was consolidated and simplified
CanReuseInputModule
andIROutputOnly
processing logic also simplified
Splitter->verifyNoCrossModuleDeviceGlobalUsage(); | ||
|
||
// Proceed with top-level splitting. | ||
while (Splitter->hasMoreSplits()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
processInputModule
now has two nested splitters - SYCL/ESIMD is nested within scoped one.
@@ -10,27 +10,27 @@ target triple = "spir64-unknown-linux" | |||
|
|||
$_Z3barIiET_S0_ = comdat any | |||
|
|||
; CHECK-TU0-NOT: @{{.*}}GV{{.*}} | |||
; CHECK-TU1: @{{.*}}GV{{.*}} = internal addrspace(1) constant [1 x i32] [i32 42], align 4 | |||
; CHECK-TU1-NOT: @{{.*}}GV{{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all test changes are related to re-numeration of output files - E.g. ESIMD kernels appeared in files generated earlier (with smaller numeric index), now they will appear in files generated later, etc.
#ifdef _NDEBUG | ||
#define DUMP_ENTRY_POINTS(args...) | ||
#else | ||
constexpr int DebugPostLink = 0; | ||
|
||
#define DUMP_ENTRY_POINTS(...) \ | ||
if (DebugPostLink > 0) { \ | ||
llvm::module_split::dumpEntryPoints(__VA_ARGS__); \ | ||
} | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have 2 suggestions for this code.
- If someone needs to see debug output, the one have to change "DebugPostLink = 0" with "DebugPostLink = 1;" and re-build. When don't need it anymore, need to cancel change and re-build again.
Wouldn't it be more convenient to have a macro?
#ifdef _NDEBUG | |
#define DUMP_ENTRY_POINTS(args...) | |
#else | |
constexpr int DebugPostLink = 0; | |
#define DUMP_ENTRY_POINTS(...) \ | |
if (DebugPostLink > 0) { \ | |
llvm::module_split::dumpEntryPoints(__VA_ARGS__); \ | |
} | |
#endif | |
#if defined(_NDEBUG) || !defined(DEBUG_POST_LINK) | |
#define DUMP_ENTRY_POINTS(args...) | |
#else | |
#define DUMP_ENTRY_POINTS(...) \ | |
llvm::module_split::dumpEntryPoints(__VA_ARGS__); \ | |
#endif | |
- Why not to have a normal C++ function, something like this?
#ifdef _NDEBUG | |
#define DUMP_ENTRY_POINTS(args...) | |
#else | |
constexpr int DebugPostLink = 0; | |
#define DUMP_ENTRY_POINTS(...) \ | |
if (DebugPostLink > 0) { \ | |
llvm::module_split::dumpEntryPoints(__VA_ARGS__); \ | |
} | |
#endif | |
template <typename... T> | |
inline void debugEntryPoints(T... Args) { | |
#if !defined(_NDEBUG) && defined(DEBUG_POST_LINK) | |
llvm::module_split::dumpEntryPoints(Args...); | |
#endif | |
} | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be more convenient to have a macro?
I don't think it is more convenient. Adding a macro to the LLVM build requires re-configuring - this seems a lot more disturbing.
Why not to have a normal C++ function, something like this?
I don't see advantages in having this as a function. DUMP_ENTRY_POINTS
in all caps is intuitively understandable that this is a macro call, and has the same efforts on the call site as debugEntryPoints
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
had to re-base to re-test with fresh top-of-trunk
- Split into SYCL and ESIMD callgraph is now obligatory, as these kinds of functions undergo different set of IR t-forms. If no SYCL/ESIMD splitting was requested on the command line, then the tool clones functions shared between the two callgraphs and then links them back together. This is future default for the compiler driver, as for invoke_simd to be correctly handled by the BE, both kinds of code must be in the same module. - Logic of the output file table generation refactored to be more straightforward. - ModuleDesc extended to accomodate properties and support entry point (Function object) replacement with new entry with the same name. - Make --split-esimd a first-class action of sycl-post-link - if passed alone, the tool will no longer complain that no actions are requested. - Fixed bug in ModuleSplitter.cpp/extractSubModule where no actual replacement of entry point Functions with cloned ones happenned.
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
ded3946
to
d067250
Compare
}; | ||
|
||
using EntryPointGroupVec = std::vector<EntryPointGroup>; | ||
|
||
struct ModuleDesc { | ||
enum class ESIMDStatus { SYCL_ONLY, ESIMD_ONLY, SYCL_AND_ESIMD }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment: IMO, the name of this enum is a bit confusing. Wouldn't it be more clear to name it something like SplitModuleStatus or ModuleStatus?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I'm thinking about something like SyclEsimdSplitStatus
} | ||
|
||
bool isESIMD() const { return Props.HasEsimd == ESIMDStatus::ESIMD_ONLY; } | ||
bool isSYCL() const { return Props.HasEsimd == ESIMDStatus::SYCL_ONLY; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'HasEsimd == ESIMDStatus::SYCL_ONLY' is a bit awkward, wouldn't it be better to say something like:
'Props.ModuleFeatures == ModuleStatus::SYCL_ONLY' ?
} | ||
|
||
bool isESIMD() const { return Props.HasEsimd == ESIMDStatus::ESIMD_ONLY; } | ||
bool isSYCL() const { return Props.HasEsimd == ESIMDStatus::SYCL_ONLY; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'HasEsimd == ESIMDStatus::SYCL_ONLY' is a bit awkward, wouldn't it be better to say something like:
'Props.ModuleFeatures == ModuleStatus::SYCL_ONLY' ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about ESIMDStatus
"awkwardness". "ModuleFeatures" is too general IMO.
|
||
// @param MD Module descriptor to save | ||
// @param IRFilename filename of already avaialble IR component. If not empty, | ||
// IR component saving is skipped, and this file name as recorded as such in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: is recorded as
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
LGTM overall. Some minor changes proposed. Thanks |
}; | ||
|
||
using EntryPointGroupVec = std::vector<EntryPointGroup>; | ||
|
||
struct ModuleDesc { | ||
enum class ESIMDStatus { SYCL_ONLY, ESIMD_ONLY, SYCL_AND_ESIMD }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment: IMO, the name of this enum is a bit confusing. Wouldn't it be more clear to name it something like SplitModuleStatus or ModuleStatus?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks
@kbobrovs, @v-klochkov, please handle post-commit fail: https://github.com/intel/llvm/runs/6694744289?check_suite_focus=true |
@pvchupin, I'm on it |
} | ||
// Verify entry points: | ||
for (const auto *F : entries()) { | ||
const bool IsESIMDFunction = isESIMDFunction(*F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider to add "(void)IsESIMDFunction;" or wrap this line under macro like #ifdef DEBUG to avoid unused variable warning/error.
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
|
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
of functions undergo different set of IR t-forms. If no SYCL/ESIMD
splitting was requested on the command line, then the tool clones
functions shared between the two callgraphs and then links them back
together. This is future default for the compiler driver, as for
invoke_simd to be correctly handled by the BE, both kinds of code
must be in the same module.
straightforward.
(Function object) replacement with new entry with the same name.
alone, the tool will no longer complain that no actions are requested.
replacement of entry point Functions with cloned ones happenned.