Skip to content

[SYCL] sycl-post-link changes to support invoke_simd. #6160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 1, 2022

Conversation

kbobrovs
Copy link
Contributor

  • Split into SYCL and ESIMD callgraph is now obligatory, as these kinds
    of functions undergo different set of IR t-forms. If no SYCL/ESIMD
    splitting was requested on the command line, then the tool clones
    functions shared between the two callgraphs and then links them back
    together. This is future default for the compiler driver, as for
    invoke_simd to be correctly handled by the BE, both kinds of code
    must be in the same module.
  • Logic of the output file table generation refactored to be more
    straightforward.
  • ModuleDesc extended to accomodate properties and support entry point
    (Function object) replacement with new entry with the same name.
  • Make --split-esimd a first-class action of sycl-post-link - if passed
    alone, the tool will no longer complain that no actions are requested.
  • Fixed bug in ModuleSplitter.cpp/extractSubModule where no actual
    replacement of entry point Functions with cloned ones happenned.

return EntryPointGroups;
}

// For device global variables with the 'device_image_scope' property,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was just moved out of the anonymous namespace into ModuleSplitterBase to make it callable from the sycl-post-link.cpp. It was also renamed to verifyNoCrossModuleDeviceGlobalUsage to reflect the purpose more closely.

@@ -424,3 +366,213 @@ std::unique_ptr<ModuleSplitterBase> module_split::getSplitterByMode(
else
return std::make_unique<ModuleCopier>(std::move(M), std::move(Groups));
}

void ModuleSplitterBase::verifyNoCrossModuleDeviceGlobalUsage() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a rename+move of checkImageScopedDeviceGlobals function.

@@ -675,8 +806,10 @@ int main(int argc, char **argv) {
" '-split=auto' mode automatically selects the best way of splitting\n"
" kernels into modules based on some heuristic.\n"
" The '-split' option is compatible with '-split-esimd'. In this case,\n"
" first input module will be split into SYCL and ESIMD modules. Then\n"
" both modules will be further split according to the '-split' option.\n"
" first input module will be split according to the '-split' option\n"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-split is now applied before -split-esimd, hence the update

@@ -706,14 +839,16 @@ int main(int argc, char **argv) {

bool DoSplit = SplitMode.getNumOccurrences() > 0;
bool DoSplitEsimd = SplitEsimd.getNumOccurrences() > 0;
bool DoLowerEsimd = LowerEsimd.getNumOccurrences() > 0;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-lower-esimd is now a first-class action - i.e. it can be the only action requested from sycl-post-link, w/o causing option processing error.

OutputFilename = (Twine(sys::path::stem(InputFilename)) + S).str();
}

std::unique_ptr<util::SimpleTable> Table = processInputModule(std::move(M));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Table generation was "homogenised" and consolidated in processInputModule

// Do invoke_simd processing before splitting because this:
// - saves processing time (the pass is run once, even though on larger IR)
// - doing it before SYCL/ESIMD splitting is required for correctness
Modified |= lowerInvokeSimd(*M);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new addition for invoke_simd processing. This t-form was dormant in the code base until now.

Modified |= Splitter->totalSplits() > 1;

if (DeviceGlobals)
Splitter->verifyNoCrossModuleDeviceGlobalUsage();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check was done internally in the splitter, which required unnecessary DeviceGloabls option propagation, now it is moved here.


TableFiles processOneModule(std::unique_ptr<Module> M, bool IsEsimd,
Copy link
Contributor Author

@kbobrovs kbobrovs May 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was refactored and in-lined into processInputModule for better clarity:

  • table generation was consolidated and simplified
  • CanReuseInputModule and IROutputOnly processing logic also simplified

Splitter->verifyNoCrossModuleDeviceGlobalUsage();

// Proceed with top-level splitting.
while (Splitter->hasMoreSplits()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

processInputModule now has two nested splitters - SYCL/ESIMD is nested within scoped one.

@@ -10,27 +10,27 @@ target triple = "spir64-unknown-linux"

$_Z3barIiET_S0_ = comdat any

; CHECK-TU0-NOT: @{{.*}}GV{{.*}}
; CHECK-TU1: @{{.*}}GV{{.*}} = internal addrspace(1) constant [1 x i32] [i32 42], align 4
; CHECK-TU1-NOT: @{{.*}}GV{{.*}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all test changes are related to re-numeration of output files - E.g. ESIMD kernels appeared in files generated earlier (with smaller numeric index), now they will appear in files generated later, etc.

@kbobrovs kbobrovs marked this pull request as ready for review May 19, 2022 07:23
@kbobrovs kbobrovs requested a review from a team as a code owner May 19, 2022 07:23
@kbobrovs kbobrovs requested a review from v-klochkov May 19, 2022 07:23
Comment on lines +64 to +74
#ifdef _NDEBUG
#define DUMP_ENTRY_POINTS(args...)
#else
constexpr int DebugPostLink = 0;

#define DUMP_ENTRY_POINTS(...) \
if (DebugPostLink > 0) { \
llvm::module_split::dumpEntryPoints(__VA_ARGS__); \
}
#endif

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have 2 suggestions for this code.

  1. If someone needs to see debug output, the one have to change "DebugPostLink = 0" with "DebugPostLink = 1;" and re-build. When don't need it anymore, need to cancel change and re-build again.
    Wouldn't it be more convenient to have a macro?
Suggested change
#ifdef _NDEBUG
#define DUMP_ENTRY_POINTS(args...)
#else
constexpr int DebugPostLink = 0;
#define DUMP_ENTRY_POINTS(...) \
if (DebugPostLink > 0) { \
llvm::module_split::dumpEntryPoints(__VA_ARGS__); \
}
#endif
#if defined(_NDEBUG) || !defined(DEBUG_POST_LINK)
#define DUMP_ENTRY_POINTS(args...)
#else
#define DUMP_ENTRY_POINTS(...) \
llvm::module_split::dumpEntryPoints(__VA_ARGS__); \
#endif
  1. Why not to have a normal C++ function, something like this?
Suggested change
#ifdef _NDEBUG
#define DUMP_ENTRY_POINTS(args...)
#else
constexpr int DebugPostLink = 0;
#define DUMP_ENTRY_POINTS(...) \
if (DebugPostLink > 0) { \
llvm::module_split::dumpEntryPoints(__VA_ARGS__); \
}
#endif
template <typename... T>
inline void debugEntryPoints(T... Args) {
#if !defined(_NDEBUG) && defined(DEBUG_POST_LINK)
llvm::module_split::dumpEntryPoints(Args...);
#endif
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more convenient to have a macro?

I don't think it is more convenient. Adding a macro to the LLVM build requires re-configuring - this seems a lot more disturbing.

Why not to have a normal C++ function, something like this?

I don't see advantages in having this as a function. DUMP_ENTRY_POINTS in all caps is intuitively understandable that this is a macro call, and has the same efforts on the call site as debugEntryPoints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to re-base to re-test with fresh top-of-trunk

kbobrovs added 6 commits May 19, 2022 15:43
- Split into SYCL and ESIMD callgraph is now obligatory, as these kinds
  of functions undergo different set of IR t-forms. If no SYCL/ESIMD
  splitting was requested on the command line, then the tool clones
  functions shared between the two callgraphs and then links them back
  together. This is future default for the compiler driver, as for
  invoke_simd to be correctly handled by the BE, both kinds of code
  must be in the same module.
- Logic of the output file table generation refactored to be more
  straightforward.
- ModuleDesc extended to accomodate properties and support entry point
  (Function object) replacement with new entry with the same name.
- Make --split-esimd a first-class action of sycl-post-link - if passed
  alone, the tool will no longer complain that no actions are requested.
- Fixed bug in ModuleSplitter.cpp/extractSubModule where no actual
  replacement of entry point Functions with cloned ones happenned.
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
v-klochkov
v-klochkov previously approved these changes May 24, 2022
};

using EntryPointGroupVec = std::vector<EntryPointGroup>;

struct ModuleDesc {
enum class ESIMDStatus { SYCL_ONLY, ESIMD_ONLY, SYCL_AND_ESIMD };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment: IMO, the name of this enum is a bit confusing. Wouldn't it be more clear to name it something like SplitModuleStatus or ModuleStatus?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I'm thinking about something like SyclEsimdSplitStatus

}

bool isESIMD() const { return Props.HasEsimd == ESIMDStatus::ESIMD_ONLY; }
bool isSYCL() const { return Props.HasEsimd == ESIMDStatus::SYCL_ONLY; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'HasEsimd == ESIMDStatus::SYCL_ONLY' is a bit awkward, wouldn't it be better to say something like:
'Props.ModuleFeatures == ModuleStatus::SYCL_ONLY' ?

}

bool isESIMD() const { return Props.HasEsimd == ESIMDStatus::ESIMD_ONLY; }
bool isSYCL() const { return Props.HasEsimd == ESIMDStatus::SYCL_ONLY; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'HasEsimd == ESIMDStatus::SYCL_ONLY' is a bit awkward, wouldn't it be better to say something like:
'Props.ModuleFeatures == ModuleStatus::SYCL_ONLY' ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about ESIMDStatus "awkwardness". "ModuleFeatures" is too general IMO.


// @param MD Module descriptor to save
// @param IRFilename filename of already avaialble IR component. If not empty,
// IR component saving is skipped, and this file name as recorded as such in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: is recorded as

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

@asudarsa
Copy link
Contributor

LGTM overall. Some minor changes proposed. Thanks

};

using EntryPointGroupVec = std::vector<EntryPointGroup>;

struct ModuleDesc {
enum class ESIMDStatus { SYCL_ONLY, ESIMD_ONLY, SYCL_AND_ESIMD };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment: IMO, the name of this enum is a bit confusing. Wouldn't it be more clear to name it something like SplitModuleStatus or ModuleStatus?

Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@kbobrovs kbobrovs merged commit 8c7bb45 into intel:sycl Jun 1, 2022
@kbobrovs kbobrovs deleted the invoke_simd_post_link branch June 1, 2022 17:38
@pvchupin
Copy link
Contributor

pvchupin commented Jun 1, 2022

@kbobrovs
Copy link
Contributor Author

kbobrovs commented Jun 2, 2022

}
// Verify entry points:
for (const auto *F : entries()) {
const bool IsESIMDFunction = isESIMDFunction(*F);
Copy link
Contributor

@HaohaiWen HaohaiWen Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to add "(void)IsESIMDFunction;" or wrap this line under macro like #ifdef DEBUG to avoid unused variable warning/error.

kbobrovs added a commit to kbobrovs/llvm that referenced this pull request Jun 2, 2022
@kbobrovs
Copy link
Contributor Author

kbobrovs commented Jun 2, 2022

@kbobrovs, @v-klochkov, please handle post-commit fail: https://github.com/intel/llvm/runs/6694744289?check_suite_focus=true

@pvchupin, I'm on it

@pvchupin - the fix is created: #6231

pvchupin pushed a commit that referenced this pull request Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants