Skip to content

[SYCL] Optimize SYCL framework functions with -O0 #7376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Feb 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
acd84e0
[SYCL] Emit "sycl-framework" metadata attached to funcions/methods in…
maksimsab Nov 15, 2022
0e222f5
[SYCL] Add -fsycl-optimize-framework command line option in Clang Dri…
maksimsab Nov 17, 2022
9cb70d7
[SYCL] Add SYCLLowerIR/IR/PassAdaptors.h
maksimsab Dec 7, 2022
41bbeb7
[SYCL] Add a SYCL framework optimization pipeline to LLVM O0 default …
maksimsab Dec 7, 2022
8b66cf2
[SYCL] fix code review remarks
maksimsab Dec 9, 2022
ad2f3cd
[SYCL] Use sycl::kernel_single_task mock in sycl-framework-metadata1.cpp
maksimsab Dec 19, 2022
6bc843a
Merge branch 'sycl' into add_new_debug_mode
maksimsab Dec 19, 2022
060da0e
[SYCL] Fix RemoveFuncAttrsFromSYCLFrameworkFuncs pass
maksimsab Dec 19, 2022
1c73c82
[SYCL] Fix ModuleToSYCLFrameworkFunctionPassAdaptor
maksimsab Jan 2, 2023
a55eb69
[SYCL] Support fsycl-optimize-framework in clang_cl Driver.
maksimsab Jan 2, 2023
3d274c6
Merge branch 'sycl' into add_new_debug_mode
maksimsab Jan 2, 2023
2f4c38a
[SYCL] Fix code style and test
maksimsab Jan 3, 2023
c5b90e8
[SYCL] Adjust SROA usage to upstream
maksimsab Jan 3, 2023
c330040
[SYCL] remove test
maksimsab Jan 3, 2023
22a85f3
[SYCL] add marshalling for fsycl_optimize_framework option
maksimsab Jan 10, 2023
8be1488
[SYCL] add AddFuncAttrsFromSYCLFrameworkFuncs pass
maksimsab Jan 11, 2023
fbdc0be
[SYCL] replace fsycl-optimize-framework with
maksimsab Jan 13, 2023
ad46aa3
Merge branch 'sycl' into add_new_debug_mode
maksimsab Jan 13, 2023
b3b866e
[SYCL] add Inline passes to pipeline
maksimsab Jan 20, 2023
15d364c
[SYCL] fix code style
maksimsab Jan 20, 2023
9e1d8c8
[SYCL] remove leftover and correct option's description
maksimsab Jan 25, 2023
56eb938
Merge branch 'sycl' into add_new_debug_mode
maksimsab Jan 26, 2023
71975d2
[SYCL] simplify optimization pipeline
maksimsab Jan 26, 2023
febec47
[SYCL] remove !sycl-framework metadata
maksimsab Jan 27, 2023
0ab1fe8
[SYCL] fix SYCLFrameworkOptimization.h description
maksimsab Jan 27, 2023
ebd116f
[SYCL] fix code style
maksimsab Jan 27, 2023
2648088
[SYCL] fix CodeGenSYCL tests
maksimsab Jan 27, 2023
1670049
[SYCL] add comment and fix a typo
maksimsab Jan 30, 2023
77eba39
[SYCL] return noinline,optnone attrs back more accurately
maksimsab Feb 1, 2023
ba2fe26
[SYCL] adjust test to error message
maksimsab Feb 1, 2023
cc077eb
[SYCL] use regex for matching function attributes in tests
maksimsab Feb 1, 2023
524966b
[SYCL] remove AddDebugFuncsAttrs pass
maksimsab Feb 6, 2023
7b4391f
[SYCL] remove useless assertion
maksimsab Feb 6, 2023
91f6d96
[SYCL] correct test description
maksimsab Feb 8, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions clang/include/clang/Basic/CodeGenOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -515,6 +515,10 @@ CODEGENOPT(CtorDtorReturnThis, 1, 0)
/// Whether to disable the standard optimization pipeline for the SYCL device compiler.
CODEGENOPT(DisableSYCLEarlyOpts, 1, 0)

/// Optimize SYCL Framework functions. These are functions
/// which do not contain "user" code.
CODEGENOPT(OptimizeSYCLFramework, 1, 0)

#undef CODEGENOPT
#undef ENUM_CODEGENOPT
#undef VALUE_CODEGENOPT
2 changes: 2 additions & 0 deletions clang/include/clang/Basic/DiagnosticDriverKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,8 @@ def warn_drv_sycl_target_missing : Warning<
InGroup<SyclTarget>;
def err_drv_no_rdc_sycl_target_missing : Error<
"linked binaries do not contain expected '%0' target; found targets: '%1', this is not supported with '-fno-sycl-rdc'">;
def err_drv_fsycl_wrong_optimization_options : Error<
"-fsycl-optimize-non-user-code option can be used only in conjunction with %0">;
def err_drv_multiple_target_with_forced_target : Error<
"multiple target usage with '%0' is not supported with '%1'">;
def err_drv_failed_to_deduce_target_from_arch : Error<
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -3004,6 +3004,10 @@ def fsycl_max_parallel_jobs_EQ : Joined<["-"], "fsycl-max-parallel-link-jobs=">,
"or AOT compilation of each device image.">;
def : Flag<["-"], "fsycl-rdc">, Flags<[CoreOption]>, Alias<fgpu_rdc>;
def : Flag<["-"], "fno-sycl-rdc">, Flags<[CoreOption]>, Alias<fno_gpu_rdc>;
def fsycl_optimize_non_user_code : Flag<["-"], "fsycl-optimize-non-user-code">,
Flags<[CC1Option, CoreOption]>, MarshallingInfoFlag<CodeGenOpts<"OptimizeSYCLFramework">>,
HelpText<"Option used in conjunction with -O0 to "
"optimize SYCL framework utility functions and leave user's kernel code unoptimized. (experimental)">;
def fsyntax_only : Flag<["-"], "fsyntax-only">,
Flags<[NoXarchOption,CoreOption,CC1Option,FC1Option,FlangOption]>, Group<Action_Group>,
HelpText<"Run the preprocessor, parser and semantic analysis stages">;
Expand Down
3 changes: 3 additions & 0 deletions clang/lib/CodeGen/BackendUtil.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -850,6 +850,9 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
// Only enable CGProfilePass when using integrated assembler, since
// non-integrated assemblers don't recognize .cgprofile section.
PTO.CallGraphProfile = !CodeGenOpts.DisableIntegratedAS;
// Enable a custom optimization pipeline for non-user SYCL code.
PTO.OptimizeSYCLFramework =
CodeGenOpts.OptimizeSYCLFramework && !CodeGenOpts.DisableLLVMPasses;

LoopAnalysisManager LAM;
FunctionAnalysisManager FAM;
Expand Down
23 changes: 23 additions & 0 deletions clang/lib/CodeGen/CodeGenModule.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2155,6 +2155,23 @@ CodeGenModule::GetOrCreateRTTIProxyGlobalVariable(llvm::Constant *Addr) {
return FTRTTIProxy;
}

/// Function checks whether given DeclContext contains a topmost
/// namespace with name "sycl"
static bool checkIfDeclaredInSYCLNamespace(const Decl *D) {
const DeclContext *DC = D->getDeclContext()->getEnclosingNamespaceContext();
const auto *ND = dyn_cast<NamespaceDecl>(DC);
if (!ND)
return false;

while (const DeclContext *Parent = ND->getParent()) {
if (!isa<NamespaceDecl>(Parent))
break;
ND = cast<NamespaceDecl>(Parent);
}

return ND && ND->getName() == "sycl";
}

void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
llvm::Function *F) {
llvm::AttrBuilder B(F->getContext());
Expand Down Expand Up @@ -2277,6 +2294,12 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,

F->addFnAttrs(B);

if (getLangOpts().SYCLIsDevice && getCodeGenOpts().OptimizeSYCLFramework &&
checkIfDeclaredInSYCLNamespace(D)) {
F->removeFnAttr(llvm::Attribute::OptimizeNone);
F->removeFnAttr(llvm::Attribute::NoInline);
}

unsigned alignment = D->getMaxAlignment() / Context.getCharWidth();
if (alignment)
F->setAlignment(llvm::Align(alignment));
Expand Down
12 changes: 12 additions & 0 deletions clang/lib/Driver/ToolChains/Clang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5023,6 +5023,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,

// Forward -fsycl-default-sub-group-size if in SYCL mode.
Args.AddLastArg(CmdArgs, options::OPT_fsycl_default_sub_group_size);

if (Args.hasArg(options::OPT_fsycl_optimize_non_user_code)) {
const Arg *OArg = Args.getLastArg(options::OPT_O_Group);
if (!OArg || !OArg->getOption().matches(options::OPT_O0)) {
bool isCLMode = C.getDriver().IsCLMode();
// Linux and Windows have different debug options.
const StringRef Option = isCLMode ? "-Od" : "-O0";
D.Diag(diag::err_drv_fsycl_wrong_optimization_options) << Option;
}

CmdArgs.push_back("-fsycl-optimize-non-user-code");
}
}

if (IsSYCL) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
// Test checks that noinline and optnone function's attributes aren't attached
// to functions whose topmost namespace is sycl.

// RUN: %clang_cc1 -fsycl-is-device -O0 -fsycl-optimize-non-user-code -internal-isystem %S/Inputs -triple spir64-unknown-unknown -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s

#include "sycl.hpp"

// Check that kernel marked with noinline and optnone func attrs.
// CHECK: spir_kernel {{.*}} #[[KERNEL_ATTRS:[0-9]+]]

// Check that user code contain noinline and optnone func attrs.
// CHECK: define {{.*}} @_Z3foov() #[[FOO_ATTRS:[0-9]+]]
int foo() {
return 123;
}

// Check that all functions on sycl::* namespace do not contain
// noinline and optnone func attrs.
namespace sycl {
// CHECK: define {{.*}} @_ZN4sycl4bar1Ev() #[[BAR1_ATTRS:[0-9]+]]
void bar1() {}

namespace V1 {
// bar1 and bar2 have common function attrs
// CHECK: define {{.*}} @_ZN4sycl2V14bar2Ev() #[[BAR1_ATTRS]]
void bar2() {}
}
}

// Check that V1::sycl::* functions do not contain noinline and optnone
// func attrs since topmost namespace is V1 instead of sycl.
namespace V1 {
namespace sycl {
// foo and bar3 have common function attrs
// CHECK: define {{.*}} @_ZN2V14sycl4bar3Ev() #[[FOO_ATTRS]]
void bar3() {}
}
}

// Check attributes
// CHECK-DAG: attributes #[[KERNEL_ATTRS]] = {{.*}} {{noinline|optnone}} {{.*}} {{noinline|optnone}}
// CHECK-DAG: attributes #[[FOO_ATTRS]] = {{.*}} noinline {{.*}} optnone
// CHECK-NOT: attributes #[[BAR1_ATTRS]] = {{.*}} {{noinline|optnone}}

int main() {
sycl::kernel_single_task<class kernel>([]() {
foo();
sycl::bar1();
sycl::V1::bar2();
V1::sycl::bar3();
});
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
// Test checks that noinline and optnone function's attributes aren't attached
// to functions whose topmost namespace is not sycl.

// RUN: %clang_cc1 -fsycl-is-device -O0 -fsycl-optimize-non-user-code -triple spir64-unknown-unknown -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s

// Check that kernel contains noinline and optnone func attrs.
// CHECK: define {{.*}} @_ZTSZ4mainE6kernel() #[[KERNEL_ATTRS:[0-9]+]]

// Check that 'anonymous namespace'::sycl::* functions contains
// noinline and optnone func attrs since topmost namespace is anonymous
// instead of sycl.
namespace {
namespace sycl {
// CHECK: define {{.*}} @_ZN12_GLOBAL__N_14sycl4bar4Ev() #[[BAR4_ATTRS:[0-9]+]]
void bar4() {}
}
}

template <typename name, typename Func>
__attribute__((sycl_kernel)) void kernel_single_task(const Func &func) {
func();
}

// CHECK: attributes #[[KERNEL_ATTRS]] = {{.*}} noinline {{.*}} optnone
// CHECK: attributes #[[BAR4_ATTRS]] = {{.*}} noinline {{.*}} optnone

int main() {
kernel_single_task<class kernel>([]() {
sycl::bar4();
});
}
20 changes: 20 additions & 0 deletions clang/test/Driver/sycl-optimize-non-user-code.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// RUN: %clangxx -fsycl -O0 -### %s 2>&1 | FileCheck %s -check-prefix=NO-OPT-CHECK
// NO-OPT-CHECK-NOT: fsycl-optimize-non-user-code

// RUN: %clangxx -fsycl -O0 -fsycl-optimize-non-user-code -### %s 2>&1 | FileCheck %s -check-prefix=OPT-CHECK
// OPT-CHECK: fsycl-optimize-non-user-code

// RUN: not %clangxx -fsycl -O1 -fsycl-optimize-non-user-code %s 2>&1 | FileCheck %s -check-prefix=CHECK-ERROR
// RUN: not %clangxx -fsycl -fsycl-optimize-non-user-code %s 2>&1 | FileCheck %s -check-prefix=CHECK-ERROR
// CHECK-ERROR: error: -fsycl-optimize-non-user-code option can be used only in conjunction with -O0

// Check cases for Microsoft Windows Driver.
// RUN: %clang_cl -fsycl -Od -### %s 2>&1 | FileCheck %s -check-prefix=NO-OPT-WIN-CHECK
// NO-OPT-WIN-CHECK-NOT: fsycl-optimize-non-user-code

// RUN: %clang_cl -fsycl -Od -fsycl-optimize-non-user-code -### %s 2>&1 | FileCheck %s -check-prefix=OPT-WIN-CHECK
// OPT-WIN-CHECK: fsycl-optimize-non-user-code

// RUN: not %clang_cl -fsycl -O1 -fsycl-optimize-non-user-code %s 2>&1 | FileCheck %s -check-prefix=CHECK-WIN-ERROR
// RUN: not %clang_cl -fsycl -fsycl-optimize-non-user-code %s 2>&1 | FileCheck %s -check-prefix=CHECK-WIN-ERROR
// CHECK-WIN-ERROR: error: -fsycl-optimize-non-user-code option can be used only in conjunction with -Od
8 changes: 8 additions & 0 deletions llvm/include/llvm/Passes/PassBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,10 @@ class PipelineTuningOptions {
// analyses after various module->function or cgscc->function adaptors in the
// default pipelines.
bool EagerlyInvalidateAnalyses;

/// Tuning option to enable a subset of optimizations in O0 optimization
/// mode for non-user SYCL code.
bool OptimizeSYCLFramework = false;
};

/// This class provides access to building LLVM's passes.
Expand Down Expand Up @@ -298,6 +302,10 @@ class PassBuilder {
ModulePassManager buildO0DefaultPipeline(OptimizationLevel Level,
bool LTOPreLink = false);

/// Constructs a optimization pipeline of a SYCL framework part of code
/// and appends it to the given MPM.
void addDefaultSYCLFrameworkOptimizationPipeline(ModulePassManager &MPM);

/// Build the default `AAManager` with the default alias analysis pipeline
/// registered.
///
Expand Down
9 changes: 9 additions & 0 deletions llvm/lib/Passes/PassBuilderPipelines.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1985,9 +1985,18 @@ ModulePassManager PassBuilder::buildO0DefaultPipeline(OptimizationLevel Level,

MPM.addPass(createModuleToFunctionPassAdaptor(AnnotationRemarksPass()));

if (PTO.OptimizeSYCLFramework)
addDefaultSYCLFrameworkOptimizationPipeline(MPM);

return MPM;
}

void PassBuilder::addDefaultSYCLFrameworkOptimizationPipeline(
ModulePassManager &MPM) {
MPM.addPass(
buildInlinerPipeline(OptimizationLevel::O2, ThinOrFullLTOPhase::None));
}

AAManager PassBuilder::buildDefaultAAPipeline() {
AAManager AA;

Expand Down