-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[OpenMP] Remove register_requires
global constructor
#80460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir-openmp @llvm/pr-subscribers-clang-codegen Author: Joseph Huber (jhuber6) ChangesSummary: This patch changes the approach to no longer use global constructors, This function removes support for the old It is worth noting that the current Patch is 2.09 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80460.diff 173 Files Affected:
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 4855e7410a015..a7b72df6d9f89 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -10100,44 +10100,6 @@ bool CGOpenMPRuntime::markAsGlobalTarget(GlobalDecl GD) {
return !AlreadyEmittedTargetDecls.insert(D).second;
}
-llvm::Function *CGOpenMPRuntime::emitRequiresDirectiveRegFun() {
- // If we don't have entries or if we are emitting code for the device, we
- // don't need to do anything.
- if (CGM.getLangOpts().OMPTargetTriples.empty() ||
- CGM.getLangOpts().OpenMPSimd || CGM.getLangOpts().OpenMPIsTargetDevice ||
- (OMPBuilder.OffloadInfoManager.empty() &&
- !HasEmittedDeclareTargetRegion && !HasEmittedTargetRegion))
- return nullptr;
-
- // Create and register the function that handles the requires directives.
- ASTContext &C = CGM.getContext();
-
- llvm::Function *RequiresRegFn;
- {
- CodeGenFunction CGF(CGM);
- const auto &FI = CGM.getTypes().arrangeNullaryFunction();
- llvm::FunctionType *FTy = CGM.getTypes().GetFunctionType(FI);
- std::string ReqName = getName({"omp_offloading", "requires_reg"});
- RequiresRegFn = CGM.CreateGlobalInitOrCleanUpFunction(FTy, ReqName, FI);
- CGF.StartFunction(GlobalDecl(), C.VoidTy, RequiresRegFn, FI, {});
- // TODO: check for other requires clauses.
- // The requires directive takes effect only when a target region is
- // present in the compilation unit. Otherwise it is ignored and not
- // passed to the runtime. This avoids the runtime from throwing an error
- // for mismatching requires clauses across compilation units that don't
- // contain at least 1 target region.
- assert((HasEmittedTargetRegion || HasEmittedDeclareTargetRegion ||
- !OMPBuilder.OffloadInfoManager.empty()) &&
- "Target or declare target region expected.");
- CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
- CGM.getModule(), OMPRTL___tgt_register_requires),
- llvm::ConstantInt::get(
- CGM.Int64Ty, OMPBuilder.Config.getRequiresFlags()));
- CGF.FinishFunction();
- }
- return RequiresRegFn;
-}
-
void CGOpenMPRuntime::emitTeamsCall(CodeGenFunction &CGF,
const OMPExecutableDirective &D,
SourceLocation Loc,
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h
index b01b39abd1606..c3206427b143e 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -1407,10 +1407,6 @@ class CGOpenMPRuntime {
/// \param GD Global to scan.
virtual bool emitTargetGlobal(GlobalDecl GD);
- /// Creates and returns a registration function for when at least one
- /// requires directives was used in the current module.
- llvm::Function *emitRequiresDirectiveRegFun();
-
/// Creates all the offload entries in the current compilation unit
/// along with the associated metadata.
void createOffloadEntriesAndInfoMetadata();
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index c63e4ecc3dcba..d6d75efbbb2a8 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -836,10 +836,6 @@ void CodeGenModule::Release() {
AddGlobalCtor(CudaCtorFunction);
}
if (OpenMPRuntime) {
- if (llvm::Function *OpenMPRequiresDirectiveRegFun =
- OpenMPRuntime->emitRequiresDirectiveRegFun()) {
- AddGlobalCtor(OpenMPRequiresDirectiveRegFun, 0);
- }
OpenMPRuntime->createOffloadEntriesAndInfoMetadata();
OpenMPRuntime->clear();
}
diff --git a/clang/test/OpenMP/bug60602.cpp b/clang/test/OpenMP/bug60602.cpp
index 2fbfdfde07a0c..3ecc70cab778a 100644
--- a/clang/test/OpenMP/bug60602.cpp
+++ b/clang/test/OpenMP/bug60602.cpp
@@ -569,10 +569,3 @@ int kernel_within_loop(int *a, int *b, int N, int num_iters) {
// CHECK: omp.precond.end:
// CHECK-NEXT: ret void
//
-//
-// CHECK-LABEL: define internal void @.omp_offloading.requires_reg
-// CHECK-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK-NEXT: entry:
-// CHECK-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_codegen.cpp b/clang/test/OpenMP/distribute_codegen.cpp
index e3b43002a0518..31ec6ff911905 100644
--- a/clang/test/OpenMP/distribute_codegen.cpp
+++ b/clang/test/OpenMP/distribute_codegen.cpp
@@ -1037,13 +1037,6 @@ int fint(void) { return ftemplate<int>(); }
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR3:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@_Z23without_schedule_clausePfS_S_S_
// CHECK3-SAME: (ptr noundef [[A:%.*]], ptr noundef [[B:%.*]], ptr noundef [[C:%.*]], ptr noundef [[D:%.*]]) #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -1953,13 +1946,6 @@ int fint(void) { return ftemplate<int>(); }
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR3:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK17-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z23without_schedule_clausePfS_S_S__l56
// CHECK17-SAME: (ptr noalias noundef [[DYN_PTR:%.*]], ptr noundef [[A:%.*]], ptr noundef [[B:%.*]], ptr noundef [[C:%.*]], ptr noundef [[D:%.*]]) #[[ATTR0:[0-9]+]] {
// CHECK17-NEXT: entry:
diff --git a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
index 361e26bc2984c..800a002e43968 100644
--- a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
@@ -304,13 +304,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -476,13 +469,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -794,7 +780,7 @@ int main() {
//
//
// CHECK9-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK9-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK9-SAME: () #[[ATTR1]] comdat {
// CHECK9-NEXT: entry:
// CHECK9-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK9-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1148,13 +1134,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -1464,7 +1443,7 @@ int main() {
//
//
// CHECK11-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK11-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK11-SAME: () #[[ATTR1]] comdat {
// CHECK11-NEXT: entry:
// CHECK11-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK11-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1815,10 +1794,3 @@ int main() {
// CHECK11-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
index e005de30e14d1..772372076e947 100644
--- a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
@@ -291,13 +291,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -460,13 +453,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -797,7 +783,7 @@ int main() {
//
//
// CHECK9-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK9-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK9-SAME: () #[[ATTR1]] comdat {
// CHECK9-NEXT: entry:
// CHECK9-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK9-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1169,13 +1155,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -1504,7 +1483,7 @@ int main() {
//
//
// CHECK11-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK11-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK11-SAME: () #[[ATTR1]] comdat {
// CHECK11-NEXT: entry:
// CHECK11-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK11-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1873,10 +1852,3 @@ int main() {
// CHECK11-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
index 7bdc4c5ab21a7..95adefa8020f6 100644
--- a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
@@ -2538,13 +2538,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -4265,13 +4258,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -8886,13 +8872,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -13404,10 +13383,3 @@ int main() {
// CHECK11: omp.precond.end:
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp
index 9f900facc6a54..46c115e40e435 100644
--- a/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp
@@ -504,13 +504,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -759,13 +752,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK8-LABEL: define {{[^@]+}}@main
// CHECK8-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK8-NEXT: entry:
@@ -1207,7 +1193,7 @@ int main() {
//
//
// CHECK8-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK8-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK8-SAME: () #[[ATTR1]] comdat {
// CHECK8-NEXT: entry:
// CHECK8-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK8-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1685,13 +1671,6 @@ int main() {
// CHECK8-NEXT: ret void
//
//
-// CHECK8-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK8-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK8-NEXT: entry:
-// CHECK8-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK8-NEXT: ret void
-//
-//
// CHECK10-LABEL: define {{[^@]+}}@main
// CHECK10-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK10-NEXT: entry:
@@ -2127,7 +2106,7 @@ int main() {
//
//
// CHECK10-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK10-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK10-SAME: () #[[ATTR1]] comdat {
// CHECK10-NEXT: entry:
// CHECK10-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK10-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -2598,10 +2577,3 @@ int main() {
// CHECK10-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK10-NEXT: ret void
//
-//
-// CHECK10-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK10-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK10-NEXT: entry:
-// CHECK10-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK10-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
index 83c9f504ccaca..846e7beb5d92f 100644
--- a/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
@@ -1609,10 +1609,3 @@ int main() {
// CHECK1-NEXT: call void @__kmpc_for_static_fini(ptr @[[GLOB1]], i32 [[TMP3]])
// CHECK1-NEXT: ret void
//
-//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR5:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp
index 8c44a1e71ae79..aa981f606cc87 100644
--- a/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp
@@ -462,13 +462,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -734,13 +727,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -1219,7 +1205,7 @@ int main() {
//
//
// CHECK9-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK9-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK9-SAME: () #[[ATTR1]] comdat {
// CHECK9-NEXT: entry:
// CHECK9-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK9-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1733,13 +1719,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -2212,7 +2191,7 @@ int main() {
//
//
// CHECK11-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK11-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK11-SAME: () #[[ATTR1]] comdat {
// CHECK11-NEXT: entry:
// CHECK11-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK11-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -2719,10 +2698,3 @@ int main() {
// CHECK11-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
index 9f769ca2886fe..5d9244268d554 100644
--- a/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
@@ -112,7 +112,7 @@ int main() {
// CHECK1-NEXT: store i32 0, ptr [[RETVAL]], align 4
// CHECK1-NEXT: call void @_ZN1SC1El(ptr noundef nonnull align 8 dereferenceable(24) [[S]], i64 noundef 0)
// CHECK1-NEXT: [[CALL:%.*]] = invoke noundef signext i8 @_ZN1ScvcEv(ptr noundef nonnull align 8 dereferenceable(24) [[S]])
-// CHECK1-NEXT: to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
+// CHECK1-NEXT: to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
// CHECK1: invoke.cont:
// CHECK1-NEXT: store i8 [[CALL]], ptr [[A]], align 1
// CHECK1-NEXT: [[TMP0:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
@@ -145,16 +145,16 @@ int main() {
// CHECK1-NEXT: [[TMP14:%.*]] = icmp ne i32 [[TMP13]], 0
// CHECK1-NEXT: br i1 [[TMP14]], label [[OMP_OFFLOAD_FAILED:%.*]], label [[OMP_OFFLOAD_CONT:%.*]]
// CHECK1: omp_offload.failed:
-// CHECK1-NEXT: call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l68() #[[ATTR4:[0-9]+]]
+// CHECK1-NEXT: call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l68() #[[ATTR3:[0-9]+]]
// CHECK1-NEXT: br label [[OMP_OFFLOAD_CONT]]
// CHECK1: lpad:
// CHECK1-NEXT: [[TMP15:%.*]] = landingpad { ptr, i32 }
-// CHECK1-NEXT: cleanup
+// CHECK1-NEXT: cleanup
// CHECK1-NEXT: [[TMP16:%.*]] = extractvalue { ptr, i32 } [[TMP15]], 0
// CHECK1-NEXT: store ptr [[TMP16]], ptr [[EXN_SLOT]], align 8
// CHECK1-NEXT: [[TMP17:%.*]] = extractvalue { ptr, i32 } [[TMP15]], 1
// CHECK1-NEXT: store...
[truncated]
|
@llvm/pr-subscribers-flang-openmp Author: Joseph Huber (jhuber6) ChangesSummary: This patch changes the approach to no longer use global constructors, This function removes support for the old It is worth noting that the current Patch is 2.09 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80460.diff 173 Files Affected:
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 4855e7410a015..a7b72df6d9f89 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -10100,44 +10100,6 @@ bool CGOpenMPRuntime::markAsGlobalTarget(GlobalDecl GD) {
return !AlreadyEmittedTargetDecls.insert(D).second;
}
-llvm::Function *CGOpenMPRuntime::emitRequiresDirectiveRegFun() {
- // If we don't have entries or if we are emitting code for the device, we
- // don't need to do anything.
- if (CGM.getLangOpts().OMPTargetTriples.empty() ||
- CGM.getLangOpts().OpenMPSimd || CGM.getLangOpts().OpenMPIsTargetDevice ||
- (OMPBuilder.OffloadInfoManager.empty() &&
- !HasEmittedDeclareTargetRegion && !HasEmittedTargetRegion))
- return nullptr;
-
- // Create and register the function that handles the requires directives.
- ASTContext &C = CGM.getContext();
-
- llvm::Function *RequiresRegFn;
- {
- CodeGenFunction CGF(CGM);
- const auto &FI = CGM.getTypes().arrangeNullaryFunction();
- llvm::FunctionType *FTy = CGM.getTypes().GetFunctionType(FI);
- std::string ReqName = getName({"omp_offloading", "requires_reg"});
- RequiresRegFn = CGM.CreateGlobalInitOrCleanUpFunction(FTy, ReqName, FI);
- CGF.StartFunction(GlobalDecl(), C.VoidTy, RequiresRegFn, FI, {});
- // TODO: check for other requires clauses.
- // The requires directive takes effect only when a target region is
- // present in the compilation unit. Otherwise it is ignored and not
- // passed to the runtime. This avoids the runtime from throwing an error
- // for mismatching requires clauses across compilation units that don't
- // contain at least 1 target region.
- assert((HasEmittedTargetRegion || HasEmittedDeclareTargetRegion ||
- !OMPBuilder.OffloadInfoManager.empty()) &&
- "Target or declare target region expected.");
- CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
- CGM.getModule(), OMPRTL___tgt_register_requires),
- llvm::ConstantInt::get(
- CGM.Int64Ty, OMPBuilder.Config.getRequiresFlags()));
- CGF.FinishFunction();
- }
- return RequiresRegFn;
-}
-
void CGOpenMPRuntime::emitTeamsCall(CodeGenFunction &CGF,
const OMPExecutableDirective &D,
SourceLocation Loc,
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h
index b01b39abd1606..c3206427b143e 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -1407,10 +1407,6 @@ class CGOpenMPRuntime {
/// \param GD Global to scan.
virtual bool emitTargetGlobal(GlobalDecl GD);
- /// Creates and returns a registration function for when at least one
- /// requires directives was used in the current module.
- llvm::Function *emitRequiresDirectiveRegFun();
-
/// Creates all the offload entries in the current compilation unit
/// along with the associated metadata.
void createOffloadEntriesAndInfoMetadata();
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index c63e4ecc3dcba..d6d75efbbb2a8 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -836,10 +836,6 @@ void CodeGenModule::Release() {
AddGlobalCtor(CudaCtorFunction);
}
if (OpenMPRuntime) {
- if (llvm::Function *OpenMPRequiresDirectiveRegFun =
- OpenMPRuntime->emitRequiresDirectiveRegFun()) {
- AddGlobalCtor(OpenMPRequiresDirectiveRegFun, 0);
- }
OpenMPRuntime->createOffloadEntriesAndInfoMetadata();
OpenMPRuntime->clear();
}
diff --git a/clang/test/OpenMP/bug60602.cpp b/clang/test/OpenMP/bug60602.cpp
index 2fbfdfde07a0c..3ecc70cab778a 100644
--- a/clang/test/OpenMP/bug60602.cpp
+++ b/clang/test/OpenMP/bug60602.cpp
@@ -569,10 +569,3 @@ int kernel_within_loop(int *a, int *b, int N, int num_iters) {
// CHECK: omp.precond.end:
// CHECK-NEXT: ret void
//
-//
-// CHECK-LABEL: define internal void @.omp_offloading.requires_reg
-// CHECK-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK-NEXT: entry:
-// CHECK-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_codegen.cpp b/clang/test/OpenMP/distribute_codegen.cpp
index e3b43002a0518..31ec6ff911905 100644
--- a/clang/test/OpenMP/distribute_codegen.cpp
+++ b/clang/test/OpenMP/distribute_codegen.cpp
@@ -1037,13 +1037,6 @@ int fint(void) { return ftemplate<int>(); }
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR3:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@_Z23without_schedule_clausePfS_S_S_
// CHECK3-SAME: (ptr noundef [[A:%.*]], ptr noundef [[B:%.*]], ptr noundef [[C:%.*]], ptr noundef [[D:%.*]]) #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -1953,13 +1946,6 @@ int fint(void) { return ftemplate<int>(); }
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR3:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK17-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z23without_schedule_clausePfS_S_S__l56
// CHECK17-SAME: (ptr noalias noundef [[DYN_PTR:%.*]], ptr noundef [[A:%.*]], ptr noundef [[B:%.*]], ptr noundef [[C:%.*]], ptr noundef [[D:%.*]]) #[[ATTR0:[0-9]+]] {
// CHECK17-NEXT: entry:
diff --git a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
index 361e26bc2984c..800a002e43968 100644
--- a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
@@ -304,13 +304,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -476,13 +469,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -794,7 +780,7 @@ int main() {
//
//
// CHECK9-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK9-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK9-SAME: () #[[ATTR1]] comdat {
// CHECK9-NEXT: entry:
// CHECK9-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK9-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1148,13 +1134,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -1464,7 +1443,7 @@ int main() {
//
//
// CHECK11-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK11-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK11-SAME: () #[[ATTR1]] comdat {
// CHECK11-NEXT: entry:
// CHECK11-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK11-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1815,10 +1794,3 @@ int main() {
// CHECK11-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
index e005de30e14d1..772372076e947 100644
--- a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
@@ -291,13 +291,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -460,13 +453,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -797,7 +783,7 @@ int main() {
//
//
// CHECK9-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK9-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK9-SAME: () #[[ATTR1]] comdat {
// CHECK9-NEXT: entry:
// CHECK9-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK9-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1169,13 +1155,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -1504,7 +1483,7 @@ int main() {
//
//
// CHECK11-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK11-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK11-SAME: () #[[ATTR1]] comdat {
// CHECK11-NEXT: entry:
// CHECK11-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK11-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1873,10 +1852,3 @@ int main() {
// CHECK11-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
index 7bdc4c5ab21a7..95adefa8020f6 100644
--- a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
@@ -2538,13 +2538,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -4265,13 +4258,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -8886,13 +8872,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -13404,10 +13383,3 @@ int main() {
// CHECK11: omp.precond.end:
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp
index 9f900facc6a54..46c115e40e435 100644
--- a/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp
@@ -504,13 +504,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -759,13 +752,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK8-LABEL: define {{[^@]+}}@main
// CHECK8-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK8-NEXT: entry:
@@ -1207,7 +1193,7 @@ int main() {
//
//
// CHECK8-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK8-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK8-SAME: () #[[ATTR1]] comdat {
// CHECK8-NEXT: entry:
// CHECK8-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK8-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1685,13 +1671,6 @@ int main() {
// CHECK8-NEXT: ret void
//
//
-// CHECK8-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK8-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK8-NEXT: entry:
-// CHECK8-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK8-NEXT: ret void
-//
-//
// CHECK10-LABEL: define {{[^@]+}}@main
// CHECK10-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK10-NEXT: entry:
@@ -2127,7 +2106,7 @@ int main() {
//
//
// CHECK10-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK10-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK10-SAME: () #[[ATTR1]] comdat {
// CHECK10-NEXT: entry:
// CHECK10-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK10-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -2598,10 +2577,3 @@ int main() {
// CHECK10-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK10-NEXT: ret void
//
-//
-// CHECK10-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK10-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK10-NEXT: entry:
-// CHECK10-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK10-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
index 83c9f504ccaca..846e7beb5d92f 100644
--- a/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
@@ -1609,10 +1609,3 @@ int main() {
// CHECK1-NEXT: call void @__kmpc_for_static_fini(ptr @[[GLOB1]], i32 [[TMP3]])
// CHECK1-NEXT: ret void
//
-//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR5:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp
index 8c44a1e71ae79..aa981f606cc87 100644
--- a/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp
@@ -462,13 +462,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -734,13 +727,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -1219,7 +1205,7 @@ int main() {
//
//
// CHECK9-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK9-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK9-SAME: () #[[ATTR1]] comdat {
// CHECK9-NEXT: entry:
// CHECK9-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK9-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1733,13 +1719,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -2212,7 +2191,7 @@ int main() {
//
//
// CHECK11-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK11-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK11-SAME: () #[[ATTR1]] comdat {
// CHECK11-NEXT: entry:
// CHECK11-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK11-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -2719,10 +2698,3 @@ int main() {
// CHECK11-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
index 9f769ca2886fe..5d9244268d554 100644
--- a/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
@@ -112,7 +112,7 @@ int main() {
// CHECK1-NEXT: store i32 0, ptr [[RETVAL]], align 4
// CHECK1-NEXT: call void @_ZN1SC1El(ptr noundef nonnull align 8 dereferenceable(24) [[S]], i64 noundef 0)
// CHECK1-NEXT: [[CALL:%.*]] = invoke noundef signext i8 @_ZN1ScvcEv(ptr noundef nonnull align 8 dereferenceable(24) [[S]])
-// CHECK1-NEXT: to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
+// CHECK1-NEXT: to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
// CHECK1: invoke.cont:
// CHECK1-NEXT: store i8 [[CALL]], ptr [[A]], align 1
// CHECK1-NEXT: [[TMP0:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
@@ -145,16 +145,16 @@ int main() {
// CHECK1-NEXT: [[TMP14:%.*]] = icmp ne i32 [[TMP13]], 0
// CHECK1-NEXT: br i1 [[TMP14]], label [[OMP_OFFLOAD_FAILED:%.*]], label [[OMP_OFFLOAD_CONT:%.*]]
// CHECK1: omp_offload.failed:
-// CHECK1-NEXT: call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l68() #[[ATTR4:[0-9]+]]
+// CHECK1-NEXT: call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l68() #[[ATTR3:[0-9]+]]
// CHECK1-NEXT: br label [[OMP_OFFLOAD_CONT]]
// CHECK1: lpad:
// CHECK1-NEXT: [[TMP15:%.*]] = landingpad { ptr, i32 }
-// CHECK1-NEXT: cleanup
+// CHECK1-NEXT: cleanup
// CHECK1-NEXT: [[TMP16:%.*]] = extractvalue { ptr, i32 } [[TMP15]], 0
// CHECK1-NEXT: store ptr [[TMP16]], ptr [[EXN_SLOT]], align 8
// CHECK1-NEXT: [[TMP17:%.*]] = extractvalue { ptr, i32 } [[TMP15]], 1
// CHECK1-NEXT: store...
[truncated]
|
@llvm/pr-subscribers-clang Author: Joseph Huber (jhuber6) ChangesSummary: This patch changes the approach to no longer use global constructors, This function removes support for the old It is worth noting that the current Patch is 2.09 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80460.diff 173 Files Affected:
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 4855e7410a015..a7b72df6d9f89 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -10100,44 +10100,6 @@ bool CGOpenMPRuntime::markAsGlobalTarget(GlobalDecl GD) {
return !AlreadyEmittedTargetDecls.insert(D).second;
}
-llvm::Function *CGOpenMPRuntime::emitRequiresDirectiveRegFun() {
- // If we don't have entries or if we are emitting code for the device, we
- // don't need to do anything.
- if (CGM.getLangOpts().OMPTargetTriples.empty() ||
- CGM.getLangOpts().OpenMPSimd || CGM.getLangOpts().OpenMPIsTargetDevice ||
- (OMPBuilder.OffloadInfoManager.empty() &&
- !HasEmittedDeclareTargetRegion && !HasEmittedTargetRegion))
- return nullptr;
-
- // Create and register the function that handles the requires directives.
- ASTContext &C = CGM.getContext();
-
- llvm::Function *RequiresRegFn;
- {
- CodeGenFunction CGF(CGM);
- const auto &FI = CGM.getTypes().arrangeNullaryFunction();
- llvm::FunctionType *FTy = CGM.getTypes().GetFunctionType(FI);
- std::string ReqName = getName({"omp_offloading", "requires_reg"});
- RequiresRegFn = CGM.CreateGlobalInitOrCleanUpFunction(FTy, ReqName, FI);
- CGF.StartFunction(GlobalDecl(), C.VoidTy, RequiresRegFn, FI, {});
- // TODO: check for other requires clauses.
- // The requires directive takes effect only when a target region is
- // present in the compilation unit. Otherwise it is ignored and not
- // passed to the runtime. This avoids the runtime from throwing an error
- // for mismatching requires clauses across compilation units that don't
- // contain at least 1 target region.
- assert((HasEmittedTargetRegion || HasEmittedDeclareTargetRegion ||
- !OMPBuilder.OffloadInfoManager.empty()) &&
- "Target or declare target region expected.");
- CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
- CGM.getModule(), OMPRTL___tgt_register_requires),
- llvm::ConstantInt::get(
- CGM.Int64Ty, OMPBuilder.Config.getRequiresFlags()));
- CGF.FinishFunction();
- }
- return RequiresRegFn;
-}
-
void CGOpenMPRuntime::emitTeamsCall(CodeGenFunction &CGF,
const OMPExecutableDirective &D,
SourceLocation Loc,
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h
index b01b39abd1606..c3206427b143e 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -1407,10 +1407,6 @@ class CGOpenMPRuntime {
/// \param GD Global to scan.
virtual bool emitTargetGlobal(GlobalDecl GD);
- /// Creates and returns a registration function for when at least one
- /// requires directives was used in the current module.
- llvm::Function *emitRequiresDirectiveRegFun();
-
/// Creates all the offload entries in the current compilation unit
/// along with the associated metadata.
void createOffloadEntriesAndInfoMetadata();
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index c63e4ecc3dcba..d6d75efbbb2a8 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -836,10 +836,6 @@ void CodeGenModule::Release() {
AddGlobalCtor(CudaCtorFunction);
}
if (OpenMPRuntime) {
- if (llvm::Function *OpenMPRequiresDirectiveRegFun =
- OpenMPRuntime->emitRequiresDirectiveRegFun()) {
- AddGlobalCtor(OpenMPRequiresDirectiveRegFun, 0);
- }
OpenMPRuntime->createOffloadEntriesAndInfoMetadata();
OpenMPRuntime->clear();
}
diff --git a/clang/test/OpenMP/bug60602.cpp b/clang/test/OpenMP/bug60602.cpp
index 2fbfdfde07a0c..3ecc70cab778a 100644
--- a/clang/test/OpenMP/bug60602.cpp
+++ b/clang/test/OpenMP/bug60602.cpp
@@ -569,10 +569,3 @@ int kernel_within_loop(int *a, int *b, int N, int num_iters) {
// CHECK: omp.precond.end:
// CHECK-NEXT: ret void
//
-//
-// CHECK-LABEL: define internal void @.omp_offloading.requires_reg
-// CHECK-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK-NEXT: entry:
-// CHECK-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_codegen.cpp b/clang/test/OpenMP/distribute_codegen.cpp
index e3b43002a0518..31ec6ff911905 100644
--- a/clang/test/OpenMP/distribute_codegen.cpp
+++ b/clang/test/OpenMP/distribute_codegen.cpp
@@ -1037,13 +1037,6 @@ int fint(void) { return ftemplate<int>(); }
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR3:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@_Z23without_schedule_clausePfS_S_S_
// CHECK3-SAME: (ptr noundef [[A:%.*]], ptr noundef [[B:%.*]], ptr noundef [[C:%.*]], ptr noundef [[D:%.*]]) #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -1953,13 +1946,6 @@ int fint(void) { return ftemplate<int>(); }
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR3:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK17-LABEL: define {{[^@]+}}@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z23without_schedule_clausePfS_S_S__l56
// CHECK17-SAME: (ptr noalias noundef [[DYN_PTR:%.*]], ptr noundef [[A:%.*]], ptr noundef [[B:%.*]], ptr noundef [[C:%.*]], ptr noundef [[D:%.*]]) #[[ATTR0:[0-9]+]] {
// CHECK17-NEXT: entry:
diff --git a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
index 361e26bc2984c..800a002e43968 100644
--- a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
@@ -304,13 +304,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -476,13 +469,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -794,7 +780,7 @@ int main() {
//
//
// CHECK9-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK9-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK9-SAME: () #[[ATTR1]] comdat {
// CHECK9-NEXT: entry:
// CHECK9-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK9-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1148,13 +1134,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -1464,7 +1443,7 @@ int main() {
//
//
// CHECK11-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK11-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK11-SAME: () #[[ATTR1]] comdat {
// CHECK11-NEXT: entry:
// CHECK11-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK11-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1815,10 +1794,3 @@ int main() {
// CHECK11-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
index e005de30e14d1..772372076e947 100644
--- a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
@@ -291,13 +291,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -460,13 +453,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -797,7 +783,7 @@ int main() {
//
//
// CHECK9-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK9-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK9-SAME: () #[[ATTR1]] comdat {
// CHECK9-NEXT: entry:
// CHECK9-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK9-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1169,13 +1155,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -1504,7 +1483,7 @@ int main() {
//
//
// CHECK11-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK11-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK11-SAME: () #[[ATTR1]] comdat {
// CHECK11-NEXT: entry:
// CHECK11-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK11-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1873,10 +1852,3 @@ int main() {
// CHECK11-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
index 7bdc4c5ab21a7..95adefa8020f6 100644
--- a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
@@ -2538,13 +2538,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -4265,13 +4258,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -8886,13 +8872,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -13404,10 +13383,3 @@ int main() {
// CHECK11: omp.precond.end:
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp
index 9f900facc6a54..46c115e40e435 100644
--- a/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp
@@ -504,13 +504,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -759,13 +752,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK8-LABEL: define {{[^@]+}}@main
// CHECK8-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK8-NEXT: entry:
@@ -1207,7 +1193,7 @@ int main() {
//
//
// CHECK8-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK8-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK8-SAME: () #[[ATTR1]] comdat {
// CHECK8-NEXT: entry:
// CHECK8-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK8-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1685,13 +1671,6 @@ int main() {
// CHECK8-NEXT: ret void
//
//
-// CHECK8-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK8-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK8-NEXT: entry:
-// CHECK8-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK8-NEXT: ret void
-//
-//
// CHECK10-LABEL: define {{[^@]+}}@main
// CHECK10-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK10-NEXT: entry:
@@ -2127,7 +2106,7 @@ int main() {
//
//
// CHECK10-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK10-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK10-SAME: () #[[ATTR1]] comdat {
// CHECK10-NEXT: entry:
// CHECK10-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK10-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -2598,10 +2577,3 @@ int main() {
// CHECK10-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK10-NEXT: ret void
//
-//
-// CHECK10-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK10-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK10-NEXT: entry:
-// CHECK10-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK10-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
index 83c9f504ccaca..846e7beb5d92f 100644
--- a/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
@@ -1609,10 +1609,3 @@ int main() {
// CHECK1-NEXT: call void @__kmpc_for_static_fini(ptr @[[GLOB1]], i32 [[TMP3]])
// CHECK1-NEXT: ret void
//
-//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR5:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp
index 8c44a1e71ae79..aa981f606cc87 100644
--- a/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp
@@ -462,13 +462,6 @@ int main() {
// CHECK1-NEXT: ret void
//
//
-// CHECK1-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK1-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK1-NEXT: entry:
-// CHECK1-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK1-NEXT: ret void
-//
-//
// CHECK3-LABEL: define {{[^@]+}}@main
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK3-NEXT: entry:
@@ -734,13 +727,6 @@ int main() {
// CHECK3-NEXT: ret void
//
//
-// CHECK3-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK3-SAME: () #[[ATTR4:[0-9]+]] {
-// CHECK3-NEXT: entry:
-// CHECK3-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK3-NEXT: ret void
-//
-//
// CHECK9-LABEL: define {{[^@]+}}@main
// CHECK9-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK9-NEXT: entry:
@@ -1219,7 +1205,7 @@ int main() {
//
//
// CHECK9-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK9-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK9-SAME: () #[[ATTR1]] comdat {
// CHECK9-NEXT: entry:
// CHECK9-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK9-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -1733,13 +1719,6 @@ int main() {
// CHECK9-NEXT: ret void
//
//
-// CHECK9-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK9-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK9-NEXT: entry:
-// CHECK9-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK9-NEXT: ret void
-//
-//
// CHECK11-LABEL: define {{[^@]+}}@main
// CHECK11-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK11-NEXT: entry:
@@ -2212,7 +2191,7 @@ int main() {
//
//
// CHECK11-LABEL: define {{[^@]+}}@_Z5tmainIiET_v
-// CHECK11-SAME: () #[[ATTR5:[0-9]+]] comdat {
+// CHECK11-SAME: () #[[ATTR1]] comdat {
// CHECK11-NEXT: entry:
// CHECK11-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK11-NEXT: [[TEST:%.*]] = alloca [[STRUCT_S_0:%.*]], align 4
@@ -2719,10 +2698,3 @@ int main() {
// CHECK11-NEXT: [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 4
// CHECK11-NEXT: ret void
//
-//
-// CHECK11-LABEL: define {{[^@]+}}@.omp_offloading.requires_reg
-// CHECK11-SAME: () #[[ATTR6:[0-9]+]] {
-// CHECK11-NEXT: entry:
-// CHECK11-NEXT: call void @__tgt_register_requires(i64 1)
-// CHECK11-NEXT: ret void
-//
diff --git a/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
index 9f769ca2886fe..5d9244268d554 100644
--- a/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
@@ -112,7 +112,7 @@ int main() {
// CHECK1-NEXT: store i32 0, ptr [[RETVAL]], align 4
// CHECK1-NEXT: call void @_ZN1SC1El(ptr noundef nonnull align 8 dereferenceable(24) [[S]], i64 noundef 0)
// CHECK1-NEXT: [[CALL:%.*]] = invoke noundef signext i8 @_ZN1ScvcEv(ptr noundef nonnull align 8 dereferenceable(24) [[S]])
-// CHECK1-NEXT: to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
+// CHECK1-NEXT: to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
// CHECK1: invoke.cont:
// CHECK1-NEXT: store i8 [[CALL]], ptr [[A]], align 1
// CHECK1-NEXT: [[TMP0:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
@@ -145,16 +145,16 @@ int main() {
// CHECK1-NEXT: [[TMP14:%.*]] = icmp ne i32 [[TMP13]], 0
// CHECK1-NEXT: br i1 [[TMP14]], label [[OMP_OFFLOAD_FAILED:%.*]], label [[OMP_OFFLOAD_CONT:%.*]]
// CHECK1: omp_offload.failed:
-// CHECK1-NEXT: call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l68() #[[ATTR4:[0-9]+]]
+// CHECK1-NEXT: call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l68() #[[ATTR3:[0-9]+]]
// CHECK1-NEXT: br label [[OMP_OFFLOAD_CONT]]
// CHECK1: lpad:
// CHECK1-NEXT: [[TMP15:%.*]] = landingpad { ptr, i32 }
-// CHECK1-NEXT: cleanup
+// CHECK1-NEXT: cleanup
// CHECK1-NEXT: [[TMP16:%.*]] = extractvalue { ptr, i32 } [[TMP15]], 0
// CHECK1-NEXT: store ptr [[TMP16]], ptr [[EXN_SLOT]], align 8
// CHECK1-NEXT: [[TMP17:%.*]] = extractvalue { ptr, i32 } [[TMP15]], 1
// CHECK1-NEXT: store...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
6b102a9
to
ae8a117
Compare
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) { | |||
Entry.size) != OFFLOAD_SUCCESS) | |||
REPORT("Failed to write symbol for USM %s\n", Entry.name); | |||
} | |||
} else { | |||
} else if (Entry.addr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now we don't have a "default" else branch here. Is it not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, could you explain in a comment why Entry.addr is used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just hacking around the offloading entires. I wanted to use the same struct and handling for convenience, but right now there's no basic flag for what the kind of the thing is. Previously we just had functions and globals, if the size is zero it's a global. I added a third thing which can also be checked if the address is nullptr currently. I want to overhaul this in the future but I'm hoping to get this stuff working first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the address is not-null, then it's a kernel waiting to be registered. It's a little hacky but in the current construction a kernel will always have a stub address pass in by the compiler. So one of these "requires" entries will have the addr be nullptr and the size be zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check for size > 0 too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is that too restrictive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that already done in the if
above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it will only enter this branch if size is 0 and then if the address is not nullptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll make the offloading entries less dumb in the future. I don't like how for working w/ CUDA and stuff we need cuda_offloading_entries
and whatnot as well. I've been planning on merging them and simply having a field for the language kind and stuff. Been planning on doing that for awhile.
Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on llvm#80460
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed a small breakage of the OpenMP MLIR dialect from one of these changes. It should be trivial to address.
@@ -6872,35 +6883,6 @@ void OpenMPIRBuilder::loadOffloadInfoMetadata(StringRef HostFilePath) { | |||
loadOffloadInfoMetadata(*M.get()); | |||
} | |||
|
|||
Function *OpenMPIRBuilder::createRegisterRequires(StringRef Name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this function breaks the compilation of the OpenMP MLIR dialect, as it's used there (mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp: convertRequiresAttr()
).
My understanding is that creating this function would no longer be necessary, so the solution to that should be to remove convertRequiresAttr()
and replace the call to it in OpenMPDialectLLVMIRTranslationInterface::amendOperation()
to return success()
in the same file. Flang should already pick up your other changes, so REQUIRES information should still work in Fortran after this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads up. Do you know if there will be any other changes required to make the requires information use the "new" format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flang already picks up your changes to createOffloadEntriesAndInfoMetadata()
and it appears to be putting the corresponding flags into a structure, though I'm not sure it's the right one.
I ran flang-new -fc1 -emit-llvm -fopenmp flang/test/Lower/OpenMP/requires.f90 -o -
for a simple test with a single target region and I'm seeing two offloading entries being produced, which I'm not sure whether it's expected behavior. It looks like only one of them has the requires flags and the other one is the one that's properly linked with the kernel.
@.omp_offloading.entry_name = internal unnamed_addr constant [43 x i8] c"__omp_offloading_10307_d2a215c__QQmain_l12\00"
@.omp_offloading.entry.__omp_offloading_10307_d2a215c__QQmain_l12 = weak constant %struct.__tgt_offload_entry { ptr @.__omp_offloading_10307_d2a215c__QQmain_l12.region_id, ptr @.omp_offloading.entry_name, i64 0, i32 0, i32 0 }, section "omp_offloading_entries", align 1
@.omp_offloading.entry_name.1 = internal unnamed_addr constant [1 x i8] zeroinitializer
@.omp_offloading.entry. = weak constant %struct.__tgt_offload_entry { ptr null, ptr @.omp_offloading.entry_name.1, i64 0, i32 22, i32 10 }, section "omp_offloading_entries", align 1
...
%14 = call i32 @__tgt_target_kernel(ptr @1, i64 -1, i32 -1, i32 0, ptr @.__omp_offloading_10307_d2a215c__QQmain_l12.region_id, ptr %kernel_args)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks a little weird, the i32
value is 22
which I don't know what that corresponds to with the flags. The value is then 10
which would be two requires flags merged together maybe? 22
is 16, 4, and 2, so I don't know why all those flags would be set at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test itself has "requires unified_shared_memory reverse_offload" (it originally just checks that the flags are passed to MLIR), so that's where the "10" is coming from. The "22" I don't know what it's supposed represent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I encoded the fact that this is a "requires" entry into the global with 16
. Realistically it shouldn't make a difference in the runtime since as long as 16
is set and the address is null it will get counted as a requires entry. It's just a little weird that it wouldn't just be 16. I'll double check. Changing these entries is also on the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a very obvious problem. I mixed up hexadecimal and did 0x16
when I meant either 0x10
or 16
. Fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, then the "16" bit just happened to be part of the number that was being set. Glad to hear this wasn't some unexpected interaction specific to Flang.
ae8a117
to
9535513
Compare
9535513
to
28e7af9
Compare
Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on llvm#80460
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot, thank you.
2fa7b81
to
1e5d2dc
Compare
Summary: Currently, OpenMP handles the `omp requires` clause by emitting a global constructor into the runtime for every translation unit that requires it. However, this is not a great solution because it prevents us from having a defined order in which the runtime is accessed and used. This patch changes the approach to no longer use global constructors, but to instead group the flag with the other offloading entires that we already handle. This has the effect of still registering each flag per requires TU, but now we have a single constructor that handles everything. This function removes support for the old `__tgt_register_requires` and replaces it with a warning message. We just had a recent release, and the OpenMP policy for the past four releases since we switched to LLVM is that we do not provide strict backwards compatibility between major LLVM releases now that the library is versioned. This means that a user will need to recompile if they have an old binary that relied on `register_requires` having the old behavior. It is important that we actively deprecate this, as otherwise it would not solve the problem of having no defined init and shutdown order for `libomptarget`. The problem of `libomptarget` not having a define init and shutdown order cascades into a lot of other issues so I have a strong incentive to be rid of it. It is worth noting that the current `__tgt_offload_entry` only has space for a 32-bit integer here. I am planning to overhaul these at some point as well.
1e5d2dc
to
76039ed
Compare
Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on llvm#80460
Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on llvm#80460 Fix decrement
Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on llvm#80460 Fix decrement
Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on #80460
Summary: Currently, OpenMP handles the `omp requires` clause by emitting a global constructor into the runtime for every translation unit that requires it. However, this is not a great solution because it prevents us from having a defined order in which the runtime is accessed and used. This patch changes the approach to no longer use global constructors, but to instead group the flag with the other offloading entires that we already handle. This has the effect of still registering each flag per requires TU, but now we have a single constructor that handles everything. This function removes support for the old `__tgt_register_requires` and replaces it with a warning message. We just had a recent release, and the OpenMP policy for the past four releases since we switched to LLVM is that we do not provide strict backwards compatibility between major LLVM releases now that the library is versioned. This means that a user will need to recompile if they have an old binary that relied on `register_requires` having the old behavior. It is important that we actively deprecate this, as otherwise it would not solve the problem of having no defined init and shutdown order for `libomptarget`. The problem of `libomptarget` not having a define init and shutdown order cascades into a lot of other issues so I have a strong incentive to be rid of it. It is worth noting that the current `__tgt_offload_entry` only has space for a 32-bit integer here. I am planning to overhaul these at some point as well. Change-Id: If8eb1fbb0eed26ab65fbea9d490196f35d05f1d8
) Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on llvm#80460 Change-Id: I70815457fab9b5d68db8e48b3b5e1c75951c05f5
Summary:
Currently, OpenMP handles the
omp requires
clause by emitting a globalconstructor into the runtime for every translation unit that requires
it. However, this is not a great solution because it prevents us from
having a defined order in which the runtime is accessed and used.
This patch changes the approach to no longer use global constructors,
but to instead group the flag with the other offloading entires that we
already handle. This has the effect of still registering each flag per
requires TU, but now we have a single constructor that handles
everything.
This function removes support for the old
__tgt_register_requires
andreplaces it with a warning message. We just had a recent release, and
the OpenMP policy for the past four releases since we switched to LLVM
is that we do not provide strict backwards compatibility between major
LLVM releases now that the library is versioned. This means that a user
will need to recompile if they have an old binary that relied on
register_requires
having the old behavior. It is important that weactively deprecate this, as otherwise it would not solve the problem of
having no defined init and shutdown order for
libomptarget
. Theproblem of
libomptarget
not having a define init and shutdown ordercascades into a lot of other issues so I have a strong incentive to be
rid of it.
It is worth noting that the current
__tgt_offload_entry
only has spacefor a 32-bit integer here. I am planning to overhaul these at some point
as well.