Skip to content

[Flang][OpenMp] Fix to threadprivate not working with host-association. #74966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 12, 2024

Conversation

harishch4
Copy link
Contributor

This patch considers host-associated variables to generate threadprivate Ops.

Fixes: #60763

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Dec 10, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 10, 2023

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-openmp

Author: None (harishch4)

Changes

This patch considers host-associated variables to generate threadprivate Ops.

Fixes: #60763


Full diff: https://github.com/llvm/llvm-project/pull/74966.diff

4 Files Affected:

  • (modified) flang/lib/Lower/HostAssociations.cpp (+12-2)
  • (modified) flang/lib/Lower/OpenMP.cpp (+14-3)
  • (added) flang/test/Lower/OpenMP/threadprivate-host-association-2.f90 (+44)
  • (added) flang/test/Lower/OpenMP/threadprivate-host-association.f90 (+42)
diff --git a/flang/lib/Lower/HostAssociations.cpp b/flang/lib/Lower/HostAssociations.cpp
index a62f7a7e99b6ff..e2a403f111301f 100644
--- a/flang/lib/Lower/HostAssociations.cpp
+++ b/flang/lib/Lower/HostAssociations.cpp
@@ -16,6 +16,7 @@
 #include "flang/Lower/ConvertVariable.h"
 #include "flang/Lower/PFTBuilder.h"
 #include "flang/Lower/SymbolMap.h"
+#include "flang/Lower/OpenMP.h"
 #include "flang/Optimizer/Builder/Character.h"
 #include "flang/Optimizer/Builder/FIRBuilder.h"
 #include "flang/Optimizer/Builder/Todo.h"
@@ -508,7 +509,10 @@ void Fortran::lower::HostAssociations::addSymbolsToBind(
          "must be initially empty");
   this->hostScope = &hostScope;
   for (const auto *s : symbols)
-    if (Fortran::lower::symbolIsGlobal(*s)) {
+    // GlobalOp are created for non-global threadprivate variable,
+    //  so considering them as globals.
+    if (Fortran::lower::symbolIsGlobal(*s) ||
+        (*s).test(Fortran::semantics::Symbol::Flag::OmpThreadprivate)) {
       // The ultimate symbol is stored here so that global symbols from the
       // host scope can later be searched in this set.
       globalSymbols.insert(&s->GetUltimate());
@@ -556,9 +560,15 @@ void Fortran::lower::HostAssociations::internalProcedureBindings(
     for (auto &hostVariable : pft::getScopeVariableList(*hostScope))
       if ((hostVariable.isAggregateStore() && hostVariable.isGlobal()) ||
           (hostVariable.hasSymbol() &&
-           globalSymbols.contains(&hostVariable.getSymbol().GetUltimate())))
+           globalSymbols.contains(&hostVariable.getSymbol().GetUltimate()))) {
         Fortran::lower::instantiateVariable(converter, hostVariable, symMap,
                                             storeMap);
+        // Generate threadprivate Op for host associated variables.
+        if (hostVariable.hasSymbol() &&
+            hostVariable.getSymbol().test(
+                Fortran::semantics::Symbol::Flag::OmpThreadprivate))
+          Fortran::lower::genThreadprivateOp(converter, hostVariable);
+      }
   }
   if (tupleSymbols.empty())
     return;
diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index eeba87fcd15116..ba07d015de39cb 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -1996,9 +1996,12 @@ static void threadPrivatizeVars(Fortran::lower::AbstractConverter &converter,
   };
 
   llvm::SetVector<const Fortran::semantics::Symbol *> threadprivateSyms;
+  // Considering Host associated symbols as well.
   converter.collectSymbolSet(
       eval, threadprivateSyms,
-      Fortran::semantics::Symbol::Flag::OmpThreadprivate);
+      Fortran::semantics::Symbol::Flag::OmpThreadprivate,
+                             /*collectSymbols=*/true,
+                             /*collectHostAssociatedSymbols=*/true);
   std::set<Fortran::semantics::SourceName> threadprivateSymNames;
 
   // For a COMMON block, the ThreadprivateOp is generated for itself instead of
@@ -3528,8 +3531,16 @@ void Fortran::lower::genThreadprivateOp(
     // variable in main program, and it has implicit SAVE attribute. Take it as
     // with SAVE attribute, so to create GlobalOp for it to simplify the
     // translation to LLVM IR.
-    fir::GlobalOp global = globalInitialization(converter, firOpBuilder, sym,
-                                                var, currentLocation);
+    fir::GlobalOp global;
+    // Avoids performing multiple globalInitializations.
+    auto module = converter.getModuleOp();
+    std::string globalName = converter.mangleName(sym);
+    if (module.lookupSymbol<fir::GlobalOp>(globalName)) {
+      global = module.lookupSymbol<fir::GlobalOp>(globalName);
+    } else {
+      global = globalInitialization(converter, firOpBuilder, sym, var,
+                                    currentLocation);
+    }
 
     mlir::Value symValue = firOpBuilder.create<fir::AddrOfOp>(
         currentLocation, global.resultType(), global.getSymbol());
diff --git a/flang/test/Lower/OpenMP/threadprivate-host-association-2.f90 b/flang/test/Lower/OpenMP/threadprivate-host-association-2.f90
new file mode 100644
index 00000000000000..e6f9a69333da1b
--- /dev/null
+++ b/flang/test/Lower/OpenMP/threadprivate-host-association-2.f90
@@ -0,0 +1,44 @@
+! This test checks lowering of OpenMP Threadprivate Directive.
+! Test for threadprivate variable in host association.
+
+!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s
+
+!CHECK: func.func @_QQmain() attributes {fir.bindc_name = "main"} {
+!CHECK:   %[[A:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFEa"}
+!CHECK:   %[[A_DECL:.*]]:2 = hlfir.declare %[[A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:   %[[A_ADDR:.*]] = fir.address_of(@_QFEa) : !fir.ref<i32>
+!CHECK:   %[[TP_A:.*]] = omp.threadprivate %[[A_ADDR]] : !fir.ref<i32> -> !fir.ref<i32>
+!CHECK:   %[[TP_A_DECL:.*]]:2 = hlfir.declare %[[TP_A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:   fir.call @_QFPsub() fastmath<contract> : () -> ()
+!CHECK:   return
+!CHECK: }
+!CHECK: func.func @_QFPsub() attributes {fir.internal_proc} {
+!CHECK:   %[[A:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFEa"}
+!CHECK:   %[[A_DECL:.*]]:2 = hlfir.declare %[[A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:   %[[A_ADDR:.*]] = fir.address_of(@_QFEa) : !fir.ref<i32>
+!CHECK:   %[[TP_A:.*]] = omp.threadprivate %[[A_ADDR]] : !fir.ref<i32> -> !fir.ref<i32>
+!CHECK:   %[[TP_A_DECL:.*]]:2 = hlfir.declare %[[TP_A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:   omp.parallel {
+!CHECK:     %[[PAR_TP_A:.*]] = omp.threadprivate %[[A_ADDR]] : !fir.ref<i32> -> !fir.ref<i32>
+!CHECK:     %[[PAR_TP_A_DECL:.*]]:2 = hlfir.declare %[[PAR_TP_A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:     %{{.*}} = fir.load %[[PAR_TP_A_DECL]]#0 : !fir.ref<i32>
+!CHECK:     omp.terminator
+!CHECK:   }
+!CHECK:   return
+!CHECK: }
+!CHECK: fir.global internal @_QFEa : i32 {
+!CHECK:   %[[A:.*]] = fir.undefined i32
+!CHECK:   fir.has_value %[[A]] : i32
+!CHECK: }
+
+program main
+   integer :: a
+   !$omp threadprivate(a)
+   call sub()
+contains
+   subroutine sub()
+      !$omp parallel
+      print *, a
+      !$omp end parallel
+   end
+end
diff --git a/flang/test/Lower/OpenMP/threadprivate-host-association.f90 b/flang/test/Lower/OpenMP/threadprivate-host-association.f90
new file mode 100644
index 00000000000000..bb6d58b0ff2189
--- /dev/null
+++ b/flang/test/Lower/OpenMP/threadprivate-host-association.f90
@@ -0,0 +1,42 @@
+! This test checks lowering of OpenMP Threadprivate Directive.
+! Test for threadprivate variable in host association.
+
+!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s
+
+!CHECK: func.func @_QQmain() attributes {fir.bindc_name = "main"} {
+!CHECK:   %[[A:.*]] = fir.address_of(@_QFEa) : !fir.ref<i32>
+!CHECK:   %[[A_DECL:.*]]:2 = hlfir.declare %[[A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:   %[[TP_A:.*]] = omp.threadprivate %[[A_DECL]]#1 : !fir.ref<i32> -> !fir.ref<i32>
+!CHECK:   %[[TP_A_DECL:.*]]:2 = hlfir.declare %[[TP_A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:   fir.call @_QFPsub() fastmath<contract> : () -> ()
+!CHECK:   return
+!CHECK: }
+!CHECK: func.func @_QFPsub() attributes {fir.internal_proc} {
+!CHECK:   %[[A:.*]] = fir.address_of(@_QFEa) : !fir.ref<i32>
+!CHECK:   %[[A_DECL:.*]]:2 = hlfir.declare %[[A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:   %[[TP_A:.*]] = omp.threadprivate %[[A_DECL]]#1 : !fir.ref<i32> -> !fir.ref<i32>
+!CHECK:   %[[TP_A_DECL:.*]]:2 = hlfir.declare %[[TP_A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:   omp.parallel {
+!CHECK:     %[[PAR_TP_A:.*]] = omp.threadprivate %[[A_DECL]]#1 : !fir.ref<i32> -> !fir.ref<i32>
+!CHECK:     %[[PAR_TP_A_DECL:.*]]:2 = hlfir.declare %[[PAR_TP_A]] {uniq_name = "_QFEa"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:     %{{.*}} = fir.load %[[PAR_TP_A_DECL]]#0 : !fir.ref<i32>
+!CHECK:     omp.terminator
+!CHECK:   }
+!CHECK:   return
+!CHECK: }
+!CHECK: fir.global internal @_QFEa : i32 {
+!CHECK:   %[[A:.*]] = fir.zero_bits i32
+!CHECK:   fir.has_value %[[A]] : i32
+!CHECK: }
+
+program main
+   integer, save :: a
+   !$omp threadprivate(a)
+   call sub()
+contains
+   subroutine sub()
+      !$omp parallel
+      print *, a
+      !$omp end parallel
+   end
+end

Copy link

github-actions bot commented Dec 10, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@harishch4
Copy link
Contributor Author

Ping for review.

eval, threadprivateSyms,
Fortran::semantics::Symbol::Flag::OmpThreadprivate);
// Considering Host associated symbols as well.
converter.collectSymbolSet(eval, threadprivateSyms,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you once check how does this patch behave once #72510 is merged? If there are some discrepancies, one of these patches would need modifications. Reason being, so far, only default clause lowering was using the two boolean parameters in collectSymbolSet. Now we are extending that to threadprivate, so I am curious how that turns out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested your changes with mine and everything seems to be working fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

// Avoids performing multiple globalInitializations.
auto module = converter.getModuleOp();
std::string globalName = converter.mangleName(sym);
if (module.lookupSymbol<fir::GlobalOp>(globalName)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can drop enclosing curly braces?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@harishch4
Copy link
Contributor Author

@kiranchandramohan @NimishMishra Requesting for review.

Copy link
Contributor

@NimishMishra NimishMishra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It might be good to add details to the patch summary on the exact approach we took here. Mainly that (1) non-global threadprivate vars are added to globalSymbols (since they have implicit SAVE attribute on them), and that (2) during lowering, we skip globalOp creation if there already exists one (this happens in the change you have in OpenMP.cpp).

Currently, we are explicitly creating a threadprivate op in HostAssociations.cpp, and I am not the best person to say whether we are intermixing the two things more than we would like to. Please wait for @kiranchandramohan or someone else to comment on this before merging.

@@ -508,7 +509,10 @@ void Fortran::lower::HostAssociations::addSymbolsToBind(
"must be initially empty");
this->hostScope = &hostScope;
for (const auto *s : symbols)
if (Fortran::lower::symbolIsGlobal(*s)) {
// GlobalOp are created for non-global threadprivate variable,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to bring in the discussion on implicit SAVE here as well:

// Non-global variable which can be in threadprivate directive must be one
. Technically, an implicit SAVE attribute is what we consider, thereby requiring GlobalOp for non-global threadprivate vars.

@harishch4 harishch4 force-pushed the tp-host-association branch from 80d29e9 to 521dff8 Compare March 9, 2024 17:22
@harishch4
Copy link
Contributor Author

@kiranchandramohan Requesting for review.

@harishch4 harishch4 requested a review from agozillon March 9, 2024 17:26
Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Thanks for the fix. In future we should move this generation of omp.threadprivate as a pass in MLIR. May be mark the global threadprivate variables with another OpenMP dialect operation and then go through all the places where the global variable is used and generate omp.threadprivate operation.

@harishch4
Copy link
Contributor Author

@kiranchandramohan @NimishMishra Thank you for reviewing and providing valuable suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[flang][OpenMP] Threadprivate variable in host-association
4 participants