Skip to content

[flang][Lower][OpenMP] try to avoid using init mold argument #125900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 6, 2025

Conversation

tblah
Copy link
Contributor

@tblah tblah commented Feb 5, 2025

Unfortunately we still have a lot of cases like
!fir.box<!fir.array<10xi32>> where we read dimensions from the mold in case there are non-default lower bounds stored inside the box. I will address this in the next patch.

@tblah tblah requested review from ergawy and luporl February 5, 2025 18:01
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Feb 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 5, 2025

@llvm/pr-subscribers-flang-openmp

@llvm/pr-subscribers-flang-fir-hlfir

Author: Tom Eccles (tblah)

Changes

Unfortunately we still have a lot of cases like
!fir.box<!fir.array<10xi32>> where we read dimensions from the mold in case there are non-default lower bounds stored inside the box. I will address this in the next patch.


Full diff: https://github.com/llvm/llvm-project/pull/125900.diff

2 Files Affected:

  • (modified) flang/lib/Lower/OpenMP/PrivateReductionUtils.cpp (+32-11)
  • (modified) flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 (-1)
diff --git a/flang/lib/Lower/OpenMP/PrivateReductionUtils.cpp b/flang/lib/Lower/OpenMP/PrivateReductionUtils.cpp
index d41564da207112d..951293b133677d3 100644
--- a/flang/lib/Lower/OpenMP/PrivateReductionUtils.cpp
+++ b/flang/lib/Lower/OpenMP/PrivateReductionUtils.cpp
@@ -277,7 +277,8 @@ class PopulateInitAndCleanupRegionsHelper {
   /// allocatedPrivVarArg: The allocation for the private
   ///                      variable.
   /// moldArg:             The original variable.
-  /// loadedMoldArg:       The original variable, loaded.
+  /// loadedMoldArg:       The original variable, loaded. Access via
+  ///                      getLoadedMoldArg().
   mlir::Value scalarInitValue, allocatedPrivVarArg, moldArg, loadedMoldArg;
 
   /// The first block in the init region.
@@ -321,6 +322,14 @@ class PopulateInitAndCleanupRegionsHelper {
   void initAndCleanupUnboxedDerivedType(bool needsInitialization);
 
   fir::IfOp handleNullAllocatable();
+
+  // Do this lazily so that we don't load it when it is not used.
+  inline mlir::Value getLoadedMoldArg() {
+    if (loadedMoldArg)
+      return loadedMoldArg;
+    loadedMoldArg = builder.loadIfRef(loc, moldArg);
+    return loadedMoldArg;
+  }
 };
 
 } // namespace
@@ -333,7 +342,7 @@ void PopulateInitAndCleanupRegionsHelper::initBoxedPrivatePointer(
   // we need a shape with the right rank so that the embox op is lowered
   // to an llvm struct of the right type. This returns nullptr if the types
   // aren't right.
-  mlir::Value shape = generateZeroShapeForRank(builder, loc, loadedMoldArg);
+  mlir::Value shape = generateZeroShapeForRank(builder, loc, moldArg);
   // Just incase, do initialize the box with a null value
   mlir::Value null = builder.createNullConstant(loc, boxTy.getEleTy());
   mlir::Value nullBox;
@@ -355,7 +364,7 @@ void PopulateInitAndCleanupRegionsHelper::initBoxedPrivatePointer(
 /// }
 /// omp.yield %box_alloca
 fir::IfOp PopulateInitAndCleanupRegionsHelper::handleNullAllocatable() {
-  mlir::Value addr = builder.create<fir::BoxAddrOp>(loc, loadedMoldArg);
+  mlir::Value addr = builder.create<fir::BoxAddrOp>(loc, getLoadedMoldArg());
   mlir::Value isNotAllocated = builder.genIsNullAddr(loc, addr);
   fir::IfOp ifOp = builder.create<fir::IfOp>(loc, isNotAllocated,
                                              /*withElseRegion=*/true);
@@ -391,7 +400,7 @@ void PopulateInitAndCleanupRegionsHelper::initAndCleanupBoxedScalar(
       loc, valType, valAlloc, /*shape=*/mlir::Value{},
       /*slice=*/mlir::Value{}, lenParams);
   initializeIfDerivedTypeBox(
-      builder, loc, box, loadedMoldArg, needsInitialization,
+      builder, loc, box, getLoadedMoldArg(), needsInitialization,
       /*isFirstPrivate=*/kind == DeclOperationKind::FirstPrivate);
   fir::StoreOp lastOp =
       builder.create<fir::StoreOp>(loc, box, allocatedPrivVarArg);
@@ -410,7 +419,7 @@ void PopulateInitAndCleanupRegionsHelper::initAndCleanupBoxedArray(
     fir::BaseBoxType boxTy, bool needsInitialization) {
   bool isAllocatableOrPointer =
       mlir::isa<fir::HeapType, fir::PointerType>(boxTy.getEleTy());
-  getLengthParameters(builder, loc, loadedMoldArg, lenParams);
+  getLengthParameters(builder, loc, getLoadedMoldArg(), lenParams);
 
   fir::IfOp ifUnallocated{nullptr};
   if (isAllocatableOrPointer) {
@@ -419,7 +428,7 @@ void PopulateInitAndCleanupRegionsHelper::initAndCleanupBoxedArray(
   }
 
   // Create the private copy from the initial fir.box:
-  hlfir::Entity source = hlfir::Entity{loadedMoldArg};
+  hlfir::Entity source = hlfir::Entity{getLoadedMoldArg()};
 
   // Special case for (possibly allocatable) arrays of polymorphic types
   // e.g. !fir.class<!fir.heap<!fir.array<?x!fir.type<>>>>
@@ -463,8 +472,9 @@ void PopulateInitAndCleanupRegionsHelper::initAndCleanupBoxedArray(
   // Put the temporary inside of a box:
   // hlfir::genVariableBox doesn't handle non-default lower bounds
   mlir::Value box;
-  fir::ShapeShiftOp shapeShift = getShapeShift(builder, loc, loadedMoldArg);
-  mlir::Type boxType = loadedMoldArg.getType();
+  fir::ShapeShiftOp shapeShift =
+      getShapeShift(builder, loc, getLoadedMoldArg());
+  mlir::Type boxType = getLoadedMoldArg().getType();
   if (mlir::isa<fir::BaseBoxType>(temp.getType()))
     // the box created by the declare form createTempFromMold is missing
     // lower bounds info
@@ -480,7 +490,7 @@ void PopulateInitAndCleanupRegionsHelper::initAndCleanupBoxedArray(
     builder.create<hlfir::AssignOp>(loc, scalarInitValue, box);
 
   initializeIfDerivedTypeBox(
-      builder, loc, box, loadedMoldArg, needsInitialization,
+      builder, loc, box, getLoadedMoldArg(), needsInitialization,
       /*isFirstPrivate=*/kind == DeclOperationKind::FirstPrivate);
 
   builder.create<fir::StoreOp>(loc, box, allocatedPrivVarArg);
@@ -553,8 +563,7 @@ void PopulateInitAndCleanupRegionsHelper::populateByRefInitAndCleanupRegions() {
     builder.setInsertionPointToEnd(initBlock);
 
     // TODO: don't do this unless it is needed
-    loadedMoldArg = builder.loadIfRef(loc, moldArg);
-    getLengthParameters(builder, loc, loadedMoldArg, lenParams);
+    getLengthParameters(builder, loc, getLoadedMoldArg(), lenParams);
 
     if (isPrivatization(kind) &&
         mlir::isa<fir::PointerType>(boxTy.getEleTy())) {
@@ -604,4 +613,16 @@ void Fortran::lower::omp::populateByRefInitAndCleanupRegions(
       converter, loc, argType, scalarInitValue, allocatedPrivVarArg, moldArg,
       initBlock, cleanupRegion, kind, sym);
   helper.populateByRefInitAndCleanupRegions();
+
+  // Often we load moldArg to check something (e.g. length parameters, shape)
+  // but then those answers can be gotten statically without accessing the
+  // runtime value and so the only remaining use is a dead load. These loads can
+  // force us to insert additional barriers and so should be avoided where
+  // possible.
+  if (moldArg.hasOneUse()) {
+    mlir::Operation *user = *moldArg.getUsers().begin();
+    if (auto load = mlir::dyn_cast<fir::LoadOp>(user))
+      if (load.use_empty())
+        load.erase();
+  }
 }
diff --git a/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 b/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90
index 1dc345c11568c49..f39ac9199e8bd23 100644
--- a/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90
+++ b/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90
@@ -42,7 +42,6 @@ subroutine delayed_privatization_lenparams(length)
 ! CHECK-LABEL: omp.private {type = firstprivate}
 ! CHECK-SAME: @[[PRIVATIZER_SYM:.*]] : [[TYPE:!fir.box<!fir.ptr<i32>>]] init {
 ! CHECK-NEXT: ^bb0(%[[PRIV_ARG:.*]]: !fir.ref<[[TYPE]]>, %[[PRIV_ALLOC:.*]]: !fir.ref<[[TYPE]]>):
-! CHECK-NEXT:   %[[ARG:.*]] = fir.load %[[PRIV_ARG]]
 ! CHECK-NEXT:   %[[NULL:.*]] = fir.zero_bits !fir.ptr<i32>
 ! CHECK-NEXT:   %[[INIT:.*]] = fir.embox %[[NULL]] : (!fir.ptr<i32>) -> !fir.box<!fir.ptr<i32>>
 ! CHECK-NEXT:   fir.store %[[INIT]] to %[[PRIV_ALLOC]] : !fir.ref<!fir.box<!fir.ptr<i32>>>

Copy link
Member

@ergawy ergawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks!

Base automatically changed from users/tblah/init-region-cleanup-0 to main February 6, 2025 10:25
Unfortunately we still have a lot of cases like
!fir.box<!fir.array<10xi32>> where we read dimensions from the mold in
case there are non-default lower bounds stored inside the box. I will
address this in the next patch.
@tblah tblah force-pushed the users/tblah/init-region-cleanup-1 branch from 5efc2c5 to 7dcea4c Compare February 6, 2025 10:26
Copy link
Contributor

@luporl luporl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +617 to +627
// Often we load moldArg to check something (e.g. length parameters, shape)
// but then those answers can be gotten statically without accessing the
// runtime value and so the only remaining use is a dead load. These loads can
// force us to insert additional barriers and so should be avoided where
// possible.
if (moldArg.hasOneUse()) {
mlir::Operation *user = *moldArg.getUsers().begin();
if (auto load = mlir::dyn_cast<fir::LoadOp>(user))
if (load.use_empty())
load.erase();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this answers my question in the other PR :)
I should have started with this one.

@tblah tblah merged commit d5c6072 into main Feb 6, 2025
8 checks passed
@tblah tblah deleted the users/tblah/init-region-cleanup-1 branch February 6, 2025 14:27
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…5900)

Unfortunately we still have a lot of cases like
!fir.box<!fir.array<10xi32>> where we read dimensions from the mold in
case there are non-default lower bounds stored inside the box. I will
address this in the next patch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants