Skip to content

[flang][OpenMP][Lower] lower array subscripts for task depend #132994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 1, 2025

Conversation

tblah
Copy link
Contributor

@tblah tblah commented Mar 25, 2025

The OpenMP standard says that all dependencies in the same set of inter-dependent tasks must be non-overlapping. This simplification means that the OpenMP only needs to keep track of the base addresses of dependency variables. This can be seen in kmp_taskdeps.cpp, which stores task dependency information in a hash table, using the base address as a key.

This patch generates a rebox operation to slice boxed arrays, but only the box data address is used for the task dependency. The extra box is optimized away by LLVM at O3.

Vector subscripts are TODO (I will address in my next patch).

This also fixes a bug for ordinary subscripts when the symbol was mapped to a box:

Fixes #132647

The OpenMP standard says that all dependencies in the same set of
inter-dependent tasks must be non-overlapping. This simplification means
that the OpenMP only needs to keep track of the base addresses of
dependency variables. This can be seen in kmp_taskdeps.cpp, which stores
task dependency information in a hash table, using the base address as a
key.

This patch generates a rebox operation to slice boxed arrays, but only
the box data address is used for the task dependency. The extra box is
optimized away by LLVM at O3.

Vector subscripts are TODO (I will address in my next patch).

This also fixes a bug for ordinary subscripts when the symbol was mapped
to a box:

Fixes llvm#132647
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Mar 25, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 25, 2025

@llvm/pr-subscribers-flang-openmp

@llvm/pr-subscribers-flang-fir-hlfir

Author: Tom Eccles (tblah)

Changes

The OpenMP standard says that all dependencies in the same set of inter-dependent tasks must be non-overlapping. This simplification means that the OpenMP only needs to keep track of the base addresses of dependency variables. This can be seen in kmp_taskdeps.cpp, which stores task dependency information in a hash table, using the base address as a key.

This patch generates a rebox operation to slice boxed arrays, but only the box data address is used for the task dependency. The extra box is optimized away by LLVM at O3.

Vector subscripts are TODO (I will address in my next patch).

This also fixes a bug for ordinary subscripts when the symbol was mapped to a box:

Fixes #132647


Full diff: https://github.com/llvm/llvm-project/pull/132994.diff

6 Files Affected:

  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+41-6)
  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+2-1)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+15-12)
  • (added) flang/test/Lower/OpenMP/Todo/depend-clause-vector-subscript-array-section.f90 (+11)
  • (added) flang/test/Lower/OpenMP/task-depend-array-section.f90 (+51)
  • (modified) flang/test/Lower/OpenMP/task.f90 (+8)
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index c66fd46767b86..2610d977fb908 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -14,6 +14,7 @@
 #include "Clauses.h"
 #include "Utils.h"
 
+#include "flang/Lower/ConvertExprToHLFIR.h"
 #include "flang/Lower/PFTBuilder.h"
 #include "flang/Parser/tools.h"
 #include "flang/Semantics/tools.h"
@@ -808,7 +809,21 @@ bool ClauseProcessor::processCopyprivate(
   return hasCopyPrivate;
 }
 
-bool ClauseProcessor::processDepend(mlir::omp::DependClauseOps &result) const {
+template <typename T>
+static bool isVectorSubscript(const evaluate::Expr<T> &expr) {
+  if (std::optional<evaluate::DataRef> dataRef{evaluate::ExtractDataRef(expr)})
+    if (const auto *arrayRef = std::get_if<evaluate::ArrayRef>(&dataRef->u))
+      for (const evaluate::Subscript &subscript : arrayRef->subscript())
+        if (std::holds_alternative<evaluate::IndirectSubscriptIntegerExpr>(
+                subscript.u))
+          if (subscript.Rank() > 0)
+            return true;
+  return false;
+}
+
+bool ClauseProcessor::processDepend(lower::SymMap &symMap,
+                                    lower::StatementContext &stmtCtx,
+                                    mlir::omp::DependClauseOps &result) const {
   auto process = [&](const omp::clause::Depend &clause,
                      const parser::CharBlock &) {
     using Depend = omp::clause::Depend;
@@ -830,18 +845,38 @@ bool ClauseProcessor::processDepend(mlir::omp::DependClauseOps &result) const {
 
     for (const omp::Object &object : objects) {
       assert(object.ref() && "Expecting designator");
+      mlir::Value dependVar;
 
       if (evaluate::ExtractSubstring(*object.ref())) {
         TODO(converter.getCurrentLocation(),
              "substring not supported for task depend");
       } else if (evaluate::IsArrayElement(*object.ref())) {
-        TODO(converter.getCurrentLocation(),
-             "array sections not supported for task depend");
+        // Array Section
+        SomeExpr expr = *object.ref();
+        if (isVectorSubscript(expr))
+          TODO(converter.getCurrentLocation(),
+               "Vector subscripted array section for task dependency");
+
+        hlfir::EntityWithAttributes entity = convertExprToHLFIR(
+            converter.getCurrentLocation(), converter, expr, symMap, stmtCtx);
+        dependVar = entity.getBase();
+      } else {
+        semantics::Symbol *sym = object.sym();
+        dependVar = converter.getSymbolAddress(*sym);
       }
 
-      semantics::Symbol *sym = object.sym();
-      const mlir::Value variable = converter.getSymbolAddress(*sym);
-      result.dependVars.push_back(variable);
+      // The openmp dialect doesn't know what to do with boxes (and it would
+      // break layering to teach it about them). The dependency variable can be
+      // a box because it was an array section or because the original symbol
+      // was mapped to a box.
+      // Getting the address of the box data is okay because all the runtime
+      // ultimately cares about is the base address of the array.
+      if (fir::isa_box_type(dependVar.getType())) {
+        fir::FirOpBuilder &builder = converter.getFirOpBuilder();
+        dependVar = builder.create<fir::BoxAddrOp>(
+            converter.getCurrentLocation(), dependVar);
+      }
+      result.dependVars.push_back(dependVar);
     }
   };
 
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index aa203689ab560..6b1f7a31c7aac 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -104,7 +104,8 @@ class ClauseProcessor {
   bool processCopyin() const;
   bool processCopyprivate(mlir::Location currentLocation,
                           mlir::omp::CopyprivateClauseOps &result) const;
-  bool processDepend(mlir::omp::DependClauseOps &result) const;
+  bool processDepend(lower::SymMap &symMap, lower::StatementContext &stmtCtx,
+                     mlir::omp::DependClauseOps &result) const;
   bool
   processEnter(llvm::SmallVectorImpl<DeclareTargetCapturePair> &result) const;
   bool processIf(omp::clause::If::DirectiveNameModifier directiveName,
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index b8ef18860e9b4..5bf806c58e6e0 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -1669,15 +1669,15 @@ static void genSingleClauses(lower::AbstractConverter &converter,
 
 static void genTargetClauses(
     lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx,
-    lower::StatementContext &stmtCtx, lower::pft::Evaluation &eval,
-    const List<Clause> &clauses, mlir::Location loc,
-    mlir::omp::TargetOperands &clauseOps,
+    lower::SymMap &symTable, lower::StatementContext &stmtCtx,
+    lower::pft::Evaluation &eval, const List<Clause> &clauses,
+    mlir::Location loc, mlir::omp::TargetOperands &clauseOps,
     llvm::SmallVectorImpl<const semantics::Symbol *> &hasDeviceAddrSyms,
     llvm::SmallVectorImpl<const semantics::Symbol *> &isDevicePtrSyms,
     llvm::SmallVectorImpl<const semantics::Symbol *> &mapSyms) {
   ClauseProcessor cp(converter, semaCtx, clauses);
   cp.processBare(clauseOps);
-  cp.processDepend(clauseOps);
+  cp.processDepend(symTable, stmtCtx, clauseOps);
   cp.processDevice(stmtCtx, clauseOps);
   cp.processHasDeviceAddr(stmtCtx, clauseOps, hasDeviceAddrSyms);
   if (!hostEvalInfo.empty()) {
@@ -1728,11 +1728,12 @@ static void genTargetDataClauses(
 
 static void genTargetEnterExitUpdateDataClauses(
     lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx,
-    lower::StatementContext &stmtCtx, const List<Clause> &clauses,
-    mlir::Location loc, llvm::omp::Directive directive,
+    lower::SymMap &symTable, lower::StatementContext &stmtCtx,
+    const List<Clause> &clauses, mlir::Location loc,
+    llvm::omp::Directive directive,
     mlir::omp::TargetEnterExitUpdateDataOperands &clauseOps) {
   ClauseProcessor cp(converter, semaCtx, clauses);
-  cp.processDepend(clauseOps);
+  cp.processDepend(symTable, stmtCtx, clauseOps);
   cp.processDevice(stmtCtx, clauseOps);
   cp.processIf(directive, clauseOps);
 
@@ -1746,12 +1747,13 @@ static void genTargetEnterExitUpdateDataClauses(
 
 static void genTaskClauses(lower::AbstractConverter &converter,
                            semantics::SemanticsContext &semaCtx,
+                           lower::SymMap &symTable,
                            lower::StatementContext &stmtCtx,
                            const List<Clause> &clauses, mlir::Location loc,
                            mlir::omp::TaskOperands &clauseOps) {
   ClauseProcessor cp(converter, semaCtx, clauses);
   cp.processAllocate(clauseOps);
-  cp.processDepend(clauseOps);
+  cp.processDepend(symTable, stmtCtx, clauseOps);
   cp.processFinal(stmtCtx, clauseOps);
   cp.processIf(llvm::omp::Directive::OMPD_task, clauseOps);
   cp.processMergeable(clauseOps);
@@ -2194,8 +2196,8 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable,
   mlir::omp::TargetOperands clauseOps;
   llvm::SmallVector<const semantics::Symbol *> mapSyms, isDevicePtrSyms,
       hasDeviceAddrSyms;
-  genTargetClauses(converter, semaCtx, stmtCtx, eval, item->clauses, loc,
-                   clauseOps, hasDeviceAddrSyms, isDevicePtrSyms, mapSyms);
+  genTargetClauses(converter, semaCtx, symTable, stmtCtx, eval, item->clauses,
+                   loc, clauseOps, hasDeviceAddrSyms, isDevicePtrSyms, mapSyms);
 
   DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval,
                            /*shouldCollectPreDeterminedSymbols=*/
@@ -2415,7 +2417,7 @@ static OpTy genTargetEnterExitUpdateDataOp(
   }
 
   mlir::omp::TargetEnterExitUpdateDataOperands clauseOps;
-  genTargetEnterExitUpdateDataClauses(converter, semaCtx, stmtCtx,
+  genTargetEnterExitUpdateDataClauses(converter, semaCtx, symTable, stmtCtx,
                                       item->clauses, loc, directive, clauseOps);
 
   return firOpBuilder.create<OpTy>(loc, clauseOps);
@@ -2428,7 +2430,8 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable,
           ConstructQueue::const_iterator item) {
   lower::StatementContext stmtCtx;
   mlir::omp::TaskOperands clauseOps;
-  genTaskClauses(converter, semaCtx, stmtCtx, item->clauses, loc, clauseOps);
+  genTaskClauses(converter, semaCtx, symTable, stmtCtx, item->clauses, loc,
+                 clauseOps);
 
   if (!enableDelayedPrivatization)
     return genOpWithBody<mlir::omp::TaskOp>(
diff --git a/flang/test/Lower/OpenMP/Todo/depend-clause-vector-subscript-array-section.f90 b/flang/test/Lower/OpenMP/Todo/depend-clause-vector-subscript-array-section.f90
new file mode 100644
index 0000000000000..f3bd58c8c559a
--- /dev/null
+++ b/flang/test/Lower/OpenMP/Todo/depend-clause-vector-subscript-array-section.f90
@@ -0,0 +1,11 @@
+! RUN: %not_todo_cmd bbc -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s
+! RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s
+
+! CHECK: Vector subscripted array section for task dependency
+subroutine vectorSubscriptArraySection(array, indices)
+  integer :: array(:)
+  integer :: indices(:)
+
+  !$omp task depend (in: array(indices))
+  !$omp end task
+end subroutine
diff --git a/flang/test/Lower/OpenMP/task-depend-array-section.f90 b/flang/test/Lower/OpenMP/task-depend-array-section.f90
new file mode 100644
index 0000000000000..b364a5e06a29c
--- /dev/null
+++ b/flang/test/Lower/OpenMP/task-depend-array-section.f90
@@ -0,0 +1,51 @@
+! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s
+
+subroutine knownShape(array)
+  integer :: array(10)
+
+  !$omp task depend(in: array(2:8))
+  !$omp end task
+end subroutine
+
+! CHECK-LABEL:   func.func @_QPknownshape(
+! CHECK-SAME:                            %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !fir.ref<!fir.array<10xi32>> {fir.bindc_name = "array"}) {
+! CHECK:           %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope
+! CHECK:           %[[VAL_2:.*]] = arith.constant 10 : index
+! CHECK:           %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1>
+! CHECK:           %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_0]](%[[VAL_3]]) dummy_scope %[[VAL_1]] {uniq_name = "_QFknownshapeEarray"} : (!fir.ref<!fir.array<10xi32>>, !fir.shape<1>, !fir.dscope) -> (!fir.ref<!fir.array<10xi32>>, !fir.ref<!fir.array<10xi32>>)
+! CHECK:           %[[VAL_5:.*]] = arith.constant 2 : index
+! CHECK:           %[[VAL_6:.*]] = arith.constant 8 : index
+! CHECK:           %[[VAL_7:.*]] = arith.constant 1 : index
+! CHECK:           %[[VAL_8:.*]] = arith.constant 7 : index
+! CHECK:           %[[VAL_9:.*]] = fir.shape %[[VAL_8]] : (index) -> !fir.shape<1>
+! CHECK:           %[[VAL_10:.*]] = hlfir.designate %[[VAL_4]]#0 (%[[VAL_5]]:%[[VAL_6]]:%[[VAL_7]])  shape %[[VAL_9]] : (!fir.ref<!fir.array<10xi32>>, index, index, index, !fir.shape<1>) -> !fir.ref<!fir.array<7xi32>>
+! CHECK:           omp.task depend(taskdependin -> %[[VAL_10]] : !fir.ref<!fir.array<7xi32>>) {
+! CHECK:             omp.terminator
+! CHECK:           }
+! CHECK:           return
+! CHECK:         }
+
+
+subroutine assumedShape(array)
+  integer :: array(:)
+
+  !$omp task depend(in: array(2:8:2))
+  !$omp end task
+end subroutine
+
+! CHECK-LABEL:   func.func @_QPassumedshape(
+! CHECK-SAME:                              %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !fir.box<!fir.array<?xi32>> {fir.bindc_name = "array"}) {
+! CHECK:           %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope
+! CHECK:           %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_0]] dummy_scope %[[VAL_1]] {uniq_name = "_QFassumedshapeEarray"} : (!fir.box<!fir.array<?xi32>>, !fir.dscope) -> (!fir.box<!fir.array<?xi32>>, !fir.box<!fir.array<?xi32>>)
+! CHECK:           %[[VAL_3:.*]] = arith.constant 2 : index
+! CHECK:           %[[VAL_4:.*]] = arith.constant 8 : index
+! CHECK:           %[[VAL_5:.*]] = arith.constant 2 : index
+! CHECK:           %[[VAL_6:.*]] = arith.constant 4 : index
+! CHECK:           %[[VAL_7:.*]] = fir.shape %[[VAL_6]] : (index) -> !fir.shape<1>
+! CHECK:           %[[VAL_8:.*]] = hlfir.designate %[[VAL_2]]#0 (%[[VAL_3]]:%[[VAL_4]]:%[[VAL_5]])  shape %[[VAL_7]] : (!fir.box<!fir.array<?xi32>>, index, index, index, !fir.shape<1>) -> !fir.box<!fir.array<4xi32>>
+! CHECK:           %[[VAL_9:.*]] = fir.box_addr %[[VAL_8]] : (!fir.box<!fir.array<4xi32>>) -> !fir.ref<!fir.array<4xi32>>
+! CHECK:           omp.task depend(taskdependin -> %[[VAL_9]] : !fir.ref<!fir.array<4xi32>>) {
+! CHECK:             omp.terminator
+! CHECK:           }
+! CHECK:           return
+! CHECK:         }
diff --git a/flang/test/Lower/OpenMP/task.f90 b/flang/test/Lower/OpenMP/task.f90
index 393801997aebc..7e1e49dad9797 100644
--- a/flang/test/Lower/OpenMP/task.f90
+++ b/flang/test/Lower/OpenMP/task.f90
@@ -158,6 +158,14 @@ subroutine task_depend_multi_task()
   !$omp end task
 end subroutine task_depend_multi_task
 
+subroutine task_depend_box(array)
+  integer :: array(:)
+  !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %{{.*}} : (!fir.box<!fir.array<?xi32>>) -> !fir.ref<!fir.array<?xi32>>
+  !CHECK: omp.task depend(taskdependin -> %[[BOX_ADDR]] : !fir.ref<!fir.array<?xi32>>)
+  !$omp task depend(in: array)
+  !$omp end task
+end subroutine
+
 !===============================================================================
 ! `private` clause
 !===============================================================================

// was mapped to a box.
// Getting the address of the box data is okay because all the runtime
// ultimately cares about is the base address of the array.
if (fir::isa_box_type(dependVar.getType())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about an allocatable array?

Copy link
Contributor Author

@tblah tblah Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard says that using an unallocated allocatable is undefined behavior (same for unassociated pointers). So I don't think this needs any special handling here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying it is OK to use the address of the descriptor and not the real address of the array for this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that would not be okay. As I understand it, this will use the address of the data pointed to by the descriptor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will probably need to unwrap the ref or dyn_cast_ptrEleTy otherwise I think this code will only work for the box type (assumed shape?) and not fir.ref<fir.box.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I see what you mean now. Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a891346

Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please wait a day in case the other folks working in this area has comments.

Is there shared code with target dependencies? Would we need tests there?

Just as a side point. An alternative way to express dependencies is to use tokens like in async dialect.

Copy link
Contributor

@NimishMishra NimishMishra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix.

@@ -2428,7 +2430,8 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable,
ConstructQueue::const_iterator item) {
lower::StatementContext stmtCtx;
Copy link
Contributor Author

@tblah tblah Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While working on the next patch in this series I've noticed that this stmtCtx seems not to be scoped correctly. If any cleanup code is added to it, it will emit that cleanup code when it is destroyed. Unfortunately, when this goes out of scope at the end of genTaskOp the builder insertion point is inside of the task operation. It would make more sense to do the cleanup after the end of the task operation to avoid adding unnecessary shared variables (this might be only a theoretical bug because I haven't yet found an example where any cleanup code is still present by the time we reach FIR.)

I have a proof of concept fix which I will finish off on Monday.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to understand the problem here. This stmtCtx is used in managing and cleaning up intermediates while generating expressions of certain clauses. At the time the cleanup code is emitted, the insertion point is inside the task's region. Is it after the operations in task's region are generated, or before that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this on the PR for this change. Hopefully the example will make it clearer: #133891 (comment)

@tblah tblah merged commit e17d864 into llvm:main Apr 1, 2025
11 checks passed
tblah added a commit that referenced this pull request Apr 1, 2025
The statement context is used for lowering clauses for openmp
operations using generalised helpers from flang lowering. The statement
context stores closures which generate code for cleaning up temporary
values generated by the lowering helper. These closures are run when the
statement construct is destroyed. Keeping the statement context local to
the clause or operation being lowered without any special handling was
not correct because any cleanup code would be generated at the insertion
point when that statement context went out of scope (which would in
general be inside of the newly created container operation). It would be
better to generate the cleanup code after the newly created operation
(clause processing is synchronous even for deferred tasks).

Currently supported clauses are mostly populated with simple scalar values
that require no cleanup. Even the simple array sections added by #132994
needed no cleanup because indexing the right values of the array did not
create any temporaries. Supporting array sections with vector indexing
will generate hlfir.destroy operations for cleanup. This patch fixes
where those will be created. Those hlfir.destroy operations don't
generate any FIR (or LLVM) code, but the issue still exists
theoretically.

I wasn't able to find any clauses which have any cleanup to use to test
this PR. It is probably NFC for the current lowering. This will be
tested in the PR adding vector subscripting of array sections.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request Apr 2, 2025
…32994)

The OpenMP standard says that all dependencies in the same set of
inter-dependent tasks must be non-overlapping. This simplification means
that the OpenMP only needs to keep track of the base addresses of
dependency variables. This can be seen in kmp_taskdeps.cpp, which stores
task dependency information in a hash table, using the base address as a
key.

This patch generates a rebox operation to slice boxed arrays, but only
the box data address is used for the task dependency. The extra box is
optimized away by LLVM at O3.

Vector subscripts are TODO (I will address in my next patch).

This also fixes a bug for ordinary subscripts when the symbol was mapped
to a box:

Fixes llvm#132647
tblah added a commit that referenced this pull request Apr 8, 2025
…133891)

The statement context is used for lowering clauses for openmp operations
using generalised helpers from flang lowering. The statement context
stores closures which generate code for cleaning up temporary values
generated by the lowering helper. These closures are run when the
statement construct is destroyed. Keeping the statement context local to
the clause or operation being lowered without any special handling was
not correct because any cleanup code would be generated at the insertion
point when that statement context went out of scope (which would in
general be inside of the newly created container operation). It would be
better to generate the cleanup code after the newly created operation
(clause processing is synchronous even for deferred tasks).

Currently supported clauses are mostly populated with simple scalar
values that require no cleanup. Even the simple array sections added by
#132994 needed no cleanup because indexing the right values of the array
did not create any temporaries. Supporting array sections with vector
indexing will generate hlfir.destroy operations for cleanup. This patch
fixes where those will be created. Those hlfir.destroy operations don't
generate any FIR (or LLVM) code, but the issue still exists
theoretically.

I wasn't able to find any clauses which have any cleanup to use to test
this PR. It is probably NFC for the current lowering. This will be
tested in [the PR adding vector subscripting of array
sections](#133892).
tblah added a commit to tblah/llvm-project that referenced this pull request May 8, 2025
tblah added a commit that referenced this pull request May 8, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Flang][OpenMP] Compilation error of combining select type construct and task construct with depend clause
4 participants