-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[flang][OpenMP] Rewrite omp.loop
to semantically equivalent ops
#115443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-flang-openmp Author: Kareem Ergawy (ergawy) ChangesIntroduces a new conversion pass that rewrites Parent PR: #114219, only the latest commit is relevant to this PR. Patch is 42.21 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/115443.diff 20 Files Affected:
diff --git a/flang/include/flang/Common/OpenMP-utils.h b/flang/include/flang/Common/OpenMP-utils.h
new file mode 100644
index 00000000000000..7dbb0f612b19cd
--- /dev/null
+++ b/flang/include/flang/Common/OpenMP-utils.h
@@ -0,0 +1,68 @@
+//===-- include/flang/Common/OpenMP-utils.h --------------------*- C++ -*-====//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef FORTRAN_COMMON_OPENMP_UTILS_H_
+#define FORTRAN_COMMON_OPENMP_UTILS_H_
+
+#include "flang/Semantics/symbol.h"
+
+#include "mlir/IR/Builders.h"
+#include "mlir/IR/Value.h"
+
+#include "llvm/ADT/ArrayRef.h"
+
+namespace Fortran::openmp::common {
+/// Structure holding the information needed to create and bind entry block
+/// arguments associated to a single clause.
+struct EntryBlockArgsEntry {
+ llvm::ArrayRef<const Fortran::semantics::Symbol *> syms;
+ llvm::ArrayRef<mlir::Value> vars;
+
+ bool isValid() const {
+ // This check allows specifying a smaller number of symbols than values
+ // because in some case cases a single symbol generates multiple block
+ // arguments.
+ return syms.size() <= vars.size();
+ }
+};
+
+/// Structure holding the information needed to create and bind entry block
+/// arguments associated to all clauses that can define them.
+struct EntryBlockArgs {
+ EntryBlockArgsEntry inReduction;
+ EntryBlockArgsEntry map;
+ EntryBlockArgsEntry priv;
+ EntryBlockArgsEntry reduction;
+ EntryBlockArgsEntry taskReduction;
+ EntryBlockArgsEntry useDeviceAddr;
+ EntryBlockArgsEntry useDevicePtr;
+
+ bool isValid() const {
+ return inReduction.isValid() && map.isValid() && priv.isValid() &&
+ reduction.isValid() && taskReduction.isValid() &&
+ useDeviceAddr.isValid() && useDevicePtr.isValid();
+ }
+
+ auto getSyms() const {
+ return llvm::concat<const Fortran::semantics::Symbol *const>(
+ inReduction.syms, map.syms, priv.syms, reduction.syms,
+ taskReduction.syms, useDeviceAddr.syms, useDevicePtr.syms);
+ }
+
+ auto getVars() const {
+ return llvm::concat<const mlir::Value>(inReduction.vars, map.vars,
+ priv.vars, reduction.vars, taskReduction.vars, useDeviceAddr.vars,
+ useDevicePtr.vars);
+ }
+};
+
+mlir::Block *genEntryBlock(
+ mlir::OpBuilder &builder, const EntryBlockArgs &args, mlir::Region ®ion);
+} // namespace Fortran::openmp::common
+
+#endif // FORTRAN_COMMON_OPENMP_UTILS_H_
diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.td b/flang/include/flang/Optimizer/OpenMP/Passes.td
index c070bc22ff20cc..bb5ad4b4e0f7bb 100644
--- a/flang/include/flang/Optimizer/OpenMP/Passes.td
+++ b/flang/include/flang/Optimizer/OpenMP/Passes.td
@@ -50,4 +50,16 @@ def FunctionFilteringPass : Pass<"omp-function-filtering"> {
];
}
+def GenericLoopConversionPass
+ : Pass<"omp-generic-loop-conversion", "mlir::func::FuncOp"> {
+ let summary = "Converts OpenMP generic `loop` direcitve to semantically "
+ "equivalent OpenMP ops";
+ let description = [{
+ Rewrites `loop` ops to their semantically equivalent nest of ops. The
+ rewrite depends on the nesting/combination structure of the `loop` op
+ within its surrounding context as well as its `bind` clause value.
+ }];
+ let dependentDialects = ["mlir::omp::OpenMPDialect"];
+}
+
#endif //FORTRAN_OPTIMIZER_OPENMP_PASSES
diff --git a/flang/lib/Common/CMakeLists.txt b/flang/lib/Common/CMakeLists.txt
index be72391847f3dd..de6bea396f3cbe 100644
--- a/flang/lib/Common/CMakeLists.txt
+++ b/flang/lib/Common/CMakeLists.txt
@@ -40,9 +40,13 @@ add_flang_library(FortranCommon
default-kinds.cpp
idioms.cpp
LangOptions.cpp
+ OpenMP-utils.cpp
Version.cpp
${version_inc}
LINK_COMPONENTS
Support
+
+ LINK_LIBS
+ MLIRIR
)
diff --git a/flang/lib/Common/OpenMP-utils.cpp b/flang/lib/Common/OpenMP-utils.cpp
new file mode 100644
index 00000000000000..32df2be01e5484
--- /dev/null
+++ b/flang/lib/Common/OpenMP-utils.cpp
@@ -0,0 +1,47 @@
+//===-- include/flang/Common/OpenMP-utils.cpp ------------------*- C++ -*-====//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "flang/Common/OpenMP-utils.h"
+
+#include "mlir/IR/OpDefinition.h"
+
+namespace Fortran::openmp::common {
+mlir::Block *genEntryBlock(mlir::OpBuilder &builder, const EntryBlockArgs &args,
+ mlir::Region ®ion) {
+ assert(args.isValid() && "invalid args");
+ assert(region.empty() && "non-empty region");
+
+ llvm::SmallVector<mlir::Type> types;
+ llvm::SmallVector<mlir::Location> locs;
+ unsigned numVars = args.inReduction.vars.size() + args.map.vars.size() +
+ args.priv.vars.size() + args.reduction.vars.size() +
+ args.taskReduction.vars.size() + args.useDeviceAddr.vars.size() +
+ args.useDevicePtr.vars.size();
+ types.reserve(numVars);
+ locs.reserve(numVars);
+
+ auto extractTypeLoc = [&types, &locs](llvm::ArrayRef<mlir::Value> vals) {
+ llvm::transform(vals, std::back_inserter(types),
+ [](mlir::Value v) { return v.getType(); });
+ llvm::transform(vals, std::back_inserter(locs),
+ [](mlir::Value v) { return v.getLoc(); });
+ };
+
+ // Populate block arguments in clause name alphabetical order to match
+ // expected order by the BlockArgOpenMPOpInterface.
+ extractTypeLoc(args.inReduction.vars);
+ extractTypeLoc(args.map.vars);
+ extractTypeLoc(args.priv.vars);
+ extractTypeLoc(args.reduction.vars);
+ extractTypeLoc(args.taskReduction.vars);
+ extractTypeLoc(args.useDeviceAddr.vars);
+ extractTypeLoc(args.useDevicePtr.vars);
+
+ return builder.createBlock(®ion, {}, types, locs);
+}
+} // namespace Fortran::openmp::common
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index e768c1cbc0784a..0191669e100b45 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -98,6 +98,25 @@ genAllocateClause(lower::AbstractConverter &converter,
genObjectList(objects, converter, allocateOperands);
}
+static mlir::omp::ClauseBindKindAttr
+genBindKindAttr(fir::FirOpBuilder &firOpBuilder,
+ const omp::clause::Bind &clause) {
+ mlir::omp::ClauseBindKind bindKind;
+ switch (clause.v) {
+ case omp::clause::Bind::Binding::Teams:
+ bindKind = mlir::omp::ClauseBindKind::Teams;
+ break;
+ case omp::clause::Bind::Binding::Parallel:
+ bindKind = mlir::omp::ClauseBindKind::Parallel;
+ break;
+ case omp::clause::Bind::Binding::Thread:
+ bindKind = mlir::omp::ClauseBindKind::Thread;
+ break;
+ }
+ return mlir::omp::ClauseBindKindAttr::get(firOpBuilder.getContext(),
+ bindKind);
+}
+
static mlir::omp::ClauseProcBindKindAttr
genProcBindKindAttr(fir::FirOpBuilder &firOpBuilder,
const omp::clause::ProcBind &clause) {
@@ -204,6 +223,15 @@ static void convertLoopBounds(lower::AbstractConverter &converter,
// ClauseProcessor unique clauses
//===----------------------------------------------------------------------===//
+bool ClauseProcessor::processBind(mlir::omp::BindClauseOps &result) const {
+ if (auto *clause = findUniqueClause<omp::clause::Bind>()) {
+ fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+ result.bindKind = genBindKindAttr(firOpBuilder, *clause);
+ return true;
+ }
+ return false;
+}
+
bool ClauseProcessor::processCollapse(
mlir::Location currentLocation, lower::pft::Evaluation &eval,
mlir::omp::LoopRelatedClauseOps &result,
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index f34121c70d0b44..772e5415d2da27 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -53,6 +53,7 @@ class ClauseProcessor {
: converter(converter), semaCtx(semaCtx), clauses(clauses) {}
// 'Unique' clauses: They can appear at most once in the clause list.
+ bool processBind(mlir::omp::BindClauseOps &result) const;
bool
processCollapse(mlir::Location currentLocation, lower::pft::Evaluation &eval,
mlir::omp::LoopRelatedClauseOps &result,
diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp
index 46caafeef8e4a8..ff4af737b27fdb 100644
--- a/flang/lib/Lower/OpenMP/Clauses.cpp
+++ b/flang/lib/Lower/OpenMP/Clauses.cpp
@@ -484,8 +484,19 @@ AtomicDefaultMemOrder make(const parser::OmpClause::AtomicDefaultMemOrder &inp,
Bind make(const parser::OmpClause::Bind &inp,
semantics::SemanticsContext &semaCtx) {
- // inp -> empty
- llvm_unreachable("Empty: bind");
+ // inp.v -> parser::OmpBindClause
+ using wrapped = parser::OmpBindClause;
+
+ CLAUSET_ENUM_CONVERT( //
+ convert, wrapped::Type, Bind::Binding,
+ // clang-format off
+ MS(Teams, Teams)
+ MS(Parallel, Parallel)
+ MS(Thread, Thread)
+ // clang-format on
+ );
+
+ return Bind{/*Binding=*/convert(inp.v.v)};
}
// CancellationConstructType: empty
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 4f9e2347308aa1..d397c9c2e4e059 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -19,6 +19,7 @@
#include "DirectivesCommon.h"
#include "ReductionProcessor.h"
#include "Utils.h"
+#include "flang/Common/OpenMP-utils.h"
#include "flang/Common/idioms.h"
#include "flang/Lower/Bridge.h"
#include "flang/Lower/ConvertExpr.h"
@@ -40,57 +41,12 @@
#include "llvm/Frontend/OpenMP/OMPConstants.h"
using namespace Fortran::lower::omp;
+using namespace Fortran::openmp::common;
//===----------------------------------------------------------------------===//
// Code generation helper functions
//===----------------------------------------------------------------------===//
-namespace {
-/// Structure holding the information needed to create and bind entry block
-/// arguments associated to a single clause.
-struct EntryBlockArgsEntry {
- llvm::ArrayRef<const semantics::Symbol *> syms;
- llvm::ArrayRef<mlir::Value> vars;
-
- bool isValid() const {
- // This check allows specifying a smaller number of symbols than values
- // because in some case cases a single symbol generates multiple block
- // arguments.
- return syms.size() <= vars.size();
- }
-};
-
-/// Structure holding the information needed to create and bind entry block
-/// arguments associated to all clauses that can define them.
-struct EntryBlockArgs {
- EntryBlockArgsEntry inReduction;
- EntryBlockArgsEntry map;
- EntryBlockArgsEntry priv;
- EntryBlockArgsEntry reduction;
- EntryBlockArgsEntry taskReduction;
- EntryBlockArgsEntry useDeviceAddr;
- EntryBlockArgsEntry useDevicePtr;
-
- bool isValid() const {
- return inReduction.isValid() && map.isValid() && priv.isValid() &&
- reduction.isValid() && taskReduction.isValid() &&
- useDeviceAddr.isValid() && useDevicePtr.isValid();
- }
-
- auto getSyms() const {
- return llvm::concat<const semantics::Symbol *const>(
- inReduction.syms, map.syms, priv.syms, reduction.syms,
- taskReduction.syms, useDeviceAddr.syms, useDevicePtr.syms);
- }
-
- auto getVars() const {
- return llvm::concat<const mlir::Value>(
- inReduction.vars, map.vars, priv.vars, reduction.vars,
- taskReduction.vars, useDeviceAddr.vars, useDevicePtr.vars);
- }
-};
-} // namespace
-
static void genOMPDispatch(lower::AbstractConverter &converter,
lower::SymMap &symTable,
semantics::SemanticsContext &semaCtx,
@@ -622,50 +578,6 @@ static void genLoopVars(
firOpBuilder.setInsertionPointAfter(storeOp);
}
-/// Create an entry block for the given region, including the clause-defined
-/// arguments specified.
-///
-/// \param [in] converter - PFT to MLIR conversion interface.
-/// \param [in] args - entry block arguments information for the given
-/// operation.
-/// \param [in] region - Empty region in which to create the entry block.
-static mlir::Block *genEntryBlock(lower::AbstractConverter &converter,
- const EntryBlockArgs &args,
- mlir::Region ®ion) {
- assert(args.isValid() && "invalid args");
- assert(region.empty() && "non-empty region");
- fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
-
- llvm::SmallVector<mlir::Type> types;
- llvm::SmallVector<mlir::Location> locs;
- unsigned numVars = args.inReduction.vars.size() + args.map.vars.size() +
- args.priv.vars.size() + args.reduction.vars.size() +
- args.taskReduction.vars.size() +
- args.useDeviceAddr.vars.size() +
- args.useDevicePtr.vars.size();
- types.reserve(numVars);
- locs.reserve(numVars);
-
- auto extractTypeLoc = [&types, &locs](llvm::ArrayRef<mlir::Value> vals) {
- llvm::transform(vals, std::back_inserter(types),
- [](mlir::Value v) { return v.getType(); });
- llvm::transform(vals, std::back_inserter(locs),
- [](mlir::Value v) { return v.getLoc(); });
- };
-
- // Populate block arguments in clause name alphabetical order to match
- // expected order by the BlockArgOpenMPOpInterface.
- extractTypeLoc(args.inReduction.vars);
- extractTypeLoc(args.map.vars);
- extractTypeLoc(args.priv.vars);
- extractTypeLoc(args.reduction.vars);
- extractTypeLoc(args.taskReduction.vars);
- extractTypeLoc(args.useDeviceAddr.vars);
- extractTypeLoc(args.useDevicePtr.vars);
-
- return firOpBuilder.createBlock(®ion, {}, types, locs);
-}
-
static void
markDeclareTarget(mlir::Operation *op, lower::AbstractConverter &converter,
mlir::omp::DeclareTargetCaptureClause captureClause,
@@ -918,7 +830,7 @@ static void genBodyOfTargetDataOp(
ConstructQueue::const_iterator item) {
fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
- genEntryBlock(converter, args, dataOp.getRegion());
+ genEntryBlock(converter.getFirOpBuilder(), args, dataOp.getRegion());
bindEntryBlockArgs(converter, dataOp, args);
// Insert dummy instruction to remember the insertion position. The
@@ -995,7 +907,8 @@ static void genBodyOfTargetOp(
auto argIface = llvm::cast<mlir::omp::BlockArgOpenMPOpInterface>(*targetOp);
mlir::Region ®ion = targetOp.getRegion();
- mlir::Block *entryBlock = genEntryBlock(converter, args, region);
+ mlir::Block *entryBlock =
+ genEntryBlock(converter.getFirOpBuilder(), args, region);
bindEntryBlockArgs(converter, targetOp, args);
// Check if cloning the bounds introduced any dependency on the outer region.
@@ -1121,7 +1034,7 @@ static OpTy genWrapperOp(lower::AbstractConverter &converter,
auto op = firOpBuilder.create<OpTy>(loc, clauseOps);
// Create entry block with arguments.
- genEntryBlock(converter, args, op.getRegion());
+ genEntryBlock(converter.getFirOpBuilder(), args, op.getRegion());
return op;
}
@@ -1176,6 +1089,18 @@ genLoopNestClauses(lower::AbstractConverter &converter,
clauseOps.loopInclusive = converter.getFirOpBuilder().getUnitAttr();
}
+static void genLoopClauses(
+ lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx,
+ const List<Clause> &clauses, mlir::Location loc,
+ mlir::omp::LoopOperands &clauseOps,
+ llvm::SmallVectorImpl<const semantics::Symbol *> &reductionSyms) {
+ ClauseProcessor cp(converter, semaCtx, clauses);
+ cp.processBind(clauseOps);
+ cp.processOrder(clauseOps);
+ cp.processReduction(loc, clauseOps, reductionSyms);
+ cp.processTODO<clause::Lastprivate>(loc, llvm::omp::Directive::OMPD_loop);
+}
+
static void genMaskedClauses(lower::AbstractConverter &converter,
semantics::SemanticsContext &semaCtx,
lower::StatementContext &stmtCtx,
@@ -1532,7 +1457,7 @@ genParallelOp(lower::AbstractConverter &converter, lower::SymMap &symTable,
const EntryBlockArgs &args, DataSharingProcessor *dsp,
bool isComposite = false) {
auto genRegionEntryCB = [&](mlir::Operation *op) {
- genEntryBlock(converter, args, op->getRegion(0));
+ genEntryBlock(converter.getFirOpBuilder(), args, op->getRegion(0));
bindEntryBlockArgs(
converter, llvm::cast<mlir::omp::BlockArgOpenMPOpInterface>(op), args);
return llvm::to_vector(args.getSyms());
@@ -1605,12 +1530,12 @@ genSectionsOp(lower::AbstractConverter &converter, lower::SymMap &symTable,
args.reduction.syms = reductionSyms;
args.reduction.vars = clauseOps.reductionVars;
- genEntryBlock(converter, args, sectionsOp.getRegion());
+ genEntryBlock(converter.getFirOpBuilder(), args, sectionsOp.getRegion());
mlir::Operation *terminator =
lower::genOpenMPTerminator(builder, sectionsOp, loc);
auto genRegionEntryCB = [&](mlir::Operation *op) {
- genEntryBlock(converter, args, op->getRegion(0));
+ genEntryBlock(converter.getFirOpBuilder(), args, op->getRegion(0));
bindEntryBlockArgs(
converter, llvm::cast<mlir::omp::BlockArgOpenMPOpInterface>(op), args);
return llvm::to_vector(args.getSyms());
@@ -2051,6 +1976,40 @@ static void genStandaloneDo(lower::AbstractConverter &converter,
llvm::omp::Directive::OMPD_do, dsp);
}
+static void genLoopOp(lower::AbstractConverter &converter,
+ lower::SymMap &symTable,
+ semantics::SemanticsContext &semaCtx,
+ lower::pft::Evaluation &eval, mlir::Location loc,
+ const ConstructQueue &queue,
+ ConstructQueue::const_iterator item) {
+ mlir::omp::LoopOperands loopClauseOps;
+ llvm::SmallVector<const semantics::Symbol *> loopReductionSyms;
+ genLoopClauses(converter, semaCtx, item->clauses, loc, loopClauseOps,
+ loopReductionSyms);
+
+ DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval,
+ /*shouldCollectPreDeterminedSymbols=*/true,
+ /*useDelayedPrivatization=*/true, &symTable);
+ dsp.processStep1(&loopClauseOps);
+
+ mlir::omp::LoopNestOperands loopNestClauseOps;
+ llvm::SmallVector<const semantics::Symbol *> iv;
+ genLoopNestClauses(converter, semaCtx, eval, item->clauses, loc,
+ loopNestClauseOps, iv);
+
+ EntryBlockArgs loopArgs;
+ loopArgs.priv.syms = dsp.getDelayedPrivSymbols();
+ loopArgs.priv.vars = loopClauseOps.privateVars;
+ loopArgs.reduction.syms = loopReductionSyms;
+ loopArgs.reduction.vars = loopClauseOps.reductionVars;
+
+ auto loopOp =
+ genWrapperOp<mlir::omp::LoopOp>(converter, loc, loopClauseOps, loopArgs);
+ genLoopNestOp(converter, symTable, semaCtx, eval, loc, queue, item,
+ loopNestClauseOps, iv, {{loopOp, loopArgs}},
+ llvm::omp::Directive::OMPD_loop, dsp);
+}
+
static void genStandaloneParallel(lower::AbstractConverter &converter,
lower::SymMap &symTable,
semantics::SemanticsContext &semaCtx,
@@ -2479,7 +2438,7 @@ static void genOMPDispatch(lower::AbstractConverter &converter,
genStandaloneDo(converter, symTable, semaCtx, eval, loc, queue, item);
break;
case llvm::omp::Directive::OMPD_loop:
- TODO(loc, "Unhandled directive " + llvm::omp::getOpenMPDirectiveName(dir));
+ genLoopOp(converter, symTable, semaCtx, eval, loc, queue, item);
break;
case llvm::omp::Directive::OMPD_masked:
genMaskedOp(converter, symTable, semaCtx, eval, loc, queue, item);
diff --git a/flang/lib/Optimizer/OpenMP/CMakeLists.txt b/flang/lib/Optimizer/OpenMP/CMakeLists.txt
index 035d0d5ca46c76..f9d284b8d77820 100644
--- a/flang/lib/Optimizer/OpenMP/CMakeLists.txt
+++ b/flang/lib/Optimizer/OpenMP/CMakeLists.txt
@@ -2,6 +2,7 @@ get_property(dialect_libs GLOBAL PROPERTY MLIR_DIALECT_LIBS)
add_flang_library(FlangOpenMPTransforms
FunctionFiltering...
[truncated]
|
3aaf579
to
dfbf88d
Compare
09b0c66
to
b2ed6da
Compare
873405d
to
e7e1a4d
Compare
// TODO we probably need to move the `loop_nest` bounds ops from the `teams` | ||
// region to the `parallel` region to avoid making these values `shared`. We can | ||
// find the backward slices of these bounds that are within the `teams` region | ||
// and move these slices to the `parallel` op. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skatrak I think we don't have to worry about this once host_eval is upstreamed, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say we don't have to worry about it, but rather there are some updates needed in other places. We'd have to make sure to enforce host-evaluation of loop bounds for target teams loop
or equivalent construct nests in the omp.target
verifier, update Flang lowering accordingly and then this pass seems to me that it should work unmodified for the tests in this PR.
However, what's expected to happen in this case (or equivalent Fortran)?
#pragma omp target teams
{
#pragma omp loop
for (...) { ... }
#pragma omp loop
for (...) { ... }
}
In that case, loop bounds wouldn't be host-evaluated if this pass introduces two distribute parallel for
nests, resulting in something similar to what is being produced by this pass right now. If these values being shared causes issues, perhaps we would need to introduce a firstprivate clause into the created omp.parallel
ops.
omp.loop
to semantically equivalent opsomp.loop
to semantically equivalent ops
e7e1a4d
to
63bf014
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except small typo
63bf014
to
85a6fc6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates and explanations
I have a nit comment about the name. Is this implementing parts of https://discourse.llvm.org/t/rfc-representing-combined-composite-constructs-in-the-openmp-dialect/76986? If so a link to the discourse post would be useful. |
85a6fc6
to
032afb5
Compare
The term
Done. |
032afb5
to
a459f4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Kareem for this work, and sorry for the delay getting to it! I have a few minor suggestions and comments.
// TODO we probably need to move the `loop_nest` bounds ops from the `teams` | ||
// region to the `parallel` region to avoid making these values `shared`. We can | ||
// find the backward slices of these bounds that are within the `teams` region | ||
// and move these slices to the `parallel` op. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say we don't have to worry about it, but rather there are some updates needed in other places. We'd have to make sure to enforce host-evaluation of loop bounds for target teams loop
or equivalent construct nests in the omp.target
verifier, update Flang lowering accordingly and then this pass seems to me that it should work unmodified for the tests in this PR.
However, what's expected to happen in this case (or equivalent Fortran)?
#pragma omp target teams
{
#pragma omp loop
for (...) { ... }
#pragma omp loop
for (...) { ... }
}
In that case, loop bounds wouldn't be host-evaluated if this pass introduces two distribute parallel for
nests, resulting in something similar to what is being produced by this pass right now. If these values being shared causes issues, perhaps we would need to introduce a firstprivate clause into the created omp.parallel
ops.
|
||
/// A conversion pattern to handle various combined forms of `omp.loop`. For how | ||
/// combined/composite directive are handled see: | ||
/// https://discourse.llvm.org/t/rfc-representing-combined-composite-constructs-in-the-openmp-dialect/76986. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this pass would be in the MLIR directory. Is there any reason for it to be not there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with other OpenMP passes, nothing more. I agree with you though, we could probably move pretty much everything under flang/lib/Optimizer/OpenMP/
to the dialect directory under mlir
. But I think this should be done in a separate PR in one go and keep things consistent here.
Let me know if you disagree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handling in a separate PR is fine with me.
I agree with you though, we could probably move pretty much everything under flang/lib/Optimizer/OpenMP/ to the dialect directory under mlir
I am not fully sure about this. Generic transformations should go to MLIR. But there could be some transformations that are Flang specific where for ease of lowering we lower in one way but immediately correct in a pass to the right form.
4f954cf
to
9d6475d
Compare
Introduces a new conversion pass that rewrites `omp.loop` ops to their semantically equivalent op nests bases on the surrounding/binding context of the `loop` op. Not all forms of `omp.loop` are supported yet.
9d6475d
to
82c22ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on my comments, this LGTM. Please wait for @kiranchandramohan's approval before merging.
@@ -41,57 +42,12 @@ | |||
#include "llvm/Frontend/OpenMP/OMPConstants.h" | |||
|
|||
using namespace Fortran::lower::omp; | |||
using namespace Fortran::common::openmp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: For consistency, it looks like Fortran::common::omp
would be a better name for the namespace in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
Introduces a new conversion pass that rewrites
omp.loop
ops to theirsemantically equivalent op nests bases on the surrounding/binding
context of the
loop
op. Not all forms ofomp.loop
are supported yet.See
isLoopConversionSupported
for more info on which forms are supported.