-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[MLIR][OpenMP] Add omp.loop_nest operation #87083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch introduces an operation intended to hold loop information associated to the `omp.distribute`, `omp.simdloop`, `omp.taskloop` and `omp.wsloop` operations. This is a stopgap solution to unblock work on transitioning these operations to becoming wrappers, as discussed in [this RFC](https://discourse.llvm.org/t/rfc-representing-combined-composite-constructs-in-the-openmp-dialect/76986). Long-term, this operation will likely be replaced by `omp.canonical_loop`, which is being designed to address missing support for loop transformations, etc.
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-flang-openmp Author: Sergio Afonso (skatrak) ChangesThis patch introduces an operation intended to hold loop information associated to the Long-term, this operation will likely be replaced by Full diff: https://github.com/llvm/llvm-project/pull/87083.diff 4 Files Affected:
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index f33942b3c7c02d..ffd00948915153 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -511,6 +511,69 @@ def SingleOp : OpenMP_Op<"single", [AttrSizedOperandSegments]> {
let hasVerifier = 1;
}
+//===----------------------------------------------------------------------===//
+// Loop Nest
+//===----------------------------------------------------------------------===//
+
+def LoopNestOp : OpenMP_Op<"loop_nest", [SameVariadicOperandSize,
+ AllTypesMatch<["lowerBound", "upperBound", "step"]>,
+ ParentOneOf<["DistributeOp", "SimdLoopOp", "TaskloopOp",
+ "WsloopOp"]>,
+ RecursiveMemoryEffects]> {
+ let summary = "rectangular loop nest";
+ let description = [{
+ This operation represents a collapsed rectangular loop nest. For each
+ rectangular loop of the nest represented by an instance of this operation,
+ lower and upper bounds, as well as a step variable, must be defined.
+
+ The lower and upper bounds specify a half-open range: the range includes the
+ lower bound but does not include the upper bound. If the `inclusive`
+ attribute is specified then the upper bound is also included.
+
+ The body region can contain any number of blocks. The region is terminated
+ by an `omp.yield` instruction without operands. The induction variables,
+ represented as entry block arguments to the loop nest operation's single
+ region, match the types of the `lowerBound`, `upperBound` and `step`
+ arguments.
+
+ ```mlir
+ omp.loop_nest (%i1, %i2) : i32 = (%c0, %c0) to (%c10, %c10) step (%c1, %c1) {
+ %a = load %arrA[%i1, %i2] : memref<?x?xf32>
+ %b = load %arrB[%i1, %i2] : memref<?x?xf32>
+ %sum = arith.addf %a, %b : f32
+ store %sum, %arrC[%i1, %i2] : memref<?x?xf32>
+ omp.yield
+ }
+ ```
+
+ This is a temporary simplified definition of a loop based on existing OpenMP
+ loop operations intended to serve as a stopgap solution until the long-term
+ representation of canonical loops is defined. Specifically, this operation
+ is intended to serve as a unique source for loop information during the
+ transition to making `omp.distribute`, `omp.simdloop`, `omp.taskloop` and
+ `omp.wsloop` wrapper operations. It is not intended to help with the
+ addition of support for loop transformations.
+ }];
+
+ let arguments = (ins Variadic<IntLikeType>:$lowerBound,
+ Variadic<IntLikeType>:$upperBound,
+ Variadic<IntLikeType>:$step,
+ UnitAttr:$inclusive);
+
+ let regions = (region AnyRegion:$region);
+
+ let extraClassDeclaration = [{
+ /// Returns the number of loops in the loop nest.
+ unsigned getNumLoops() { return getLowerBound().size(); }
+
+ /// Returns the induction variables of the loop nest.
+ ArrayRef<BlockArgument> getIVs() { return getRegion().getArguments(); }
+ }];
+
+ let hasCustomAssemblyFormat = 1;
+ let hasVerifier = 1;
+}
+
//===----------------------------------------------------------------------===//
// 2.9.2 Workshare Loop Construct
//===----------------------------------------------------------------------===//
@@ -724,7 +787,7 @@ def SimdLoopOp : OpenMP_Op<"simdloop", [AttrSizedOperandSegments,
def YieldOp : OpenMP_Op<"yield",
[Pure, ReturnLike, Terminator,
- ParentOneOf<["WsloopOp", "DeclareReductionOp",
+ ParentOneOf<["LoopNestOp", "WsloopOp", "DeclareReductionOp",
"AtomicUpdateOp", "SimdLoopOp", "PrivateClauseOp"]>]> {
let summary = "loop yield and termination operation";
let description = [{
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index bf5875071e0dc4..796df1d13e6564 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -1662,6 +1662,77 @@ LogicalResult TaskloopOp::verify() {
return success();
}
+//===----------------------------------------------------------------------===//
+// LoopNestOp
+//===----------------------------------------------------------------------===//
+
+ParseResult LoopNestOp::parse(OpAsmParser &parser, OperationState &result) {
+ // Parse an opening `(` followed by induction variables followed by `)`
+ SmallVector<OpAsmParser::Argument> ivs;
+ SmallVector<OpAsmParser::UnresolvedOperand> lbs, ubs;
+ Type loopVarType;
+ if (parser.parseArgumentList(ivs, OpAsmParser::Delimiter::Paren) ||
+ parser.parseColonType(loopVarType) ||
+ // Parse loop bounds.
+ parser.parseEqual() ||
+ parser.parseOperandList(lbs, ivs.size(), OpAsmParser::Delimiter::Paren) ||
+ parser.parseKeyword("to") ||
+ parser.parseOperandList(ubs, ivs.size(), OpAsmParser::Delimiter::Paren))
+ return failure();
+
+ for (auto &iv : ivs)
+ iv.type = loopVarType;
+
+ // Parse "inclusive" flag.
+ if (succeeded(parser.parseOptionalKeyword("inclusive")))
+ result.addAttribute("inclusive",
+ UnitAttr::get(parser.getBuilder().getContext()));
+
+ // Parse step values.
+ SmallVector<OpAsmParser::UnresolvedOperand> steps;
+ if (parser.parseKeyword("step") ||
+ parser.parseOperandList(steps, ivs.size(), OpAsmParser::Delimiter::Paren))
+ return failure();
+
+ // Parse the body.
+ Region *region = result.addRegion();
+ if (parser.parseRegion(*region, ivs))
+ return failure();
+
+ // Resolve operands.
+ if (parser.resolveOperands(lbs, loopVarType, result.operands) ||
+ parser.resolveOperands(ubs, loopVarType, result.operands) ||
+ parser.resolveOperands(steps, loopVarType, result.operands))
+ return failure();
+
+ // Parse the optional attribute list.
+ return parser.parseOptionalAttrDict(result.attributes);
+}
+
+void LoopNestOp::print(OpAsmPrinter &p) {
+ Region ®ion = getRegion();
+ auto args = region.getArguments();
+ p << " (" << args << ") : " << args[0].getType() << " = (" << getLowerBound()
+ << ") to (" << getUpperBound() << ") ";
+ if (getInclusive())
+ p << "inclusive ";
+ p << "step (" << getStep() << ") ";
+ p.printRegion(region, /*printEntryBlockArgs=*/false);
+}
+
+LogicalResult LoopNestOp::verify() {
+ if (getLowerBound().size() != getIVs().size())
+ return emitOpError() << "number of range arguments and IVs do not match";
+
+ for (auto [lb, iv] : llvm::zip_equal(getLowerBound(), getIVs())) {
+ if (lb.getType() != iv.getType())
+ return emitOpError()
+ << "range argument type does not match corresponding IV type";
+ }
+
+ return success();
+}
+
//===----------------------------------------------------------------------===//
// WsloopOp
//===----------------------------------------------------------------------===//
diff --git a/mlir/test/Dialect/OpenMP/invalid.mlir b/mlir/test/Dialect/OpenMP/invalid.mlir
index a00383cf44057c..760ebb14d94121 100644
--- a/mlir/test/Dialect/OpenMP/invalid.mlir
+++ b/mlir/test/Dialect/OpenMP/invalid.mlir
@@ -87,6 +87,43 @@ func.func @proc_bind_once() {
// -----
+func.func @invalid_parent(%lb : index, %ub : index, %step : index) {
+ // expected-error@+1 {{op expects parent op to be one of 'omp.distribute, omp.simdloop, omp.taskloop, omp.wsloop'}}
+ omp.loop_nest (%iv) : index = (%lb) to (%ub) step (%step) {
+ omp.yield
+ }
+}
+
+// -----
+
+func.func @type_mismatch(%lb : index, %ub : index, %step : index) {
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // expected-error@+1 {{range argument type does not match corresponding IV type}}
+ "omp.loop_nest" (%lb, %ub, %step) ({
+ ^bb0(%iv2: i32):
+ omp.yield
+ }) : (index, index, index) -> ()
+ omp.yield
+ }
+}
+
+// -----
+
+func.func @iv_number_mismatch(%lb : index, %ub : index, %step : index) {
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // expected-error@+1 {{number of range arguments and IVs do not match}}
+ "omp.loop_nest" (%lb, %ub, %step) ({
+ ^bb0(%iv1 : index, %iv2 : index):
+ omp.yield
+ }) : (index, index, index) -> ()
+ omp.yield
+ }
+}
+
+// -----
+
func.func @inclusive_not_a_clause(%lb : index, %ub : index, %step : index) {
// expected-error @below {{expected 'for'}}
omp.wsloop nowait inclusive
diff --git a/mlir/test/Dialect/OpenMP/ops.mlir b/mlir/test/Dialect/OpenMP/ops.mlir
index 30ce77423005ac..8d9acab67e0358 100644
--- a/mlir/test/Dialect/OpenMP/ops.mlir
+++ b/mlir/test/Dialect/OpenMP/ops.mlir
@@ -133,6 +133,85 @@ func.func @omp_parallel_pretty(%data_var : memref<i32>, %if_cond : i1, %num_thre
return
}
+// CHECK-LABEL: omp_loop_nest
+func.func @omp_loop_nest(%lb : index, %ub : index, %step : index) -> () {
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // CHECK: omp.loop_nest
+ // CHECK-SAME: (%{{.*}}) : index =
+ // CHECK-SAME: (%{{.*}}) to (%{{.*}}) step (%{{.*}})
+ "omp.loop_nest" (%lb, %ub, %step) ({
+ ^bb0(%iv2: index):
+ omp.yield
+ }) : (index, index, index) -> ()
+ omp.yield
+ }
+
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // CHECK: omp.loop_nest
+ // CHECK-SAME: (%{{.*}}) : index =
+ // CHECK-SAME: (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}})
+ "omp.loop_nest" (%lb, %ub, %step) ({
+ ^bb0(%iv2: index):
+ omp.yield
+ }) {inclusive} : (index, index, index) -> ()
+ omp.yield
+ }
+
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // CHECK: omp.loop_nest
+ // CHECK-SAME: (%{{.*}}, %{{.*}}) : index =
+ // CHECK-SAME: (%{{.*}}, %{{.*}}) to (%{{.*}}, %{{.*}}) step (%{{.*}}, %{{.*}})
+ "omp.loop_nest" (%lb, %lb, %ub, %ub, %step, %step) ({
+ ^bb0(%iv2: index, %iv3: index):
+ omp.yield
+ }) : (index, index, index, index, index, index) -> ()
+ omp.yield
+ }
+
+ return
+}
+
+// CHECK-LABEL: omp_loop_nest_pretty
+func.func @omp_loop_nest_pretty(%lb : index, %ub : index, %step : index) -> () {
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // CHECK: omp.loop_nest
+ // CHECK-SAME: (%{{.*}}) : index =
+ // CHECK-SAME: (%{{.*}}) to (%{{.*}}) step (%{{.*}})
+ omp.loop_nest (%iv2) : index = (%lb) to (%ub) step (%step) {
+ omp.yield
+ }
+ omp.yield
+ }
+
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // CHECK: omp.loop_nest
+ // CHECK-SAME: (%{{.*}}) : index =
+ // CHECK-SAME: (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}})
+ omp.loop_nest (%iv2) : index = (%lb) to (%ub) inclusive step (%step) {
+ omp.yield
+ }
+ omp.yield
+ }
+
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // CHECK: omp.loop_nest
+ // CHECK-SAME: (%{{.*}}) : index =
+ // CHECK-SAME: (%{{.*}}, %{{.*}}) to (%{{.*}}, %{{.*}}) step (%{{.*}}, %{{.*}})
+ omp.loop_nest (%iv2, %iv3) : index = (%lb, %lb) to (%ub, %ub) step (%step, %step) {
+ omp.yield
+ }
+ omp.yield
+ }
+
+ return
+}
+
// CHECK-LABEL: omp_wsloop
func.func @omp_wsloop(%lb : index, %ub : index, %step : index, %data_var : memref<i32>, %linear_var : i32, %chunk_var : i32) -> () {
|
// TODO Remove induction variables from omp.wsloop. | ||
omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[remark] I fear there will be inconcistencies if potentially both omp.loop_nest
and omp.wsloop
both define potentially inconsistent loop bounds and intermediate passes not considering the wrapper passes. Since we already discussed this, only a remark here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is true. However, the transition of omp.wsloop
to be a wrapper rather than defining the whole loop will be part of a separate patch. So, after this patch alone, it's not expected to use omp.loop_nest
inside of omp.wsloop
. These tests just needed to add a "future-wrapper" parent operation so that the new op could be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What is the mechanism we now have to avoid code appearing between loop_nest and the wrapper operation? |
The One operation where this doesn't always need to be checked would be |
Would the code motion passes still end up moving the operation and then subsequently the verifier errors out? Have you tried out a code sinking or hoisting pass to see what is the behaviour? For omp.wsloop where will collapse be represented? In the loop_nest operation? |
I'm currently working on making // cat test.mlir
func.func private @bar(i32) -> ()
func.func @foo(%arg0 : i32) -> () {
func.call @bar(%arg0) : (i32) -> ()
return
}
func.func @test_cse(%x : i32) -> () {
%c1_0 = arith.constant 1 : i32
%x_plus_1_0 = arith.addi %x, %c1_0 : i32
func.call @foo(%x_plus_1_0) : (i32) -> ()
%c2_0 = arith.constant 2 : i32
%x_plus_2_0 = arith.addi %x, %c2_0 : i32
omp.wsloop {
omp.loop_nest (%i) : i32 = (%x) to (%x_plus_2_0) step (%c1_0) {
%c2_1 = arith.constant 2 : i32
%x_plus_2_1 = arith.addi %x, %c2_1 : i32
func.call @foo(%x_plus_2_1) : (i32) -> ()
omp.yield
}
omp.terminator
}
return
}
// mlir-opt test.mlir -inline -cse -o -
func.func private @bar(i32)
func.func @foo(%arg0: i32) {
call @bar(%arg0) : (i32) -> ()
return
}
func.func @test_cse(%arg0: i32) {
%c2_i32 = arith.constant 2 : i32
%c1_i32 = arith.constant 1 : i32
%0 = arith.addi %arg0, %c1_i32 : i32
call @bar(%0) : (i32) -> ()
%1 = arith.addi %arg0, %c2_i32 : i32
omp.wsloop {
omp.loop_nest (%arg1) : i32 = (%arg0) to (%1) step (%c1_i32) {
func.call @foo(%1) : (i32) -> ()
omp.yield
}
omp.terminator
}
return
}
Yes, so it will be possible to represent |
I have seen canonicalize transformation pull constants outside of OpenMP constructs.
./bin/mlir-opt --canonicalize t1.mlir
|
Yes, that's the constant hoisting part of the canonicalization pass. As per the documentation, it will hoist constants "into the entry block of the first parent barrier region". Which is an func.func @constant_sink_test(%x : !llvm.ptr, %y : index) {
omp.wsloop {
omp.loop_nest (%i) : index = (%y) to (%y) step (%y) {
%c1 = arith.constant 10 : i32
llvm.store %c1, %x : i32, !llvm.ptr
omp.yield
}
omp.terminator
}
return
} mlir-opt --canonicalize test.mlir func.func @constant_sink_test(%arg0: !llvm.ptr, %arg1: index) {
%c10_i32 = arith.constant 10 : i32
omp.wsloop {
omp.loop_nest (%arg2) : index = (%arg1) to (%arg1) step (%arg1) {
llvm.store %c10_i32, %arg0 : i32, !llvm.ptr
omp.yield
}
omp.terminator
}
return
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This patch introduces an operation intended to hold loop information associated to the
omp.distribute
,omp.simdloop
,omp.taskloop
andomp.wsloop
operations. This is a stopgap solution to unblock work on transitioning these operations to becoming wrappers, as discussed in this RFC.Long-term, this operation will likely be replaced by
omp.canonical_loop
, which is being designed to address missing support for loop transformations, etc.