-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[acc] acc.loop verifier now requires parallelism determination flag #143720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The OpenACC specification for `acc loop` describe that a loop's parallelism determination mode is either auto, independent, or seq. The rules are as follows. - As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct. - As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause. - Additionally, loops marked with gang, worker, or vector are not guaranteed to be parallel. Specifically noted in 2.9.7 auto clause: If not, or if it is unable to make a determination, it must treat the auto clause as if it is a seq clause, and it must ignore any gang, worker, or vector clauses on the loop construct. The verifier for `acc.loop` was updated to enforce this marking because the context in which a loop appears is not trivially determined once IR transformations begin. For example, orphaned loops are implicitly `independent`, but after inlining into an `acc.kernels` region they would be implicitly considered `auto`. Thus now the verifier requires that a frontend specifically generates acc dialect with this marking since it knows the context.
@llvm/pr-subscribers-openacc @llvm/pr-subscribers-mlir-openacc Author: Razvan Lupusoru (razvanlupusoru) ChangesThe OpenACC specification for
The verifier for Patch is 23.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143720.diff 5 Files Affected:
diff --git a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
index c72ec47be9f04..de378a921a7af 100644
--- a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
+++ b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
@@ -2461,10 +2461,34 @@ LogicalResult acc::LoopOp::verify() {
if (hasDuplicateDeviceTypes(getAuto_(), deviceTypes) ||
hasDuplicateDeviceTypes(getIndependent(), deviceTypes) ||
hasDuplicateDeviceTypes(getSeq(), deviceTypes)) {
- return emitError() << "only one of \"" << acc::LoopOp::getAutoAttrStrName()
- << "\", " << getIndependentAttrName() << ", "
- << getSeqAttrName()
- << " can be present at the same time";
+ return emitError() << "only one of auto, independent, seq can be present "
+ "at the same time";
+ }
+
+ // Check that at least one of auto, independent, or seq is present
+ // for the device-independent default clauses.
+ auto hasDeviceNone = [](mlir::acc::DeviceTypeAttr attr) -> bool {
+ return attr.getValue() == mlir::acc::DeviceType::None;
+ };
+ bool hasDefaultSeq =
+ getSeqAttr()
+ ? llvm::any_of(getSeqAttr().getAsRange<mlir::acc::DeviceTypeAttr>(),
+ hasDeviceNone)
+ : false;
+ bool hasDefaultIndependent =
+ getIndependentAttr()
+ ? llvm::any_of(
+ getIndependentAttr().getAsRange<mlir::acc::DeviceTypeAttr>(),
+ hasDeviceNone)
+ : false;
+ bool hasDefaultAuto =
+ getAuto_Attr()
+ ? llvm::any_of(getAuto_Attr().getAsRange<mlir::acc::DeviceTypeAttr>(),
+ hasDeviceNone)
+ : false;
+ if (!hasDefaultSeq && !hasDefaultIndependent && !hasDefaultAuto) {
+ return emitError()
+ << "at least one of auto, independent, seq must be present";
}
// Gang, worker and vector are incompatible with seq.
@@ -2483,7 +2507,7 @@ LogicalResult acc::LoopOp::verify() {
getGangValue(mlir::acc::GangArgType::Static,
deviceTypeAttr.getValue()))
return emitError()
- << "gang, worker or vector cannot appear with the seq attr";
+ << "gang, worker or vector cannot appear with seq";
}
}
diff --git a/mlir/test/Dialect/OpenACC/canonicalize.mlir b/mlir/test/Dialect/OpenACC/canonicalize.mlir
index e43a27f6b9e89..fdc8e6b5cae6e 100644
--- a/mlir/test/Dialect/OpenACC/canonicalize.mlir
+++ b/mlir/test/Dialect/OpenACC/canonicalize.mlir
@@ -116,10 +116,10 @@ func.func @testhostdataop(%a: memref<f32>, %ifCond: i1) -> () {
acc.host_data dataOperands(%0 : memref<f32>) if(%false) {
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
acc.yield
- } attributes { inclusiveUpperbound = array<i1: true> }
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
acc.yield
- } attributes { inclusiveUpperbound = array<i1: true> }
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.terminator
}
return
diff --git a/mlir/test/Dialect/OpenACC/invalid.mlir b/mlir/test/Dialect/OpenACC/invalid.mlir
index aadf189273212..8f6e961a06163 100644
--- a/mlir/test/Dialect/OpenACC/invalid.mlir
+++ b/mlir/test/Dialect/OpenACC/invalid.mlir
@@ -2,7 +2,7 @@
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -12,7 +12,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -22,7 +22,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -32,7 +32,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -42,7 +42,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -52,7 +52,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -62,7 +62,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -72,7 +72,7 @@ acc.loop {
// expected-error@+1 {{expected non-empty body.}}
acc.loop {
-}
+} attributes {independent = [#acc.device_type<none>]}
// -----
@@ -99,7 +99,7 @@ acc.loop {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{only one of "auto", "independent", "seq" can be present at the same time}}
+// expected-error@+1 {{only one of auto, independent, seq can be present at the same time}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
acc.yield
} attributes {auto_ = [#acc.device_type<none>], seq = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
@@ -168,7 +168,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32){
// expected-error@+1 {{'acc.init' op cannot be nested in a compute operation}}
acc.init
acc.yield
-} attributes {inclusiveUpperbound = array<i1: true>}
+} attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// -----
@@ -186,7 +186,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
// expected-error@+1 {{'acc.shutdown' op cannot be nested in a compute operation}}
acc.shutdown
acc.yield
-} attributes {inclusiveUpperbound = array<i1: true>}
+} attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// -----
@@ -198,7 +198,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
acc.shutdown
}) : () -> ()
acc.yield
-} attributes {inclusiveUpperbound = array<i1: true>}
+} attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// -----
@@ -797,7 +797,7 @@ func.func @acc_loop_container() {
scf.yield
}
acc.yield
- } attributes { collapse = [2], collapseDeviceType = [#acc.device_type<none>] }
+ } attributes { collapse = [2], collapseDeviceType = [#acc.device_type<none>], independent = [#acc.device_type<none>]}
return
}
@@ -816,6 +816,6 @@ func.func @acc_loop_container() {
scf.yield
}
acc.yield
- } attributes { collapse = [3], collapseDeviceType = [#acc.device_type<none>] }
+ } attributes { collapse = [3], collapseDeviceType = [#acc.device_type<none>], independent = [#acc.device_type<none>]}
return
}
diff --git a/mlir/test/Dialect/OpenACC/legalize-data.mlir b/mlir/test/Dialect/OpenACC/legalize-data.mlir
index 28ef6761a6ef4..40604dcc736de 100644
--- a/mlir/test/Dialect/OpenACC/legalize-data.mlir
+++ b/mlir/test/Dialect/OpenACC/legalize-data.mlir
@@ -96,7 +96,7 @@ func.func @test(%a: memref<10xf32>) {
acc.loop control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
%ci = memref.load %a[%i] : memref<10xf32>
acc.yield
- }
+ } attributes {independent = [#acc.device_type<none>]}
acc.yield
}
return
@@ -109,7 +109,7 @@ func.func @test(%a: memref<10xf32>) {
// CHECK: acc.loop control(%[[I:.*]] : index) = (%{{.*}} : index) to (%{{.*}} : index) step (%{{.*}} : index) {
// DEVICE: %{{.*}} = memref.load %[[CREATE:.*]][%[[I]]] : memref<10xf32>
// CHECK: acc.yield
-// CHECK: }
+// CHECK: } attributes {independent = [#acc.device_type<none>]}
// CHECK: acc.yield
// CHECK: }
@@ -134,7 +134,7 @@ func.func @test(%a: memref<10xf32>) {
acc.loop control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
%ci = memref.load %a[%i] : memref<10xf32>
acc.yield
- }
+ } attributes {independent = [#acc.device_type<none>]}
acc.yield
}
return
@@ -147,7 +147,7 @@ func.func @test(%a: memref<10xf32>) {
// CHECK: acc.loop control(%[[I:.*]] : index) = (%{{.*}} : index) to (%{{.*}} : index) step (%{{.*}} : index) {
// DEVICE: %{{.*}} = memref.load %[[PRIVATE:.*]][%[[I]]] : memref<10xf32>
// CHECK: acc.yield
-// CHECK: }
+// CHECK: } attributes {independent = [#acc.device_type<none>]}
// CHECK: acc.yield
// CHECK: }
@@ -172,7 +172,7 @@ func.func @test(%a: memref<10xf32>) {
acc.loop private(@privatization_memref_10_f32 -> %p1 : memref<10xf32>) control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
%ci = memref.load %a[%i] : memref<10xf32>
acc.yield
- }
+ } attributes {independent = [#acc.device_type<none>]}
acc.yield
}
return
@@ -185,7 +185,7 @@ func.func @test(%a: memref<10xf32>) {
// CHECK: acc.loop private(@privatization_memref_10_f32 -> %[[PRIVATE]] : memref<10xf32>) control(%[[I:.*]] : index) = (%{{.*}} : index) to (%{{.*}} : index) step (%{{.*}} : index) {
// DEVICE: %{{.*}} = memref.load %[[PRIVATE:.*]][%[[I]]] : memref<10xf32>
// CHECK: acc.yield
-// CHECK: }
+// CHECK: } attributes {independent = [#acc.device_type<none>]}
// CHECK: acc.yield
// CHECK: }
@@ -210,7 +210,7 @@ func.func @test(%a: memref<10xf32>) {
acc.loop control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
%ci = memref.load %a[%i] : memref<10xf32>
acc.yield
- }
+ } attributes {seq = [#acc.device_type<none>]}
acc.yield
}
return
@@ -223,7 +223,7 @@ func.func @test(%a: memref<10xf32>) {
// CHECK: acc.loop control(%[[I:.*]] : index) = (%{{.*}} : index) to (%{{.*}} : index) step (%{{.*}} : index) {
// DEVICE: %{{.*}} = memref.load %[[PRIVATE:.*]][%[[I]]] : memref<10xf32>
// CHECK: acc.yield
-// CHECK: }
+// CHECK: } attributes {seq = [#acc.device_type<none>]}
// CHECK: acc.yield
// CHECK: }
diff --git a/mlir/test/Dialect/OpenACC/ops.mlir b/mlir/test/Dialect/OpenACC/ops.mlir
index 550f295f074a2..97278f869534b 100644
--- a/mlir/test/Dialect/OpenACC/ops.mlir
+++ b/mlir/test/Dialect/OpenACC/ops.mlir
@@ -19,7 +19,7 @@ func.func @compute1(%A: memref<10x10xf32>, %B: memref<10x10xf32>, %C: memref<10x
%co = arith.addf %cij, %p : f32
memref.store %co, %C[%arg3, %arg4] : memref<10x10xf32>
acc.yield
- } attributes { collapse = [3], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true, true>}
+ } attributes { collapse = [3], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true, true>, independent = [#acc.device_type<none>]}
acc.yield
}
@@ -40,7 +40,7 @@ func.func @compute1(%A: memref<10x10xf32>, %B: memref<10x10xf32>, %C: memref<10x
// CHECK-NEXT: %{{.*}} = arith.addf %{{.*}}, %{{.*}} : f32
// CHECK-NEXT: memref.store %{{.*}}, %{{.*}}[%{{.*}}, %{{.*}}] : memref<10x10xf32>
// CHECK-NEXT: acc.yield
-// CHECK-NEXT: } attributes {collapse = [3], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true, true>}
+// CHECK-NEXT: } attributes {collapse = [3], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true, true>, independent = [#acc.device_type<none>]}
// CHECK-NEXT: acc.yield
// CHECK-NEXT: }
// CHECK-NEXT: return %{{.*}} : memref<10x10xf32>
@@ -129,7 +129,7 @@ func.func @compute3(%a: memref<10x10xf32>, %b: memref<10x10xf32>, %c: memref<10x
%tmp = arith.addf %axy, %bxy : f32
memref.store %tmp, %c[%y] : memref<10xf32>
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
// for i = 0 to 10 step 1
@@ -139,9 +139,9 @@ func.func @compute3(%a: memref<10x10xf32>, %b: memref<10x10xf32>, %c: memref<10x
%z = arith.addf %ci, %dx : f32
memref.store %z, %d[%x] : memref<10xf32>
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>, seq = [#acc.device_type<nvidia>]}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>], seq = [#acc.device_type<nvidia>]}
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.yield
}
acc.terminator
@@ -166,16 +166,16 @@ func.func @compute3(%a: memref<10x10xf32>, %b: memref<10x10xf32>, %c: memref<10x
// CHECK-NEXT: %{{.*}} = arith.addf %{{.*}}, %{{.*}} : f32
// CHECK-NEXT: memref.store %{{.*}}, %{{.*}}[%{{.*}}] : memref<10xf32>
// CHECK-NEXT: acc.yield
-// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// CHECK-NEXT: acc.loop control(%{{.*}}) = (%{{.*}}) to (%{{.*}}) step (%{{.*}}) {
// CHECK-NEXT: %{{.*}} = memref.load %{{.*}}[%{{.*}}] : memref<10xf32>
// CHECK-NEXT: %{{.*}} = memref.load %{{.*}}[%{{.*}}] : memref<10xf32>
// CHECK-NEXT: %{{.*}} = arith.addf %{{.*}}, %{{.*}} : f32
// CHECK-NEXT: memref.store %{{.*}}, %{{.*}}[%{{.*}}] : memref<10xf32>
// CHECK-NEXT: acc.yield
-// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, seq = [#acc.device_type<nvidia>]}
+// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>], seq = [#acc.device_type<nvidia>]}
// CHECK-NEXT: acc.yield
-// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// CHECK-NEXT: acc.yield
// CHECK-NEXT: }
// CHECK-NEXT: acc.terminator
@@ -196,72 +196,72 @@ func.func @testloopop(%a : memref<10xf32>) -> () {
acc.loop gang vector worker control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({num=%i64Value: i64}) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({static=%i64Value: i64}) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop worker(%i64Value: i64) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop worker(%i32Value: i32) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop worker(%idxValue: index) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop vector(%i64Value: i64) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop vector(%i32Value: i32) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop vector(%idxValue: index) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({num=%i64Value: i64}) worker vector control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({num=%i64Value: i64, static=%i64Value: i64}) worker(%i64Value: i64) vector(%i64Value: i64) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({num=%i32Value: i32, static=%idxValue: index}) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop tile({%i64Value : i64, %i64Value : i64}) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop tile({%i32Value : i32, %i32Value : i32}) control(%iv : index) = (%...
[truncated]
|
@llvm/pr-subscribers-mlir Author: Razvan Lupusoru (razvanlupusoru) ChangesThe OpenACC specification for
The verifier for Patch is 23.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143720.diff 5 Files Affected:
diff --git a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
index c72ec47be9f04..de378a921a7af 100644
--- a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
+++ b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
@@ -2461,10 +2461,34 @@ LogicalResult acc::LoopOp::verify() {
if (hasDuplicateDeviceTypes(getAuto_(), deviceTypes) ||
hasDuplicateDeviceTypes(getIndependent(), deviceTypes) ||
hasDuplicateDeviceTypes(getSeq(), deviceTypes)) {
- return emitError() << "only one of \"" << acc::LoopOp::getAutoAttrStrName()
- << "\", " << getIndependentAttrName() << ", "
- << getSeqAttrName()
- << " can be present at the same time";
+ return emitError() << "only one of auto, independent, seq can be present "
+ "at the same time";
+ }
+
+ // Check that at least one of auto, independent, or seq is present
+ // for the device-independent default clauses.
+ auto hasDeviceNone = [](mlir::acc::DeviceTypeAttr attr) -> bool {
+ return attr.getValue() == mlir::acc::DeviceType::None;
+ };
+ bool hasDefaultSeq =
+ getSeqAttr()
+ ? llvm::any_of(getSeqAttr().getAsRange<mlir::acc::DeviceTypeAttr>(),
+ hasDeviceNone)
+ : false;
+ bool hasDefaultIndependent =
+ getIndependentAttr()
+ ? llvm::any_of(
+ getIndependentAttr().getAsRange<mlir::acc::DeviceTypeAttr>(),
+ hasDeviceNone)
+ : false;
+ bool hasDefaultAuto =
+ getAuto_Attr()
+ ? llvm::any_of(getAuto_Attr().getAsRange<mlir::acc::DeviceTypeAttr>(),
+ hasDeviceNone)
+ : false;
+ if (!hasDefaultSeq && !hasDefaultIndependent && !hasDefaultAuto) {
+ return emitError()
+ << "at least one of auto, independent, seq must be present";
}
// Gang, worker and vector are incompatible with seq.
@@ -2483,7 +2507,7 @@ LogicalResult acc::LoopOp::verify() {
getGangValue(mlir::acc::GangArgType::Static,
deviceTypeAttr.getValue()))
return emitError()
- << "gang, worker or vector cannot appear with the seq attr";
+ << "gang, worker or vector cannot appear with seq";
}
}
diff --git a/mlir/test/Dialect/OpenACC/canonicalize.mlir b/mlir/test/Dialect/OpenACC/canonicalize.mlir
index e43a27f6b9e89..fdc8e6b5cae6e 100644
--- a/mlir/test/Dialect/OpenACC/canonicalize.mlir
+++ b/mlir/test/Dialect/OpenACC/canonicalize.mlir
@@ -116,10 +116,10 @@ func.func @testhostdataop(%a: memref<f32>, %ifCond: i1) -> () {
acc.host_data dataOperands(%0 : memref<f32>) if(%false) {
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
acc.yield
- } attributes { inclusiveUpperbound = array<i1: true> }
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
acc.yield
- } attributes { inclusiveUpperbound = array<i1: true> }
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.terminator
}
return
diff --git a/mlir/test/Dialect/OpenACC/invalid.mlir b/mlir/test/Dialect/OpenACC/invalid.mlir
index aadf189273212..8f6e961a06163 100644
--- a/mlir/test/Dialect/OpenACC/invalid.mlir
+++ b/mlir/test/Dialect/OpenACC/invalid.mlir
@@ -2,7 +2,7 @@
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -12,7 +12,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -22,7 +22,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -32,7 +32,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -42,7 +42,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -52,7 +52,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -62,7 +62,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{gang, worker or vector cannot appear with the seq attr}}
+// expected-error@+1 {{gang, worker or vector cannot appear with seq}}
acc.loop {
"test.openacc_dummy_op"() : () -> ()
acc.yield
@@ -72,7 +72,7 @@ acc.loop {
// expected-error@+1 {{expected non-empty body.}}
acc.loop {
-}
+} attributes {independent = [#acc.device_type<none>]}
// -----
@@ -99,7 +99,7 @@ acc.loop {
%1 = arith.constant 1 : i32
%2 = arith.constant 10 : i32
-// expected-error@+1 {{only one of "auto", "independent", "seq" can be present at the same time}}
+// expected-error@+1 {{only one of auto, independent, seq can be present at the same time}}
acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
acc.yield
} attributes {auto_ = [#acc.device_type<none>], seq = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
@@ -168,7 +168,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32){
// expected-error@+1 {{'acc.init' op cannot be nested in a compute operation}}
acc.init
acc.yield
-} attributes {inclusiveUpperbound = array<i1: true>}
+} attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// -----
@@ -186,7 +186,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
// expected-error@+1 {{'acc.shutdown' op cannot be nested in a compute operation}}
acc.shutdown
acc.yield
-} attributes {inclusiveUpperbound = array<i1: true>}
+} attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// -----
@@ -198,7 +198,7 @@ acc.loop control(%iv : i32) = (%1 : i32) to (%2 : i32) step (%1 : i32) {
acc.shutdown
}) : () -> ()
acc.yield
-} attributes {inclusiveUpperbound = array<i1: true>}
+} attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// -----
@@ -797,7 +797,7 @@ func.func @acc_loop_container() {
scf.yield
}
acc.yield
- } attributes { collapse = [2], collapseDeviceType = [#acc.device_type<none>] }
+ } attributes { collapse = [2], collapseDeviceType = [#acc.device_type<none>], independent = [#acc.device_type<none>]}
return
}
@@ -816,6 +816,6 @@ func.func @acc_loop_container() {
scf.yield
}
acc.yield
- } attributes { collapse = [3], collapseDeviceType = [#acc.device_type<none>] }
+ } attributes { collapse = [3], collapseDeviceType = [#acc.device_type<none>], independent = [#acc.device_type<none>]}
return
}
diff --git a/mlir/test/Dialect/OpenACC/legalize-data.mlir b/mlir/test/Dialect/OpenACC/legalize-data.mlir
index 28ef6761a6ef4..40604dcc736de 100644
--- a/mlir/test/Dialect/OpenACC/legalize-data.mlir
+++ b/mlir/test/Dialect/OpenACC/legalize-data.mlir
@@ -96,7 +96,7 @@ func.func @test(%a: memref<10xf32>) {
acc.loop control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
%ci = memref.load %a[%i] : memref<10xf32>
acc.yield
- }
+ } attributes {independent = [#acc.device_type<none>]}
acc.yield
}
return
@@ -109,7 +109,7 @@ func.func @test(%a: memref<10xf32>) {
// CHECK: acc.loop control(%[[I:.*]] : index) = (%{{.*}} : index) to (%{{.*}} : index) step (%{{.*}} : index) {
// DEVICE: %{{.*}} = memref.load %[[CREATE:.*]][%[[I]]] : memref<10xf32>
// CHECK: acc.yield
-// CHECK: }
+// CHECK: } attributes {independent = [#acc.device_type<none>]}
// CHECK: acc.yield
// CHECK: }
@@ -134,7 +134,7 @@ func.func @test(%a: memref<10xf32>) {
acc.loop control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
%ci = memref.load %a[%i] : memref<10xf32>
acc.yield
- }
+ } attributes {independent = [#acc.device_type<none>]}
acc.yield
}
return
@@ -147,7 +147,7 @@ func.func @test(%a: memref<10xf32>) {
// CHECK: acc.loop control(%[[I:.*]] : index) = (%{{.*}} : index) to (%{{.*}} : index) step (%{{.*}} : index) {
// DEVICE: %{{.*}} = memref.load %[[PRIVATE:.*]][%[[I]]] : memref<10xf32>
// CHECK: acc.yield
-// CHECK: }
+// CHECK: } attributes {independent = [#acc.device_type<none>]}
// CHECK: acc.yield
// CHECK: }
@@ -172,7 +172,7 @@ func.func @test(%a: memref<10xf32>) {
acc.loop private(@privatization_memref_10_f32 -> %p1 : memref<10xf32>) control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
%ci = memref.load %a[%i] : memref<10xf32>
acc.yield
- }
+ } attributes {independent = [#acc.device_type<none>]}
acc.yield
}
return
@@ -185,7 +185,7 @@ func.func @test(%a: memref<10xf32>) {
// CHECK: acc.loop private(@privatization_memref_10_f32 -> %[[PRIVATE]] : memref<10xf32>) control(%[[I:.*]] : index) = (%{{.*}} : index) to (%{{.*}} : index) step (%{{.*}} : index) {
// DEVICE: %{{.*}} = memref.load %[[PRIVATE:.*]][%[[I]]] : memref<10xf32>
// CHECK: acc.yield
-// CHECK: }
+// CHECK: } attributes {independent = [#acc.device_type<none>]}
// CHECK: acc.yield
// CHECK: }
@@ -210,7 +210,7 @@ func.func @test(%a: memref<10xf32>) {
acc.loop control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
%ci = memref.load %a[%i] : memref<10xf32>
acc.yield
- }
+ } attributes {seq = [#acc.device_type<none>]}
acc.yield
}
return
@@ -223,7 +223,7 @@ func.func @test(%a: memref<10xf32>) {
// CHECK: acc.loop control(%[[I:.*]] : index) = (%{{.*}} : index) to (%{{.*}} : index) step (%{{.*}} : index) {
// DEVICE: %{{.*}} = memref.load %[[PRIVATE:.*]][%[[I]]] : memref<10xf32>
// CHECK: acc.yield
-// CHECK: }
+// CHECK: } attributes {seq = [#acc.device_type<none>]}
// CHECK: acc.yield
// CHECK: }
diff --git a/mlir/test/Dialect/OpenACC/ops.mlir b/mlir/test/Dialect/OpenACC/ops.mlir
index 550f295f074a2..97278f869534b 100644
--- a/mlir/test/Dialect/OpenACC/ops.mlir
+++ b/mlir/test/Dialect/OpenACC/ops.mlir
@@ -19,7 +19,7 @@ func.func @compute1(%A: memref<10x10xf32>, %B: memref<10x10xf32>, %C: memref<10x
%co = arith.addf %cij, %p : f32
memref.store %co, %C[%arg3, %arg4] : memref<10x10xf32>
acc.yield
- } attributes { collapse = [3], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true, true>}
+ } attributes { collapse = [3], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true, true>, independent = [#acc.device_type<none>]}
acc.yield
}
@@ -40,7 +40,7 @@ func.func @compute1(%A: memref<10x10xf32>, %B: memref<10x10xf32>, %C: memref<10x
// CHECK-NEXT: %{{.*}} = arith.addf %{{.*}}, %{{.*}} : f32
// CHECK-NEXT: memref.store %{{.*}}, %{{.*}}[%{{.*}}, %{{.*}}] : memref<10x10xf32>
// CHECK-NEXT: acc.yield
-// CHECK-NEXT: } attributes {collapse = [3], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true, true>}
+// CHECK-NEXT: } attributes {collapse = [3], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true, true>, independent = [#acc.device_type<none>]}
// CHECK-NEXT: acc.yield
// CHECK-NEXT: }
// CHECK-NEXT: return %{{.*}} : memref<10x10xf32>
@@ -129,7 +129,7 @@ func.func @compute3(%a: memref<10x10xf32>, %b: memref<10x10xf32>, %c: memref<10x
%tmp = arith.addf %axy, %bxy : f32
memref.store %tmp, %c[%y] : memref<10xf32>
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop control(%i : index) = (%lb : index) to (%c10 : index) step (%st : index) {
// for i = 0 to 10 step 1
@@ -139,9 +139,9 @@ func.func @compute3(%a: memref<10x10xf32>, %b: memref<10x10xf32>, %c: memref<10x
%z = arith.addf %ci, %dx : f32
memref.store %z, %d[%x] : memref<10xf32>
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>, seq = [#acc.device_type<nvidia>]}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>], seq = [#acc.device_type<nvidia>]}
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.yield
}
acc.terminator
@@ -166,16 +166,16 @@ func.func @compute3(%a: memref<10x10xf32>, %b: memref<10x10xf32>, %c: memref<10x
// CHECK-NEXT: %{{.*}} = arith.addf %{{.*}}, %{{.*}} : f32
// CHECK-NEXT: memref.store %{{.*}}, %{{.*}}[%{{.*}}] : memref<10xf32>
// CHECK-NEXT: acc.yield
-// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// CHECK-NEXT: acc.loop control(%{{.*}}) = (%{{.*}}) to (%{{.*}}) step (%{{.*}}) {
// CHECK-NEXT: %{{.*}} = memref.load %{{.*}}[%{{.*}}] : memref<10xf32>
// CHECK-NEXT: %{{.*}} = memref.load %{{.*}}[%{{.*}}] : memref<10xf32>
// CHECK-NEXT: %{{.*}} = arith.addf %{{.*}}, %{{.*}} : f32
// CHECK-NEXT: memref.store %{{.*}}, %{{.*}}[%{{.*}}] : memref<10xf32>
// CHECK-NEXT: acc.yield
-// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, seq = [#acc.device_type<nvidia>]}
+// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>], seq = [#acc.device_type<nvidia>]}
// CHECK-NEXT: acc.yield
-// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+// CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
// CHECK-NEXT: acc.yield
// CHECK-NEXT: }
// CHECK-NEXT: acc.terminator
@@ -196,72 +196,72 @@ func.func @testloopop(%a : memref<10xf32>) -> () {
acc.loop gang vector worker control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({num=%i64Value: i64}) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({static=%i64Value: i64}) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop worker(%i64Value: i64) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop worker(%i32Value: i32) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop worker(%idxValue: index) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop vector(%i64Value: i64) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop vector(%i32Value: i32) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop vector(%idxValue: index) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({num=%i64Value: i64}) worker vector control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({num=%i64Value: i64, static=%i64Value: i64}) worker(%i64Value: i64) vector(%i64Value: i64) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop gang({num=%i32Value: i32, static=%idxValue: index}) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop tile({%i64Value : i64, %i64Value : i64}) control(%iv : index) = (%c0 : index) to (%c10 : index) step (%c1 : index) {
"test.openacc_dummy_op"() : () -> ()
acc.yield
- } attributes {inclusiveUpperbound = array<i1: true>}
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
acc.loop tile({%i32Value : i32, %i32Value : i32}) control(%iv : index) = (%...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it appears that I just need to make sure that 'loop' ALWAYS has a default auto/seq/independent, right?
IF you could hold off on merging this patch (which otherwise looks fine) until I get a chance to make sure I have a patch so that Clang doesn't get a breakage, I'd appreciate it.
PR llvm#143720 adds a requirement to the ACC dialect that every acc.loop must have a seq, independent, or auto attribute for the 'default' device_type. The standard has rules for how this can be intuited: orphan/parallel/parallel loop: independent kernels/kernels loop: auto serial/serial loop: seq, unless there is a gang/worker/vector, at which point it should be 'auto'. This patch implements all of this rule as a 'cleanup' step on the IR generation for combined/loop operations. Note that the test impact is much less since I inadvertently have my 'operation' terminating curley matching the end curley from 'attribute' instead of the front of the line, so I've added sufficient tests to ensure I captured the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR #143720 adds a requirement to the ACC dialect that every acc.loop must have a seq, independent, or auto attribute for the 'default' device_type. The standard has rules for how this can be intuited: orphan/parallel/parallel loop: independent kernels/kernels loop: auto serial/serial loop: seq, unless there is a gang/worker/vector, at which point it should be 'auto'. This patch implements all of this rule as a 'cleanup' step on the IR generation for combined/loop operations. Note that the test impact is much less since I inadvertently have my 'operation' terminating curley matching the end curley from 'attribute' instead of the front of the line, so I've added sufficient tests to ensure I captured the above.
#143751 did the clang changes, I'm good about merging this whenever you'd like. Thanks for the heads up! |
PR llvm#143720 adds a requirement to the ACC dialect that every acc.loop must have a seq, independent, or auto attribute for the 'default' device_type. The standard has rules for how this can be intuited: orphan/parallel/parallel loop: independent kernels/kernels loop: auto serial/serial loop: seq, unless there is a gang/worker/vector, at which point it should be 'auto'. This patch implements all of this rule as a 'cleanup' step on the IR generation for combined/loop operations. Note that the test impact is much less since I inadvertently have my 'operation' terminating curley matching the end curley from 'attribute' instead of the front of the line, so I've added sufficient tests to ensure I captured the above.
…lvm#143720) The OpenACC specification for `acc loop` describe that a loop's parallelism determination mode is either auto, independent, or seq. The rules are as follows. - As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct. - As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause. - Additionally, loops marked with gang, worker, or vector are not guaranteed to be parallel. Specifically noted in 2.9.7 auto clause: If not, or if it is unable to make a determination, it must treat the auto clause as if it is a seq clause, and it must ignore any gang, worker, or vector clauses on the loop construct. The verifier for `acc.loop` was updated to enforce this marking because the context in which a loop appears is not trivially determined once IR transformations begin. For example, orphaned loops are implicitly `independent`, but after inlining into an `acc.kernels` region they would be implicitly considered `auto`. Thus now the verifier requires that a frontend specifically generates acc dialect with this marking since it knows the context.
PR llvm#143720 adds a requirement to the ACC dialect that every acc.loop must have a seq, independent, or auto attribute for the 'default' device_type. The standard has rules for how this can be intuited: orphan/parallel/parallel loop: independent kernels/kernels loop: auto serial/serial loop: seq, unless there is a gang/worker/vector, at which point it should be 'auto'. This patch implements all of this rule as a 'cleanup' step on the IR generation for combined/loop operations. Note that the test impact is much less since I inadvertently have my 'operation' terminating curley matching the end curley from 'attribute' instead of the front of the line, so I've added sufficient tests to ensure I captured the above.
…lvm#143720) The OpenACC specification for `acc loop` describe that a loop's parallelism determination mode is either auto, independent, or seq. The rules are as follows. - As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct. - As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause. - Additionally, loops marked with gang, worker, or vector are not guaranteed to be parallel. Specifically noted in 2.9.7 auto clause: If not, or if it is unable to make a determination, it must treat the auto clause as if it is a seq clause, and it must ignore any gang, worker, or vector clauses on the loop construct. The verifier for `acc.loop` was updated to enforce this marking because the context in which a loop appears is not trivially determined once IR transformations begin. For example, orphaned loops are implicitly `independent`, but after inlining into an `acc.kernels` region they would be implicitly considered `auto`. Thus now the verifier requires that a frontend specifically generates acc dialect with this marking since it knows the context.
The OpenACC specification for
acc loop
describe that a loop's parallelism determination mode is either auto, independent, or seq. The rules are as follows.The verifier for
acc.loop
was updated to enforce this marking because the context in which a loop appears is not trivially determined once IR transformations begin. For example, orphaned loops are implicitlyindependent
, but after inlining into anacc.kernels
region they would be implicitly consideredauto
. Thus now the verifier requires that a frontend specifically generates acc dialect with this marking since it knows the context.