-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Flang][OpenMP] Make boxed procedure pass aware of OpenMP private ops #118261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Your org has enabled the Graphite merge queue for merging into mainAdd the label “FP Bundles” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link. |
@llvm/pr-subscribers-flang-codegen @llvm/pr-subscribers-flang-fir-hlfir Author: Kiran Chandramohan (kiranchandramohan) ChangesFixes #109727 Full diff: https://github.com/llvm/llvm-project/pull/118261.diff 2 Files Affected:
diff --git a/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp b/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp
index c536fd19fcc69a..850b5b4367a30d 100644
--- a/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp
+++ b/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp
@@ -16,6 +16,7 @@
#include "flang/Optimizer/Dialect/Support/FIRContext.h"
#include "flang/Optimizer/Support/FatalError.h"
#include "flang/Optimizer/Support/InternalNames.h"
+#include "mlir/Dialect/OpenMP/OpenMPDialect.h"
#include "mlir/IR/PatternMatch.h"
#include "mlir/Pass/Pass.h"
#include "mlir/Transforms/DialectConversion.h"
@@ -374,6 +375,15 @@ class BoxedProcedurePass
op->getResult(i.index()).setType(toTy);
}
rewriter.finalizeOpModification(op);
+ } else if (auto privateOp =
+ mlir::dyn_cast<mlir::omp::PrivateClauseOp>(op)) {
+ mlir::Type dataTy = privateOp.getType();
+ if (typeConverter.needsConversion(dataTy)) {
+ auto toTy = typeConverter.convertType(dataTy);
+ rewriter.startOpModification(privateOp);
+ privateOp.setType(toTy);
+ rewriter.finalizeOpModification(privateOp);
+ }
}
// Ensure block arguments are updated if needed.
if (opIsValid && op->getNumRegions() != 0) {
diff --git a/flang/test/Fir/boxproc-openmp.fir b/flang/test/Fir/boxproc-openmp.fir
new file mode 100644
index 00000000000000..01582a13c0da4d
--- /dev/null
+++ b/flang/test/Fir/boxproc-openmp.fir
@@ -0,0 +1,86 @@
+// RUN: fir-opt --boxed-procedure %s | FileCheck %s
+// Test the boxed procedure pass with OpenMP declaration operations
+
+// Test a private declaration with one region (alloc)
+//CHECK: omp.private {type = private} @_QFsub1Et1_private_ref_rec__QFsub1Tt : !fir.ref<!fir.type<_QFsub1TtUnboxProc{p1:() -> ()}>> alloc {
+omp.private {type = private} @_QFsub1Et1_private_ref_rec__QFsub1Tt : !fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>> alloc {
+//CHECK: ^bb0(%arg0: !fir.ref<!fir.type<_QFsub1TtUnboxProc{p1:() -> ()}>>):
+^bb0(%arg0: !fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>):
+ %c1_i32 = arith.constant 1 : i32
+ %0 = fir.alloca !fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}> {bindc_name = "t1", pinned, uniq_name = "_QFsub1Et1"}
+ %1 = fir.declare %0 {uniq_name = "_QFsub1Et1"} : (!fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>) -> !fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>
+ %2 = fir.embox %1 : (!fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>) -> !fir.box<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>
+ %3 = fir.address_of(@_QQclXea6256ba131ddd9c2210e68030a0edd3) : !fir.ref<!fir.char<1,49>>
+ %4 = fir.convert %2 : (!fir.box<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>) -> !fir.box<none>
+ %5 = fir.convert %3 : (!fir.ref<!fir.char<1,49>>) -> !fir.ref<i8>
+ %6 = fir.call @_FortranAInitialize(%4, %5, %c1_i32) fastmath<contract> : (!fir.box<none>, !fir.ref<i8>, i32) -> none
+//CHECK: omp.yield(%1 : !fir.ref<!fir.type<_QFsub1TtUnboxProc{p1:() -> ()}>>)
+ omp.yield(%1 : !fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>)
+}
+func.func @_QPsub1() {
+ %0 = fir.alloca !fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}> {bindc_name = "t1", uniq_name = "_QFsub1Et1"}
+ %1 = fir.declare %0 {uniq_name = "_QFsub1Et1"} : (!fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>) -> !fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>
+//CHECK: omp.parallel private(@_QFsub1Et1_private_ref_rec__QFsub1Tt %1 -> %arg0 : !fir.ref<!fir.type<_QFsub1TtUnboxProc{p1:() -> ()}>>) {
+ omp.parallel private(@_QFsub1Et1_private_ref_rec__QFsub1Tt %1 -> %arg0 : !fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>) {
+ %2 = fir.declare %arg0 {uniq_name = "_QFsub1Et1"} : (!fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>) -> !fir.ref<!fir.type<_QFsub1Tt{p1:!fir.boxproc<() -> ()>}>>
+ omp.terminator
+ }
+ return
+}
+
+
+// Test a private declaration with all regions (alloc, copy, dealloc)
+//CHECK: omp.private {type = firstprivate} @_QFsub2Et1_firstprivate_ref_box_heap_rec__QFsub2Tt :
+//CHECK-SAME: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2TtUnboxProc{p1:() -> ()}>>>> alloc {
+omp.private {type = firstprivate} @_QFsub2Et1_firstprivate_ref_box_heap_rec__QFsub2Tt : !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>> alloc {
+//CHECK: ^bb0(%arg0: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2TtUnboxProc{p1:() -> ()}>>>>):
+^bb0(%arg0: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>):
+ %0 = fir.alloca !fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>> {bindc_name = "t1", pinned, uniq_name = "_QFsub2Et1"}
+ %1 = fir.declare %0 {fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QFsub2Et1"} : (!fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>
+//CHECK: omp.yield(%1 : !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2TtUnboxProc{p1:() -> ()}>>>>)
+ omp.yield(%1 : !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>)
+} copy {
+//CHECK: ^bb0(%arg0: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2TtUnboxProc{p1:() -> ()}>>>>,
+//CHECK-SAME: %arg1: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2TtUnboxProc{p1:() -> ()}>>>>):
+^bb0(%arg0: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>, %arg1: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>):
+ %c5_i32 = arith.constant 5 : i32
+ %0 = fir.load %arg0 : !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>
+ %1 = fir.box_addr %0 : (!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>) -> !fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>
+ %2 = fir.embox %1 : (!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>) -> !fir.box<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>
+ %3 = fir.address_of(@_QQclXea) : !fir.ref<!fir.char<1,8>>
+ %4 = fir.convert %arg1 : (!fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>) -> !fir.ref<!fir.box<none>>
+ %5 = fir.convert %2 : (!fir.box<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>) -> !fir.box<none>
+ %6 = fir.convert %3 : (!fir.ref<!fir.char<1,8>>) -> !fir.ref<i8>
+ %7 = fir.call @_FortranAAssign(%4, %5, %6, %c5_i32) : (!fir.ref<!fir.box<none>>, !fir.box<none>, !fir.ref<i8>, i32) -> none
+//CHECK: omp.yield(%arg1 : !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2TtUnboxProc{p1:() -> ()}>>>>)
+ omp.yield(%arg1 : !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>)
+} dealloc {
+//CHECK: ^bb0(%arg0: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2TtUnboxProc{p1:() -> ()}>>>>):
+^bb0(%arg0: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>):
+ %c5_i32 = arith.constant 5 : i32
+ %false = arith.constant false
+ %0 = fir.absent !fir.box<none>
+ %1 = fir.address_of(@_QQclXea) : !fir.ref<!fir.char<1,8>>
+ %2 = fir.convert %arg0 : (!fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>) -> !fir.ref<!fir.box<none>>
+ %3 = fir.convert %1 : (!fir.ref<!fir.char<1,8>>) -> !fir.ref<i8>
+ %4 = fir.call @_FortranAAllocatableDeallocate(%2, %false, %0, %3, %c5_i32) fastmath<contract> : (!fir.ref<!fir.box<none>>, i1, !fir.box<none>, !fir.ref<i8>, i32) -> i32
+ omp.yield
+}
+func.func @_QPsub2() {
+ %0 = fir.alloca !fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>> {bindc_name = "t1", uniq_name = "_QFsub2Et1"}
+ %1 = fir.declare %0 {fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QFsub2Et1"} : (!fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>
+//CHECK: omp.parallel private(@_QFsub2Et1_firstprivate_ref_box_heap_rec__QFsub2Tt %1 -> %arg0 :
+//CHECK-SAME: !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2TtUnboxProc{p1:() -> ()}>>>>) {
+ omp.parallel private(@_QFsub2Et1_firstprivate_ref_box_heap_rec__QFsub2Tt %1 -> %arg0 : !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>) {
+ %2 = fir.declare %arg0 {fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QFsub2Et1"} : (!fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.type<_QFsub2Tt{p1:!fir.boxproc<() -> ()>}>>>>
+ omp.terminator
+ }
+ return
+}
+func.func private @_FortranAInitialize(!fir.box<none>, !fir.ref<i8>, i32) -> none attributes {fir.runtime}
+fir.global linkonce @_QQclXea constant : !fir.char<1,8> {
+ %0 = fir.string_lit "pp.f90\00"(8) : !fir.char<1,8>
+ fir.has_value %0 : !fir.char<1,8>
+}
+func.func private @_FortranAAllocatableDeallocate(!fir.ref<!fir.box<none>>, i1, !fir.box<none>, !fir.ref<i8>, i32) -> i32 attributes {fir.runtime}
+func.func private @_FortranAAssign(!fir.ref<!fir.box<none>>, !fir.box<none>, !fir.ref<i8>, i32) -> none attributes {fir.runtime}
|
ccca3af
to
2ff85cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kiran. I wonder if the generic fix is not to remove the "op->getDialect() == firDialect" in the case just above. I do not recall the rational of limiting it to fir ops. Most likely this was an early compiler-perf optimization that was based on the wrong assumption that such FIR type would only be returned by FIR ops, but I see no harm removing that limitation (the needsConversion
lookup cost should be pretty limited for most non FIR operation result types).
The right thing would likely to be to have a real dialect conversion pass, but these are expensive, so it may be better to keep the current approach for now (the next bag of issue I see would come from FIR type inside attributes of operations not handled here. There is some work to make dialect conversion cheaper happening in MLIR, maybe we can revisit at that point).
Thanks @jeanPerier for the quick response.
This is a good idea. The fir dialect-specific check currently converts the results. For this particular case, see the next point.
The type is stored as an attribute in the OpenMP
Some kind of generalization might still be required for the other declaration/recipe-based operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kiran for the details, then the only generic solution I see with the current pass would be to also update the TypeAttr attribute in the code that updates the result type (like what is done for SymbolRefAttr in ExternalNameConversion).
If that does not work, you can assign an issue to me and I am OK with your patch.
Thanks @jeanPerier for the reference. Just an update here. I added something like the following, which iterates through the attributes and updates if it is a typeAttr that needs conversion. This works fine for my usecase.
But I see a failure in the gfortan test. I will have a detailed look later this week.
|
Thanks for testing @kiranchandramohan, I think the issue is the Side note: the TypeDescOp design of duplicating the type (inside the result, and as attribute) is probably debatable/useless. |
Thanks for the hint. I will continue with this. |
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix looks good (small nit on the include) and I worked fine for me. Thanks for making it generic @kiranchandramohan!
@@ -16,6 +16,7 @@ | |||
#include "flang/Optimizer/Dialect/Support/FIRContext.h" | |||
#include "flang/Optimizer/Support/FatalError.h" | |||
#include "flang/Optimizer/Support/InternalNames.h" | |||
#include "mlir/Dialect/OpenMP/OpenMPDialect.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not needed anymore
Fixes #109727