-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[flang][AddAliasTags] Generalise tbaa tags beyond functions #92379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
TL;DR without using the undocumented option this should be NFC. Even with the option, it is probably still NFC in practice. This patch is intended to upstream some experiment code that might be more broadly useful. This changes the per-function TBAA trees to be per-(top-level)-symbol. This allows running the pass on other top level operations e.g. openmp reductions. Symbol names should be unique so the trees should not collide (without multiple inlining like in llvm#89829). The current alias code can't generate useful alias tags for reductions (it just sees the reduction variables come from a block argument and correctly concludes it doesn't know anything about them). But in the future this could be useful to allow array reductions to vectorize. For this reason, I don't think the current code will generate alias tags for existing top level operations other than functions, but I would prefer to leave the option off until there is a real use case just to be safe. I have tested with the option enabled and do not know of any misscompilations.
@llvm/pr-subscribers-flang-fir-hlfir Author: Tom Eccles (tblah) ChangesTL;DR without using the undocumented option this should be NFC. Even with the option, it is probably still NFC in practice. This patch is intended to upstream some experiment code that might be more broadly useful. This changes the per-function TBAA trees to be per-(top-level)-symbol. This allows running the pass on other top level operations e.g. openmp reductions. Symbol names should be unique so the trees should not collide (without multiple inlining like in #89829). The current alias code can't generate useful alias tags for reductions (it just sees the reduction variables come from a block argument and correctly concludes it doesn't know anything about them). But in the future this could be useful to allow array reductions to vectorize. For this reason, I don't think the current code will generate alias tags for existing top level operations other than functions, but I would prefer to leave the option off until there is a real use case just to be safe. I have tested with the option enabled and do not know of any misscompilations. Full diff: https://github.com/llvm/llvm-project/pull/92379.diff 3 Files Affected:
diff --git a/flang/include/flang/Optimizer/Analysis/TBAAForest.h b/flang/include/flang/Optimizer/Analysis/TBAAForest.h
index 619ed4939c51c..f82a3fce3f9e9 100644
--- a/flang/include/flang/Optimizer/Analysis/TBAAForest.h
+++ b/flang/include/flang/Optimizer/Analysis/TBAAForest.h
@@ -73,15 +73,15 @@ struct TBAATree {
//===----------------------------------------------------------------------===//
// TBAAForrest
//===----------------------------------------------------------------------===//
-/// Collection of TBAATrees, usually indexed by function (so that each function
-/// has a different TBAATree)
+/// Collection of TBAATrees, usually indexed by symbol (so that each
+/// function-like container operation has a different TBAATree)
class TBAAForrest {
public:
explicit TBAAForrest(bool separatePerFunction = true)
: separatePerFunction{separatePerFunction} {}
- inline const TBAATree &operator[](mlir::func::FuncOp func) {
- return getFuncTree(func.getSymNameAttr());
+ inline const TBAATree &operator[](mlir::SymbolOpInterface sym) {
+ return getFuncTree(sym.getNameAttr());
}
inline const TBAATree &operator[](mlir::LLVM::LLVMFuncOp func) {
// the external name conversion pass may rename some functions. Their old
diff --git a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
index 3642a812096db..643218583b0d9 100644
--- a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
+++ b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
@@ -49,6 +49,13 @@ static llvm::cl::opt<bool> enableLocalAllocs(
"local-alloc-tbaa", llvm::cl::init(false), llvm::cl::Hidden,
llvm::cl::desc("Add TBAA tags to local allocations. UNSAFE."));
+// Run on function-like top-level operations (e.g. omp.declare_reduction).
+// This needs more testing before enabling by default:
+static llvm::cl::opt<bool> runOnNonFunctions(
+ "nonfunc-tbaa", llvm::cl::init(false), llvm::cl::Hidden,
+ llvm::cl::desc("Add TBAA tags to operations insode of container operations "
+ "other than func.func"));
+
namespace {
/// Shared state per-module
@@ -61,9 +68,9 @@ class PassState {
return analysisCache[value];
}
- /// get the per-function TBAATree for this function
- inline const fir::TBAATree &getFuncTree(mlir::func::FuncOp func) {
- return forrest[func];
+ /// get the per-function TBAATree for this function-like op
+ inline const fir::TBAATree &getFuncTree(mlir::SymbolOpInterface symbol) {
+ return forrest[symbol];
}
private:
@@ -113,10 +120,22 @@ static std::string getFuncArgName(mlir::Value arg) {
return attr.str();
}
+static mlir::SymbolOpInterface getTopLevelSymbolParent(mlir::Operation *op) {
+ mlir::SymbolOpInterface parent =
+ op->getParentOfType<mlir::SymbolOpInterface>();
+ mlir::SymbolOpInterface oldParent = parent;
+ while (parent) {
+ parent = getTopLevelSymbolParent(parent);
+ if (parent)
+ oldParent = parent;
+ }
+ return oldParent;
+}
+
void AddAliasTagsPass::runOnAliasInterface(fir::FirAliasTagOpInterface op,
PassState &state) {
- mlir::func::FuncOp func = op->getParentOfType<mlir::func::FuncOp>();
- if (!func)
+ mlir::SymbolOpInterface parent = getTopLevelSymbolParent(op);
+ if (!runOnNonFunctions && !mlir::isa<mlir::func::FuncOp>(parent))
return;
llvm::SmallVector<mlir::Value> accessedOperands = op.getAccessedOperands();
@@ -147,7 +166,7 @@ void AddAliasTagsPass::runOnAliasInterface(fir::FirAliasTagOpInterface op,
<< "Found reference to dummy argument at " << *op << "\n");
std::string name = getFuncArgName(source.u.get<mlir::Value>());
if (!name.empty())
- tag = state.getFuncTree(func).dummyArgDataTree.getTag(name);
+ tag = state.getFuncTree(parent).dummyArgDataTree.getTag(name);
else
LLVM_DEBUG(llvm::dbgs().indent(2)
<< "WARN: couldn't find a name for dummy argument " << *op
@@ -160,7 +179,7 @@ void AddAliasTagsPass::runOnAliasInterface(fir::FirAliasTagOpInterface op,
const char *name = glbl.getRootReference().data();
LLVM_DEBUG(llvm::dbgs().indent(2) << "Found reference to global " << name
<< " at " << *op << "\n");
- tag = state.getFuncTree(func).globalDataTree.getTag(name);
+ tag = state.getFuncTree(parent).globalDataTree.getTag(name);
// TBAA for SourceKind::Direct
} else if (enableDirect &&
@@ -170,7 +189,7 @@ void AddAliasTagsPass::runOnAliasInterface(fir::FirAliasTagOpInterface op,
const char *name = glbl.getRootReference().data();
LLVM_DEBUG(llvm::dbgs().indent(2) << "Found reference to direct " << name
<< " at " << *op << "\n");
- tag = state.getFuncTree(func).directDataTree.getTag(name);
+ tag = state.getFuncTree(parent).directDataTree.getTag(name);
} else {
// SourceKind::Direct is likely to be extended to cases which are not a
// SymbolRefAttr in the future
@@ -190,7 +209,7 @@ void AddAliasTagsPass::runOnAliasInterface(fir::FirAliasTagOpInterface op,
if (name) {
LLVM_DEBUG(llvm::dbgs().indent(2) << "Found reference to allocation "
<< name << " at " << *op << "\n");
- tag = state.getFuncTree(func).allocatedDataTree.getTag(*name);
+ tag = state.getFuncTree(parent).allocatedDataTree.getTag(*name);
} else {
LLVM_DEBUG(llvm::dbgs().indent(2)
<< "WARN: couldn't find a name for allocation " << *op
diff --git a/flang/test/Transforms/tbaa3.fir b/flang/test/Transforms/tbaa3.fir
new file mode 100644
index 0000000000000..18121edfe1a73
--- /dev/null
+++ b/flang/test/Transforms/tbaa3.fir
@@ -0,0 +1,72 @@
+// Test experimental option to add tbaa trees for things that aren't functions
+
+// RUN: fir-opt --nonfunc-tbaa --fir-add-alias-tags %s | FileCheck %s
+
+
+// CHECK: #[[TBAA_ROOT:.*]] = #llvm.tbaa_root<id = "Flang function root add_reduction_byref_box_3xi32">
+// CHECK: #[[ANY_ACCESS:.*]] = #llvm.tbaa_type_desc<id = "any access", members = {<#[[TBAA_ROOT]], 0>}>
+// CHECK: #[[ANY_DATA:.*]] = #llvm.tbaa_type_desc<id = "any data access", members = {<#[[ANY_ACCESS]], 0>}>
+// CHECK: #[[DIRECT_DATA:.*]] = #llvm.tbaa_type_desc<id = "direct data", members = {<#[[ANY_DATA]], 0>}>
+// CHECK: #[[GLOBAL_A:.*]] = #llvm.tbaa_type_desc<id = "direct data/_QMmodEa", members = {<#[[DIRECT_DATA]], 0>}>
+// CHECK: #[[GLOBAL_A_TAG:.*]] = #llvm.tbaa_tag<base_type = #[[GLOBAL_A]], access_type = #[[GLOBAL_A]], offset = 0>
+
+fir.global @_QMmodEa : !fir.box<!fir.heap<!fir.array<?xi32>>> {
+ %c0 = arith.constant 0 : index
+ %0 = fir.zero_bits !fir.heap<!fir.array<?xi32>>
+ %1 = fir.shape %c0 : (index) -> !fir.shape<1>
+ %2 = fir.embox %0(%1) : (!fir.heap<!fir.array<?xi32>>, !fir.shape<1>) -> !fir.box<!fir.heap<!fir.array<?xi32>>>
+ fir.has_value %2 : !fir.box<!fir.heap<!fir.array<?xi32>>>
+}
+
+// nonsense reduction to give us something that we can generate TBAA tags for
+omp.declare_reduction @add_reduction_byref_box_3xi32 : !fir.ref<!fir.box<!fir.array<3xi32>>> init {
+^bb0(%arg0: !fir.ref<!fir.box<!fir.array<3xi32>>>):
+ %c0_i32 = arith.constant 0 : i32
+ %0 = fir.load %arg0 : !fir.ref<!fir.box<!fir.array<3xi32>>>
+ %1 = fir.address_of(@_QMmodEa) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
+ %c3 = arith.constant 3 : index
+ %2 = fir.shape %c3 : (index) -> !fir.shape<1>
+ %3 = fir.allocmem !fir.array<3xi32> {bindc_name = ".tmp", uniq_name = ""}
+ %true = arith.constant true
+ %4:2 = hlfir.declare %3(%2) {uniq_name = ".tmp"} : (!fir.heap<!fir.array<3xi32>>, !fir.shape<1>) -> (!fir.heap<!fir.array<3xi32>>, !fir.heap<!fir.array<3xi32>>)
+ %c0 = arith.constant 0 : index
+ %5:3 = fir.box_dims %0, %c0 : (!fir.box<!fir.array<3xi32>>, index) -> (index, index, index)
+ %6 = fir.shape_shift %5#0, %5#1 : (index, index) -> !fir.shapeshift<1>
+ %7 = fir.embox %4#0(%6) : (!fir.heap<!fir.array<3xi32>>, !fir.shapeshift<1>) -> !fir.box<!fir.array<3xi32>>
+ hlfir.assign %c0_i32 to %7 : i32, !fir.box<!fir.array<3xi32>>
+ %8 = fir.convert %7 : (!fir.box<!fir.array<3xi32>>) -> !fir.box<!fir.heap<!fir.array<?xi32>>>
+ fir.store %8 to %1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
+ %9 = fir.convert %1 : (!fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.array<3xi32>>>
+ omp.yield(%9 : !fir.ref<!fir.box<!fir.array<3xi32>>>)
+} combiner {
+^bb0(%arg0: !fir.ref<!fir.box<!fir.array<3xi32>>>, %arg1: !fir.ref<!fir.box<!fir.array<3xi32>>>):
+ %0 = fir.load %arg0 : !fir.ref<!fir.box<!fir.array<3xi32>>>
+ %1 = fir.address_of(@_QMmodEa) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
+ %a = fir.load %1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
+ %c0 = arith.constant 0 : index
+ %2:3 = fir.box_dims %0, %c0 : (!fir.box<!fir.array<3xi32>>, index) -> (index, index, index)
+ %3 = fir.shape_shift %2#0, %2#1 : (index, index) -> !fir.shapeshift<1>
+ %c1 = arith.constant 1 : index
+ fir.do_loop %arg2 = %c1 to %2#1 step %c1 unordered {
+ %4 = fir.array_coor %0(%3) %arg2 : (!fir.box<!fir.array<3xi32>>, !fir.shapeshift<1>, index) -> !fir.ref<i32>
+ %5 = fir.array_coor %a(%3) %arg2 : (!fir.box<!fir.heap<!fir.array<?xi32>>>, !fir.shapeshift<1>, index) -> !fir.ref<i32>
+ %6 = fir.load %4 : !fir.ref<i32>
+ %7 = fir.load %5 : !fir.ref<i32>
+// CHECK: fir.load %{{.*}} {tbaa = [#[[GLOBAL_A_TAG]]]} : !fir.ref<i32>
+ %8 = arith.addi %6, %7 : i32
+ fir.store %8 to %4 : !fir.ref<i32>
+ }
+ omp.yield(%arg0 : !fir.ref<!fir.box<!fir.array<3xi32>>>)
+} cleanup {
+^bb0(%arg0: !fir.ref<!fir.box<!fir.array<3xi32>>>):
+ %0 = fir.load %arg0 : !fir.ref<!fir.box<!fir.array<3xi32>>>
+ %1 = fir.box_addr %0 : (!fir.box<!fir.array<3xi32>>) -> !fir.ref<!fir.array<3xi32>>
+ %2 = fir.convert %1 : (!fir.ref<!fir.array<3xi32>>) -> i64
+ %c0_i64 = arith.constant 0 : i64
+ %3 = arith.cmpi ne, %2, %c0_i64 : i64
+ fir.if %3 {
+ %4 = fir.convert %1 : (!fir.ref<!fir.array<3xi32>>) -> !fir.heap<!fir.array<3xi32>>
+ fir.freemem %4 : !fir.heap<!fir.array<3xi32>>
+ }
+ omp.yield
+}
|
Ping for review |
Hi Tom, it looks okay to me, but I do not really see what useful TBAA information we can generate for standalone isolated-from-above OpenMP operations. Can you please give some examples of how this may evolve and become useful? It seems to me that we can only attach some useful TBAA tags after such operations are inlined into their users' contexts, but then this change does not really apply to this case. |
Hi, thanks for taking a look. Here is an example reduction declaration for a fixed size array:
It would be great if that do-loop could be vectorized for large arrays. Alias analysis could help with this. You are right that one approach would be to do all of the inlining in MLIR and add TBAA tags afterwards. This is not how OpenMP currently works. There was a desire to share as much OpenMP codegen with clang as possible so these are only inlined (and outlined) when the MLIR LLVM+OpenMP dialects are transformed into LLVM IR. Given this design, I would prefer to adapt TBAA tags to understand openmp operations before they are inlined. In this case, the two block arguments do not alias (fir::AliasAnalysis doesn't yet know this and I have not added that to this PR). One alternative would be not to treat this like Fortran code at all and just generate correct alias information (perhaps with noalias) when generating the reduction declaration in OpenMP lowering. |
I have no immediate plans to work on what I described above (until we have a concrete usecase for this optimization). I wanted to upstream this code in case it was useful for anyone else working on the same thing, but I don't mind if you would prefer I abandon this PR until it is actually useful for something. |
Thank you for the explanation, Tom. Let me review your example in more detail today, and give you more feedback. I am on a fence about merging this right now :) |
Tom, regarding your example, while I understand the desire to provide better aliasing information for the OpenMP isolated from above operations, I do not think this is related to FIR (meaning Fortran) aliasing rules. And, in general, I do not think FIR AliasAnalysis can apply Fortran aliasing rules just based on the fact that the memory references originate from the block arguments of some isolated from above operation. Just think of a pass that, for whatever reason, outlines a set of FIR operations into a standalone function, and some live-in memrefs become block arguments of the function: at this point there is no guarantee that this function follows Fortran aliasing rules for these blocks arguments, since they do not represent dummy arguments in Fortran terms. I think it is more appropriate to rely on the Fortran rules only when it is known that a source of a memory reference comes from an instantiation of a declaration of a Fortran dummy argument, which is conveyed in via So maybe it is better to actually generate such With all that said, I would advise not merging this. |
Not really an area I am intimately familiar with, so I can't quite comment one way or the other, but I wouldn't be against gathering alias information for OpenMP operations, however, I'll leave the discussion to people more familiar with this area/work :-) |
Thank you for your thoughts Slava. I will close this pull request. |
TL;DR without using the undocumented option this should be NFC. Even with the option, it is probably still NFC in practice.
This patch is intended to upstream some experiment code that might be more broadly useful.
This changes the per-function TBAA trees to be per-(top-level)-symbol. This allows running the pass on other top level operations e.g. openmp reductions. Symbol names should be unique so the trees should not collide (without multiple inlining like in #89829).
The current alias code can't generate useful alias tags for reductions (it just sees the reduction variables come from a block argument and correctly concludes it doesn't know anything about them). But in the future this could be useful to allow array reductions to vectorize. For this reason, I don't think the current code will generate alias tags for existing top level operations other than functions, but I would prefer to leave the option off until there is a real use case just to be safe.
I have tested with the option enabled and do not know of any misscompilations.