-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[flang][StackArrays] track pointers through fir.convert #121919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This does add a little computational complexity because now every freemem operation has to be tested for every allocation. This could be improved with some more memoisation but I think it is easier to read this way. Let me know if you would prefer me to change this to pre-compute the normalised addresses each freemem operation is using. Weirdly, this change resulted in a verifier failure for the fir.declare in the previous test case. Maybe it was previously removed as dead code and now it isn't. Anyway I fixed that too.
This is a follow up to #121864 (review) |
@llvm/pr-subscribers-flang-fir-hlfir Author: Tom Eccles (tblah) ChangesThis does add a little computational complexity because now every freemem operation has to be tested for every allocation. This could be improved with some more memoisation but I think it is easier to read this way. Let me know if you would prefer me to change this to pre-compute the normalised addresses each freemem operation is using. Weirdly, this change resulted in a verifier failure for the fir.declare in the previous test case. Maybe it was previously removed as dead code and now it isn't. Anyway I fixed that too. Full diff: https://github.com/llvm/llvm-project/pull/121919.diff 2 Files Affected:
diff --git a/flang/lib/Optimizer/Transforms/StackArrays.cpp b/flang/lib/Optimizer/Transforms/StackArrays.cpp
index bdcb8199b790de..2a9d3397e87b08 100644
--- a/flang/lib/Optimizer/Transforms/StackArrays.cpp
+++ b/flang/lib/Optimizer/Transforms/StackArrays.cpp
@@ -330,6 +330,18 @@ std::optional<AllocationState> LatticePoint::get(mlir::Value val) const {
return it->second;
}
+static mlir::Value lookThroughDeclaresAndConverts(mlir::Value value) {
+ while (mlir::Operation *op = value.getDefiningOp()) {
+ if (auto declareOp = llvm::dyn_cast<fir::DeclareOp>(op))
+ value = declareOp.getMemref();
+ else if (auto convertOp = llvm::dyn_cast<fir::ConvertOp>(op))
+ value = convertOp->getOperand(0);
+ else
+ return value;
+ }
+ return value;
+}
+
mlir::LogicalResult AllocationAnalysis::visitOperation(
mlir::Operation *op, const LatticePoint &before, LatticePoint *after) {
LLVM_DEBUG(llvm::dbgs() << "StackArrays: Visiting operation: " << *op
@@ -363,10 +375,10 @@ mlir::LogicalResult AllocationAnalysis::visitOperation(
mlir::Value operand = op->getOperand(0);
// Note: StackArrays is scheduled in the pass pipeline after lowering hlfir
- // to fir. Therefore, we only need to handle `fir::DeclareOp`s.
- if (auto declareOp =
- llvm::dyn_cast_if_present<fir::DeclareOp>(operand.getDefiningOp()))
- operand = declareOp.getMemref();
+ // to fir. Therefore, we only need to handle `fir::DeclareOp`s. Also look
+ // past converts in case the pointer was changed between different pointer
+ // types.
+ operand = lookThroughDeclaresAndConverts(operand);
std::optional<AllocationState> operandState = before.get(operand);
if (operandState && *operandState == AllocationState::Allocated) {
@@ -535,17 +547,12 @@ AllocMemConversion::matchAndRewrite(fir::AllocMemOp allocmem,
// remove freemem operations
llvm::SmallVector<mlir::Operation *> erases;
- for (mlir::Operation *user : allocmem.getOperation()->getUsers()) {
- if (auto declareOp = mlir::dyn_cast_if_present<fir::DeclareOp>(user)) {
- for (mlir::Operation *user : declareOp->getUsers()) {
- if (mlir::isa<fir::FreeMemOp>(user))
- erases.push_back(user);
- }
- }
-
- if (mlir::isa<fir::FreeMemOp>(user))
- erases.push_back(user);
- }
+ mlir::Operation *parent = allocmem->getParentOp();
+ // TODO: this shouldn't need to be re-calculated for every allocmem
+ parent->walk([&](fir::FreeMemOp freeOp) {
+ if (lookThroughDeclaresAndConverts(freeOp->getOperand(0)) == allocmem)
+ erases.push_back(freeOp);
+ });
// now we are done iterating the users, it is safe to mutate them
for (mlir::Operation *erase : erases)
diff --git a/flang/test/Transforms/stack-arrays.fir b/flang/test/Transforms/stack-arrays.fir
index 66cd2a5aa910b9..444136d53e0340 100644
--- a/flang/test/Transforms/stack-arrays.fir
+++ b/flang/test/Transforms/stack-arrays.fir
@@ -379,7 +379,8 @@ func.func @placement_loop_declare() {
%3 = arith.addi %c1, %c2 : index
// operand is now available
%4 = fir.allocmem !fir.array<?xi32>, %3
- %5 = fir.declare %4 {uniq_name = "temp"} : (!fir.heap<!fir.array<?xi32>>) -> !fir.heap<!fir.array<?xi32>>
+ %shape = fir.shape %3 : (index) -> !fir.shape<1>
+ %5 = fir.declare %4(%shape) {uniq_name = "temp"} : (!fir.heap<!fir.array<?xi32>>, !fir.shape<1>) -> !fir.heap<!fir.array<?xi32>>
// ...
fir.freemem %5 : !fir.heap<!fir.array<?xi32>>
fir.result %3, %c1_i32 : index, i32
@@ -400,3 +401,20 @@ func.func @placement_loop_declare() {
// CHECK-NEXT: }
// CHECK-NEXT: return
// CHECK-NEXT: }
+
+// Can we look through fir.convert and fir.declare?
+func.func @lookthrough() {
+ %0 = fir.allocmem !fir.array<42xi32>
+ %c42 = arith.constant 42 : index
+ %shape = fir.shape %c42 : (index) -> !fir.shape<1>
+ %1 = fir.declare %0(%shape) {uniq_name = "name"} : (!fir.heap<!fir.array<42xi32>>, !fir.shape<1>) -> !fir.heap<!fir.array<42xi32>>
+ %2 = fir.convert %1 : (!fir.heap<!fir.array<42xi32>>) -> !fir.ref<!fir.array<42xi32>>
+ // use the ref so the converts aren't folded
+ %3 = fir.load %2 : !fir.ref<!fir.array<42xi32>>
+ %4 = fir.convert %2 : (!fir.ref<!fir.array<42xi32>>) -> !fir.heap<!fir.array<42xi32>>
+ fir.freemem %4 : !fir.heap<!fir.array<42xi32>>
+ return
+}
+// CHECK: func.func @lookthrough() {
+// CHECK: fir.alloca !fir.array<42xi32>
+// CHECK-NOT: fir.freemem
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
`findAllocaLoopInsertionPoint()` hit assertion not being able to find the `fir.freemem` because of the `fir.convert`. I think it is better to look for `fir.freemem` same way with the look-through walk.
This does add a little computational complexity because now every freemem operation has to be tested for every allocation. This could be improved with some more memoisation but I think it is easier to read this way. Let me know if you would prefer me to change this to pre-compute the normalised addresses each freemem operation is using.
Weirdly, this change resulted in a verifier failure for the fir.declare in the previous test case. Maybe it was previously removed as dead code and now it isn't. Anyway I fixed that too.