Skip to content

Commit b2a9501

Browse files
committed
[flang][hlfir] Fixed associate-op codegen for optimized HLFIR.
This effectively reverts D154715. The issue appears as the dialect conversion error because we try to erase an op that has already been erased. See the added LIT test case with HLFIR that may appear as a result of CSE. The `adaptor.getSource()` is an operation producing a tuple, which does not have users, so `allOtherUsesAreSafeForAssociate` just looks at the empty list of users. So we get completely wrong answers from it. This causes problems with the following `eraseAllUsesInDestroys` that tries to remove the `DestroyOp` twice during both `hflir.associate` processing. But we also cannot use `associate.getSource()` *efficiently*, because the original users may still hang around: one example is the original body of hlfir.elemental (see D154715), another example is other already converted AssociateOp's that are pending removal in the rewriter (that is why we have a temporary created for each hlfir.associate in the newly added LIT case). This patch just fixes the correctness issue. I think we have to separate the buffer reuse analysis from the conversion itself. I also tried to address the issues with the cloned bodies of `hlfir.elemental`, but this should not matter since D155778: if `hlfir.associate` is inside `hlfir.elemental`, it will end up inside a do-loop body region, so the early exit added in D155778 will prevent the buffer reuse. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D158471
1 parent 06d3ee9 commit b2a9501

File tree

2 files changed

+112
-53
lines changed

2 files changed

+112
-53
lines changed

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp

Lines changed: 29 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
#include "flang/Optimizer/HLFIR/HLFIRDialect.h"
2626
#include "flang/Optimizer/HLFIR/HLFIROps.h"
2727
#include "flang/Optimizer/HLFIR/Passes.h"
28+
#include "mlir/IR/Dominance.h"
2829
#include "mlir/IR/PatternMatch.h"
2930
#include "mlir/Pass/Pass.h"
3031
#include "mlir/Pass/PassManager.h"
@@ -448,9 +449,10 @@ static bool allOtherUsesAreSafeForAssociate(mlir::Value value,
448449
mlir::isa<hlfir::GetLengthOp>(useOp)) {
449450
if (!endAssociate)
450451
continue;
451-
// not known to occur in practice:
452+
// If useOp dominates the endAssociate, then it is definitely safe.
452453
if (useOp->getBlock() != endAssociate->getBlock())
453-
TODO(endAssociate->getLoc(), "Associate split over multiple blocks");
454+
if (mlir::DominanceInfo{}.dominates(useOp, endAssociate))
455+
continue;
454456
if (useOp->isBeforeInBlock(endAssociate))
455457
continue;
456458
}
@@ -530,14 +532,31 @@ struct AssociateOpConversion
530532
firVar = adjustVar(firVar, associateFirVarType);
531533
associate.getResult(1).replaceAllUsesWith(firVar);
532534
associate.getResult(2).replaceAllUsesWith(flag);
533-
rewriter.replaceOp(associate, {hlfirVar, firVar, flag});
535+
// FIXME: note that the AssociateOp that is being erased
536+
// here will continue to be a user of the original Source
537+
// operand (e.g. a result of hlfir.elemental), because
538+
// the erasure is not immediate in the rewriter.
539+
// In case there are multiple uses of the Source operand,
540+
// the allOtherUsesAreSafeForAssociate() below will always
541+
// see them, so there is no way to reuse the buffer.
542+
// I think we have to run this analysis before doing
543+
// the conversions, so that we can analyze HLFIR in its
544+
// original form and decide which of the AssociateOp
545+
// users of hlfir.expr can reuse the buffer (if it can).
546+
rewriter.eraseOp(associate);
534547
};
535548

536549
// If this is the last use of the expression value and this is an hlfir.expr
537550
// that was bufferized, re-use the storage.
538551
// Otherwise, create a temp and assign the storage to it.
552+
//
553+
// WARNING: it is important to use the original Source operand
554+
// of the AssociateOp to look for the users, because its replacement
555+
// has zero materialized users at this point.
556+
// So allOtherUsesAreSafeForAssociate() may incorrectly return
557+
// true here.
539558
if (!isTrivialValue && allOtherUsesAreSafeForAssociate(
540-
adaptor.getSource(), associate.getOperation(),
559+
associate.getSource(), associate.getOperation(),
541560
getEndAssociate(associate))) {
542561
// Re-use hlfir.expr buffer if this is the only use of the hlfir.expr
543562
// outside of the hlfir.destroy. Take on the cleaning-up responsibility
@@ -771,6 +790,12 @@ struct ElementalOpConversion
771790

772791
mlir::Value bufferizedExpr =
773792
packageBufferizedExpr(loc, builder, temp, cleanup);
793+
// Explicitly delete the body of the elemental to get rid
794+
// of any users of hlfir.expr values inside the body as early
795+
// as possible.
796+
rewriter.startRootUpdate(elemental);
797+
rewriter.eraseBlock(elemental.getBody());
798+
rewriter.finalizeRootUpdate(elemental);
774799
rewriter.replaceOp(elemental, bufferizedExpr);
775800
return mlir::success();
776801
}

flang/test/HLFIR/associate-codegen.fir

Lines changed: 83 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -239,55 +239,6 @@ func.func @test_shape_of(%arg0: !fir.ref<!fir.array<4x3xi32>>) {
239239
// CHECK: return
240240
// CHECK: }
241241

242-
// The hlfir.associate op is cloned when the elemental is bufferized into the
243-
// fir.do_loop. When the associate op conversion is run, if the source of the
244-
// assoicate is used directly (not accessing the bufferized version through
245-
// the adaptor) then both the associate inside the elemental and the associate
246-
// inside the fir.do_loop are found as uses. Therefore being erroneously
247-
// flagged as an associate with more than one use
248-
func.func @test_cloned_associate() {
249-
%false = arith.constant false
250-
%c1 = arith.constant 1 : index
251-
%c10 = arith.constant 10 : index
252-
%0 = fir.alloca !fir.char<1>
253-
%2 = fir.shape %c10 : (index) -> !fir.shape<1>
254-
%4 = hlfir.as_expr %0 move %false : (!fir.ref<!fir.char<1>>, i1) -> !hlfir.expr<!fir.char<1>>
255-
%5 = hlfir.elemental %2 unordered : (!fir.shape<1>) -> !hlfir.expr<10x!fir.logical<4>> {
256-
^bb0(%arg0: index):
257-
%8:3 = hlfir.associate %4 typeparams %c1 {uniq_name = "adapt.valuebyref"} : (!hlfir.expr<!fir.char<1>>, index) -> (!fir.ref<!fir.char<1>>, !fir.ref<!fir.char<1>>, i1)
258-
hlfir.end_associate %8#1, %8#2 : !fir.ref<!fir.char<1>>, i1
259-
%15 = fir.convert %false : (i1) -> !fir.logical<4>
260-
hlfir.yield_element %15 : !fir.logical<4>
261-
}
262-
hlfir.destroy %5 : !hlfir.expr<10x!fir.logical<4>>
263-
hlfir.destroy %4 : !hlfir.expr<!fir.char<1>>
264-
return
265-
}
266-
// CHECK-LABEL: func.func @test_cloned_associate() {
267-
// CHECK: %[[VAL_0:.*]] = arith.constant false
268-
// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index
269-
// CHECK: %[[VAL_2:.*]] = arith.constant 10 : index
270-
// CHECK: %[[VAL_3:.*]] = fir.alloca !fir.char<1>
271-
// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1>
272-
// CHECK: %[[VAL_5:.*]] = fir.undefined tuple<!fir.ref<!fir.char<1>>, i1>
273-
// CHECK: %[[VAL_6:.*]] = fir.insert_value %[[VAL_5]], %[[VAL_0]], [1 : index] : (tuple<!fir.ref<!fir.char<1>>, i1>, i1) -> tuple<!fir.ref<!fir.char<1>>, i1>
274-
// CHECK: %[[VAL_7:.*]] = fir.insert_value %[[VAL_6]], %[[VAL_3]], [0 : index] : (tuple<!fir.ref<!fir.char<1>>, i1>, !fir.ref<!fir.char<1>>) -> tuple<!fir.ref<!fir.char<1>>, i1>
275-
// CHECK: %[[VAL_8:.*]] = fir.allocmem !fir.array<10x!fir.logical<4>> {bindc_name = ".tmp.array", uniq_name = ""}
276-
// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_8]](%[[VAL_4]]) {uniq_name = ".tmp.array"} : (!fir.heap<!fir.array<10x!fir.logical<4>>>, !fir.shape<1>) -> (!fir.heap<!fir.array<10x!fir.logical<4>>>, !fir.heap<!fir.array<10x!fir.logical<4>>>)
277-
// CHECK: %[[VAL_10:.*]] = arith.constant true
278-
// CHECK: %[[VAL_11:.*]] = arith.constant 1 : index
279-
// CHECK: fir.do_loop %[[VAL_12:.*]] = %[[VAL_11]] to %[[VAL_2]] step %[[VAL_11]] unordered {
280-
// CHECK: %[[VAL_13:.*]] = fir.convert %[[VAL_0]] : (i1) -> !fir.logical<4>
281-
// CHECK: %[[VAL_14:.*]] = hlfir.designate %[[VAL_9]]#0 (%[[VAL_12]]) : (!fir.heap<!fir.array<10x!fir.logical<4>>>, index) -> !fir.ref<!fir.logical<4>>
282-
// CHECK: hlfir.assign %[[VAL_13]] to %[[VAL_14]] temporary_lhs : !fir.logical<4>, !fir.ref<!fir.logical<4>>
283-
// CHECK: }
284-
// CHECK: %[[VAL_15:.*]] = fir.undefined tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>
285-
// CHECK: %[[VAL_16:.*]] = fir.insert_value %[[VAL_15]], %[[VAL_10]], [1 : index] : (tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>, i1) -> tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>
286-
// CHECK: %[[VAL_17:.*]] = fir.insert_value %[[VAL_16]], %[[VAL_9]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>, !fir.heap<!fir.array<10x!fir.logical<4>>>) -> tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>
287-
// CHECK: fir.freemem %[[VAL_9]]#1 : !fir.heap<!fir.array<10x!fir.logical<4>>>
288-
// CHECK: return
289-
// CHECK: }
290-
291242
func.func @test_multiple_associations(%arg0: !hlfir.expr<1x2xi32>) {
292243
%c1 = arith.constant 1 : index
293244
%c2 = arith.constant 2 : index
@@ -420,3 +371,86 @@ func.func @_QPtest_multiple_expr_uses_inside_elemental() {
420371
// CHECK: %[[VAL_27:.*]] = fir.insert_value %[[VAL_26]], %[[VAL_12]]#0, [0 : index] : (tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>, !fir.box<!fir.array<?x!fir.logical<4>>>) -> tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>
421372
// CHECK: return
422373
// CHECK: }
374+
375+
// Verify that we properly recognize mutliple consequent hlfir.associate using
376+
// the same result of hlfir.elemental.
377+
func.func @_QPtest_multitple_associates_for_same_expr() {
378+
%c1 = arith.constant 1 : index
379+
%c10 = arith.constant 10 : index
380+
%4 = fir.shape %c10 : (index) -> !fir.shape<1>
381+
%11 = hlfir.elemental %4 typeparams %c1 unordered : (!fir.shape<1>, index) -> !hlfir.expr<10x!fir.char<1>> {
382+
^bb0(%arg1: index):
383+
%44 = fir.undefined !fir.ref<!fir.char<1>>
384+
hlfir.yield_element %44 : !fir.ref<!fir.char<1>>
385+
}
386+
%12:3 = hlfir.associate %11(%4) typeparams %c1 {uniq_name = "adapt.valuebyref"} : (!hlfir.expr<10x!fir.char<1>>, !fir.shape<1>, index) -> (!fir.ref<!fir.array<10x!fir.char<1>>>, !fir.ref<!fir.array<10x!fir.char<1>>>, i1)
387+
hlfir.end_associate %12#1, %12#2 : !fir.ref<!fir.array<10x!fir.char<1>>>, i1
388+
%31:3 = hlfir.associate %11(%4) typeparams %c1 {uniq_name = "adapt.valuebyref"} : (!hlfir.expr<10x!fir.char<1>>, !fir.shape<1>, index) -> (!fir.ref<!fir.array<10x!fir.char<1>>>, !fir.ref<!fir.array<10x!fir.char<1>>>, i1)
389+
hlfir.end_associate %31#1, %31#2 : !fir.ref<!fir.array<10x!fir.char<1>>>, i1
390+
hlfir.destroy %11 : !hlfir.expr<10x!fir.char<1>>
391+
return
392+
}
393+
// CHECK-LABEL: func.func @_QPtest_multitple_associates_for_same_expr() {
394+
// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index
395+
// CHECK: %[[VAL_1:.*]] = arith.constant 10 : index
396+
// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_1]] : (index) -> !fir.shape<1>
397+
// CHECK: %[[VAL_3:.*]] = fir.allocmem !fir.array<10x!fir.char<1>> {bindc_name = ".tmp.array", uniq_name = ""}
398+
// CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_3]](%[[VAL_2]]) typeparams %[[VAL_0]] {uniq_name = ".tmp.array"} : (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.shape<1>, index) -> (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>)
399+
// CHECK: %[[VAL_5:.*]] = arith.constant true
400+
// CHECK: %[[VAL_6:.*]] = arith.constant 1 : index
401+
// CHECK: fir.do_loop %[[VAL_7:.*]] = %[[VAL_6]] to %[[VAL_1]] step %[[VAL_6]] unordered {
402+
// CHECK: %[[VAL_8:.*]] = fir.undefined !fir.ref<!fir.char<1>>
403+
// CHECK: %[[VAL_9:.*]] = hlfir.designate %[[VAL_4]]#0 (%[[VAL_7]]) typeparams %[[VAL_0]] : (!fir.heap<!fir.array<10x!fir.char<1>>>, index, index) -> !fir.ref<!fir.char<1>>
404+
// CHECK: hlfir.assign %[[VAL_8]] to %[[VAL_9]] temporary_lhs : !fir.ref<!fir.char<1>>, !fir.ref<!fir.char<1>>
405+
// CHECK: }
406+
// CHECK: %[[VAL_10:.*]] = fir.undefined tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
407+
// CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_5]], [1 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, i1) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
408+
// CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_4]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, !fir.heap<!fir.array<10x!fir.char<1>>>) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
409+
// CHECK: %[[VAL_13:.*]] = fir.allocmem !fir.array<10x!fir.char<1>> {bindc_name = ".tmp", uniq_name = ""}
410+
// CHECK: %[[VAL_14:.*]] = arith.constant true
411+
// CHECK: %[[VAL_15:.*]]:2 = hlfir.declare %[[VAL_13]](%[[VAL_2]]) typeparams %[[VAL_0]] {uniq_name = ".tmp"} : (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.shape<1>, index) -> (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>)
412+
// CHECK: hlfir.assign %[[VAL_4]]#0 to %[[VAL_15]]#0 temporary_lhs : !fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>
413+
// CHECK: %[[VAL_16:.*]] = fir.undefined tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
414+
// CHECK: %[[VAL_17:.*]] = fir.insert_value %[[VAL_16]], %[[VAL_14]], [1 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, i1) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
415+
// CHECK: %[[VAL_18:.*]] = fir.insert_value %[[VAL_17]], %[[VAL_15]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, !fir.heap<!fir.array<10x!fir.char<1>>>) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
416+
// CHECK: %[[VAL_19:.*]] = fir.convert %[[VAL_15]]#0 : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>
417+
// CHECK: %[[VAL_20:.*]] = fir.convert %[[VAL_15]]#1 : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>
418+
// CHECK: %[[VAL_21:.*]] = fir.convert %[[VAL_20]] : (!fir.ref<!fir.array<10x!fir.char<1>>>) -> !fir.heap<!fir.array<10x!fir.char<1>>>
419+
// CHECK: fir.freemem %[[VAL_21]] : !fir.heap<!fir.array<10x!fir.char<1>>>
420+
// CHECK: %[[VAL_22:.*]] = fir.allocmem !fir.array<10x!fir.char<1>> {bindc_name = ".tmp", uniq_name = ""}
421+
// CHECK: %[[VAL_23:.*]] = arith.constant true
422+
// CHECK: %[[VAL_24:.*]]:2 = hlfir.declare %[[VAL_22]](%[[VAL_2]]) typeparams %[[VAL_0]] {uniq_name = ".tmp"} : (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.shape<1>, index) -> (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>)
423+
// CHECK: hlfir.assign %[[VAL_4]]#0 to %[[VAL_24]]#0 temporary_lhs : !fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>
424+
// CHECK: %[[VAL_25:.*]] = fir.undefined tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
425+
// CHECK: %[[VAL_26:.*]] = fir.insert_value %[[VAL_25]], %[[VAL_23]], [1 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, i1) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
426+
// CHECK: %[[VAL_27:.*]] = fir.insert_value %[[VAL_26]], %[[VAL_24]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, !fir.heap<!fir.array<10x!fir.char<1>>>) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
427+
// CHECK: %[[VAL_28:.*]] = fir.convert %[[VAL_24]]#0 : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>
428+
// CHECK: %[[VAL_29:.*]] = fir.convert %[[VAL_24]]#1 : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>
429+
// CHECK: %[[VAL_30:.*]] = fir.convert %[[VAL_29]] : (!fir.ref<!fir.array<10x!fir.char<1>>>) -> !fir.heap<!fir.array<10x!fir.char<1>>>
430+
// CHECK: fir.freemem %[[VAL_30]] : !fir.heap<!fir.array<10x!fir.char<1>>>
431+
// CHECK: fir.freemem %[[VAL_4]]#1 : !fir.heap<!fir.array<10x!fir.char<1>>>
432+
// CHECK: return
433+
// CHECK: }
434+
435+
// Test hlfir.associate codegen, when its operand is used
436+
// by hlfir.shape located in a block different from the block
437+
// of the hlfir.end_associate.
438+
func.func @_QPtest(%arg0: index, %arg1: index, %arg2 : i32) {
439+
%c1_i32 = arith.constant 1 : i32
440+
%3 = fir.shape %arg0, %arg1 : (index, index) -> !fir.shape<2>
441+
%4 = hlfir.elemental %3 unordered : (!fir.shape<2>) -> !hlfir.expr<?x?xi32> {
442+
^bb0(%arg3: index, %arg4: index):
443+
%16 = fir.undefined i32
444+
hlfir.yield_element %16 : i32
445+
}
446+
%5 = hlfir.shape_of %4 : (!hlfir.expr<?x?xi32>) -> !fir.shape<2>
447+
%6:3 = hlfir.associate %4(%5) {uniq_name = "adapt.valuebyref"} : (!hlfir.expr<?x?xi32>, !fir.shape<2>) -> (!fir.box<!fir.array<?x?xi32>>, !fir.ref<!fir.array<?x?xi32>>, i1)
448+
%13 = arith.cmpi ne, %arg2, %c1_i32 : i32
449+
cf.cond_br %13, ^bb1, ^bb2
450+
^bb1: // pred: ^bb0
451+
fir.unreachable
452+
^bb2: // pred: ^bb0
453+
hlfir.end_associate %6#1, %6#2 : !fir.ref<!fir.array<?x?xi32>>, i1
454+
hlfir.destroy %4 : !hlfir.expr<?x?xi32>
455+
return
456+
}

0 commit comments

Comments
 (0)