[flang] Canonicalize redundant pointer converts. #121864

vzakhari · 2025-01-07T01:42:28Z

This patch adds a canonicalization pattern for optimizing redundant
"pointer" fir.converts. Such converts prevent the StackArrays pass
to recognize fir.freemem for the corresponding fir.allocmem, e.g.:

    %69 = fir.allocmem !fir.array<2xi32>
    %71:2 = hlfir.declare %69(%70) {uniq_name = ".tmp.arrayctor"} :
        (!fir.heap<!fir.array<2xi32>>, !fir.shape<1>) ->
        (!fir.heap<!fir.array<2xi32>>, !fir.heap<!fir.array<2xi32>>)
    %95 = fir.convert %71#1 :
        (!fir.heap<!fir.array<2xi32>>) -> !fir.ref<!fir.array<2xi32>>
    %100 = fir.convert %95 :
        (!fir.ref<!fir.array<2xi32>>) -> !fir.heap<!fir.array<2xi32>>
    fir.freemem %100 : !fir.heap<!fir.array<2xi32>>

I found this in tonto, but the change does not affect performance at all.
Anyway, it looks like a reasonable thing to do, and it makes easier
to compare the performance profiles with other compilers'.

This patch adds a canonicalization pattern for optimizing redundant "pointer" fir.converts. Such converts prevent the StackArrays pass to recognize fir.freemem for the corresponding fir.allocmem, e.g.: ``` %69 = fir.allocmem !fir.array<2xi32> %71:2 = hlfir.declare %69(%70) {uniq_name = ".tmp.arrayctor"} : (!fir.heap<!fir.array<2xi32>>, !fir.shape<1>) -> (!fir.heap<!fir.array<2xi32>>, !fir.heap<!fir.array<2xi32>>) %95 = fir.convert %71#1 : (!fir.heap<!fir.array<2xi32>>) -> !fir.ref<!fir.array<2xi32>> %100 = fir.convert %95 : (!fir.ref<!fir.array<2xi32>>) -> !fir.heap<!fir.array<2xi32>> fir.freemem %100 : !fir.heap<!fir.array<2xi32>> ``` I found this in `tonto`, but the change does not affect performance at all. Anyway, it looks like a reasonable thing to do, and it makes easier to compare the performance profiles with other compilers'.

llvmbot · 2025-01-07T01:43:03Z

@llvm/pr-subscribers-flang-fir-hlfir

Author: Slava Zakharin (vzakhari)

Changes

This patch adds a canonicalization pattern for optimizing redundant
"pointer" fir.converts. Such converts prevent the StackArrays pass
to recognize fir.freemem for the corresponding fir.allocmem, e.g.:

    %69 = fir.allocmem !fir.array&lt;2xi32&gt;
    %71:2 = hlfir.declare %69(%70) {uniq_name = ".tmp.arrayctor"} :
        (!fir.heap&lt;!fir.array&lt;2xi32&gt;&gt;, !fir.shape&lt;1&gt;) -&gt;
        (!fir.heap&lt;!fir.array&lt;2xi32&gt;&gt;, !fir.heap&lt;!fir.array&lt;2xi32&gt;&gt;)
    %95 = fir.convert %71#<!-- -->1 :
        (!fir.heap&lt;!fir.array&lt;2xi32&gt;&gt;) -&gt; !fir.ref&lt;!fir.array&lt;2xi32&gt;&gt;
    %100 = fir.convert %95 :
        (!fir.ref&lt;!fir.array&lt;2xi32&gt;&gt;) -&gt; !fir.heap&lt;!fir.array&lt;2xi32&gt;&gt;
    fir.freemem %100 : !fir.heap&lt;!fir.array&lt;2xi32&gt;&gt;

I found this in tonto, but the change does not affect performance at all.
Anyway, it looks like a reasonable thing to do, and it makes easier
to compare the performance profiles with other compilers'.

Full diff: https://github.com/llvm/llvm-project/pull/121864.diff

5 Files Affected:

(modified) flang/include/flang/Optimizer/Dialect/CanonicalizationPatterns.td (+17)
(modified) flang/lib/Optimizer/Dialect/FIROps.cpp (+2-1)
(modified) flang/test/Fir/convert-fold.fir (+9)
(modified) flang/test/Lower/array-substring.f90 (+1-2)
(modified) flang/test/Lower/vector-subscript-io.f90 (+1-2)

diff --git a/flang/include/flang/Optimizer/Dialect/CanonicalizationPatterns.td b/flang/include/flang/Optimizer/Dialect/CanonicalizationPatterns.td
index 1dbde5c1c73024..2414de496d45bc 100644
--- a/flang/include/flang/Optimizer/Dialect/CanonicalizationPatterns.td
+++ b/flang/include/flang/Optimizer/Dialect/CanonicalizationPatterns.td
@@ -57,6 +57,9 @@ def StrictSmallerWidthPred : Constraint<CPred<
                        "$0.getType().getIntOrFloatBitWidth() < "
                        "$1.getType().getIntOrFloatBitWidth()">>;
 
+def PointerCompatiblePred
+    : Constraint<CPred<"fir::ConvertOp::isPointerCompatible($0.getType())">>;
+
 // floats or ints that undergo successive extensions or successive truncations.
 def ConvertConvertOptPattern
     : Pat<(fir_ConvertOp:$res (fir_ConvertOp:$irm $arg)),
@@ -112,4 +115,18 @@ def ForwardConstantConvertPattern
           (createConstantOp $res, $attr),
           [(IndexTypePred $res), (IntegerTypePred $cnt)]>;
 
+// Optimize redundant pointer conversions, e.g.:
+// %1 = fir.convert %0 :
+//     (!fir.heap<!fir.array<2xf32>>) -> !fir.ref<!fir.array<2xf32>>
+// %2 = fir.convert %1 :
+//     (!fir.ref<!fir.array<2xf32>>) -> !fir.heap<!fir.array<2xf32>>
+// Will be optimized into:
+// %2 = fir.convert %0 :
+//     (!fir.heap<!fir.array<2xf32>>) -> !fir.heap<!fir.array<2xf32>>
+// which is redundant due to RedundantConvertOptPattern.
+def ChainedPointerConvertsPattern
+    : Pat<(fir_ConvertOp:$res(fir_ConvertOp:$irm $arg)), (fir_ConvertOp $arg),
+          [(PointerCompatiblePred $arg), (PointerCompatiblePred $irm),
+           (PointerCompatiblePred $res)]>;
+
 #endif // FORTRAN_FIR_REWRITE_PATTERNS
diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp
index cdcf9bda49a627..fa83aa380e489c 100644
--- a/flang/lib/Optimizer/Dialect/FIROps.cpp
+++ b/flang/lib/Optimizer/Dialect/FIROps.cpp
@@ -1313,7 +1313,8 @@ void fir::ConvertOp::getCanonicalizationPatterns(
   results.insert<ConvertConvertOptPattern, ConvertAscendingIndexOptPattern,
                  ConvertDescendingIndexOptPattern, RedundantConvertOptPattern,
                  CombineConvertOptPattern, CombineConvertTruncOptPattern,
-                 ForwardConstantConvertPattern>(context);
+                 ForwardConstantConvertPattern, ChainedPointerConvertsPattern>(
+      context);
 }
 
 mlir::OpFoldResult fir::ConvertOp::fold(FoldAdaptor adaptor) {
diff --git a/flang/test/Fir/convert-fold.fir b/flang/test/Fir/convert-fold.fir
index ebb6c8db7c891c..fb30e634ba5e6d 100644
--- a/flang/test/Fir/convert-fold.fir
+++ b/flang/test/Fir/convert-fold.fir
@@ -35,3 +35,12 @@ func.func @ctest() -> index {
   // CHECK-NEXT: return %{{.*}} : index
   return %2 : index
 }
+
+// CHECK-LABEL:   func.func @ptrtest(
+// CHECK-SAME:                       %[[VAL_0:.*]]: !fir.heap<!fir.array<2xf32>>) -> !fir.heap<!fir.array<2xf32>> {
+func.func @ptrtest(%0 : !fir.heap<!fir.array<2xf32>>) -> !fir.heap<!fir.array<2xf32>> {
+  %1 = fir.convert %0 : (!fir.heap<!fir.array<2xf32>>) -> !fir.ref<!fir.array<2xf32>>
+  %2 = fir.convert %1 : (!fir.ref<!fir.array<2xf32>>) -> !fir.heap<!fir.array<2xf32>>
+// CHECK:           return %[[VAL_0]] : !fir.heap<!fir.array<2xf32>>
+  return %2 : !fir.heap<!fir.array<2xf32>>
+}
diff --git a/flang/test/Lower/array-substring.f90 b/flang/test/Lower/array-substring.f90
index 02101039120e9f..7544fbb989627b 100644
--- a/flang/test/Lower/array-substring.f90
+++ b/flang/test/Lower/array-substring.f90
@@ -24,9 +24,8 @@
 ! CHECK:         %[[VAL_16:.*]] = fir.array_coor %[[VAL_7]](%[[VAL_9]]) {{\[}}%[[VAL_10]]] %[[VAL_15]] : (!fir.ref<!fir.array<1x!fir.char<1,12>>>, !fir.shape<1>, !fir.slice<1>, index) -> !fir.ref<!fir.char<1,12>>
 ! CHECK:         %[[VAL_17:.*]] = fir.convert %[[VAL_16]] : (!fir.ref<!fir.char<1,12>>) -> !fir.ref<!fir.array<12x!fir.char<1>>>
 ! CHECK:         %[[VAL_18:.*]] = fir.coordinate_of %[[VAL_17]], %[[VAL_2]] : (!fir.ref<!fir.array<12x!fir.char<1>>>, index) -> !fir.ref<!fir.char<1>>
-! CHECK:         %[[VAL_19:.*]] = fir.convert %[[VAL_18]] : (!fir.ref<!fir.char<1>>) -> !fir.ref<!fir.char<1,?>>
 ! CHECK:         %[[VAL_20:.*]] = fir.array_coor %[[VAL_11]](%[[VAL_9]]) %[[VAL_15]] : (!fir.ref<!fir.array<1x!fir.char<1,8>>>, !fir.shape<1>, index) -> !fir.ref<!fir.char<1,8>>
-! CHECK:         %[[VAL_21:.*]] = fir.convert %[[VAL_19]] : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
+! CHECK:         %[[VAL_21:.*]] = fir.convert %[[VAL_18]] : (!fir.ref<!fir.char<1>>) -> !fir.ref<i8>
 ! CHECK:         %[[VAL_22:.*]] = fir.convert %[[VAL_20]] : (!fir.ref<!fir.char<1,8>>) -> !fir.ref<i8>
 ! CHECK:         %[[VAL_23:.*]] = fir.convert %[[VAL_4]] : (index) -> i64
 ! CHECK:         %[[VAL_24:.*]] = fir.call @_FortranACharacterCompareScalar1(%[[VAL_21]], %[[VAL_22]], %[[VAL_23]], %[[VAL_23]]) {{.*}}: (!fir.ref<i8>, !fir.ref<i8>, i64, i64) -> i32
diff --git a/flang/test/Lower/vector-subscript-io.f90 b/flang/test/Lower/vector-subscript-io.f90
index 372130fd099074..9a041af16c88cc 100644
--- a/flang/test/Lower/vector-subscript-io.f90
+++ b/flang/test/Lower/vector-subscript-io.f90
@@ -325,12 +325,11 @@ subroutine substring(x, y, i, j)
 ! CHECK:   %[[VAL_230:.*]] = arith.subi %[[VAL_216]], %[[VAL_210]] : index
 ! CHECK:   %[[VAL_231:.*]] = fir.convert %[[VAL_228]] : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<!fir.array<?x!fir.char<1>>>
 ! CHECK:   %[[VAL_232:.*]] = fir.coordinate_of %[[VAL_231]], %[[VAL_230]] : (!fir.ref<!fir.array<?x!fir.char<1>>>, index) -> !fir.ref<!fir.char<1>>
-! CHECK:   %[[VAL_233:.*]] = fir.convert %[[VAL_232]] : (!fir.ref<!fir.char<1>>) -> !fir.ref<!fir.char<1,?>>
 ! CHECK:   %[[VAL_234:.*]] = arith.subi %[[VAL_219]], %[[VAL_216]] : index
 ! CHECK:   %[[VAL_235:.*]] = arith.addi %[[VAL_234]], %[[VAL_210]] : index
 ! CHECK:   %[[VAL_236:.*]] = arith.cmpi slt, %[[VAL_235]], %[[VAL_209]] : index
 ! CHECK:   %[[VAL_237:.*]] = arith.select %[[VAL_236]], %[[VAL_209]], %[[VAL_235]] : index
-! CHECK:   %[[VAL_238:.*]] = fir.convert %[[VAL_233]] : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
+! CHECK:   %[[VAL_238:.*]] = fir.convert %[[VAL_232]] : (!fir.ref<!fir.char<1>>) -> !fir.ref<i8>
 ! CHECK:   %[[VAL_239:.*]] = fir.convert %[[VAL_237]] : (index) -> i64
 ! CHECK:   %[[VAL_240:.*]] = fir.call @_FortranAioInputAscii(%[[VAL_213]], %[[VAL_238]], %[[VAL_239]]) {{.*}}: (!fir.ref<i8>, !fir.ref<i8>, i64) -> i1
 ! CHECK:   %[[VAL_241:.*]] = arith.addi %[[VAL_221]], %[[VAL_210]] overflow<nsw> : index

jeanPerier

Thanks! Makes sense to me to make the IR cleaner here.

tblah

This LGTM, but maybe it is a broader problem that stack arrays doesn't understand fir.convert because it could still get confused by non-redundant converts. I'll add that to my to-do list.

vzakhari · 2025-01-07T16:26:19Z

This LGTM, but maybe it is a broader problem that stack arrays doesn't understand fir.convert because it could still get confused by non-redundant converts. I'll add that to my to-do list.

Thanks for the follow-up! I did not want to modify just the StackArrays pass. It seems to me that the conversions canonicalization potentially helps other passes, so I decided to go this route instead.

kawashima-fj · 2025-01-16T08:56:16Z

@vzakhari After this PR is merged, https://github.com/fujitsu/compiler-test-suite/tree/2024-12/Fortran/0393/0393_0171.f90 and https://github.com/fujitsu/compiler-test-suite/blob/2024-12/Fortran/0393/0393_0174.f90 started to fail by segmentation fault on AArch64.

I can see the error with -O2.

$ flang -O2 Fortran/0393/0393_0171.f90
$ ./a.out
Segmentation fault (core dumped)

I confirmed the error occurs with this commit (2e637db) and the latest main branch (f30ff0b) but doesn't occur with the previous commit (51c9c82).

Could you check? If you cannot reproduce the error, please let me know.

vzakhari · 2025-01-16T15:55:49Z

Thanks for reporting, @kawashima-fj! I saw the bug report, but could not reproduce on my side. I will try again today and will let you know if I still cannot.

vzakhari · 2025-01-17T03:31:12Z

I tried differently built compilers today, but could not reproduce it. @kawashima-fj, can you please run it under gdb and let me know what the backtrace is?

kawashima-fj · 2025-01-17T05:00:33Z

@vzakhari Assembly output by flang -S -g -O2 (Flang/LLVM: commit f30ff0b) and GDB outputs are attached.

0393_0171.s.txt
0393_0174.s.txt
0393_0171.gdb.txt
0393_0174.gdb.txt

I don't know why 0393_0174.gdb.txt cannot get a program counter.

I run it on Ubuntu 24.04.1 LTS docker container in Rocky Linux 8.6 host on Grace (Neoverse V2).

Please let me know you need more information.

vzakhari · 2025-01-17T05:06:08Z

Thank you! I will take a closer look tomorrow. I would appreciate it if you can collect the assembly files with my commit reverted. No problem, if you do not have time for this.

kawashima-fj · 2025-01-17T05:35:07Z

I'll do it after some meetings are over. You can obtan the result tomorrow in your timezone.

kawashima-fj · 2025-01-17T09:09:51Z

@vzakhari I'm very sorry. The root cause of the segmentation fault does not seem to be your commit. Reverting your commit (2e637db) from latest main branch (f30ff0b) results in the same assembly file and the same segmentation fault. Your commit 2e637db seems to have triggered an existing bug at that time, and the latest main branch can generate the segmentation fault without your change.

I'll take a look and create a new issue if needed.

vzakhari · 2025-01-17T16:22:12Z

Thank you for figuring it out! No worries!

vzakhari requested review from clementval and jeanPerier January 7, 2025 01:42

llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Jan 7, 2025

jeanPerier approved these changes Jan 7, 2025

View reviewed changes

tblah approved these changes Jan 7, 2025

View reviewed changes

tblah mentioned this pull request Jan 7, 2025

[flang][StackArrays] track pointers through fir.convert #121919

Merged

vzakhari merged commit 2e637db into llvm:main Jan 7, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[flang] Canonicalize redundant pointer converts. #121864

[flang] Canonicalize redundant pointer converts. #121864

Uh oh!

vzakhari commented Jan 7, 2025

Uh oh!

llvmbot commented Jan 7, 2025

Uh oh!

jeanPerier left a comment

Uh oh!

tblah left a comment

Uh oh!

vzakhari commented Jan 7, 2025

Uh oh!

Uh oh!

kawashima-fj commented Jan 16, 2025

Uh oh!

vzakhari commented Jan 16, 2025

Uh oh!

vzakhari commented Jan 17, 2025 •

edited

Loading

Uh oh!

kawashima-fj commented Jan 17, 2025

Uh oh!

vzakhari commented Jan 17, 2025

Uh oh!

kawashima-fj commented Jan 17, 2025

Uh oh!

kawashima-fj commented Jan 17, 2025

Uh oh!

vzakhari commented Jan 17, 2025

Uh oh!

Uh oh!

[flang] Canonicalize redundant pointer converts. #121864

[flang] Canonicalize redundant pointer converts. #121864

Uh oh!

Conversation

vzakhari commented Jan 7, 2025

Uh oh!

llvmbot commented Jan 7, 2025

Uh oh!

jeanPerier left a comment

Choose a reason for hiding this comment

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

vzakhari commented Jan 7, 2025

Uh oh!

Uh oh!

kawashima-fj commented Jan 16, 2025

Uh oh!

vzakhari commented Jan 16, 2025

Uh oh!

vzakhari commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kawashima-fj commented Jan 17, 2025

Uh oh!

vzakhari commented Jan 17, 2025

Uh oh!

kawashima-fj commented Jan 17, 2025

Uh oh!

kawashima-fj commented Jan 17, 2025

Uh oh!

vzakhari commented Jan 17, 2025

Uh oh!

Uh oh!

vzakhari commented Jan 17, 2025 •

edited

Loading