Skip to content

[flang] AliasAnalysis: Fix pointer component logic #94242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Oct 15, 2024

Conversation

jdenny-ornl
Copy link
Collaborator

@jdenny-ornl jdenny-ornl commented Jun 3, 2024

This PR applies the changes discussed in [RFC] Rationale for Flang AliasAnalysis pointer component logic.

In summary, this PR replaces the existing pointer component logic in Flang's AliasAnalysis implementation. That logic focuses on aliasing between pointers and non-pointer, non-target composites that have pointer components. However, it is more conservative than necessary, and some existing tests expect its current results when less conservative results seem reasonable.

This PR splits the logic into two cases:

  1. Source values are the same: Return MayAlias when one value is the address of a composite, and the other value is statically the address of a pointer component of that composite.
  2. Source values are different: Return MayAlias when one value is the address of a composite (actual argument), and the other value is the address of a pointer (dummy arg) that might dynamically be a component of that composite.

In both cases, the actual implementation is still more conservative than described above, but it can be improved further later. Details appear in the comments.

Additionally, this PR revises the logic that reports MayAlias for a pointer/target vs. another pointer/target. It constrains the existing logic to handle only isData cases, and it adds less conservative handling of !isData cases elsewhere. First, it extends case 2 listed above to cover the case where the actual argument is the address of a pointer rather than a composite. Second, it adds a third case: where target attributes enable aliasing with a dummy argument.

Copy link

github-actions bot commented Jun 3, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Comment on lines 144 to 150
if (lhsSrc.origin.u == rhsSrc.origin.u &&
((lhsSrc.approximateSource && !lhsSrc.isData() && !rhsSrc.approximateSource) ||
(rhsSrc.approximateSource && !rhsSrc.isData() && !lhsSrc.approximateSource))) {
LLVM_DEBUG(llvm::dbgs()
<< " aliasing between composite and pointer component\n");
return AliasResult::MayAlias;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Renaud-K posted the following comment as part of a commit, and I'm trying to move the conversation into this PR:

I don't think approximateSource is a reliable way of meaning component. We could have an optimized out fir.ccordinate_of %memref %c0, still referring to the same source.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Renaud-K I could extend Source with a field like bool component; It would be set to true for any DesignateOp specifying a component, and approximateSource would also still be set to true then. Does that seem like a reasonable direction to try?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DesignateOp is later lowered to coordinate_of for instance. The absence of which does not mean we are not dealing with a component. The alias analysis has to work at all levels. Again, fir.coordinate_of %memref %c0 is the same as %memref. We need to provide consistent results whether fir.coordinate_of is present or at least error on the conservative side.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DesignateOp is later lowered to coordinate_of for instance.

Thanks. In a simple example, I see that

%10 = hlfir.designate %1#0{"p"}   {fortran_attrs = #fir.var_attrs<pointer>} : (!fir.ref<!fir.type<_QMmTt{p:!fir.box<!fir.ptr<f32>>}>>) -> !fir.ref<!fir.box<!fir.ptr<f32>>>

becomes

%10 = fir.field_index p, !fir.type<_QMmTt{p:!fir.box<!fir.ptr<f32>>}>
%11 = fir.coordinate_of %1, %10 : (!fir.ref<!fir.type<_QMmTt{p:!fir.box<!fir.ptr<f32>>}>>, !fir.field) -> !fir.ref<!fir.box<!fir.ptr<f32>>>

The latter pattern seems simple enough to detect. I did not find a later version of that IR where fir.coordinate_of appeared but !fir.field and fir.field_index had been optimized away. Is your fir.coordinate_of %memref %c0 an example of that?

The absence of which does not mean we are not dealing with a component.

Is it true that either hlfir.designate or fir.coordinate_of must be present when accessing a component? If not, then, independently of this PR, we risk approximateSource=false and an invalid MustAlias, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MustAlias is correct if we can determine the source more accurately, and MayAlias is a conservative error. But in your case we could go from MayAlias (because we detect components) to NoAlias which is too optimistic. Maybe we can keep using the type, such as, if the type of the source is a derived type and the type of the starting point is not then we have a component.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To finish the thought, I don't see why we have to give up (SourceKind=Indirect forces MayAlias) on tracking aliasing for a pointer just because it's a member of an alloca rather than the entire alloca.

Copy link
Contributor

@Renaud-K Renaud-K Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern would be to modify getOriginalDef to return anything other than Source::Indirect.
%11 with the designate case is always indirect. We have no idea what was stored in %9, it could a global, an allocatable or a dummy: we do not know.

There is room to improve on SourceKind::Indirect, such as using the type system to distinguish !fir.ptr from other memory references but that would be a different discussion. In the example just above, %11 would then be indirect too since we are following the data and not interested in the box address which happens to be an SourceKind::Alloca. Today, I would guess, we would get SourceKind::Alloca with an isData flag but it would be better modeled with SourceKind::Indirect and isData. Then we could do with SourceKind::Indirect what we used to do clumsily with SourceKind::Direct.

But I think, we should take it one step at a time. It would impact @tblah who is handling the (global, isData) separately. These two would become (indirect, isData)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern would be to modify getOriginalDef to return anything other than Source::Indirect.

For the record, getOriginalDef does not itself choose a SourceKind. Moreover, after getSource's LoadOp case calls getOriginalDef, getSource currently returns any of Indirect, Argument, or Global. That is true independently of this patch. No modification required.

(As you have noted, there are cases like alloca that getSource's LoadOp case doesn't currently fully handle. As a result, it currently returns Indirect sometimes where, for consistency with the current handling of dummy args and globals, it should return Allocate.)

%11 with the designate case is always indirect. We have no idea what was stored in %9, it could a global, an allocatable or a dummy: we do not know.

In that example, what was stored in %9 is the address of the pointer. In my above example without hlfir.designate, %3 is the address of the pointer. As far as I understand, the difference between those is whether that address is at a base address (for a global, dummy arg, alloca, etc.) or at an offset from that base address.

I don't see why that difference should affect the SourceKind for any value that is computed from the address of the pointer. In particular, if the source origin (i.e., %2) were a dummy arg or global, the SourceKind for %11 in my above example without hflir.designate would currently be Argument or Global, but the SourceKind for %11 in my above example with hlfir.designate would currently be Indirect. I see no reason for that difference. In a later patch, I would like to fix that. (Again, I would also like to add support for the alloca/Allocate case shown in the examples.)

There is room to improve on SourceKind::Indirect, such as using the type system to distinguish !fir.ptr from other memory references but that would be a different discussion. In the example just above, %11 would then be indirect too since we are following the data and not interested in the box address which happens to be an SourceKind::Alloca. Today, I would guess, we would get SourceKind::Alloca with an isData flag but it would be better modeled with SourceKind::Indirect and isData. Then we could do with SourceKind::Indirect what we used to do clumsily with SourceKind::Direct.

Whatever we decide to call the SourceKind for that case, do we agree that hlfir.designate should not change the SourceKind for %11 between my previous two examples?

But I think, we should take it one step at a time.

Agreed. Hopefully the current patch I'm trying to develop can be a next step.

It would impact @tblah who is handling the (global, isData) separately. These two would become (indirect, isData)

Where is the discussion of his work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jdenny-ornl my work was proposed here https://discourse.llvm.org/t/rfc-propagate-fir-alias-analysis-information-using-tbaa/73755

There have since been changes since then to add support for Direct variables: 6242c8c

I think in this case the proposed changes won't alter the TBAA tags. My pass doesn't tag SourceKind::Alloca (by default) or SourceKind::Indirect (at all) (there are still TBAA tags added later during codegen to distinguish box metadata access from data access, but that is different again).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tblah Thanks for the pointers!

For the static case (when source values are the same):
- As discussed in this PR with Renaud Kauffmann, rely on types and
  `!isData` instead of `approximateSource`, which doesn't necessarily
  indicate a component.
- Improve comments to document effect on allocatables and some
  remaining limitations.

For the dynamic case:
- For the composite, as discussed in the RFC, don't restrict it to a
  global.  It might be an argument.  Just exclude locals, which cannot
  be aliased by a pointer dummy arg.
- For the pointer, make sure it's the address of the pointer
  (`!isData`) not the address in the pointer.
- For the pointer, make sure it's a dummy argument not a component of
  a composite dummy argument.

For both cases, for the composite, call `isRecordWithPointerComponent`
on the value (`lhs` or `rhs`) not on its source.  Otherwise, we can
mistake a separate component for a composite.

Fix `getSource` to record pointer attributes for `DesignateOp`.
Otherwise, we miss MayAlias for pointer components vs. other pointers.
Before the changes in this PR, `isRecordWithPointerComponent` returned
true on the source of a pointer component, and that was enough to
return MayAlias for it vs. another pointer.

Extend tests, including coverage of all fixes above.
@@ -17,13 +25,13 @@
// end module

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check aliasing between x%next and and y%next?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing. I've added that check. It returns MayAlias. As the comment I added says, I think it should return NoAlias. We could try to fix that in this patch, but I think it would be better to address it in a separate patch because it involves a different part of the AliasAnalysis logic: the isTargetOrPointer() checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the cases I wanted covered here: #95287

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I plan to take a look today.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That part looks good to me. Thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing. I've added that check. It returns MayAlias. As the comment I added says, I think it should return NoAlias. We could try to fix that in this patch, but I think it would be better to address it in a separate patch because it involves a different part of the AliasAnalysis logic: the isTargetOrPointer() checks.

I decided to go ahead and fix that in this PR as it helped me to work through another issue.

Renaud-K added a commit that referenced this pull request Jun 13, 2024
Ahead of #94242 and as
requested in the technical call, I am adding a couple of tests for
pointer components that I would like to make sure are covered.
Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes!

bool isDummyArgument() const;
bool isData() const;
bool isBoxData() const;

mlir::Type getType() const;

/// Return true, if `ty` is a reference type to an object of derived type
/// that contains a component with POINTER attribute.
static bool isRecordWithPointerComponent(mlir::Type ty);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it may be better to make it a private static member of the AliasAnalysis class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for the isPointerReference member function that follows it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems appropriate as well. Thank you!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for the review.

@jdenny-ornl jdenny-ornl marked this pull request as ready for review June 13, 2024 20:52
@jdenny-ornl jdenny-ornl changed the title WIP: [flang] AliasAnalysis: Fix pointer component logic [flang] AliasAnalysis: Fix pointer component logic Jun 13, 2024
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Jun 13, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 13, 2024

@llvm/pr-subscribers-flang-fir-hlfir

Author: Joel E. Denny (jdenny-ornl)

Changes

This PR is not yet being proposed for merging. For now, it is intended to facilitate the discussion of an RFC on Flang AliasAnalysis pointer component logic.


Patch is 23.63 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/94242.diff

4 Files Affected:

  • (modified) flang/include/flang/Optimizer/Analysis/AliasAnalysis.h (+4-5)
  • (modified) flang/lib/Optimizer/Analysis/AliasAnalysis.cpp (+91-17)
  • (modified) flang/test/Analysis/AliasAnalysis/alias-analysis-3.fir (+7-8)
  • (modified) flang/test/Analysis/AliasAnalysis/alias-analysis-9.fir (+192-7)
diff --git a/flang/include/flang/Optimizer/Analysis/AliasAnalysis.h b/flang/include/flang/Optimizer/Analysis/AliasAnalysis.h
index 8cb6e92e41d97..4963d39c52bdd 100644
--- a/flang/include/flang/Optimizer/Analysis/AliasAnalysis.h
+++ b/flang/include/flang/Optimizer/Analysis/AliasAnalysis.h
@@ -153,17 +153,16 @@ struct AliasAnalysis {
     /// Return true, if Target or Pointer attribute is set.
     bool isTargetOrPointer() const;
 
-    /// Return true, if the memory source's `valueType` is a reference type
-    /// to an object of derived type that contains a component with POINTER
-    /// attribute.
-    bool isRecordWithPointerComponent() const;
-
     bool isDummyArgument() const;
     bool isData() const;
     bool isBoxData() const;
 
     mlir::Type getType() const;
 
+    /// Return true, if `ty` is a reference type to an object of derived type
+    /// that contains a component with POINTER attribute.
+    static bool isRecordWithPointerComponent(mlir::Type ty);
+
     /// Return true, if `ty` is a reference type to a boxed
     /// POINTER object or a raw fir::PointerType.
     static bool isPointerReference(mlir::Type ty);
diff --git a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
index 2084962fde729..87389a848b9bf 100644
--- a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
+++ b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
@@ -60,6 +60,14 @@ void AliasAnalysis::Source::print(llvm::raw_ostream &os) const {
   attributes.Dump(os, EnumToString);
 }
 
+bool AliasAnalysis::Source::isRecordWithPointerComponent(mlir::Type ty) {
+  auto eleTy = fir::dyn_cast_ptrEleTy(ty);
+  if (!eleTy)
+    return false;
+  // TO DO: Look for pointer components
+  return mlir::isa<fir::RecordType>(eleTy);
+}
+
 bool AliasAnalysis::Source::isPointerReference(mlir::Type ty) {
   auto eleTy = fir::dyn_cast_ptrEleTy(ty);
   if (!eleTy)
@@ -86,14 +94,6 @@ bool AliasAnalysis::Source::isBoxData() const {
          origin.isData;
 }
 
-bool AliasAnalysis::Source::isRecordWithPointerComponent() const {
-  auto eleTy = fir::dyn_cast_ptrEleTy(valueType);
-  if (!eleTy)
-    return false;
-  // TO DO: Look for pointer components
-  return mlir::isa<fir::RecordType>(eleTy);
-}
-
 AliasResult AliasAnalysis::alias(Value lhs, Value rhs) {
   // TODO: alias() has to be aware of the function scopes.
   // After MLIR inlining, the current implementation may
@@ -115,6 +115,10 @@ AliasResult AliasAnalysis::alias(Value lhs, Value rhs) {
   }
 
   if (lhsSrc.kind == rhsSrc.kind) {
+    // If the kinds and origins are the same, then lhs and rhs must alias unless
+    // either source is approximate.  Approximate sources are for parts of the
+    // origin, but we don't have info here on which parts and whether they
+    // overlap, so we normally return MayAlias in that case.
     if (lhsSrc.origin == rhsSrc.origin) {
       LLVM_DEBUG(llvm::dbgs()
                  << "  aliasing because same source kind and origin\n");
@@ -122,6 +126,34 @@ AliasResult AliasAnalysis::alias(Value lhs, Value rhs) {
         return AliasResult::MayAlias;
       return AliasResult::MustAlias;
     }
+    // If one value is the address of a composite, and if the other value is the
+    // address of a pointer/allocatable component of that composite, their
+    // origins compare unequal because the latter has !isData().  As for the
+    // address of any component vs. the address of the composite, a store to one
+    // can affect a load from the other, so the result should be MayAlias.  To
+    // catch this case, we conservatively return MayAlias when one value is the
+    // address of a composite, the other value is non-data, and they have the
+    // same origin value.
+    //
+    // TODO: That logic does not check that the latter is actually a component
+    // of the former, so it can return MayAlias when unnecessary.  For example,
+    // they might both be addresses of components of a larger composite.
+    //
+    // FIXME: Actually, we should generalize from
+    // Source::isRecordWithPointerComponent to any composite because a component
+    // with !isData() is not always a pointer.  However,
+    // Source::isRecordWithPointerComponent currently doesn't actually check for
+    // pointer components, so it's fine for now.
+    if (lhsSrc.origin.u == rhsSrc.origin.u &&
+        ((Source::isRecordWithPointerComponent(lhs.getType()) &&
+          !rhsSrc.isData()) ||
+         (Source::isRecordWithPointerComponent(rhs.getType()) &&
+          !lhsSrc.isData()))) {
+      LLVM_DEBUG(llvm::dbgs()
+                 << "  aliasing between composite and non-data component with "
+                 << "same source kind and origin value\n");
+      return AliasResult::MayAlias;
+    }
 
     // Two host associated accesses may overlap due to an equivalence.
     if (lhsSrc.kind == SourceKind::HostAssoc) {
@@ -131,12 +163,17 @@ AliasResult AliasAnalysis::alias(Value lhs, Value rhs) {
   }
 
   Source *src1, *src2;
+  Value *val1, *val2;
   if (lhsSrc.kind < rhsSrc.kind) {
     src1 = &lhsSrc;
     src2 = &rhsSrc;
+    val1 = &lhs;
+    val2 = &rhs;
   } else {
     src1 = &rhsSrc;
     src2 = &lhsSrc;
+    val1 = &rhs;
+    val2 = &lhs;
   }
 
   if (src1->kind == SourceKind::Argument &&
@@ -159,21 +196,56 @@ AliasResult AliasAnalysis::alias(Value lhs, Value rhs) {
     src2->attributes.set(Attribute::Target);
   }
 
-  // Dummy TARGET/POINTER argument may alias with a global TARGET/POINTER.
+  // Two TARGET/POINTERs may alias.
   if (src1->isTargetOrPointer() && src2->isTargetOrPointer() &&
       src1->isData() == src2->isData()) {
     LLVM_DEBUG(llvm::dbgs() << "  aliasing because of target or pointer\n");
     return AliasResult::MayAlias;
   }
 
-  // Box for POINTER component inside an object of a derived type
-  // may alias box of a POINTER object, as well as boxes for POINTER
-  // components inside two objects of derived types may alias.
-  if ((src1->isRecordWithPointerComponent() && src2->isTargetOrPointer()) ||
-      (src2->isRecordWithPointerComponent() && src1->isTargetOrPointer()) ||
-      (src1->isRecordWithPointerComponent() &&
-       src2->isRecordWithPointerComponent())) {
-    LLVM_DEBUG(llvm::dbgs() << "  aliasing because of pointer components\n");
+  // A pointer dummy arg (but not a pointer component of a dummy arg) may alias
+  // a pointer component and thus the associated composite.  That composite
+  // might be a global or another dummy arg.  This is an example of the global
+  // composite case:
+  //
+  // module m
+  //   type t
+  //      real, pointer :: p
+  //   end type
+  //   type(t) :: a
+  //   type(t) :: b
+  // contains
+  //   subroutine test(p)
+  //     real, pointer :: p
+  //     p = 42
+  //     a = b
+  //     print *, p
+  //   end subroutine
+  // end module
+  // program
+  //   use m
+  //   real, target :: x1 = 1
+  //   real, target :: x2 = 2
+  //   a%p => x1
+  //   b%p => x2
+  //   call test(a%p)
+  // end
+  //
+  // The dummy argument p is an alias for a%p, even for the purposes of pointer
+  // association during the assignment a = b.  Thus, the program should print 2.
+  if ((Source::isRecordWithPointerComponent(val1->getType()) &&
+       src1->kind != SourceKind::Allocate &&
+       src2->kind == SourceKind::Argument &&
+       src2->attributes.test(Attribute::Pointer) && !src2->isData() &&
+       !Source::isRecordWithPointerComponent(src2->valueType)) ||
+      (Source::isRecordWithPointerComponent(val2->getType()) &&
+       src2->kind != SourceKind::Allocate &&
+       src1->kind == SourceKind::Argument &&
+       src1->attributes.test(Attribute::Pointer) && !src1->isData() &&
+       !Source::isRecordWithPointerComponent(src1->valueType))) {
+    LLVM_DEBUG(llvm::dbgs()
+               << "  aliasing between pointer arg and composite with pointer "
+               << "component\n");
     return AliasResult::MayAlias;
   }
 
@@ -360,6 +432,8 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v,
           defOp = v.getDefiningOp();
         })
         .Case<hlfir::DesignateOp>([&](auto op) {
+          auto varIf = llvm::cast<fir::FortranVariableOpInterface>(defOp);
+          attributes |= getAttrsFromVariable(varIf);
           // Track further through the memory indexed into
           // => if the source arrays/structures don't alias then nor do the
           //    results of hlfir.designate
diff --git a/flang/test/Analysis/AliasAnalysis/alias-analysis-3.fir b/flang/test/Analysis/AliasAnalysis/alias-analysis-3.fir
index 91829a461dc72..eab438576c2bc 100644
--- a/flang/test/Analysis/AliasAnalysis/alias-analysis-3.fir
+++ b/flang/test/Analysis/AliasAnalysis/alias-analysis-3.fir
@@ -20,8 +20,8 @@
 // CHECK-LABEL: Testing : "_QMmPtest
 // CHECK: a#0 <-> func.region0#0: MayAlias
 
-// FIXME: a's box cannot alias with raw reference to f32 (x), so MayAlias below must be NoAlias:
-// CHECK: a#0 <-> func.region0#1: MayAlias
+// a's box cannot alias with raw reference to f32 (x)
+// CHECK: a#0 <-> func.region0#1: NoAlias
 
 // pointer_dummy's box cannot alias with raw reference to f32 (x)
 // CHECK: func.region0#0 <-> func.region0#1: NoAlias
@@ -46,7 +46,7 @@ func.func private @_QPtest2(!fir.ref<f32>)
 
 // -----
 
-// A composite with a pointer component may alias with a dummy
+// A composite with a pointer component does not alias with a dummy
 // argument of composite type with a pointer component:
 // module m
 //   type t
@@ -63,7 +63,7 @@ func.func private @_QPtest2(!fir.ref<f32>)
 // end module m
 
 // CHECK-LABEL: Testing : "_QMmPtest"
-// CHECK: a#0 <-> func.region0#0: MayAlias
+// CHECK: a#0 <-> func.region0#0: NoAlias
 
 fir.global @_QMmEa : !fir.type<_QMmTt{pointer_component:!fir.box<!fir.ptr<f32>>}> {
   %0 = fir.undefined !fir.type<_QMmTt{pointer_component:!fir.box<!fir.ptr<f32>>}>
@@ -88,7 +88,7 @@ func.func private @_QPtest2(!fir.ref<f32>)
 // -----
 
 // Two dummy arguments of composite type with a pointer component
-// may alias each other:
+// do not alias each other:
 // module m
 //   type t
 //      real, pointer :: pointer_component
@@ -103,7 +103,7 @@ func.func private @_QPtest2(!fir.ref<f32>)
 // end module m
 
 // CHECK-LABEL: Testing : "_QMmPtest"
-// CHECK: func.region0#0 <-> func.region0#1: MayAlias
+// CHECK: func.region0#0 <-> func.region0#1: NoAlias
 
 func.func @_QMmPtest(%arg0: !fir.ref<!fir.type<_QMmTt{pointer_component:!fir.box<!fir.ptr<f32>>}>> {fir.bindc_name = "a"}, %arg1: !fir.ref<!fir.type<_QMmTt{pointer_component:!fir.box<!fir.ptr<f32>>}>> {fir.bindc_name = "b"}, %arg2: !fir.ref<f32> {fir.bindc_name = "x", fir.target}) attributes {test.ptr = "func"} {
   %0 = fir.field_index pointer_component, !fir.type<_QMmTt{pointer_component:!fir.box<!fir.ptr<f32>>}>
@@ -137,8 +137,7 @@ func.func private @_QPtest2(!fir.ref<f32>)
 // end module m
 
 // CHECK-LABEL: Testing : "_QMmPtest"
-// FIXME: MayAlias must be NoAlias
-// CHECK: func.region0#0 <-> func.region0#1: MayAlias
+// CHECK: func.region0#0 <-> func.region0#1: NoAlias
 
 func.func @_QMmPtest(%arg0: !fir.ref<!fir.type<_QMmTt{allocatable_component:!fir.box<!fir.heap<f32>>}>> {fir.bindc_name = "a"}, %arg1: !fir.ref<!fir.type<_QMmTt{allocatable_component:!fir.box<!fir.heap<f32>>}>> {fir.bindc_name = "b"}) attributes {test.ptr = "func"} {
   %0 = fir.field_index allocatable_component, !fir.type<_QMmTt{allocatable_component:!fir.box<!fir.heap<f32>>}>
diff --git a/flang/test/Analysis/AliasAnalysis/alias-analysis-9.fir b/flang/test/Analysis/AliasAnalysis/alias-analysis-9.fir
index df24a6d32aa59..a5380bde9d108 100644
--- a/flang/test/Analysis/AliasAnalysis/alias-analysis-9.fir
+++ b/flang/test/Analysis/AliasAnalysis/alias-analysis-9.fir
@@ -1,5 +1,13 @@
-// RUN: fir-opt %s -pass-pipeline='builtin.module(func.func(test-fir-alias-analysis))' 2>&1 | FileCheck %s
+// RUN: fir-opt -debug %s -split-input-file \
+// RUN:   -pass-pipeline='builtin.module(func.func(test-fir-alias-analysis))' \
+// RUN:   2>&1 | FileCheck -match-full-lines %s
 
+// FIXME: Extend much of this to check that it also works after
+// convert-hlfir-to-fir, where component access is via fir.coordinate_of instead
+// of hlfir.designate.
+
+// FIXME: What about renaming this test to ptr-component.fir?  What about
+// merging with alias-analysis-3.fir as it has the same focus?
 
 // module m
 // type t
@@ -17,13 +25,13 @@
 // end module
 
 // CHECK-LABEL: Testing : "_QMmPfoo"
-// TODO: x and y are non pointer, non target argument and therefore do not alias.
-// CHECK-DAG: x#0 <-> y#0: MayAlias
+// x and y are non pointer, non target argument and therefore do not alias.
+// CHECK-DAG: x#0 <-> y#0: NoAlias
 
-// TODO: y is not a pointer object and therefore does not alias with the x%next component. 
-//       Also assigning x to y would not modify x.next
-// CHECK-DAG: y#0 <-> xnext1#0: MayAlias
-// CHECK-DAG: y#0 <-> xnext2#0: MayAlias
+// y is not a pointer object and therefore does not alias with the x%next
+// component.  Also assigning x to y would not modify x.next
+// CHECK-DAG: y#0 <-> xnext1#0: NoAlias
+// CHECK-DAG: y#0 <-> xnext2#0: NoAlias
 
 // We need to catch the fact that assigning y to x will modify xnext. 
 // The only side-effect between the 2 loads of x.next is the assignment to x, 
@@ -35,6 +43,12 @@
 //       we are currently not comparing operands involved in offset computations
 // CHECK-DAG: xnext1#0 <-> xnext2#0: MayAlias
 
+// TODO: This should be NoAlias.  However, AliasAnalysis currently
+// indiscriminately treats all pointers as aliasing.  That makes sense for the
+// addresses within the pointers but not necessarily for the addresses of the
+// pointers.
+// CHECK-DAG: xnext1#0 <-> ynext#0: MayAlias
+
 func.func @_QMmPfoo(%arg0: !fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>> {fir.bindc_name = "x"}, %arg1: !fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>> {fir.bindc_name = "y"}) {
   %0 = fir.alloca i32 {bindc_name = "i1", uniq_name = "_QMmFfooEi1"}
   %1:2 = hlfir.declare %0 {uniq_name = "_QMmFfooEi1"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
@@ -55,5 +69,176 @@ func.func @_QMmPfoo(%arg0: !fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir
   %14 = hlfir.designate %13{"i"} : (!fir.ptr<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>>) -> !fir.ref<i32>
   %15 = fir.load %14 : !fir.ref<i32>
   hlfir.assign %15 to %3#0 : i32, !fir.ref<i32>
+  %16 = hlfir.designate %5#0{"next"}   {fortran_attrs = #fir.var_attrs<pointer>, test.ptr = "ynext"} : (!fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>>) -> !fir.ref<!fir.box<!fir.ptr<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>>>>
+  return
+}
+
+// -----
+
+// The address of a composite aliases the address of any component, including an
+// allocatable component.  Like the address of a pointer, the address of an
+// allocatable is considered non-data, so AliasAnalysis has special handling to
+// detect the aliasing.
+
+// module m
+//   type t
+//     integer, allocatable :: p
+//   end type
+//   type(t) :: x
+// contains
+//   subroutine test()
+//     ! access x%p
+//   end subroutine
+// end module
+
+// CHECK-LABEL: Testing : "_QMmPtest"
+// CHECK-DAG: x#0 <-> x%p#0: MayAlias
+
+func.func @_QMmPtest() {
+  %0 = fir.address_of(@_QMmEx) : !fir.ref<!fir.type<_QMmTt{p:!fir.box<!fir.heap<i32>>}>>
+  %1:2 = hlfir.declare %0 {test.ptr="x", uniq_name = "_QMmEx"} : (!fir.ref<!fir.type<_QMmTt{p:!fir.box<!fir.heap<i32>>}>>) -> (!fir.ref<!fir.type<_QMmTt{p:!fir.box<!fir.heap<i32>>}>>, !fir.ref<!fir.type<_QMmTt{p:!fir.box<!fir.heap<i32>>}>>)
+  %2 = hlfir.designate %1#0{"p"}   {test.ptr="x%p", fortran_attrs = #fir.var_attrs<allocatable>} : (!fir.ref<!fir.type<_QMmTt{p:!fir.box<!fir.heap<i32>>}>>) -> !fir.ref<!fir.box<!fir.heap<i32>>>
+  return
+}
+
+// -----
+
+// Nested composites.
+
+// module m
+//   type t1
+//     integer, pointer :: p
+//   end type
+//   type t2
+//     type(t1) :: x
+//     integer, pointer :: p
+//     integer :: i
+//   end type
+// contains
+//   subroutine test()
+//     type(t2) :: x
+//   end subroutine
+// end module
+
+// CHECK-LABEL: Testing : "_QMmPtest"
+
+// The addresses of a composite and its pointer component alias even if the
+// composite is nested within another composite.
+// CHECK-DAG: x#0 <-> x%p#0: MayAlias
+// CHECK-DAG: x%x#0 <-> x%x%p#0: MayAlias
+
+// The addresses of different components of the same composite do not alias.
+//
+// TODO: Thus, all results below should be NoAlias.  However, AliasAnalysis
+// normally does not recognize when two values (x%x vs. x%i) are distinct
+// components of the same composite (x) as opposed to being potentially
+// overlapping parts of something, so it returns MayAlias.  There is special
+// handling for a pointer component (x%p) that does recognize it as separate
+// from other components (x%i).  But it does not yet distinguish the composite
+// (x) from a component (x%x) that is also a composite with a pointer component
+// (x%x%p).  Thus, because x and x%p can alias, it assumes x%x and x%p can too.
+// CHECK-DAG: x%x#0 <-> x%i#0: MayAlias
+// CHECK-DAG: x%p#0 <-> x%i#0: NoAlias
+// CHECK-DAG: x%x#0 <-> x%p#0: MayAlias
+
+func.func @_QMmPtest() {
+  %0 = fir.alloca !fir.type<_QMmTt2{x:!fir.type<_QMmTt1{p:!fir.box<!fir.ptr<i32>>}>,p:!fir.box<!fir.ptr<i32>>,i:i32}> {bindc_name = "x", uniq_name = "_QMmFtestEx"}
+  %1:2 = hlfir.declare %0 {test.ptr="x", uniq_name = "_QMmFtestEx"} : (!fir.ref<!fir.type<_QMmTt2{x:!fir.type<_QMmTt1{p:!fir.box<!fir.ptr<i32>>}>,p:!fir.box<!fir.ptr<i32>>,i:i32}>>) -> (!fir.ref<!fir.type<_QMmTt2{x:!fir.type<_QMmTt1{p:!fir.box<!fir.ptr<i32>>}>,p:!fir.box<!fir.ptr<i32>>,i:i32}>>, !fir.ref<!fir.type<_QMmTt2{x:!fir.type<_QMmTt1{p:!fir.box<!fir.ptr<i32>>}>,p:!fir.box<!fir.ptr<i32>>,i:i32}>>)
+  %2 = hlfir.designate %1#0{"x"}  {test.ptr="x%x"}  : (!fir.ref<!fir.type<_QMmTt2{x:!fir.type<_QMmTt1{p:!fir.box<!fir.ptr<i32>>}>,p:!fir.box<!fir.ptr<i32>>,i:i32}>>) -> !fir.ref<!fir.type<_QMmTt1{p:!fir.box<!fir.ptr<i32>>}>>
+  %3 = hlfir.designate %1#0{"p"}   {test.ptr="x%p", fortran_attrs = #fir.var_attrs<pointer>} : (!fir.ref<!fir.type<_QMmTt2{x:!fir.type<_QMmTt1{p:!fir.box<!fir.ptr<i32>>}>,p:!fir.box<!fir.ptr<i32>>,i:i32}>>) -> !fir.ref<!fir.box<!fir.ptr<i32>>>
+  %4 = hlfir.designate %1#0{"i"} {test.ptr="x%i"}  : (!fir.ref<!fir.type<_QMmTt2{x:!fir.type<_QMmTt1{p:!fir.box<!fir.ptr<i32>>}>,p:!fir.box<!fir.ptr<i32>>,i:i32}>>) -> !fir.ref<i32>
+  %5 = hlfir.designate %2{"p"}   {test.ptr="x%x%p", fortran_attrs = #fir.var_attrs<pointer>} : (!fir.ref<!fir.type<_QMmTt1{p:!fir.box<!fir.ptr<i32>>}>>) -> !fir.ref<!fir.box<!fir.ptr<i32>>>
+  return
+}
+
+// -----
+
+// Pointer that might dynamically be a component.
+
+// The address of a pointer dummy arg (argp) might alias the address of a
+// pointer component (x%p) and thus the address of the associated composite (x),
+// but it does not alias the addresses of other components (x%i) of the
+// composite.  Moreover, the address *in* argp does not alias any of those.
+// Finally, an allocatable dummy arg (arga) should not be mistaken for a pointer
+// dummy arg and cannot have such aliasing.
+
+// module m
+//   type t
+//     integer, pointer :: p
+//     integer i
+//   end type
+//   type(t) :: glob
+// contains
+//   subroutine test(argp, arga, arg)
+//     integer, pointer :: argp
+//     integer, allocatable :: arga
+//     type(t) :: arg
+//     type(t) :: loc
+//   end subroutine
+// end module
+
+// CHECK-LABEL: Testing : "_QMmPtest"
+
+// Check when composite is a dummy arg.
+// CHECK-DAG: argp#0 <-> arg#0: MayAlias
+// CHECK-DAG: argp#0 <-> arg%p#0: MayAlias
+// CHECK-DAG: argp#0 <-> arg%i#0: NoAlias
+// CHECK-DAG: argp.tgt#0 <-> arg#0: NoAlias
+// CHECK-DAG: argp.tgt#0 <-> arg%p#0: NoAlias
+// CHECK-DAG: argp.tgt#0 <-> arg%i#0: NoAlias
+// CHECK-DAG: arga#0 <-> arg#0: NoAlias
+// CHECK-DAG: arga#0 <-> arg%p#0: NoAlias
+
+// Check when composite is a global.
+// CHECK-DAG: argp#0 <-> glob#0: MayAlias
+// CHECK-DAG: argp#0 <-> glob%p#0: MayAlias
+// CHECK-DAG: argp#0 <-> glob%i#0: NoAlias
+// CHECK-DAG: argp.tgt#0 <-> glob#0: NoAlias
+// CHECK-DAG: argp.tgt#0 <-> glob%p#0: NoAlias
+// CHECK-DAG: argp.tgt#0 <-> glob%i#0: NoAlias
+// CHECK-DAG: arga#0 <-> glob#0: NoAlias
+// CHECK-DAG: arga#0 <-> glob%p#0: NoAlias
+
+// Check when composite is a local and thus cannot alias a dummy arg.
+//
+// TODO: The argp vs. loc%p cas...
[truncated]

However, FileCheck directives then fail.  fir-opt's stdout (MLIR
output) and stderr (diagnostics) are jumbled together, and somehow
-debug changes the jumble sufficiently for FileCheck directives to
pass.  We don't actually need stdout, so send it to /dev/null.  That
makes the output easier to read anyway when debugging.
Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jdenny-ornl for the update. This looks better!

I am still a bit bothered by the fact that the code is turned in a way were it is proving aliasing instead of no aliasing (i.e., it is doing if (cdt1) then mayalias else if (cdt2) then mayalias .... else noalias ), this makes it harder to validate that aliasing cases are not missing via human review (at least in my experience). But I have to concede the standard is a bit written in that way in section 15.5.2.14 of F2023, and I do not have immediate suggestions to reverse the logic in a clearer way.

Can I ask you to push the isRecordWithPointerComponent and isPointerReference NFC change as well as the alias-analysis-9.fir renaming in a preliminary patch so that your functional changes stands out?

Then I will run this update through some end-to-end apps and keep you posted.

Do you know any apps where I should see performance improvement (i.e., was your change motivated by some use case in some app?). Note that in general I think we should try to change aliasing code in the future either to simplify the logic, or when motivated by some actual performance use case because these changes need thorough review, which require a lot of brain bandwidth. I have limited faith in end-to-end testing currently for aliasing given flang has a rather limited usage of it, so my concern is the introduction of hard to catch dormant optimization bugs (so thank you for all the LIT tests you are adding in your patch, this is great).

jdenny-ornl added a commit to jdenny-ornl/llvm-project that referenced this pull request Aug 23, 2024
This PR extracts several small NFC changes from PR llvm#94242 to make it
more readable.
@jdenny-ornl
Copy link
Collaborator Author

jdenny-ornl commented Aug 23, 2024

Thanks @jdenny-ornl for the update. This looks better!

Thanks for reviewing.

I am still a bit bothered by the fact that the code is turned in a way were it is proving aliasing instead of no aliasing (i.e., it is doing if (cdt1) then mayalias else if (cdt2) then mayalias .... else noalias ), this makes it harder to validate that aliasing cases are not missing via human review (at least in my experience). But I have to concede the standard is a bit written in that way in section 15.5.2.14 of F2023, and I do not have immediate suggestions to reverse the logic in a clearer way.

I think that is a description of fir::AliasAnalysis::alias, and it holds with or without this PR. I agree the reverse seems safer, but I don't know how to achieve it either.

Can I ask you to push the isRecordWithPointerComponent and isPointerReference NFC change as well as the alias-analysis-9.fir renaming in a preliminary patch so that your functional changes stands out?

Done in PR #105899. I haven't bothered with a stacked PR as I assume that will land quickly, but let me know if that would be helpful.

Then I will run this update through some end-to-end apps and keep you posted.

Sounds great. Thanks.

Do you know any apps where I should see performance improvement

No, not so far.

(i.e., was your change motivated by some use case in some app?).

Yes, indirectly. I have developed some other Flang alias analysis improvements that benefit the performance of LLNL's UMT proxy app. However, those improvements alter some alias analysis results in Flang's test suite. Often where the existing expected result is MayAlias, I was originally not sure whether authors felt that is a necessary result or just a conservative result due to limitations of the current implementation. I also then doubted new tests I was adding. I wrote the RFC to get answers so I could make further progress on my improvements. Developing this PR seemed like the best way to make the RFC discussion concrete.

Note that in general I think we should try to change aliasing code in the future either to simplify the logic, or when motivated by some actual performance use case because these changes need thorough review, which require a lot of brain bandwidth. I have limited faith in end-to-end testing currently for aliasing given flang has a rather limited usage of it, so my concern is the introduction of hard to catch dormant optimization bugs (so thank you for all the LIT tests you are adding in your patch, this is great).

I understand. When I started the RFC, I wasn't sure whether we'd end up landing changes to the implementation or just the test suite comments. I hope the PR we ended up with serves to clarify both even as it exposes more detailed use cases in the Fortran standard.

jdenny-ornl added a commit to jdenny-ornl/llvm-project that referenced this pull request Aug 23, 2024
chsigg pushed a commit that referenced this pull request Aug 26, 2024
This PR extracts several small NFC changes from PR #94242 to make it
more readable.
@jdenny-ornl
Copy link
Collaborator Author

Can I ask you to push the isRecordWithPointerComponent and isPointerReference NFC change as well as the alias-analysis-9.fir renaming in a preliminary patch so that your functional changes stands out?

Done in PR #105899. I haven't bothered with a stacked PR as I assume that will land quickly, but let me know if that would be helpful.

That has landed, and I've merged it here.

@jdenny-ornl
Copy link
Collaborator Author

Then I will run this update through some end-to-end apps and keep you posted.

@jeanPerier Any update here? Let me know if there's any way I can help.

@jdenny-ornl
Copy link
Collaborator Author

Ping

@jdenny-ornl
Copy link
Collaborator Author

Ping.

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @jdenny-ornl. Testing on a few apps seemed OK. Re-reading the PR, I was unconfutable with the names of the added helpers, so I am suggesting new names.

@@ -94,6 +94,39 @@ bool AliasAnalysis::Source::isBoxData() const {
origin.isData;
}

bool AliasAnalysis::Source::aliasesLikeDummyArg() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mayBeDummyArgOrHostAssociatedDummyArg(), although ugly and verbose, would be better. aliasesLikeDummyArg raises a lot of questions about what it means to "alias like a dummy argument".

Same for aliasesLikePtrDummyArg.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but I went with mayBeDummyArgOrHostAssoc to make it a little shorter. I think it's still clear given that we're talking about SourceKind::HostAssoc.

Comment on lines 167 to 168
bool canBeActualArg() const;
bool canBeActualArgWithPtr(const mlir::Value *val) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some short explanation, it is not very clear what is meant looking at the names it could be read as implying that they can be passed as actual argument in call inside the current function, not that they can be actual argument of the current function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jdenny-ornl
Copy link
Collaborator Author

Thanks for the updates @jdenny-ornl. Testing on a few apps seemed OK.

Good to hear. Thanks.

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates

@jdenny-ornl jdenny-ornl merged commit 7323533 into llvm:main Oct 15, 2024
8 checks passed
@jdenny-ornl
Copy link
Collaborator Author

Thanks for the review.

DanielCChen pushed a commit to DanielCChen/llvm-project that referenced this pull request Oct 16, 2024
This PR applies the changes discussed in [[RFC] Rationale for Flang
AliasAnalysis pointer component
logic](https://discourse.llvm.org/t/rfc-rationale-for-flang-aliasanalysis-pointer-component-logic/79252).

In summary, this PR replaces the existing pointer component logic in
Flang's AliasAnalysis implementation. That logic focuses on aliasing
between pointers and non-pointer, non-target composites that have
pointer components. However, it is more conservative than necessary, and
some existing tests expect its current results when less conservative
results seem reasonable.

This PR splits the logic into two cases:

1. Source values are the same: Return MayAlias when one value is the
address of a composite, and the other value is statically the address of
a pointer component of that composite.
2. Source values are different: Return MayAlias when one value is the
address of a composite (actual argument), and the other value is the
address of a pointer (dummy arg) that might dynamically be a component
of that composite.

In both cases, the actual implementation is still more conservative than
described above, but it can be improved further later. Details appear in
the comments.

Additionally, this PR revises the logic that reports MayAlias for a
pointer/target vs. another pointer/target. It constrains the existing
logic to handle only isData cases, and it adds less conservative
handling of !isData cases elsewhere. First, it extends case 2 listed
above to cover the case where the actual argument is the address of a
pointer rather than a composite. Second, it adds a third case: where
target attributes enable aliasing with a dummy argument.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants