Skip to content

[globals-aa] Improved isNonEscapingGlobalNoAlias. #127707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 20, 2025

Conversation

vzakhari
Copy link
Contributor

A non-escaping global should never alias with non-pointer values
and ptr null. This should improve disambiguation of global
pointers with relation to Flang runtime calls (given that
nosync nocallback attributes are properly set).
It also seems to be an obvious improvement with little overhead.

A non-escaping global should never alias with non-pointer values
and `ptr null`. This should improve disambiguation of global
pointers with relation to Flang runtime calls (given that
`nosync nocallback` attributes are properly set).
It also seems to be an obvious improvement with little overhead.
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Feb 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 18, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Slava Zakharin (vzakhari)

Changes

A non-escaping global should never alias with non-pointer values
and ptr null. This should improve disambiguation of global
pointers with relation to Flang runtime calls (given that
nosync nocallback attributes are properly set).
It also seems to be an obvious improvement with little overhead.


Full diff: https://github.com/llvm/llvm-project/pull/127707.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Analysis/GlobalsModRef.h (+2-1)
  • (modified) llvm/lib/Analysis/GlobalsModRef.cpp (+17-4)
  • (modified) llvm/test/Analysis/GlobalsModRef/nonescaping-noalias.ll (+21)
diff --git a/llvm/include/llvm/Analysis/GlobalsModRef.h b/llvm/include/llvm/Analysis/GlobalsModRef.h
index ab8ab8295b556..36a95e095aaa5 100644
--- a/llvm/include/llvm/Analysis/GlobalsModRef.h
+++ b/llvm/include/llvm/Analysis/GlobalsModRef.h
@@ -118,7 +118,8 @@ class GlobalsAAResult : public AAResultBase {
   bool AnalyzeIndirectGlobalMemory(GlobalVariable *GV);
   void CollectSCCMembership(CallGraph &CG);
 
-  bool isNonEscapingGlobalNoAlias(const GlobalValue *GV, const Value *V);
+  bool isNonEscapingGlobalNoAlias(const GlobalValue *GV, const Value *V,
+                                  const Instruction *CtxI);
   ModRefInfo getModRefInfoForArgument(const CallBase *Call,
                                       const GlobalValue *GV, AAQueryInfo &AAQI);
 };
diff --git a/llvm/lib/Analysis/GlobalsModRef.cpp b/llvm/lib/Analysis/GlobalsModRef.cpp
index 1ceb1b2629418..ade9ffe333b2d 100644
--- a/llvm/lib/Analysis/GlobalsModRef.cpp
+++ b/llvm/lib/Analysis/GlobalsModRef.cpp
@@ -713,7 +713,8 @@ static bool isNonEscapingGlobalNoAliasWithLoad(const GlobalValue *GV,
 // active, or to be forced to operate as a module pass that cannot co-exist
 // with an alias analysis such as GMR.
 bool GlobalsAAResult::isNonEscapingGlobalNoAlias(const GlobalValue *GV,
-                                                 const Value *V) {
+                                                 const Value *V,
+                                                 const Instruction *CtxI) {
   // In order to know that the underlying object cannot alias the
   // non-addr-taken global, we must know that it would have to be an escape.
   // Thus if the underlying object is a function argument, a load from
@@ -762,6 +763,18 @@ bool GlobalsAAResult::isNonEscapingGlobalNoAlias(const GlobalValue *GV,
       continue;
     }
 
+    // A non-addr-taken global cannot alias with any non-pointer value.
+    if (!Input->getType()->isPointerTy())
+      continue;
+
+    if (CtxI) {
+      // Null pointer cannot alias with a non-addr-taken global.
+      const Function *F = CtxI->getFunction();
+      if (const ConstantPointerNull *CPN = dyn_cast<ConstantPointerNull>(Input))
+        if (!NullPointerIsDefined(F, CPN->getType()->getAddressSpace()))
+          continue;
+    }
+
     // Recurse through a limited number of selects, loads and PHIs. This is an
     // arbitrary depth of 4, lower numbers could be used to fix compile time
     // issues if needed, but this is generally expected to be only be important
@@ -820,7 +833,7 @@ bool GlobalsAAResult::invalidate(Module &, const PreservedAnalyses &PA,
 /// address of the global isn't taken.
 AliasResult GlobalsAAResult::alias(const MemoryLocation &LocA,
                                    const MemoryLocation &LocB,
-                                   AAQueryInfo &AAQI, const Instruction *) {
+                                   AAQueryInfo &AAQI, const Instruction *CtxI) {
   // Get the base object these pointers point to.
   const Value *UV1 =
       getUnderlyingObject(LocA.Ptr->stripPointerCastsForAliasAnalysis());
@@ -856,7 +869,7 @@ AliasResult GlobalsAAResult::alias(const MemoryLocation &LocA,
     if ((GV1 || GV2) && GV1 != GV2) {
       const GlobalValue *GV = GV1 ? GV1 : GV2;
       const Value *UV = GV1 ? UV2 : UV1;
-      if (isNonEscapingGlobalNoAlias(GV, UV))
+      if (isNonEscapingGlobalNoAlias(GV, UV, CtxI))
         return AliasResult::NoAlias;
     }
 
@@ -920,7 +933,7 @@ ModRefInfo GlobalsAAResult::getModRefInfoForArgument(const CallBase *Call,
         !all_of(Objects, [&](const Value *V) {
           return this->alias(MemoryLocation::getBeforeOrAfter(V),
                              MemoryLocation::getBeforeOrAfter(GV), AAQI,
-                             nullptr) == AliasResult::NoAlias;
+                             Call) == AliasResult::NoAlias;
         }))
       return ConservativeResult;
 
diff --git a/llvm/test/Analysis/GlobalsModRef/nonescaping-noalias.ll b/llvm/test/Analysis/GlobalsModRef/nonescaping-noalias.ll
index 4a2c10ca55cdc..eed93cf0df8ef 100644
--- a/llvm/test/Analysis/GlobalsModRef/nonescaping-noalias.ll
+++ b/llvm/test/Analysis/GlobalsModRef/nonescaping-noalias.ll
@@ -175,3 +175,24 @@ entry:
   %v = load i32, ptr @g1
   ret i32 %v
 }
+
+define i32 @test6(ptr %param) {
+; Ensure that we can fold a store to a load of a global across a set of
+; calls that cannot use in any way a non-escaping global.
+;
+; CHECK-LABEL: @test6(
+; CHECK: store i32 42, ptr @g1
+; CHECK-NOT: load i32
+; CHECK: ret i32 42
+entry:
+  store i32 42, ptr @g1
+  %1 = call ptr @_FortranAioBeginExternalFormattedOutput(ptr null, i64 3, ptr null, i32 6, ptr null, i32 2)
+  %2 = call i1 @_FortranAioOutputAscii(ptr %1, ptr null, i64 4)
+  %3 = call i32 @_FortranAioEndIoStatement(ptr %1)
+  %v = load i32, ptr @g1
+  ret i32 %v
+}
+declare ptr @_FortranAioBeginExternalFormattedOutput(ptr, i64, ptr, i32, ptr, i32) #0
+declare zeroext i1 @_FortranAioOutputAscii(ptr, ptr, i64) #0
+declare i32 @_FortranAioEndIoStatement(ptr) #0
+attributes #0 = { nocallback nosync }

@@ -762,6 +763,18 @@ bool GlobalsAAResult::isNonEscapingGlobalNoAlias(const GlobalValue *GV,
continue;
}

// A non-addr-taken global cannot alias with any non-pointer value.
if (!Input->getType()->isPointerTy())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we end up with a value that isn't a pointer here? alias() itself should only be queried with pointers (it doesn't make sense to ask whether a non-pointer aliases anything). And the recursive steps in isNonEscapingGlobalNoAlias should produce pointers if the input is a pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We end up here from line 934.

I guess a pointer captured via ptrtoint or any other indirect "conversion" to a non-pointer value might be considered aliasing with this non-pointer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a bug in that code; we're only supposed to be trying to analyze pointers.

A ptrtoint would be considered an escape, so we don't need to try to analyze integers to figure out if they were originally pointers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that BasicAAResult::aliasCheck seems to expect that any of the values can have non-pointer type:

if (!V1->getType()->isPointerTy() || !V2->getType()->isPointerTy())

So I am not sure there is a strict requirement that alias can only be called with pointers. I can easily move this check to the code around line 934, but it seems it would be more appropriate staying here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea why we don't forbid alias queries on non-pointers. CC @nikic.

In any case, you can hoist this check out of the loop.


if (CtxI) {
// Null pointer cannot alias with a non-addr-taken global.
const Function *F = CtxI->getFunction();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sink the variable declaration into the if?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

continue;

if (CtxI)
if (const ConstantPointerNull *CPN =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use "auto" here.

@@ -762,6 +763,18 @@ bool GlobalsAAResult::isNonEscapingGlobalNoAlias(const GlobalValue *GV,
continue;
}

// A non-addr-taken global cannot alias with any non-pointer value.
if (!Input->getType()->isPointerTy())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea why we don't forbid alias queries on non-pointers. CC @nikic.

In any case, you can hoist this check out of the loop.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vzakhari vzakhari merged commit 75bd46e into llvm:main Feb 20, 2025
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 20, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/13308

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: offloading/pgo1.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fprofile-generate      -Xclang "-fprofile-instrument=llvm"
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fprofile-generate -Xclang -fprofile-instrument=llvm
# RUN: at line 3
env LLVM_PROFILE_FILE=llvm.profraw /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp 2>&1
# executed command: env LLVM_PROFILE_FILE=llvm.profraw /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp
# RUN: at line 4
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/llvm-profdata show --all-functions --counts      amdgcn-amd-amdhsa.llvm.profraw | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c      --check-prefix="LLVM-PGO"
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/llvm-profdata show --all-functions --counts amdgcn-amd-amdhsa.llvm.profraw
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c --check-prefix=LLVM-PGO
# RUN: at line 8
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fprofile-instr-generate      -Xclang "-fprofile-instrument=clang"
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fprofile-instr-generate -Xclang -fprofile-instrument=clang
# RUN: at line 10
env LLVM_PROFILE_FILE=clang.profraw /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp 2>&1
# executed command: env LLVM_PROFILE_FILE=clang.profraw /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp
# RUN: at line 11
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/llvm-profdata show --all-functions --counts      amdgcn-amd-amdhsa.clang.profraw | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c      --check-prefix="CLANG-PGO"
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/llvm-profdata show --all-functions --counts amdgcn-amd-amdhsa.clang.profraw
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c --check-prefix=CLANG-PGO
# .---command stderr------------
# | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c:54:15: error: CLANG-PGO: expected string not found in input
# | // CLANG-PGO: Block counts: [11, 20]
# |               ^
# | <stdin>:5:19: note: scanning from here
# |  Function count: 0
# |                   ^
# | <stdin>:6:2: note: possible intended match here
# |  Block counts: [13, 20]
# |  ^
# | 
# | Input file: <stdin>
# | Check file: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: Counters: 
# |             2:  pgo1.c:__omp_offloading_802_d8283c4_main_l27: 
# |             3:  Hash: 0x000000011b11b451 
# |             4:  Counters: 3 
# |             5:  Function count: 0 
# | check:54'0                       X error: no match found
# |             6:  Block counts: [13, 20] 
...

vzakhari added a commit to vzakhari/llvm-project that referenced this pull request Feb 21, 2025
This change is inspired by a case in facerec benchmark, where performance
of scalar code may improve by about 6%@aarch64 due to getting rid of redundant
loads from Fortran descriptors. These descriptors are corresponding
to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest
in LocalMove subroutine contains call to Fortran runtime IO functions,
and LLVM globals-aa analysis cannot prove that these calls do not modify
the globalized descriptors with internal linkage.

This patch sets and propagates llvm.memory_effects attribute for fir.call
operations calling Fortran runtime functions. In particular, it tries
to set the Other memory effect to NoModRef. The Other memory effect
includes accesses to globals and captured pointers, so we cannot set
it for functions taking Fortran descriptors with one exception
for calls where the Fortran descriptor arguments are all null.

As long as different calls to the same Fortran runtime function may have
different attributes, I decided to attach the attributes to the calls
rather than functions. Moreover, attaching the attributes to func.func
will require propagating these attributes to llvm.func, which is not
happening right now.

In addition to llvm.memory_effects, the new pass sets llvm.nosync
and llvm.nocallback attributes that may also help LLVM alias analysis
(e.g. see llvm#127707). These attributes are ignored currently.
I will support them in LLVM IR dialect in a separate patch.

I also added another pass for developers to be able to print
declarations/calls of all Fortran runtime functions that are recognized
by the attributes setting pass. It should help with maintenance
of the LIT tests.
vzakhari added a commit that referenced this pull request Feb 24, 2025
#128093)

This change is inspired by a case in facerec benchmark, where
performance
of scalar code may improve by about 6%@aarch64 due to getting rid of
redundant
loads from Fortran descriptors. These descriptors are corresponding
to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest
in LocalMove subroutine contains call to Fortran runtime IO functions,
and LLVM globals-aa analysis cannot prove that these calls do not modify
the globalized descriptors with internal linkage.

This patch sets and propagates llvm.memory_effects attribute for
fir.call
operations calling Fortran runtime functions. In particular, it tries
to set the Other memory effect to NoModRef. The Other memory effect
includes accesses to globals and captured pointers, so we cannot set
it for functions taking Fortran descriptors with one exception
for calls where the Fortran descriptor arguments are all null.

As long as different calls to the same Fortran runtime function may have
different attributes, I decided to attach the attributes to the calls
rather than functions. Moreover, attaching the attributes to func.func
will require propagating these attributes to llvm.func, which is not
happening right now.

In addition to llvm.memory_effects, the new pass sets llvm.nosync
and llvm.nocallback attributes that may also help LLVM alias analysis
(e.g. see #127707). These attributes are ignored currently.
I will support them in LLVM IR dialect in a separate patch.

I also added another pass for developers to be able to print
declarations/calls of all Fortran runtime functions that are recognized
by the attributes setting pass. It should help with maintenance
of the LIT tests.
vzakhari added a commit that referenced this pull request Feb 24, 2025
… runtime. (#128093)"

This change is inspired by a case in facerec benchmark, where
performance
of scalar code may improve by about 6%@aarch64 due to getting rid of
redundant
loads from Fortran descriptors. These descriptors are corresponding
to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest
in LocalMove subroutine contains call to Fortran runtime IO functions,
and LLVM globals-aa analysis cannot prove that these calls do not modify
the globalized descriptors with internal linkage.

This patch sets and propagates llvm.memory_effects attribute for
fir.call
operations calling Fortran runtime functions. In particular, it tries
to set the Other memory effect to NoModRef. The Other memory effect
includes accesses to globals and captured pointers, so we cannot set
it for functions taking Fortran descriptors with one exception
for calls where the Fortran descriptor arguments are all null.

As long as different calls to the same Fortran runtime function may have
different attributes, I decided to attach the attributes to the calls
rather than functions. Moreover, attaching the attributes to func.func
will require propagating these attributes to llvm.func, which is not
happening right now.

In addition to llvm.memory_effects, the new pass sets llvm.nosync
and llvm.nocallback attributes that may also help LLVM alias analysis
(e.g. see #127707). These attributes are ignored currently.
I will support them in LLVM IR dialect in a separate patch.

I also added another pass for developers to be able to print
declarations/calls of all Fortran runtime functions that are recognized
by the attributes setting pass. It should help with maintenance
of the LIT tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants