Skip to content

[CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering #97727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 1, 2024

Conversation

aengelke
Copy link
Contributor

@aengelke aengelke commented Jul 4, 2024

Currently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal.


I'm not sure how this interacts with AMDGPUPromoteAlloca, but tests seem to pass.

This avoids one non-cheap pass in the -O0 pipeline. c-t-t.

Currently, the LowerConstantIntrinsics pass does an RPO traversal of
every function... only to find that many functions don't have constant
intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is
already a pre-isel intrinsic lowering pass, which iterates over
intrinsic declarations and lowers all users. Call
lowerConstantIntrinsics from this pass to avoid the extra iteration over
the entire IR and the RPO traversal.

I'm not sure how this interacts with AMDGPUPromoteAlloca, but tests seem
to pass.
@llvmbot
Copy link
Member

llvmbot commented Jul 4, 2024

@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-loongarch
@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-powerpc

@llvm/pr-subscribers-backend-amdgpu

Author: Alexis Engelke (aengelke)

Changes

Currently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal.


I'm not sure how this interacts with AMDGPUPromoteAlloca, but tests seem to pass.

This avoids one non-cheap pass in the -O0 pipeline. c-t-t.


Patch is 20.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97727.diff

19 Files Affected:

  • (modified) llvm/include/llvm/LinkAllPasses.h (-1)
  • (modified) llvm/include/llvm/Transforms/Scalar.h (-7)
  • (modified) llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h (+5)
  • (modified) llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp (+47-3)
  • (modified) llvm/lib/CodeGen/TargetPassConfig.cpp (-1)
  • (modified) llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp (+2-44)
  • (modified) llvm/lib/Transforms/Scalar/Scalar.cpp (-1)
  • (modified) llvm/test/CodeGen/AArch64/O0-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/AArch64/O3-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/AMDGPU/llc-pipeline.ll (-5)
  • (modified) llvm/test/CodeGen/ARM/O3-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/LoongArch/O0-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/LoongArch/opt-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/PowerPC/O0-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/PowerPC/O3-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/RISCV/O0-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/RISCV/O3-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/X86/O0-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/X86/opt-pipeline.ll (-1)
diff --git a/llvm/include/llvm/LinkAllPasses.h b/llvm/include/llvm/LinkAllPasses.h
index 311d38e8a751f..8d7a1a21385f3 100644
--- a/llvm/include/llvm/LinkAllPasses.h
+++ b/llvm/include/llvm/LinkAllPasses.h
@@ -89,7 +89,6 @@ namespace {
       (void) llvm::createLoopSimplifyPass();
       (void) llvm::createLoopStrengthReducePass();
       (void) llvm::createLoopUnrollPass();
-      (void) llvm::createLowerConstantIntrinsicsPass();
       (void) llvm::createLowerGlobalDtorsLegacyPass();
       (void) llvm::createLowerInvokePass();
       (void) llvm::createLowerSwitchPass();
diff --git a/llvm/include/llvm/Transforms/Scalar.h b/llvm/include/llvm/Transforms/Scalar.h
index f74a49785e11b..98d0adca35521 100644
--- a/llvm/include/llvm/Transforms/Scalar.h
+++ b/llvm/include/llvm/Transforms/Scalar.h
@@ -149,13 +149,6 @@ extern char &InferAddressSpacesID;
 //
 FunctionPass *createTLSVariableHoistPass();
 
-//===----------------------------------------------------------------------===//
-//
-// LowerConstantIntrinsicss - Expand any remaining llvm.objectsize and
-// llvm.is.constant intrinsic calls, even for the unknown cases.
-//
-FunctionPass *createLowerConstantIntrinsicsPass();
-
 //===----------------------------------------------------------------------===//
 //
 // PartiallyInlineLibCalls - Tries to inline the fast path of library
diff --git a/llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h b/llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h
index e8e404bb93d6e..3d4d1034c0b24 100644
--- a/llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h
+++ b/llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h
@@ -19,7 +19,12 @@
 
 namespace llvm {
 
+class DominatorTree;
 class Function;
+class TargetLibraryInfo;
+
+bool lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI,
+                             DominatorTree *DT);
 
 struct LowerConstantIntrinsicsPass :
     PassInfoMixin<LowerConstantIntrinsicsPass> {
diff --git a/llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp b/llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp
index 0777acf633187..368e7e6052674 100644
--- a/llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp
+++ b/llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp
@@ -14,6 +14,7 @@
 #include "llvm/CodeGen/PreISelIntrinsicLowering.h"
 #include "llvm/Analysis/ObjCARCInstKind.h"
 #include "llvm/Analysis/ObjCARCUtil.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
 #include "llvm/CodeGen/Passes.h"
 #include "llvm/CodeGen/TargetLowering.h"
@@ -24,10 +25,12 @@
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Type.h"
+#include "llvm/IR/Use.h"
 #include "llvm/InitializePasses.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Target/TargetMachine.h"
+#include "llvm/Transforms/Scalar/LowerConstantIntrinsics.h"
 #include "llvm/Transforms/Utils/LowerMemIntrinsics.h"
 
 using namespace llvm;
@@ -45,6 +48,7 @@ namespace {
 struct PreISelIntrinsicLowering {
   const TargetMachine &TM;
   const function_ref<TargetTransformInfo &(Function &)> LookupTTI;
+  const function_ref<TargetLibraryInfo &(Function &)> LookupTLI;
 
   /// If this is true, assume it's preferably to leave memory intrinsic calls
   /// for replacement with a library call later. Otherwise this depends on
@@ -54,8 +58,9 @@ struct PreISelIntrinsicLowering {
   explicit PreISelIntrinsicLowering(
       const TargetMachine &TM_,
       function_ref<TargetTransformInfo &(Function &)> LookupTTI_,
+      function_ref<TargetLibraryInfo &(Function &)> LookupTLI_,
       bool UseMemIntrinsicLibFunc_ = true)
-      : TM(TM_), LookupTTI(LookupTTI_),
+      : TM(TM_), LookupTTI(LookupTTI_), LookupTLI(LookupTLI_),
         UseMemIntrinsicLibFunc(UseMemIntrinsicLibFunc_) {}
 
   static bool shouldExpandMemIntrinsicWithSize(Value *Size,
@@ -66,6 +71,27 @@ struct PreISelIntrinsicLowering {
 
 } // namespace
 
+template <class T>
+static bool forEachCall(Function &Intrin, T Callback) {
+  // Lowering all intrinsics in a function will delete multiple uses, so we
+  // can't use an early-inc-range. In case some remain, we don't want to look
+  // at them again. Unfortunately, Value::UseList is private, so we can't use a
+  // simple Use**. If LastUse is null, we the next use to consider is
+  // Intrin.use_begin(), otherwise it's LastUse->getNext().
+  Use *LastUse = nullptr;
+  bool Changed = false;
+  while (!Intrin.use_empty() && (!LastUse || LastUse->getNext())) {
+    Use *U = LastUse ? LastUse->getNext() : &*Intrin.use_begin();
+    bool Removed = false;
+    auto CI = dyn_cast<CallInst>(U->getUser());
+    if (CI && CI->getCalledOperand() == &Intrin)
+      Changed |= Removed = Callback(CI);
+    if (!Removed)
+      LastUse = U;
+  }
+  return Changed;
+}
+
 static bool lowerLoadRelative(Function &F) {
   if (F.use_empty())
     return false;
@@ -285,6 +311,16 @@ bool PreISelIntrinsicLowering::lowerIntrinsics(Module &M) const {
     case Intrinsic::load_relative:
       Changed |= lowerLoadRelative(F);
       break;
+    case Intrinsic::is_constant:
+    case Intrinsic::objectsize:
+      Changed |= forEachCall(F, [&](CallInst *CI) {
+        Function *Parent = CI->getParent()->getParent();
+        TargetLibraryInfo &TLI = LookupTLI(*Parent);
+        bool Changed = lowerConstantIntrinsics(*Parent, TLI, /*DT=*/nullptr);
+        assert(Changed && "lowerConstantIntrinsics did not lower intrinsic");
+        return Changed;
+      });
+      break;
     case Intrinsic::objc_autorelease:
       Changed |= lowerObjCCall(F, "objc_autorelease");
       break;
@@ -375,6 +411,7 @@ class PreISelIntrinsicLoweringLegacyPass : public ModulePass {
 
   void getAnalysisUsage(AnalysisUsage &AU) const override {
     AU.addRequired<TargetTransformInfoWrapperPass>();
+    AU.addRequired<TargetLibraryInfoWrapperPass>();
     AU.addRequired<TargetPassConfig>();
   }
 
@@ -382,9 +419,12 @@ class PreISelIntrinsicLoweringLegacyPass : public ModulePass {
     auto LookupTTI = [this](Function &F) -> TargetTransformInfo & {
       return this->getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
     };
+    auto LookupTLI = [this](Function &F) -> TargetLibraryInfo & {
+      return this->getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
+    };
 
     const auto &TM = getAnalysis<TargetPassConfig>().getTM<TargetMachine>();
-    PreISelIntrinsicLowering Lowering(TM, LookupTTI);
+    PreISelIntrinsicLowering Lowering(TM, LookupTTI, LookupTLI);
     return Lowering.lowerIntrinsics(M);
   }
 };
@@ -396,6 +436,7 @@ char PreISelIntrinsicLoweringLegacyPass::ID;
 INITIALIZE_PASS_BEGIN(PreISelIntrinsicLoweringLegacyPass,
                       "pre-isel-intrinsic-lowering",
                       "Pre-ISel Intrinsic Lowering", false, false)
+INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
 INITIALIZE_PASS_DEPENDENCY(TargetPassConfig)
 INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
 INITIALIZE_PASS_END(PreISelIntrinsicLoweringLegacyPass,
@@ -413,8 +454,11 @@ PreservedAnalyses PreISelIntrinsicLoweringPass::run(Module &M,
   auto LookupTTI = [&FAM](Function &F) -> TargetTransformInfo & {
     return FAM.getResult<TargetIRAnalysis>(F);
   };
+  auto LookupTLI = [&FAM](Function &F) -> TargetLibraryInfo & {
+    return FAM.getResult<TargetLibraryAnalysis>(F);
+  };
 
-  PreISelIntrinsicLowering Lowering(TM, LookupTTI);
+  PreISelIntrinsicLowering Lowering(TM, LookupTTI, LookupTLI);
   if (!Lowering.lowerIntrinsics(M))
     return PreservedAnalyses::all();
   else
diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index 3658e8320a0cc..eb74c6f3e5d44 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -845,7 +845,6 @@ void TargetPassConfig::addIRPasses() {
   // TODO: add a pass insertion point here
   addPass(&GCLoweringID);
   addPass(&ShadowStackGCLoweringID);
-  addPass(createLowerConstantIntrinsicsPass());
 
   // For MachO, lower @llvm.global_dtors into @llvm.global_ctors with
   // __cxa_atexit() calls to avoid emitting the deprecated __mod_term_func.
diff --git a/llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp b/llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp
index 939c36164f781..50d1b4297cf5e 100644
--- a/llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp
+++ b/llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp
@@ -96,8 +96,8 @@ static bool replaceConditionalBranchesOnConstant(Instruction *II,
   return HasDeadBlocks;
 }
 
-static bool lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI,
-                                    DominatorTree *DT) {
+bool llvm::lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI,
+                                   DominatorTree *DT) {
   std::optional<DomTreeUpdater> DTU;
   if (DT)
     DTU.emplace(DT, DomTreeUpdater::UpdateStrategy::Lazy);
@@ -165,45 +165,3 @@ LowerConstantIntrinsicsPass::run(Function &F, FunctionAnalysisManager &AM) {
 
   return PreservedAnalyses::all();
 }
-
-namespace {
-/// Legacy pass for lowering is.constant intrinsics out of the IR.
-///
-/// When this pass is run over a function it converts is.constant intrinsics
-/// into 'true' or 'false'. This complements the normal constant folding
-/// to 'true' as part of Instruction Simplify passes.
-class LowerConstantIntrinsics : public FunctionPass {
-public:
-  static char ID;
-  LowerConstantIntrinsics() : FunctionPass(ID) {
-    initializeLowerConstantIntrinsicsPass(*PassRegistry::getPassRegistry());
-  }
-
-  bool runOnFunction(Function &F) override {
-    const TargetLibraryInfo &TLI =
-        getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
-    DominatorTree *DT = nullptr;
-    if (auto *DTWP = getAnalysisIfAvailable<DominatorTreeWrapperPass>())
-      DT = &DTWP->getDomTree();
-    return lowerConstantIntrinsics(F, TLI, DT);
-  }
-
-  void getAnalysisUsage(AnalysisUsage &AU) const override {
-    AU.addRequired<TargetLibraryInfoWrapperPass>();
-    AU.addPreserved<GlobalsAAWrapperPass>();
-    AU.addPreserved<DominatorTreeWrapperPass>();
-  }
-};
-} // namespace
-
-char LowerConstantIntrinsics::ID = 0;
-INITIALIZE_PASS_BEGIN(LowerConstantIntrinsics, "lower-constant-intrinsics",
-                      "Lower constant intrinsics", false, false)
-INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
-INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
-INITIALIZE_PASS_END(LowerConstantIntrinsics, "lower-constant-intrinsics",
-                    "Lower constant intrinsics", false, false)
-
-FunctionPass *llvm::createLowerConstantIntrinsicsPass() {
-  return new LowerConstantIntrinsics();
-}
diff --git a/llvm/lib/Transforms/Scalar/Scalar.cpp b/llvm/lib/Transforms/Scalar/Scalar.cpp
index cb1456b146325..86669e8c5aa49 100644
--- a/llvm/lib/Transforms/Scalar/Scalar.cpp
+++ b/llvm/lib/Transforms/Scalar/Scalar.cpp
@@ -32,7 +32,6 @@ void llvm::initializeScalarOpts(PassRegistry &Registry) {
   initializeLoopStrengthReducePass(Registry);
   initializeLoopUnrollPass(Registry);
   initializeLowerAtomicLegacyPassPass(Registry);
-  initializeLowerConstantIntrinsicsPass(Registry);
   initializeMergeICmpsLegacyPassPass(Registry);
   initializeNaryReassociateLegacyPassPass(Registry);
   initializePartiallyInlineLibCallsLegacyPassPass(Registry);
diff --git a/llvm/test/CodeGen/AArch64/O0-pipeline.ll b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
index a0306b8e1e924..90911c0e33b5b 100644
--- a/llvm/test/CodeGen/AArch64/O0-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
@@ -21,7 +21,6 @@
 ; CHECK-NEXT:       Module Verifier
 ; CHECK-NEXT:       Lower Garbage Collection Instructions
 ; CHECK-NEXT:       Shadow Stack GC Lowering
-; CHECK-NEXT:       Lower constant intrinsics
 ; CHECK-NEXT:       Remove unreachable blocks from the CFG
 ; CHECK-NEXT:       Expand vector predication intrinsics
 ; CHECK-NEXT:       Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/AArch64/O3-pipeline.ll b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
index 84e672d14d99d..b63aefa70d863 100644
--- a/llvm/test/CodeGen/AArch64/O3-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
@@ -52,7 +52,6 @@
 ; CHECK-NEXT:       Expand memcmp() to load/stores
 ; CHECK-NEXT:       Lower Garbage Collection Instructions
 ; CHECK-NEXT:       Shadow Stack GC Lowering
-; CHECK-NEXT:       Lower constant intrinsics
 ; CHECK-NEXT:       Remove unreachable blocks from the CFG
 ; CHECK-NEXT:       Natural Loop Information
 ; CHECK-NEXT:       Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index 08cf83fd2bd0f..161b229ba052a 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -42,7 +42,6 @@
 ; GCN-O0-NEXT:    Lower uses of LDS variables from non-kernel functions
 ; GCN-O0-NEXT:    FunctionPass Manager
 ; GCN-O0-NEXT:      Expand Atomic instructions
-; GCN-O0-NEXT:      Lower constant intrinsics
 ; GCN-O0-NEXT:      Remove unreachable blocks from the CFG
 ; GCN-O0-NEXT:      Expand vector predication intrinsics
 ; GCN-O0-NEXT:      Instrument function entry/exit with calls to e.g. mcount() (post inlining)
@@ -218,7 +217,6 @@
 ; GCN-O1-NEXT:      Lazy Branch Probability Analysis
 ; GCN-O1-NEXT:      Lazy Block Frequency Analysis
 ; GCN-O1-NEXT:      Expand memcmp() to load/stores
-; GCN-O1-NEXT:      Lower constant intrinsics
 ; GCN-O1-NEXT:      Remove unreachable blocks from the CFG
 ; GCN-O1-NEXT:      Natural Loop Information
 ; GCN-O1-NEXT:      Post-Dominator Tree Construction
@@ -508,7 +506,6 @@
 ; GCN-O1-OPTS-NEXT:      Lazy Branch Probability Analysis
 ; GCN-O1-OPTS-NEXT:      Lazy Block Frequency Analysis
 ; GCN-O1-OPTS-NEXT:      Expand memcmp() to load/stores
-; GCN-O1-OPTS-NEXT:      Lower constant intrinsics
 ; GCN-O1-OPTS-NEXT:      Remove unreachable blocks from the CFG
 ; GCN-O1-OPTS-NEXT:      Natural Loop Information
 ; GCN-O1-OPTS-NEXT:      Post-Dominator Tree Construction
@@ -817,7 +814,6 @@
 ; GCN-O2-NEXT:      Lazy Branch Probability Analysis
 ; GCN-O2-NEXT:      Lazy Block Frequency Analysis
 ; GCN-O2-NEXT:      Expand memcmp() to load/stores
-; GCN-O2-NEXT:      Lower constant intrinsics
 ; GCN-O2-NEXT:      Remove unreachable blocks from the CFG
 ; GCN-O2-NEXT:      Natural Loop Information
 ; GCN-O2-NEXT:      Post-Dominator Tree Construction
@@ -1134,7 +1130,6 @@
 ; GCN-O3-NEXT:      Lazy Branch Probability Analysis
 ; GCN-O3-NEXT:      Lazy Block Frequency Analysis
 ; GCN-O3-NEXT:      Expand memcmp() to load/stores
-; GCN-O3-NEXT:      Lower constant intrinsics
 ; GCN-O3-NEXT:      Remove unreachable blocks from the CFG
 ; GCN-O3-NEXT:      Natural Loop Information
 ; GCN-O3-NEXT:      Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/ARM/O3-pipeline.ll b/llvm/test/CodeGen/ARM/O3-pipeline.ll
index 461920e1c5da1..aa92c7ab4fb9f 100644
--- a/llvm/test/CodeGen/ARM/O3-pipeline.ll
+++ b/llvm/test/CodeGen/ARM/O3-pipeline.ll
@@ -30,7 +30,6 @@
 ; CHECK-NEXT:      Expand memcmp() to load/stores
 ; CHECK-NEXT:      Lower Garbage Collection Instructions
 ; CHECK-NEXT:      Shadow Stack GC Lowering
-; CHECK-NEXT:      Lower constant intrinsics
 ; CHECK-NEXT:      Remove unreachable blocks from the CFG
 ; CHECK-NEXT:      Natural Loop Information
 ; CHECK-NEXT:      Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
index 13f774c96d5b1..138f0c81238b3 100644
--- a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
+++ b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
@@ -25,7 +25,6 @@
 ; CHECK-NEXT:       Module Verifier
 ; CHECK-NEXT:       Lower Garbage Collection Instructions
 ; CHECK-NEXT:       Shadow Stack GC Lowering
-; CHECK-NEXT:       Lower constant intrinsics
 ; CHECK-NEXT:       Remove unreachable blocks from the CFG
 ; CHECK-NEXT:       Expand vector predication intrinsics
 ; CHECK-NEXT:       Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/LoongArch/opt-pipeline.ll b/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
index b0c77155c095b..033c61905c1a7 100644
--- a/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
+++ b/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
@@ -53,7 +53,6 @@
 ; LAXX-NEXT:       Expand memcmp() to load/stores
 ; LAXX-NEXT:       Lower Garbage Collection Instructions
 ; LAXX-NEXT:       Shadow Stack GC Lowering
-; LAXX-NEXT:       Lower constant intrinsics
 ; LAXX-NEXT:       Remove unreachable blocks from the CFG
 ; LAXX-NEXT:       Natural Loop Information
 ; LAXX-NEXT:       Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
index cd37bee4592d6..f1c9da078a16e 100644
--- a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
+++ b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
@@ -24,7 +24,6 @@
 ; CHECK-NEXT:       Module Verifier
 ; CHECK-NEXT:       Lower Garbage Collection Instructions
 ; CHECK-NEXT:       Shadow Stack GC Lowering
-; CHECK-NEXT:       Lower constant intrinsics
 ; CHECK-NEXT:       Remove unreachable blocks from the CFG
 ; CHECK-NEXT:       Expand vector predication intrinsics
 ; CHECK-NEXT:       Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/PowerPC/O3-pipeline.ll b/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
index a564a5517f579..be0fbf3abcda2 100644
--- a/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
+++ b/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
@@ -54,7 +54,6 @@
 ; CHECK-NEXT:       Expand memcmp() to load/stores
 ; CHECK-NEXT:       Lower Garbage Collection Instructions
 ; CHECK-NEXT:       Shadow Stack GC Lowering
-; CHECK-NEXT:       Lower constant intrinsics
 ; CHECK-NEXT:       Remove unreachable blocks from the CFG
 ; CHECK-NEXT:       Natural Loop Information
 ; CHECK-NEXT:       Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index 953eb873b660b..fd2ba49ea8610 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -25,7 +25,6 @@
 ; CHECK-NEXT:       Module Verifier
 ; CHECK-NEXT:       Lower Garbage Collection Instructions
 ; CHECK-NEXT:       Shadow Stack GC Lowering
-; CHECK-NEXT:       Lower constant intrinsics
 ; CHECK-NEXT:       Remove unreachable blocks from the CFG
 ; CHECK-NEXT:       Expand vector predication intrinsics
 ; CHECK-NEXT:       Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 3611d92826235..373bdb9d1db0b 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -54,7 +54,6 @@
 ; CHECK-NEXT:       Expand memcmp() to load/stores
 ; CHECK-NEXT:       Lower Garbage Collection Instructions
 ; CHECK-NEXT:       Shadow Stack GC Lowering
-; CHECK-NEXT:       Lower constant intrinsics
 ; CHECK-NEXT:       Remove unreachable blocks from the CFG
 ; CHECK-NEXT:       Natural Loop Information
 ; CHECK-NEXT:       Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/X86/O0-pipeline.ll b/llvm/test/CodeGen/X86/O0-pipeline.ll
index 40648adeb91cd..9e92c10233fd9 100644
--- a/llvm/test/CodeGen/X86/O0-pipeline.ll
+++ b/llvm/test/CodeGen/X86/O0-pipeline.ll
@@ -25,7 +25,6 @@
 ; CHECK-NEXT:       Module Verifier
 ; CHECK-NEXT:       Lower Garbage Collection Instructions
 ; CHECK-NEXT:       Shadow Stack GC Lowering
-; CHECK-NEXT:       Lower constant intrinsics
 ; CHECK-NEXT:       Remove unreachable blocks from the CFG
 ; CHECK-NEXT:       Expand vector predication intrinsics
 ; CHECK-NEXT:       Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/X86/opt-pipeline.ll b/llvm/test/CodeGen/X86/opt-pipeline.ll
index 15c496bfb7f66..b921ef915b77d 100644
--- a/llvm/test/CodeGen/X86/opt-pipeline.ll
+++ b/llvm/test/CodeGen/X86/opt-pipeline.ll
@@ -51,7 +51,6 @@
 ; CHECK-NEXT:       Expand memcmp() to load/stores
 ; CHECK-NEXT:       Lower Garbage Collection Instructions
 ; CHECK-NEXT:       Shadow Stack GC Lowering
-; CHECK-NEXT:       Lower constant intrinsics
 ; CHECK-NEXT:       Remove unreachable blocks from the CFG
 ; CHECK-NEXT:       Natural Loop Information
 ; CHE...
[truncated]

Copy link

github-actions bot commented Jul 4, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Use *U = LastUse ? LastUse->getNext() : &*Intrin.use_begin();
bool Removed = false;
auto CI = dyn_cast<CallInst>(U->getUser());
if (CI && CI->getCalledOperand() == &Intrin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These conditions should always be true. The IR verifier checks that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's the case of assume-like intrinsics, which are permitted (example with debug record). I think that's also the reason this check is found elsewhere in the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I didn't know that. I guess you could still drop the getCalledOperand check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. I couldn't come up with a case where this invariant does not hold. It might make sense to simplify the other checks later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But metadata uses are not part of the use list.

@aengelke aengelke merged commit b5fc083 into llvm:main Aug 1, 2024
7 checks passed
@aengelke aengelke deleted the passdrop/constintrin branch August 1, 2024 15:44
aengelke added a commit to aengelke/llvm-project that referenced this pull request Aug 2, 2024
Similar to llvm#97727; avoid an extra pass over the entire IR by performing
the lowering as part of the pre-isel-intrinsic-lowering pass.
aengelke added a commit to aengelke/llvm-project that referenced this pull request Aug 2, 2024
Similar to llvm#97727; avoid an extra pass over the entire IR by performing
the lowering as part of the pre-isel-intrinsic-lowering pass.
aengelke added a commit that referenced this pull request Aug 6, 2024
Similar to #97727; avoid an extra pass over the entire IR by performing
the lowering as part of the pre-isel-intrinsic-lowering pass.
@@ -96,8 +96,8 @@ static bool replaceConditionalBranchesOnConstant(Instruction *II,
return HasDeadBlocks;
}

static bool lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI,
DominatorTree *DT) {
bool llvm::lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is still doing an RPOT traversal, although the commit message says that we now avoid the RPOT traversal. Is that intentional?

I guess the goal was to call lowerIsConstantIntrisic and lowerObjectSizeCall directly from pre-isel-lowering, rather than running lowerConstantIntrinsics for each such call in the function?

I think it is really weird to call this helper multiple times for the same function. Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although the commit message says that we now avoid the RPOT traversal. Is that intentional?

Yes. For most functions, the RPOT is avoided, because they have no constant intrinsics. If a function has constant intrinsics, it still does a RPOT.

I think it is really weird to call this helper multiple times for the same function. Right?

This function is called once per function (well, except for the degenerate case you point out below...): here we (somewhat) iterate over the uses, and when we find one, we lower all intrinsic calls in the function. These are then no longer in the use list, the next use is in a different function.

Copy link
Collaborator

@bjope bjope Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. So PreISelIntrinsicLowering::lowerIntrinsics is iterating over used intrinsic functions. And then the forEachCall thing is finding out in which functions that the intrinsic is called. That way you avoid doing the RPO scanning for every function in the module.

Just a little bit weird that there is a dependency, that the switch in PreISelIntrinsicLowering::lowerIntrinsics for example may trigger on Intrinsic::objectsize, and then the calls to llvm::lowerConstantIntrinsics may lower lots of other instrinsics, and even remove dead code, as a "side-effect". There is no clear correspondance/correlation between what llvm::lowerConstantIntrinsics is doing and what is switched on in PreISelIntrinsicLowering::lowerIntrinsics.

Function *Parent = CI->getParent()->getParent();
TargetLibraryInfo &TLI = LookupTLI(*Parent);
bool Changed = lowerConstantIntrinsics(*Parent, TLI, /*DT=*/nullptr);
assert(Changed && "lowerConstantIntrinsics did not lower intrinsic");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert fails for input like this

define void @bbi_97828() {
  ret void

dead:
  %x = call i32 @llvm.objectsize.i32.p21(ptr addrspace(21) null, i1 false, i1 false, i1 false)
  br label %dead
}

since lowerConstantIntrinsics is doing an RPOT traversal to find out which intrinsics to lower. So it may fail to find CI, and then there is no changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Opened #102442 to remove the assertion and add a test case.

darkbuck added a commit that referenced this pull request Feb 15, 2025
- After #97727 and #101652, `LowerConstantIntrinsics` and
  `ExpandVectorPredicationPass` are no longer dedicated passes.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
- After llvm#97727 and llvm#101652, `LowerConstantIntrinsics` and
  `ExpandVectorPredicationPass` are no longer dedicated passes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants