-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering #97727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal. I'm not sure how this interacts with AMDGPUPromoteAlloca, but tests seem to pass.
@llvm/pr-subscribers-backend-arm @llvm/pr-subscribers-backend-amdgpu Author: Alexis Engelke (aengelke) ChangesCurrently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal. I'm not sure how this interacts with AMDGPUPromoteAlloca, but tests seem to pass. This avoids one non-cheap pass in the -O0 pipeline. c-t-t. Patch is 20.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97727.diff 19 Files Affected:
diff --git a/llvm/include/llvm/LinkAllPasses.h b/llvm/include/llvm/LinkAllPasses.h
index 311d38e8a751f..8d7a1a21385f3 100644
--- a/llvm/include/llvm/LinkAllPasses.h
+++ b/llvm/include/llvm/LinkAllPasses.h
@@ -89,7 +89,6 @@ namespace {
(void) llvm::createLoopSimplifyPass();
(void) llvm::createLoopStrengthReducePass();
(void) llvm::createLoopUnrollPass();
- (void) llvm::createLowerConstantIntrinsicsPass();
(void) llvm::createLowerGlobalDtorsLegacyPass();
(void) llvm::createLowerInvokePass();
(void) llvm::createLowerSwitchPass();
diff --git a/llvm/include/llvm/Transforms/Scalar.h b/llvm/include/llvm/Transforms/Scalar.h
index f74a49785e11b..98d0adca35521 100644
--- a/llvm/include/llvm/Transforms/Scalar.h
+++ b/llvm/include/llvm/Transforms/Scalar.h
@@ -149,13 +149,6 @@ extern char &InferAddressSpacesID;
//
FunctionPass *createTLSVariableHoistPass();
-//===----------------------------------------------------------------------===//
-//
-// LowerConstantIntrinsicss - Expand any remaining llvm.objectsize and
-// llvm.is.constant intrinsic calls, even for the unknown cases.
-//
-FunctionPass *createLowerConstantIntrinsicsPass();
-
//===----------------------------------------------------------------------===//
//
// PartiallyInlineLibCalls - Tries to inline the fast path of library
diff --git a/llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h b/llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h
index e8e404bb93d6e..3d4d1034c0b24 100644
--- a/llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h
+++ b/llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h
@@ -19,7 +19,12 @@
namespace llvm {
+class DominatorTree;
class Function;
+class TargetLibraryInfo;
+
+bool lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI,
+ DominatorTree *DT);
struct LowerConstantIntrinsicsPass :
PassInfoMixin<LowerConstantIntrinsicsPass> {
diff --git a/llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp b/llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp
index 0777acf633187..368e7e6052674 100644
--- a/llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp
+++ b/llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp
@@ -14,6 +14,7 @@
#include "llvm/CodeGen/PreISelIntrinsicLowering.h"
#include "llvm/Analysis/ObjCARCInstKind.h"
#include "llvm/Analysis/ObjCARCUtil.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/CodeGen/Passes.h"
#include "llvm/CodeGen/TargetLowering.h"
@@ -24,10 +25,12 @@
#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Module.h"
#include "llvm/IR/Type.h"
+#include "llvm/IR/Use.h"
#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"
#include "llvm/Support/Casting.h"
#include "llvm/Target/TargetMachine.h"
+#include "llvm/Transforms/Scalar/LowerConstantIntrinsics.h"
#include "llvm/Transforms/Utils/LowerMemIntrinsics.h"
using namespace llvm;
@@ -45,6 +48,7 @@ namespace {
struct PreISelIntrinsicLowering {
const TargetMachine &TM;
const function_ref<TargetTransformInfo &(Function &)> LookupTTI;
+ const function_ref<TargetLibraryInfo &(Function &)> LookupTLI;
/// If this is true, assume it's preferably to leave memory intrinsic calls
/// for replacement with a library call later. Otherwise this depends on
@@ -54,8 +58,9 @@ struct PreISelIntrinsicLowering {
explicit PreISelIntrinsicLowering(
const TargetMachine &TM_,
function_ref<TargetTransformInfo &(Function &)> LookupTTI_,
+ function_ref<TargetLibraryInfo &(Function &)> LookupTLI_,
bool UseMemIntrinsicLibFunc_ = true)
- : TM(TM_), LookupTTI(LookupTTI_),
+ : TM(TM_), LookupTTI(LookupTTI_), LookupTLI(LookupTLI_),
UseMemIntrinsicLibFunc(UseMemIntrinsicLibFunc_) {}
static bool shouldExpandMemIntrinsicWithSize(Value *Size,
@@ -66,6 +71,27 @@ struct PreISelIntrinsicLowering {
} // namespace
+template <class T>
+static bool forEachCall(Function &Intrin, T Callback) {
+ // Lowering all intrinsics in a function will delete multiple uses, so we
+ // can't use an early-inc-range. In case some remain, we don't want to look
+ // at them again. Unfortunately, Value::UseList is private, so we can't use a
+ // simple Use**. If LastUse is null, we the next use to consider is
+ // Intrin.use_begin(), otherwise it's LastUse->getNext().
+ Use *LastUse = nullptr;
+ bool Changed = false;
+ while (!Intrin.use_empty() && (!LastUse || LastUse->getNext())) {
+ Use *U = LastUse ? LastUse->getNext() : &*Intrin.use_begin();
+ bool Removed = false;
+ auto CI = dyn_cast<CallInst>(U->getUser());
+ if (CI && CI->getCalledOperand() == &Intrin)
+ Changed |= Removed = Callback(CI);
+ if (!Removed)
+ LastUse = U;
+ }
+ return Changed;
+}
+
static bool lowerLoadRelative(Function &F) {
if (F.use_empty())
return false;
@@ -285,6 +311,16 @@ bool PreISelIntrinsicLowering::lowerIntrinsics(Module &M) const {
case Intrinsic::load_relative:
Changed |= lowerLoadRelative(F);
break;
+ case Intrinsic::is_constant:
+ case Intrinsic::objectsize:
+ Changed |= forEachCall(F, [&](CallInst *CI) {
+ Function *Parent = CI->getParent()->getParent();
+ TargetLibraryInfo &TLI = LookupTLI(*Parent);
+ bool Changed = lowerConstantIntrinsics(*Parent, TLI, /*DT=*/nullptr);
+ assert(Changed && "lowerConstantIntrinsics did not lower intrinsic");
+ return Changed;
+ });
+ break;
case Intrinsic::objc_autorelease:
Changed |= lowerObjCCall(F, "objc_autorelease");
break;
@@ -375,6 +411,7 @@ class PreISelIntrinsicLoweringLegacyPass : public ModulePass {
void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<TargetTransformInfoWrapperPass>();
+ AU.addRequired<TargetLibraryInfoWrapperPass>();
AU.addRequired<TargetPassConfig>();
}
@@ -382,9 +419,12 @@ class PreISelIntrinsicLoweringLegacyPass : public ModulePass {
auto LookupTTI = [this](Function &F) -> TargetTransformInfo & {
return this->getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
};
+ auto LookupTLI = [this](Function &F) -> TargetLibraryInfo & {
+ return this->getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
+ };
const auto &TM = getAnalysis<TargetPassConfig>().getTM<TargetMachine>();
- PreISelIntrinsicLowering Lowering(TM, LookupTTI);
+ PreISelIntrinsicLowering Lowering(TM, LookupTTI, LookupTLI);
return Lowering.lowerIntrinsics(M);
}
};
@@ -396,6 +436,7 @@ char PreISelIntrinsicLoweringLegacyPass::ID;
INITIALIZE_PASS_BEGIN(PreISelIntrinsicLoweringLegacyPass,
"pre-isel-intrinsic-lowering",
"Pre-ISel Intrinsic Lowering", false, false)
+INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetPassConfig)
INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_END(PreISelIntrinsicLoweringLegacyPass,
@@ -413,8 +454,11 @@ PreservedAnalyses PreISelIntrinsicLoweringPass::run(Module &M,
auto LookupTTI = [&FAM](Function &F) -> TargetTransformInfo & {
return FAM.getResult<TargetIRAnalysis>(F);
};
+ auto LookupTLI = [&FAM](Function &F) -> TargetLibraryInfo & {
+ return FAM.getResult<TargetLibraryAnalysis>(F);
+ };
- PreISelIntrinsicLowering Lowering(TM, LookupTTI);
+ PreISelIntrinsicLowering Lowering(TM, LookupTTI, LookupTLI);
if (!Lowering.lowerIntrinsics(M))
return PreservedAnalyses::all();
else
diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index 3658e8320a0cc..eb74c6f3e5d44 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -845,7 +845,6 @@ void TargetPassConfig::addIRPasses() {
// TODO: add a pass insertion point here
addPass(&GCLoweringID);
addPass(&ShadowStackGCLoweringID);
- addPass(createLowerConstantIntrinsicsPass());
// For MachO, lower @llvm.global_dtors into @llvm.global_ctors with
// __cxa_atexit() calls to avoid emitting the deprecated __mod_term_func.
diff --git a/llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp b/llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp
index 939c36164f781..50d1b4297cf5e 100644
--- a/llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp
+++ b/llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp
@@ -96,8 +96,8 @@ static bool replaceConditionalBranchesOnConstant(Instruction *II,
return HasDeadBlocks;
}
-static bool lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI,
- DominatorTree *DT) {
+bool llvm::lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI,
+ DominatorTree *DT) {
std::optional<DomTreeUpdater> DTU;
if (DT)
DTU.emplace(DT, DomTreeUpdater::UpdateStrategy::Lazy);
@@ -165,45 +165,3 @@ LowerConstantIntrinsicsPass::run(Function &F, FunctionAnalysisManager &AM) {
return PreservedAnalyses::all();
}
-
-namespace {
-/// Legacy pass for lowering is.constant intrinsics out of the IR.
-///
-/// When this pass is run over a function it converts is.constant intrinsics
-/// into 'true' or 'false'. This complements the normal constant folding
-/// to 'true' as part of Instruction Simplify passes.
-class LowerConstantIntrinsics : public FunctionPass {
-public:
- static char ID;
- LowerConstantIntrinsics() : FunctionPass(ID) {
- initializeLowerConstantIntrinsicsPass(*PassRegistry::getPassRegistry());
- }
-
- bool runOnFunction(Function &F) override {
- const TargetLibraryInfo &TLI =
- getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
- DominatorTree *DT = nullptr;
- if (auto *DTWP = getAnalysisIfAvailable<DominatorTreeWrapperPass>())
- DT = &DTWP->getDomTree();
- return lowerConstantIntrinsics(F, TLI, DT);
- }
-
- void getAnalysisUsage(AnalysisUsage &AU) const override {
- AU.addRequired<TargetLibraryInfoWrapperPass>();
- AU.addPreserved<GlobalsAAWrapperPass>();
- AU.addPreserved<DominatorTreeWrapperPass>();
- }
-};
-} // namespace
-
-char LowerConstantIntrinsics::ID = 0;
-INITIALIZE_PASS_BEGIN(LowerConstantIntrinsics, "lower-constant-intrinsics",
- "Lower constant intrinsics", false, false)
-INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
-INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
-INITIALIZE_PASS_END(LowerConstantIntrinsics, "lower-constant-intrinsics",
- "Lower constant intrinsics", false, false)
-
-FunctionPass *llvm::createLowerConstantIntrinsicsPass() {
- return new LowerConstantIntrinsics();
-}
diff --git a/llvm/lib/Transforms/Scalar/Scalar.cpp b/llvm/lib/Transforms/Scalar/Scalar.cpp
index cb1456b146325..86669e8c5aa49 100644
--- a/llvm/lib/Transforms/Scalar/Scalar.cpp
+++ b/llvm/lib/Transforms/Scalar/Scalar.cpp
@@ -32,7 +32,6 @@ void llvm::initializeScalarOpts(PassRegistry &Registry) {
initializeLoopStrengthReducePass(Registry);
initializeLoopUnrollPass(Registry);
initializeLowerAtomicLegacyPassPass(Registry);
- initializeLowerConstantIntrinsicsPass(Registry);
initializeMergeICmpsLegacyPassPass(Registry);
initializeNaryReassociateLegacyPassPass(Registry);
initializePartiallyInlineLibCallsLegacyPassPass(Registry);
diff --git a/llvm/test/CodeGen/AArch64/O0-pipeline.ll b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
index a0306b8e1e924..90911c0e33b5b 100644
--- a/llvm/test/CodeGen/AArch64/O0-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
@@ -21,7 +21,6 @@
; CHECK-NEXT: Module Verifier
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Expand vector predication intrinsics
; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/AArch64/O3-pipeline.ll b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
index 84e672d14d99d..b63aefa70d863 100644
--- a/llvm/test/CodeGen/AArch64/O3-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
@@ -52,7 +52,6 @@
; CHECK-NEXT: Expand memcmp() to load/stores
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Natural Loop Information
; CHECK-NEXT: Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index 08cf83fd2bd0f..161b229ba052a 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -42,7 +42,6 @@
; GCN-O0-NEXT: Lower uses of LDS variables from non-kernel functions
; GCN-O0-NEXT: FunctionPass Manager
; GCN-O0-NEXT: Expand Atomic instructions
-; GCN-O0-NEXT: Lower constant intrinsics
; GCN-O0-NEXT: Remove unreachable blocks from the CFG
; GCN-O0-NEXT: Expand vector predication intrinsics
; GCN-O0-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
@@ -218,7 +217,6 @@
; GCN-O1-NEXT: Lazy Branch Probability Analysis
; GCN-O1-NEXT: Lazy Block Frequency Analysis
; GCN-O1-NEXT: Expand memcmp() to load/stores
-; GCN-O1-NEXT: Lower constant intrinsics
; GCN-O1-NEXT: Remove unreachable blocks from the CFG
; GCN-O1-NEXT: Natural Loop Information
; GCN-O1-NEXT: Post-Dominator Tree Construction
@@ -508,7 +506,6 @@
; GCN-O1-OPTS-NEXT: Lazy Branch Probability Analysis
; GCN-O1-OPTS-NEXT: Lazy Block Frequency Analysis
; GCN-O1-OPTS-NEXT: Expand memcmp() to load/stores
-; GCN-O1-OPTS-NEXT: Lower constant intrinsics
; GCN-O1-OPTS-NEXT: Remove unreachable blocks from the CFG
; GCN-O1-OPTS-NEXT: Natural Loop Information
; GCN-O1-OPTS-NEXT: Post-Dominator Tree Construction
@@ -817,7 +814,6 @@
; GCN-O2-NEXT: Lazy Branch Probability Analysis
; GCN-O2-NEXT: Lazy Block Frequency Analysis
; GCN-O2-NEXT: Expand memcmp() to load/stores
-; GCN-O2-NEXT: Lower constant intrinsics
; GCN-O2-NEXT: Remove unreachable blocks from the CFG
; GCN-O2-NEXT: Natural Loop Information
; GCN-O2-NEXT: Post-Dominator Tree Construction
@@ -1134,7 +1130,6 @@
; GCN-O3-NEXT: Lazy Branch Probability Analysis
; GCN-O3-NEXT: Lazy Block Frequency Analysis
; GCN-O3-NEXT: Expand memcmp() to load/stores
-; GCN-O3-NEXT: Lower constant intrinsics
; GCN-O3-NEXT: Remove unreachable blocks from the CFG
; GCN-O3-NEXT: Natural Loop Information
; GCN-O3-NEXT: Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/ARM/O3-pipeline.ll b/llvm/test/CodeGen/ARM/O3-pipeline.ll
index 461920e1c5da1..aa92c7ab4fb9f 100644
--- a/llvm/test/CodeGen/ARM/O3-pipeline.ll
+++ b/llvm/test/CodeGen/ARM/O3-pipeline.ll
@@ -30,7 +30,6 @@
; CHECK-NEXT: Expand memcmp() to load/stores
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Natural Loop Information
; CHECK-NEXT: Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
index 13f774c96d5b1..138f0c81238b3 100644
--- a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
+++ b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
@@ -25,7 +25,6 @@
; CHECK-NEXT: Module Verifier
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Expand vector predication intrinsics
; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/LoongArch/opt-pipeline.ll b/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
index b0c77155c095b..033c61905c1a7 100644
--- a/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
+++ b/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
@@ -53,7 +53,6 @@
; LAXX-NEXT: Expand memcmp() to load/stores
; LAXX-NEXT: Lower Garbage Collection Instructions
; LAXX-NEXT: Shadow Stack GC Lowering
-; LAXX-NEXT: Lower constant intrinsics
; LAXX-NEXT: Remove unreachable blocks from the CFG
; LAXX-NEXT: Natural Loop Information
; LAXX-NEXT: Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
index cd37bee4592d6..f1c9da078a16e 100644
--- a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
+++ b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
@@ -24,7 +24,6 @@
; CHECK-NEXT: Module Verifier
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Expand vector predication intrinsics
; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/PowerPC/O3-pipeline.ll b/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
index a564a5517f579..be0fbf3abcda2 100644
--- a/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
+++ b/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
@@ -54,7 +54,6 @@
; CHECK-NEXT: Expand memcmp() to load/stores
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Natural Loop Information
; CHECK-NEXT: Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index 953eb873b660b..fd2ba49ea8610 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -25,7 +25,6 @@
; CHECK-NEXT: Module Verifier
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Expand vector predication intrinsics
; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 3611d92826235..373bdb9d1db0b 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -54,7 +54,6 @@
; CHECK-NEXT: Expand memcmp() to load/stores
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Natural Loop Information
; CHECK-NEXT: Post-Dominator Tree Construction
diff --git a/llvm/test/CodeGen/X86/O0-pipeline.ll b/llvm/test/CodeGen/X86/O0-pipeline.ll
index 40648adeb91cd..9e92c10233fd9 100644
--- a/llvm/test/CodeGen/X86/O0-pipeline.ll
+++ b/llvm/test/CodeGen/X86/O0-pipeline.ll
@@ -25,7 +25,6 @@
; CHECK-NEXT: Module Verifier
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Expand vector predication intrinsics
; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
diff --git a/llvm/test/CodeGen/X86/opt-pipeline.ll b/llvm/test/CodeGen/X86/opt-pipeline.ll
index 15c496bfb7f66..b921ef915b77d 100644
--- a/llvm/test/CodeGen/X86/opt-pipeline.ll
+++ b/llvm/test/CodeGen/X86/opt-pipeline.ll
@@ -51,7 +51,6 @@
; CHECK-NEXT: Expand memcmp() to load/stores
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Natural Loop Information
; CHE...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Use *U = LastUse ? LastUse->getNext() : &*Intrin.use_begin(); | ||
bool Removed = false; | ||
auto CI = dyn_cast<CallInst>(U->getUser()); | ||
if (CI && CI->getCalledOperand() == &Intrin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These conditions should always be true. The IR verifier checks that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's the case of assume-like intrinsics, which are permitted (example with debug record). I think that's also the reason this check is found elsewhere in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I didn't know that. I guess you could still drop the getCalledOperand
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed. I couldn't come up with a case where this invariant does not hold. It might make sense to simplify the other checks later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But metadata uses are not part of the use list.
Similar to llvm#97727; avoid an extra pass over the entire IR by performing the lowering as part of the pre-isel-intrinsic-lowering pass.
Similar to llvm#97727; avoid an extra pass over the entire IR by performing the lowering as part of the pre-isel-intrinsic-lowering pass.
Similar to #97727; avoid an extra pass over the entire IR by performing the lowering as part of the pre-isel-intrinsic-lowering pass.
@@ -96,8 +96,8 @@ static bool replaceConditionalBranchesOnConstant(Instruction *II, | |||
return HasDeadBlocks; | |||
} | |||
|
|||
static bool lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI, | |||
DominatorTree *DT) { | |||
bool llvm::lowerConstantIntrinsics(Function &F, const TargetLibraryInfo &TLI, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is still doing an RPOT traversal, although the commit message says that we now avoid the RPOT traversal. Is that intentional?
I guess the goal was to call lowerIsConstantIntrisic and lowerObjectSizeCall directly from pre-isel-lowering, rather than running lowerConstantIntrinsics for each such call in the function?
I think it is really weird to call this helper multiple times for the same function. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although the commit message says that we now avoid the RPOT traversal. Is that intentional?
Yes. For most functions, the RPOT is avoided, because they have no constant intrinsics. If a function has constant intrinsics, it still does a RPOT.
I think it is really weird to call this helper multiple times for the same function. Right?
This function is called once per function (well, except for the degenerate case you point out below...): here we (somewhat) iterate over the uses, and when we find one, we lower all intrinsic calls in the function. These are then no longer in the use list, the next use is in a different function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. So PreISelIntrinsicLowering::lowerIntrinsics is iterating over used intrinsic functions. And then the forEachCall thing is finding out in which functions that the intrinsic is called. That way you avoid doing the RPO scanning for every function in the module.
Just a little bit weird that there is a dependency, that the switch in PreISelIntrinsicLowering::lowerIntrinsics for example may trigger on Intrinsic::objectsize, and then the calls to llvm::lowerConstantIntrinsics may lower lots of other instrinsics, and even remove dead code, as a "side-effect". There is no clear correspondance/correlation between what llvm::lowerConstantIntrinsics is doing and what is switched on in PreISelIntrinsicLowering::lowerIntrinsics.
Function *Parent = CI->getParent()->getParent(); | ||
TargetLibraryInfo &TLI = LookupTLI(*Parent); | ||
bool Changed = lowerConstantIntrinsics(*Parent, TLI, /*DT=*/nullptr); | ||
assert(Changed && "lowerConstantIntrinsics did not lower intrinsic"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert fails for input like this
define void @bbi_97828() {
ret void
dead:
%x = call i32 @llvm.objectsize.i32.p21(ptr addrspace(21) null, i1 false, i1 false, i1 false)
br label %dead
}
since lowerConstantIntrinsics is doing an RPOT traversal to find out which intrinsics to lower. So it may fail to find CI, and then there is no changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Opened #102442 to remove the assertion and add a test case.
- After llvm#97727 and llvm#101652, `LowerConstantIntrinsics` and `ExpandVectorPredicationPass` are no longer dedicated passes.
Currently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal.
I'm not sure how this interacts with AMDGPUPromoteAlloca, but tests seem to pass.
This avoids one non-cheap pass in the -O0 pipeline. c-t-t.