Skip to content

Commit 17ea305

Browse files
committed
[AMDGPU] Run LowerLDS at the end of the fullLTO pipeline
This change allows us to use `--lto-partitions` in some cases (not guaranteed it works perfectly), as LDS is lowered before the module is split for parallel codegen.
1 parent d4569d4 commit 17ea305

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -793,6 +793,15 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(
793793

794794
PM.addPass(createCGSCCToFunctionPassAdaptor(std::move(FPM)));
795795
});
796+
797+
PB.registerFullLinkTimeOptimizationLastEPCallback(
798+
[this](ModulePassManager &PM, OptimizationLevel Level) {
799+
// We want to support the -lto-partitions=N option as "best effort".
800+
// For that, we need to lower LDS earlier in the pipeline before the
801+
// module is partitioned for codegen.
802+
if (EnableLowerModuleLDS)
803+
PM.addPass(AMDGPULowerModuleLDSPass(*this));
804+
});
796805
}
797806

798807
int64_t AMDGPUTargetMachine::getNullPointerValue(unsigned AddrSpace) {

0 commit comments

Comments
 (0)