Skip to content

Commit 80fd5fa

Browse files
committed
[AMDGPU] Replace non-kernel function uses of LDS globals by pointers.
The main motivation behind pointer replacement of LDS use within non-kernel functions is - to *avoid* subsequent LDS lowering pass from directly packing LDS (assume large LDS) into a struct type which would otherwise cause allocating huge memory for struct instance within every kernel. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D103225
1 parent 7a97cd9 commit 80fd5fa

24 files changed

+1968
-1
lines changed

llvm/lib/Target/AMDGPU/AMDGPU.h

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ FunctionPass *createAMDGPUMachineCFGStructurizerPass();
7171
FunctionPass *createAMDGPUPropagateAttributesEarlyPass(const TargetMachine *);
7272
ModulePass *createAMDGPUPropagateAttributesLatePass(const TargetMachine *);
7373
FunctionPass *createAMDGPURewriteOutArgumentsPass();
74+
ModulePass *createAMDGPUReplaceLDSUseWithPointerPass();
7475
ModulePass *createAMDGPULowerModuleLDSPass();
7576
FunctionPass *createSIModeRegisterPass();
7677

@@ -146,6 +147,14 @@ struct AMDGPUPropagateAttributesLatePass
146147
TargetMachine &TM;
147148
};
148149

150+
void initializeAMDGPUReplaceLDSUseWithPointerPass(PassRegistry &);
151+
extern char &AMDGPUReplaceLDSUseWithPointerID;
152+
153+
struct AMDGPUReplaceLDSUseWithPointerPass
154+
: PassInfoMixin<AMDGPUReplaceLDSUseWithPointerPass> {
155+
PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
156+
};
157+
149158
void initializeAMDGPULowerModuleLDSPass(PassRegistry &);
150159
extern char &AMDGPULowerModuleLDSID;
151160

llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,13 @@
2424
// A possible future refinement is to specialise the structure per-kernel, so
2525
// that fields can be elided based on more expensive analysis.
2626
//
27+
// NOTE: Since this pass will directly pack LDS (assume large LDS) into a struct
28+
// type which would cause allocating huge memory for struct instance within
29+
// every kernel. Hence, before running this pass, it is advisable to run the
30+
// pass "amdgpu-replace-lds-use-with-pointer" which will replace LDS uses within
31+
// non-kernel functions by pointers and thereby minimizes the unnecessary per
32+
// kernel allocation of LDS memory.
33+
//
2734
//===----------------------------------------------------------------------===//
2835

2936
#include "AMDGPU.h"

0 commit comments

Comments
 (0)