Skip to content

Commit 9398247

Browse files
committed
[AMDGPU] Fix DynLDS causing crash when LowerLDS is run at fullLTO pipeline
Direct mapped dynamic LDS is not lowered in the LowerLDSModule pass. Hence it is not marked with absolute symbol. When lowerLDS pass is rerun in LTO, compilation fails with assert "cannot mix abs and non-abs LDVs". This patch adds fix to check if all GVs are absolute or if its non absolute,then whether it is direct mapped dynLDS, if not fails with the same assert. Fixes SWDEV-454281
1 parent 0c02811 commit 9398247

File tree

2 files changed

+9
-3
lines changed

2 files changed

+9
-3
lines changed

llvm/lib/Target/AMDGPU/Utils/AMDGPUMemoryUtils.cpp

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -207,16 +207,21 @@ LDSUsesInfoTy getTransitiveUsesOfLDS(const CallGraph &CG, Module &M) {
207207
}
208208

209209
// Verify that we fall into one of 2 cases:
210-
// - All variables are absolute: this is a re-run of the pass
210+
// - All variables are either absolute
211+
// or direct mapped dynamic LDS that is not lowered.
212+
// this is a re-run of the pass
211213
// so we don't have anything to do.
212214
// - No variables are absolute.
213215
std::optional<bool> HasAbsoluteGVs;
214216
for (auto &Map : {DirectMapKernel, IndirectMapKernel}) {
215217
for (auto &[Fn, GVs] : Map) {
216218
for (auto *GV : GVs) {
217219
bool IsAbsolute = GV->isAbsoluteSymbolRef();
220+
bool IsDirectMapDynLDSGV = AMDGPU::isDynamicLDS(*GV) && DirectMapKernel.contains(Fn);
218221
if (HasAbsoluteGVs.has_value()) {
219-
if (*HasAbsoluteGVs != IsAbsolute) {
222+
if (*HasAbsoluteGVs != IsAbsolute ) {
223+
if(IsDirectMapDynLDSGV)
224+
continue;
220225
report_fatal_error(
221226
"Module cannot mix absolute and non-absolute LDS GVs");
222227
}

llvm/test/CodeGen/AMDGPU/lto-lower-module-lds.ll

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,10 @@
3939
; CHECK: Lower uses of LDS variables from non-kernel functions
4040

4141
@lds = internal unnamed_addr addrspace(3) global i32 poison, align 4
42-
42+
@dynlds = external addrspace(3) global [0 x i32]
4343
define amdgpu_kernel void @test() {
4444
entry:
4545
store i32 1, ptr addrspace(3) @lds
46+
store i32 0, ptr addrspace(3) @dynlds
4647
ret void
4748
}

0 commit comments

Comments
 (0)