-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AMDGPU] Don't realign already allocated LDS. Point fix for 106412 #106421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-amdgpu Author: Jon Chesterfield (JonChesterfield) ChangesFixes 106412. The logic that skips the pass on already-lowered variables doesn't cover the path that increases alignment of variables. If a variable is allocated at 24 and then given 16 byte alignment, the backend notices and fatal-errors on the inconsistency. Full diff: https://github.com/llvm/llvm-project/pull/106421.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp b/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
index 92b42e27a035eb..c96ce740ce026e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
@@ -1131,6 +1131,11 @@ class AMDGPULowerModuleLDS {
continue;
}
+ if (GV.isAbsoluteSymbolRef()) {
+ // If the variable is already allocated, don't change the alignment
+ continue;
+ }
+
Align Alignment = AMDGPU::getAlign(DL, &GV);
TypeSize GVSize = DL.getTypeAllocSize(GV.getValueType());
diff --git a/llvm/test/CodeGen/AMDGPU/lds-no-realign-allocated-variables.ll b/llvm/test/CodeGen/AMDGPU/lds-no-realign-allocated-variables.ll
new file mode 100644
index 00000000000000..5acc5cefbbd9c6
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/lds-no-realign-allocated-variables.ll
@@ -0,0 +1,15 @@
+; RUN: opt -S -mtriple=amdgcn-- -amdgpu-lower-module-lds < %s | FileCheck %s
+
+; Can't have a second variable without absolute_symbol showing it is realigned as
+; there is a fatal error on mixing absolute and non-absolute symbols
+
+; CHECK: @lds.dont_realign = internal addrspace(3) global i64 undef, align 2, !absolute_symbol !0
+@lds.dont_realign = internal addrspace(3) global i64 undef, align 2, !absolute_symbol !0
+
+; CHECK: void @use_variables
+define amdgpu_kernel void @use_variables(i64 %val) {
+ store i64 %val, ptr addrspace(3) @lds.dont_realign, align 2
+ ret void
+}
+
+!0 = !{i32 2, i32 3}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -0,0 +1,15 @@ | |||
; RUN: opt -S -mtriple=amdgcn-- -amdgpu-lower-module-lds < %s | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use -passes=amdgpu-lower-module-lds
; there is a fatal error on mixing absolute and non-absolute symbols | ||
|
||
; CHECK: @lds.dont_realign = internal addrspace(3) global i64 undef, align 2, !absolute_symbol !0 | ||
@lds.dont_realign = internal addrspace(3) global i64 undef, align 2, !absolute_symbol !0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use poison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why poison? Working model is that uses of uninitialised LDS are fine but meaningless, undef seems to capture that more accurately than poison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have one test with poison and one with undef? I've seen poison being used for these LDS vars in certain cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
undef is soft deprecated and there's no difference here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't do much harm here but in general duplicating test cases for zero changes in code path seems bad
I'd lean towards the default for tests should probably correspond to whatever clang marks them with, which seems to be poison now but I'm sure used to be undef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
poison is new-ish so yes, it used to be undef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with just a test with poison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Poison takes it for the test case, but in general it seems unfortunate. Reading from uninitialised LDS could be totally well defined and making it poison will propagate dead code elimination from it, though I concede that is the direction LLVM is moving in
You can test this locally with the following command:git-clang-format --diff 829c47f4e0e63dd2a0f181b8f0c21817e1b25ccf e7058e39cf64354cad506f3d63643f6af69557fe --extensions cpp -- llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp View the diff from clang-format here.diff --git a/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp b/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
index c96ce740ce..38f0b6dda1 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
@@ -1135,7 +1135,7 @@ private:
// If the variable is already allocated, don't change the alignment
continue;
}
-
+
Align Alignment = AMDGPU::getAlign(DL, &GV);
TypeSize GVSize = DL.getTypeAllocSize(GV.getValueType());
|
e7058e3
to
d22d4f9
Compare
|
||
; CHECK: @lds.dont_realign_undef = internal addrspace(3) global i64 undef, align 2, !absolute_symbol !0 | ||
; CHECK: @lds.dont_realign_poison = internal addrspace(3) global i64 poison, align 2, !absolute_symbol !0 | ||
@lds.dont_realign_undef = internal addrspace(3) global i64 undef, align 2, !absolute_symbol !0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no reason to have the undef one, or separate names. just have one normal variable with poison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, race condition between review requests and updating the branch
d22d4f9
to
2eed1ac
Compare
2eed1ac
to
3ee8cf8
Compare
Fixes 106412. The logic that skips the pass on already-lowered variables doesn't cover the path that increases alignment of variables. If a variable is allocated at 24 and then given 16 byte alignment, the backend notices and fatal-errors on the inconsistency.