-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Move HIP fatbin sections farther away from .text #95949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lld-elf @llvm/pr-subscribers-lld Author: Siu Chi Chan (scchan) ChangesThis would avoid wasting relocation range to jump over the HIP fatbin sections and therefore alleviate relocation overflow pressure. Full diff: https://github.com/llvm/llvm-project/pull/95949.diff 2 Files Affected:
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 640cb2a445f7d..66e729b2d74d5 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -617,6 +617,7 @@ static bool isRelroSection(const OutputSection *sec) {
enum RankFlags {
RF_NOT_ADDR_SET = 1 << 27,
RF_NOT_ALLOC = 1 << 26,
+ RF_HIP_FATBIN = 1 << 19,
RF_PARTITION = 1 << 18, // Partition number (8 bits)
RF_LARGE_ALT = 1 << 15,
RF_WRITE = 1 << 14,
@@ -714,6 +715,15 @@ unsigned elf::getSectionRank(OutputSection &osec) {
if (osec.type == SHT_NOBITS)
rank |= RF_BSS;
+ // Put HIP fatbin related sections further away to avoid wasting relocation
+ // range to jump over them. Make sure .hip_fatbin is the furthest.
+ if (osec.name == ".hipFatBinSegment")
+ rank |= RF_HIP_FATBIN;
+ if (osec.name == ".hip_gpubin_handle")
+ rank |= RF_HIP_FATBIN | 2;
+ if (osec.name == ".hip_fatbin")
+ rank |= RF_HIP_FATBIN | RF_WRITE | 3;
+
// Some architectures have additional ordering restrictions for sections
// within the same PT_LOAD.
if (config->emachine == EM_PPC64) {
diff --git a/lld/test/ELF/hip-section-layout.s b/lld/test/ELF/hip-section-layout.s
new file mode 100644
index 0000000000000..ec6fe3e457829
--- /dev/null
+++ b/lld/test/ELF/hip-section-layout.s
@@ -0,0 +1,37 @@
+# REQUIRES: x86
+## Test HIP specific sections layout.
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux --defsym=HIP_SECTIONS=1 --defsym=NON_HIP_SECTIONS=1 %s -o %t.o
+# RUN: ld.lld %t.o -o %t.out
+# RUN: llvm-readobj --sections %t.out | FileCheck %s
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux --defsym=NON_HIP_SECTIONS=1 %s -o %t.1.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux --defsym=HIP_SECTIONS=1 %s -o %t.2.o
+# RUN: ld.lld %t.1.o %t.2.o -o %t.s.out
+# RUN: llvm-readobj --sections %t.s.out | FileCheck %s
+
+.ifdef HIP_SECTIONS
+.section .hipFatBinSegment,"aw",@progbits; .space 1
+.section .hip_gpubin_handle,"aw",@progbits; .space 1
+.section .hip_fatbin,"a",@progbits; .space 1
+.endif
+
+.ifdef NON_HIP_SECTIONS
+.global _start
+.text
+_start:
+.section .bss,"aw",@nobits; .space 1
+.section .debug_info,"",@progbits
+.section .debug_line,"",@progbits
+.section .debug_str,"MS",@progbits,1
+.endif
+
+# Check that the HIP sections are placed towards the end but before non allocated sections
+
+// CHECK: Name: .bss
+// CHECK: Name: .hipFatBinSegment
+// CHECK: Name: .hip_gpubin_handle
+// CHECK: Name: .hip_fatbin
+// CHECK: Name: .debug_info
+// CHECK: Name: .debug_line
+// CHECK: Name: .debug_str
\ No newline at end of file
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
lld/test/ELF/hip-section-layout.s
Outdated
// CHECK: Name: .hip_fatbin | ||
// CHECK: Name: .debug_info | ||
// CHECK: Name: .debug_line | ||
// CHECK: Name: .debug_str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing new line at the end
To my understanding, you haven't contributed to lld before and your prior contribution (only a few commits since 2022) has been limited to a few AMD-specific directories. Given this, could you please explain the rationale behind committing 048f350 and f7bbc40 without first seeking review from a designated lld reviewer? |
I don't think we should have section order special cases like FWIW in the past I received internal requests to add such very application-specific ordering stuff (including CUDA fatbin). I've convinced them to adopt I am revertingthis. |
It was due to my poor judgement in believing that there was no objection since the PR has been raised for 1.5 months, my apologies... |
That works but the motivation to add to lld is have a permanent solution because the relocation overflow error isn't always obvious to most developers about the underlying issue. Also, a hip specific linker isn't always available, for example when someone uses a clang/lld from a distro. |
Sorry I did not think through when reviewing this PR. I should have added more experienced lld developers as reviewers. |
Even if teaching LLD about the sections is the right thing to do, the implementation failed to account for RF_PARTITION's 8 bit encoding, as is clearly stated on the comment next to it on the line below your RF_HIP_FATBIN. That is, bits 18 through 25 are used by it, and are not free for use. |
This would avoid wasting relocation range to jump over the HIP fatbin sections and therefore alleviate relocation overflow pressure.