Skip to content

Move HIP fatbin sections farther away from .text #95949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from

Conversation

scchan
Copy link
Contributor

@scchan scchan commented Jun 18, 2024

This would avoid wasting relocation range to jump over the HIP fatbin sections and therefore alleviate relocation overflow pressure.

@scchan scchan added the lld label Jun 18, 2024
@scchan scchan requested a review from yxsamliu June 18, 2024 16:16
@llvmbot
Copy link
Member

llvmbot commented Jun 18, 2024

@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lld

Author: Siu Chi Chan (scchan)

Changes

This would avoid wasting relocation range to jump over the HIP fatbin sections and therefore alleviate relocation overflow pressure.


Full diff: https://github.com/llvm/llvm-project/pull/95949.diff

2 Files Affected:

  • (modified) lld/ELF/Writer.cpp (+10)
  • (added) lld/test/ELF/hip-section-layout.s (+37)
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 640cb2a445f7d..66e729b2d74d5 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -617,6 +617,7 @@ static bool isRelroSection(const OutputSection *sec) {
 enum RankFlags {
   RF_NOT_ADDR_SET = 1 << 27,
   RF_NOT_ALLOC = 1 << 26,
+  RF_HIP_FATBIN = 1 << 19,
   RF_PARTITION = 1 << 18, // Partition number (8 bits)
   RF_LARGE_ALT = 1 << 15,
   RF_WRITE = 1 << 14,
@@ -714,6 +715,15 @@ unsigned elf::getSectionRank(OutputSection &osec) {
   if (osec.type == SHT_NOBITS)
     rank |= RF_BSS;
 
+  // Put HIP fatbin related sections further away to avoid wasting relocation 
+  // range to jump over them.  Make sure .hip_fatbin is the furthest.
+  if (osec.name == ".hipFatBinSegment")
+    rank |= RF_HIP_FATBIN;
+  if (osec.name == ".hip_gpubin_handle")
+    rank |= RF_HIP_FATBIN | 2;
+  if (osec.name == ".hip_fatbin")
+    rank |= RF_HIP_FATBIN | RF_WRITE | 3;
+
   // Some architectures have additional ordering restrictions for sections
   // within the same PT_LOAD.
   if (config->emachine == EM_PPC64) {
diff --git a/lld/test/ELF/hip-section-layout.s b/lld/test/ELF/hip-section-layout.s
new file mode 100644
index 0000000000000..ec6fe3e457829
--- /dev/null
+++ b/lld/test/ELF/hip-section-layout.s
@@ -0,0 +1,37 @@
+# REQUIRES: x86
+## Test HIP specific sections layout.
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux --defsym=HIP_SECTIONS=1 --defsym=NON_HIP_SECTIONS=1 %s -o %t.o
+# RUN: ld.lld %t.o -o %t.out
+# RUN: llvm-readobj --sections %t.out | FileCheck %s
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux --defsym=NON_HIP_SECTIONS=1 %s -o %t.1.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux --defsym=HIP_SECTIONS=1 %s -o %t.2.o
+# RUN: ld.lld %t.1.o %t.2.o -o %t.s.out
+# RUN: llvm-readobj --sections %t.s.out | FileCheck %s
+
+.ifdef HIP_SECTIONS
+.section .hipFatBinSegment,"aw",@progbits; .space 1
+.section .hip_gpubin_handle,"aw",@progbits; .space 1
+.section .hip_fatbin,"a",@progbits; .space 1
+.endif
+
+.ifdef NON_HIP_SECTIONS
+.global _start
+.text
+_start:
+.section .bss,"aw",@nobits; .space 1
+.section .debug_info,"",@progbits
+.section .debug_line,"",@progbits
+.section .debug_str,"MS",@progbits,1
+.endif
+
+# Check that the HIP sections are placed towards the end but before non allocated sections
+
+// CHECK: Name: .bss
+// CHECK: Name: .hipFatBinSegment
+// CHECK: Name: .hip_gpubin_handle
+// CHECK: Name: .hip_fatbin
+// CHECK: Name: .debug_info
+// CHECK: Name: .debug_line
+// CHECK: Name: .debug_str
\ No newline at end of file

Copy link

github-actions bot commented Aug 6, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

// CHECK: Name: .hip_fatbin
// CHECK: Name: .debug_info
// CHECK: Name: .debug_line
// CHECK: Name: .debug_str
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing new line at the end

@scchan scchan closed this Aug 6, 2024
@MaskRay
Copy link
Member

MaskRay commented Jan 27, 2025

To my understanding, you haven't contributed to lld before and your prior contribution (only a few commits since 2022) has been limited to a few AMD-specific directories. Given this, could you please explain the rationale behind committing 048f350 and f7bbc40 without first seeking review from a designated lld reviewer?

@MaskRay
Copy link
Member

MaskRay commented Jan 27, 2025

I don't think we should have section order special cases like .hip_gpubin_handle for very niche use cases.
Can't you use INSERT AFTER (https://maskray.me/blog/2021-07-04-sections-and-overwrite-sections )?

FWIW in the past I received internal requests to add such very application-specific ordering stuff (including CUDA fatbin). I've convinced them to adopt INSERT [AFTER|BEFORE] instead.

I am revertingthis.

MaskRay added a commit that referenced this pull request Jan 27, 2025
This reverts commit 048f350.
This reverts commit f7bbc40.

Related to #95949. A developer with no prior lld contribution and very
little AMD contribution sneaked in these application-specific section
order rules we discourage.
@scchan
Copy link
Contributor Author

scchan commented Jan 27, 2025

To my understanding, you haven't contributed to lld before and your prior contribution (only a few commits since 2022) has been limited to a few AMD-specific directories. Given this, could you please explain the rationale behind committing 048f350 and f7bbc40 without first seeking review from a designated lld reviewer?

It was due to my poor judgement in believing that there was no objection since the PR has been raised for 1.5 months, my apologies...

@scchan
Copy link
Contributor Author

scchan commented Jan 27, 2025

I don't think we should have section order special cases like .hip_gpubin_handle for very niche use cases. Can't you use INSERT AFTER (https://maskray.me/blog/2021-07-04-sections-and-overwrite-sections )?

FWIW in the past I received internal requests to add such very application-specific ordering stuff (including CUDA fatbin). I've convinced them to adopt INSERT [AFTER|BEFORE] instead.

I am revertingthis.

That works but the motivation to add to lld is have a permanent solution because the relocation overflow error isn't always obvious to most developers about the underlying issue. Also, a hip specific linker isn't always available, for example when someone uses a clang/lld from a distro.

@yxsamliu
Copy link
Collaborator

To my understanding, you haven't contributed to lld before and your prior contribution (only a few commits since 2022) has been limited to a few AMD-specific directories. Given this, could you please explain the rationale behind committing 048f350 and f7bbc40 without first seeking review from a designated lld reviewer?

It was due to my poor judgement in believing that there was no objection since the PR has been raised for 1.5 months, my apologies...

Sorry I did not think through when reviewing this PR. I should have added more experienced lld developers as reviewers.

@jrtc27
Copy link
Collaborator

jrtc27 commented Jan 27, 2025

Even if teaching LLD about the sections is the right thing to do, the implementation failed to account for RF_PARTITION's 8 bit encoding, as is clearly stated on the comment next to it on the line below your RF_HIP_FATBIN. That is, bits 18 through 25 are used by it, and are not free for use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants