-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[X86] Don't set SHF_X86_64_LARGE for variables with explicit section name of a well-known small data section prefix #70748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…name of a well-known small data section prefix Commit f3ea731 allows SHF_X86_64_LARGE for all global variables with an explicit section. For the following variables, their data sections will be annotated as SHF_X86_64_LARGE. ``` const char relro[512] __attribute__((section(".rodata"))) = "a"; const char *const relro __attribute__((section(".data.rel.ro"))) = "a"; char data[512] __attribute__((section(".data"))) = "a"; ``` The typical linker requirement is that we do not create more than one output section with the same name, and the only output section should have the bitwise OR value of all input section flags. Therefore, the output .data section will have the SHF_X86_64_LARGE flag and be moved away from the regular sections. This is undesired but benign. However, .data.rel.ro having the SHF_X86_64_LARGE flag is problematic because dynamic loaders do not support more than one PT_GNU_RELRO program header, and LLD produces the error `error: section: .jcr is not contiguous with other relro sections`. I believe the most appropriate solution is to disallow SHF_X86_64_LARGE on variables with an explicit section of certain prefixes ( .bss/.data/.bss) and allow others (e.g. metadata sections for various instrumentation). Fortunately, global variables with an explicit .bss/.data/.bss section are rare, so they should not cause excessive relocation overflow pressure.
@llvm/pr-subscribers-backend-x86 Author: Fangrui Song (MaskRay) ChangesCommit f3ea731 allows SHF_X86_64_LARGE
The typical linker requirement is that we do not create more than one I believe the most appropriate solution is to disallow SHF_X86_64_LARGE Full diff: https://github.com/llvm/llvm-project/pull/70748.diff 2 Files Affected:
diff --git a/llvm/lib/Target/TargetMachine.cpp b/llvm/lib/Target/TargetMachine.cpp
index 45fb612cb91da19..9dc00ff85e009be 100644
--- a/llvm/lib/Target/TargetMachine.cpp
+++ b/llvm/lib/Target/TargetMachine.cpp
@@ -46,6 +46,20 @@ bool TargetMachine::isLargeData(const GlobalVariable *GV) const {
// restrict this to medium.
if (getCodeModel() != CodeModel::Medium)
return false;
+
+ // Allowing large metadata sections in the presence of an explicit section is
+ // useful, even if GCC does not allow them. However, we should not mark
+ // certain well-known prefixes as large, because it would make the whole
+ // output section large and cause the linker to move it, which is almost
+ // always undesired.
+ StringRef Name = GV->getSection();
+ auto IsPrefix = [&](StringRef Prefix) {
+ StringRef S = Name;
+ return S.consume_front(Prefix) && (S.empty() || S[0] == '.');
+ };
+ if (IsPrefix(".bss") || IsPrefix(".data") || IsPrefix(".rodata"))
+ return false;
+
const DataLayout &DL = GV->getParent()->getDataLayout();
uint64_t Size = DL.getTypeSizeInBits(GV->getValueType()) / 8;
return Size == 0 || Size > LargeDataThreshold;
diff --git a/llvm/test/CodeGen/X86/code-model-elf-sections.ll b/llvm/test/CodeGen/X86/code-model-elf-sections.ll
index fe659fa9a46e727..5f579edc440d6b2 100644
--- a/llvm/test/CodeGen/X86/code-model-elf-sections.ll
+++ b/llvm/test/CodeGen/X86/code-model-elf-sections.ll
@@ -17,6 +17,8 @@
; RUN: llvm-readelf -S %t | FileCheck %s --check-prefix=SMALL-DS
; SMALL: .data {{.*}} WA {{.*}}
+; SMALL: .data.x {{.*}} WA {{.*}}
+; SMALL: .data0 {{.*}} WA {{.*}}
; SMALL: foo {{.*}} WA {{.*}}
; SMALL: .bss {{.*}} WA {{.*}}
; SMALL: .rodata {{.*}} A {{.*}}
@@ -24,6 +26,9 @@
; SMALL: .tbss {{.*}} WAT {{.*}}
; SMALL: .tdata {{.*}} WAT {{.*}}
+; SMALL-DS: .data {{.*}} WA {{.*}}
+; SMALL-DS: .data.x {{.*}} WA {{.*}}
+; SMALL-DS: .data0 {{.*}} WA {{.*}}
; SMALL-DS: .data.data {{.*}} WA {{.*}}
; SMALL-DS: foo {{.*}} WA {{.*}}
; SMALL-DS: .bss.bss {{.*}} WA {{.*}}
@@ -32,17 +37,27 @@
; SMALL-DS: .tbss.tbss {{.*}} WAT {{.*}}
; SMALL-DS: .tdata.tdata {{.*}} WAT {{.*}}
+; LARGE: .data {{.*}} WA {{.*}}
+; LARGE: .data.x {{.*}} WA {{.*}}
+; LARGE: .data0 {{.*}} WAl {{.*}}
; LARGE: .ldata {{.*}} WAl {{.*}}
; LARGE: foo {{.*}} WAl {{.*}}
+; LARGE: .bss {{.*}} WA {{.*}}
; LARGE: .lbss {{.*}} WAl {{.*}}
+; LARGE: .rodata {{.*}} A {{.*}}
; LARGE: .lrodata {{.*}} Al {{.*}}
; LARGE: .ldata.rel.ro {{.*}} WAl {{.*}}
; LARGE: .tbss {{.*}} WAT {{.*}}
; LARGE: .tdata {{.*}} WAT {{.*}}
+; LARGE-DS: .data {{.*}} WA {{.*}}
+; LARGE-DS: .data.x {{.*}} WA {{.*}}
+; LARGE-DS: .data0 {{.*}} WAl {{.*}}
; LARGE-DS: .ldata.data {{.*}} WAl {{.*}}
; LARGE-DS: foo {{.*}} WAl {{.*}}
+; LARGE-DS: .bss {{.*}} WA {{.*}}
; LARGE-DS: .lbss.bss {{.*}} WAl {{.*}}
+; LARGE-DS: .rodata {{.*}} A {{.*}}
; LARGE-DS: .lrodata.rodata {{.*}} Al {{.*}}
; LARGE-DS: .ldata.rel.ro.relro {{.*}} WAl {{.*}}
; LARGE-DS: .tbss.tbss {{.*}} WAT {{.*}}
@@ -51,9 +66,14 @@
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64--linux"
+@data_with_explicit_section = internal global [10 x i64] [i64 1, i64 2, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0], section ".data"
+@data_with_explicit_section2 = internal global [10 x i64] [i64 1, i64 2, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0], section ".data.x"
+@data_with_explicit_section0 = internal global [10 x i64] [i64 1, i64 2, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0], section ".data0"
@data = internal global [10 x i64] [i64 1, i64 2, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0]
-@data_with_explicit_section = internal global [10 x i64] [i64 1, i64 2, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0], section "foo"
+@foo_with_explicit_section = internal global [10 x i64] [i64 1, i64 2, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0], section "foo"
+@bss_with_explicit_section = internal global [10 x i64] zeroinitializer, section ".bss"
@bss = internal global [10 x i64] zeroinitializer
+@rodata_with_explicit_section = internal constant [10 x i64] zeroinitializer, section ".rodata"
@rodata = internal constant [10 x i64] zeroinitializer
@relro = internal constant [10 x ptr] [ptr @func, ptr @func, ptr @func, ptr @func, ptr @func, ptr @func, ptr @func, ptr @func, ptr @func, ptr @func]
@tbss = internal thread_local global [10 x i64] zeroinitializer
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, but it seems unfortunate to have a hardcoded list of common section names. is there anywhere else we check section names like this and if so can we factor out this name checking?
Thanks for the quick review!
Yes, we hard code section names in many places, including llvm/lib/MC/MCContext.cpp |
… section name Globals marked with the .lbss/.ldata/.lrodata should automatically be treated as large. Do this regardless of the code model for consistency when mixing object files compiled with different code models. Basically the opposite of llvm#70748. Example in the wild: https://codebrowser.dev/qt5/qtbase/src/testlib/qtestcase.cpp.html#1664
… section name (#74381) Globals marked with the .lbss/.ldata/.lrodata should automatically be treated as large. Do this regardless of the code model for consistency when mixing object files compiled with different code models. Basically the other half of #70748. Example in the wild: https://codebrowser.dev/qt5/qtbase/src/testlib/qtestcase.cpp.html#1664
Commit f3ea731 allows SHF_X86_64_LARGE
for all global variables with an explicit section. For the following
variables, their data sections will be annotated as SHF_X86_64_LARGE.
The typical linker requirement is that we do not create more than one
output section with the same name, and the only output section should
have the bitwise OR value of all input section flags. Therefore, the
output .data section will have the SHF_X86_64_LARGE flag and be
moved away from the regular sections. This is undesired but benign.
However, .data.rel.ro having the SHF_X86_64_LARGE flag is problematic
because dynamic loaders do not support more than one PT_GNU_RELRO
program header, and LLD produces the error
error: section: .jcr is not contiguous with other relro sections
.I believe the most appropriate solution is to disallow SHF_X86_64_LARGE
on variables with an explicit section of certain prefixes (
.bss/.data/.bss) and allow others (e.g. metadata sections for various
instrumentation). Fortunately, global variables with an explicit
.bss/.data/.bss section are rare, so they should not cause excessive
relocation overflow pressure.