Skip to content

[LLD][ELF] Skip non-SHF_ALLOC sections when checking max VA and max VA difference in relaxOnce() #145863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 1, 2025

Conversation

Enna1
Copy link
Contributor

@Enna1 Enna1 commented Jun 26, 2025

For non-SHF_ALLOC sections, sh_addr is set to 0.
Skip sections without SHF_ALLOC flag, so minVA will not be set to 0 with non-SHF_ALLOC sections, and the size of non-SHF_ALLOC sections will not contribute to maxVA.

…A difference in relaxOnce()

For non-SHF_ALLOC sections, sh_addr is set to 0.
Skip sections without SHF_ALLOC flag, so `minVA` will not be set to 0 with
non-SHF_ALLOC sections, and the size of non-SHF_ALLOC sections will not
contribute to `maxVA`.
@llvmbot
Copy link
Member

llvmbot commented Jun 26, 2025

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-elf

Author: Mingjie Xu (Enna1)

Changes

For non-SHF_ALLOC sections, sh_addr is set to 0.
Skip sections without SHF_ALLOC flag, so minVA will not be set to 0 with non-SHF_ALLOC sections, and the size of non-SHF_ALLOC sections will not contribute to maxVA.


Full diff: https://github.com/llvm/llvm-project/pull/145863.diff

1 Files Affected:

  • (modified) lld/ELF/Arch/X86_64.cpp (+2)
diff --git a/lld/ELF/Arch/X86_64.cpp b/lld/ELF/Arch/X86_64.cpp
index 163505102d0ec..488f4803b2cb4 100644
--- a/lld/ELF/Arch/X86_64.cpp
+++ b/lld/ELF/Arch/X86_64.cpp
@@ -320,6 +320,8 @@ bool X86_64::deleteFallThruJmpInsn(InputSection &is, InputFile *file,
 bool X86_64::relaxOnce(int pass) const {
   uint64_t minVA = UINT64_MAX, maxVA = 0;
   for (OutputSection *osec : ctx.outputSections) {
+    if (!(osec->flags & SHF_ALLOC))
+      continue;
     minVA = std::min(minVA, osec->addr);
     maxVA = std::max(maxVA, osec->addr + osec->size);
   }

@MaskRay
Copy link
Member

MaskRay commented Jun 26, 2025

This is correct but can you edit /x86-64-gotpc-relax-too-far.s to make this testable?

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@Enna1
Copy link
Contributor Author

Enna1 commented Jun 27, 2025

Thanks for the review!

This is correct but can you edit /x86-64-gotpc-relax-too-far.s to make this testable?

Tried but didn't make it...

The max VA and max VA difference checks are for early return.
The real decision for not relaxing R_X86_64_(REX_)GOTPCRELX is made by checking !isInt<32>(getRelocTargetVA).
Even if max VA or max VA difference in -pie/-shared is >= 2^31 (say the size of .strtab section is 2^31), isInt<32>(getRelocTargetVA) still can be true and R_X86_64_(REX_)GOTPCRELX will be relaxed.
I thinks current implementation doesn't have correctnees issue, but this change can remove some redundant isInt<32>(getRelocTargetVA) checks, so I didn't come up how to extend /x86-64-gotpc-relax-too-far.s to test this.

Do you have any suggestions? Thanks!

@MaskRay
Copy link
Member

MaskRay commented Jun 27, 2025

You are right that the SHF_ALLOC condition just skips some check and does not change the behavior. Then why does it matter? Do you find a scenario where having the condition improves performance?

@Enna1
Copy link
Contributor Author

Enna1 commented Jun 27, 2025

I'm looking for strategies that alleviate relocation overflow pressure. While trying to understand 9d6ec28 and f3c4dae, I found the max VA difference takes non-SHF_ALLOC sections into account, which is a bit confusing.

Do you find a scenario where having the condition improves performance?

I will test with internal cases to see if there is any performance improvement.

@Enna1
Copy link
Contributor Author

Enna1 commented Jun 30, 2025

Tested a internal large binary.
The size of .debug_info section and .debug_ranges section > 2^31.
With non-SHF_ALLOC sections, maxVA is 0x8ab15c0c (the size of .debug_info section).
Without non-SHF_ALLOC sections, maxVA is 0x3f42d818 (the end VA of .bss section).

readelf output:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        00000000002002e0 0002e0 000035 00   A  0   0  1
  [ 2] .note.ABI-tag     NOTE            0000000000200318 000318 000020 00   A  0   0  4
  [ 3] .note.gnu.build-id NOTE            0000000000200338 000338 000024 00   A  0   0  4
  [ 4] .dynsym           DYNSYM          0000000000200360 000360 1b54900 18   A  8   1  8
  [ 5] .gnu.version      VERSYM          0000000001d54c60 1b54c60 2470c0 02   A  4   0  2
  [ 6] .gnu.version_r    VERNEED         0000000001f9bd20 1d9bd20 000510 00   A  8  14  4
  [ 7] .gnu.hash         GNU_HASH        0000000001f9c230 1d9c230 7af7b8 00   A  4   0  8
  [ 8] .dynstr           STRTAB          000000000274b9e8 254b9e8 d86753a 00   A  0   0  1
  [ 9] .rela.dyn         RELA            000000000ffb2f28 fdb2f28 280680 18   A  4   0  8
  [10] .rela.plt         RELA            00000000102335a8 100335a8 008c58 18  AI  4  33  8
  [11] .rodata           PROGBITS        000000001023d000 1003d000 211a18e 00 AMS  0   0 4096
  [12] .gcc_except_table PROGBITS        0000000012357190 12157190 1b1d73c 00   A  0   0  4
  [13] xxx               PROGBITS        0000000013e748d0 13c748d0 2fbf20 00   A  0   0 16
  [14] .eh_frame_hdr     PROGBITS        00000000141707f0 13f707f0 73e204 00   A  0   0  4
  [15] .eh_frame         PROGBITS        00000000148ae9f8 146ae9f8 327fe64 00   A  0   0  8
  [16] .text             PROGBITS        0000000017b2f900 1792e900 24cf90a0 00  AX  0   0 256
  [17] .init             PROGBITS        000000003c8289a0 3c6279a0 00001c 00  AX  0   0  4
  [18] .fini             PROGBITS        000000003c8289bc 3c6279bc 000009 00  AX  0   0  4
  [19] xxxx              PROGBITS        000000003c8289c6 3c6279c6 0001c0 00  AX  0   0  2
  [20] .plt              PROGBITS        000000003c828b90 3c627b90 005da0 00  AX  0   0 16
  [21] .tdata            PROGBITS        000000003c82f940 3c62d940 000b68 00 WAT  0   0 32
  [22] .tbss             NOBITS          000000003c8304b0 3c62e4a8 17e2a9 00 WAT  0   0 16
  [23] .fini_array       FINI_ARRAY      000000003c8304a8 3c62e4a8 000010 00  WA  0   0  8
  [24] .init_array       INIT_ARRAY      000000003c8304b8 3c62e4b8 00c520 00  WA  0   0  8
  [25] .data.rel.ro      PROGBITS        000000003c83c9e0 3c63a9e0 15afba0 00  WA  0   0 16
  [26] .dynamic          DYNAMIC         000000003ddec580 3dbea580 000460 10  WA  8   0  8
  [27] .got              PROGBITS        000000003ddec9e0 3dbea9e0 000848 00  WA  0   0  8
  [28] .bss.rel.ro       NOBITS          000000003dded228 3dbeb228 000018 00  WA  0   0  8
  [29] .relro_padding    NOBITS          000000003dded240 3dbeb228 000dc0 00  WA  0   0  1
  [30] .data             PROGBITS        000000003ddee240 3dbeb240 113c00 00  WA  0   0 16
  [31] .tm_clone_table   PROGBITS        000000003df01e40 3dcfee40 000000 00  WA  0   0  8
  [32] xxxxx             PROGBITS        000000003df01e40 3dcfee40 000060 00  WA  0   0 16
  [33] .got.plt          PROGBITS        000000003df01ea0 3dcfeea0 002ee0 00  WA  0   0  8
  [34] .bss              NOBITS          000000003df04d80 3dd01d80 1528a98 00 WAo  0   0 128
  [35] .comment          PROGBITS        0000000000000000 3dd01d80 000302 01  MS  0   0  1
  [36] .debug_aranges    PROGBITS        0000000000000000 3dd02090 000220 00      0   0 16
  [37] .debug_info       PROGBITS        0000000000000000 3dd022b0 8ab15c0c 00      0   0  1
  [38] .debug_abbrev     PROGBITS        0000000000000000 c8817ebc 620275 00      0   0  1
  [39] .debug_line       PROGBITS        0000000000000000 c8e38131 4311b0ab 00      0   0  1
  [40] .debug_str        PROGBITS        0000000000000000 10bf531dc 366716bd 01  MS  0   0  1
  [41] .debug_ranges     PROGBITS        0000000000000000 1425c48a0 8857f890 00      0   0 16
  [42] .debug_addr       PROGBITS        0000000000000000 1cab44130 2bf15390 00      0   0  1
  [43] .GCC.command.line PROGBITS        0000000000000000 1f6a594c0 2db0cf8 01  MS  0   0  1
  [44] .debug_loc        PROGBITS        0000000000000000 1f980a1b8 d81682 00      0   0  1
  [45] .debug_frame      PROGBITS        0000000000000000 1fa58b840 000060 00      0   0  8
  [46] .note.stapsdt     NOTE            0000000000000000 1fa58b8a0 00057c 00      0   0  4
  [47] .debug_macinfo    PROGBITS        0000000000000000 1fa58be1c fb2ea4 00      0   0  1
  [48] .gdb_index        PROGBITS        0000000000000000 1fb53ecc0 3142da3e 00      0   0  1
  [49] .symtab           SYMTAB          0000000000000000 22c96c700 28dba88 18     51 591036  8
  [50] .shstrtab         STRTAB          0000000000000000 22f248188 00022e 00      0   0  1
  [51] .strtab           STRTAB          0000000000000000 22f2483b6 ebe91ca 00      0   0  1

Added llvm::TimeTraceScope timeScope("relaxOnce"); in bool X86_64::relaxOnce(int pass) function.
Time trace shows skipping non-SHF_ALLOC sections omits the relaxOnce() cost, ~141ms in this case.

Attached time-trace files:
base.time-trace.json
test.time-trace.json

@Enna1 Enna1 merged commit 6323541 into main Jul 1, 2025
10 checks passed
@Enna1 Enna1 deleted the users/Enna1/avoid-non-shf_alloc-sections-in-relaxonce branch July 1, 2025 01:02
rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
…A difference in relaxOnce() (llvm#145863)

For non-SHF_ALLOC sections, sh_addr is set to 0.
Skip sections without SHF_ALLOC flag, so `minVA` will not be set to 0
with non-SHF_ALLOC sections, and the size of non-SHF_ALLOC sections will
not contribute to `maxVA`.
rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
…A difference in relaxOnce() (llvm#145863)

For non-SHF_ALLOC sections, sh_addr is set to 0.
Skip sections without SHF_ALLOC flag, so `minVA` will not be set to 0
with non-SHF_ALLOC sections, and the size of non-SHF_ALLOC sections will
not contribute to `maxVA`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants