Skip to content

[X86,lld] Handle relocation R_X86_64_REX2_GOTPCRELX #109783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 29, 2024
Merged

Conversation

KanRobert
Copy link
Contributor

@KanRobert KanRobert commented Sep 24, 2024

For

mov        name@GOTPCREL(%rip), %reg
test       %reg, name@GOTPCREL(%rip)
binop      name@GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions, we added

R_X86_64_REX2_GOTPCRELX = 43

in #106681.

Linker can treat R_X86_64_REX2_GOTPCRELX as R_X86_64_GOTPCREL or convert the above instructions to

lea	name(%rip), %reg
mov	$name, %reg
test	$name, %reg
binop	$name, %reg

if the first byte of the instruction at the relocation offset - 4 is 0xd5 (namely, encoded w/ REX2 prefix) when possible.

Binutils patch: bminor/binutils-gdb@3d5a60d
Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation

For

	mov        name@GOTPCREL(%rip), %reg
	test       %reg, name@GOTPCREL(%rip)
	binop      name@GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions,
add

 R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX = 43

if the instruction starts at 4 bytes before the relocation offset.  It
similar to R_X86_64_GOTPCRELX.

Linker can treat R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX
as R_X86_64_GOTPCREL or convert the above instructions to

	lea	name(%rip), %reg
	mov	$name, %reg
	test	$name, %reg
	binop	$name, %reg

if the first byte of the instruction at the relocation `offset - 4` is
`0xd5` (namely, encoded w/ REX2 prefix) when possible.

Binutils patch: bminor/binutils-gdb@3d5a60d
Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation
@llvmbot
Copy link
Member

llvmbot commented Sep 24, 2024

@llvm/pr-subscribers-lld

Author: Shengchen Kan (KanRobert)

Changes

For

mov        name@<!-- -->GOTPCREL(%rip), %reg
test       %reg, name@<!-- -->GOTPCREL(%rip)
binop      name@<!-- -->GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions, add

R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX = 43

if the instruction starts at 4 bytes before the relocation offset. It similar to R_X86_64_GOTPCRELX.

Linker can treat R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX as R_X86_64_GOTPCREL or convert the above instructions to

lea	name(%rip), %reg
mov	$name, %reg
test	$name, %reg
binop	$name, %reg

if the first byte of the instruction at the relocation offset - 4 is 0xd5 (namely, encoded w/ REX2 prefix) when possible.

Binutils patch: bminor/binutils-gdb@3d5a60d
Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation


Full diff: https://github.com/llvm/llvm-project/pull/109783.diff

4 Files Affected:

  • (modified) lld/ELF/Arch/X86_64.cpp (+28-9)
  • (modified) lld/test/ELF/x86-64-gotpc-no-relax-err.s (+7-3)
  • (modified) lld/test/ELF/x86-64-gotpc-relax-nopic.s (+69-40)
  • (modified) lld/test/ELF/x86-64-gotpc-relax.s (+40-14)
diff --git a/lld/ELF/Arch/X86_64.cpp b/lld/ELF/Arch/X86_64.cpp
index 48f17718365e24..56e3b882b8b3c7 100644
--- a/lld/ELF/Arch/X86_64.cpp
+++ b/lld/ELF/Arch/X86_64.cpp
@@ -388,6 +388,7 @@ RelExpr X86_64::getRelExpr(RelType type, const Symbol &s,
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_GOTTPOFF:
     return R_GOT_PC;
   case R_X86_64_GOTOFF64:
@@ -725,6 +726,7 @@ int64_t X86_64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_PC32:
   case R_X86_64_GOTTPOFF:
   case R_X86_64_PLT32:
@@ -808,6 +810,7 @@ void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
     break;
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
     if (rel.expr != R_GOT_PC) {
       relaxGot(loc, rel, val);
     } else {
@@ -859,12 +862,13 @@ void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
 
 RelExpr X86_64::adjustGotPcExpr(RelType type, int64_t addend,
                                 const uint8_t *loc) const {
-  // Only R_X86_64_[REX_]GOTPCRELX can be relaxed. GNU as may emit GOTPCRELX
-  // with addend != -4. Such an instruction does not load the full GOT entry, so
-  // we cannot relax the relocation. E.g. movl x@GOTPCREL+4(%rip), %rax
-  // (addend=0) loads the high 32 bits of the GOT entry.
+  // Only R_X86_64_[REX_]|[REX2_]GOTPCRELX can be relaxed. GNU as may emit
+  // GOTPCRELX with addend != -4. Such an instruction does not load the full GOT
+  // entry, so we cannot relax the relocation. E.g. movl x@GOTPCREL+4(%rip),
+  // %rax (addend=0) loads the high 32 bits of the GOT entry.
   if (!ctx.arg.relax || addend != -4 ||
-      (type != R_X86_64_GOTPCRELX && type != R_X86_64_REX_GOTPCRELX))
+      (type != R_X86_64_GOTPCRELX && type != R_X86_64_REX_GOTPCRELX &&
+       type != R_X86_64_REX2_GOTPCRELX))
     return R_GOT_PC;
   const uint8_t op = loc[-2];
   const uint8_t modRm = loc[-1];
@@ -880,7 +884,7 @@ RelExpr X86_64::adjustGotPcExpr(RelType type, int64_t addend,
   if (op == 0xff && (modRm == 0x15 || modRm == 0x25))
     return R_RELAX_GOT_PC;
 
-  // We don't support test/binop instructions without a REX prefix.
+  // We don't support test/binop instructions without a REX/REX2 prefix.
   if (type == R_X86_64_GOTPCRELX)
     return R_GOT_PC;
 
@@ -897,6 +901,7 @@ RelExpr X86_64::adjustGotPcExpr(RelType type, int64_t addend,
 static void relaxGotNoPic(uint8_t *loc, uint64_t val, uint8_t op,
                           uint8_t modRm) {
   const uint8_t rex = loc[-3];
+  const bool isRex2 = loc[-4] == 0xd5;
   // Convert "test %reg, foo@GOTPCREL(%rip)" to "test $foo, %reg".
   if (op == 0x85) {
     // See "TEST-Logical Compare" (4-428 Vol. 2B),
@@ -921,7 +926,7 @@ static void relaxGotNoPic(uint8_t *loc, uint64_t val, uint8_t op,
     // See "TEST-Logical Compare" (4-428 Vol. 2B).
     loc[-2] = 0xf7;
 
-    // Move R bit to the B bit in REX byte.
+    // Move R bit to the B bit in REX/REX2 byte.
     // REX byte is encoded as 0100WRXB, where
     // 0100 is 4bit fixed pattern.
     // REX.W When 1, a 64-bit operand size is used. Otherwise, when 0, the
@@ -932,7 +937,18 @@ static void relaxGotNoPic(uint8_t *loc, uint64_t val, uint8_t op,
     // REX.B This 1-bit value is an extension to the MODRM.rm field or the
     // SIB.base field.
     // See "2.2.1.2 More on REX Prefix Fields " (2-8 Vol. 2A).
-    loc[-3] = (rex & ~0x4) | (rex & 0x4) >> 2;
+    //
+    // REX2 prefix is encoded as 0xd5|M|R2|X2|B2|WRXB, where
+    // 0xd5 is 1byte fixed pattern.
+    // REX2's [W,R,X,B] have the same meanings as REX's.
+    // REX2.M encodes the map id.
+    // R2/X2/B2 provides the fifth and most siginicant bits of the R/X/B
+    // register identifiers, each of which can now address all 32 GPRs.
+    // TODO: Add the section number here after APX SPEC is merged into SDM.
+    if (isRex2)
+      loc[-3] = (rex & ~0x44) | (rex & 0x44) >> 2;
+    else
+      loc[-3] = (rex & ~0x4) | (rex & 0x4) >> 2;
     write32le(loc, val);
     return;
   }
@@ -953,7 +969,10 @@ static void relaxGotNoPic(uint8_t *loc, uint64_t val, uint8_t op,
   // "INSTRUCTION SET REFERENCE, N-Z" (Vol. 2B 4-1) for
   // descriptions about each operation.
   loc[-2] = 0x81;
-  loc[-3] = (rex & ~0x4) | (rex & 0x4) >> 2;
+  if (isRex2)
+    loc[-3] = (rex & ~0x44) | (rex & 0x44) >> 2;
+  else
+    loc[-3] = (rex & ~0x4) | (rex & 0x4) >> 2;
   write32le(loc, val);
 }
 
diff --git a/lld/test/ELF/x86-64-gotpc-no-relax-err.s b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
index 618dca47755f41..4280c8fd1dc97e 100644
--- a/lld/test/ELF/x86-64-gotpc-no-relax-err.s
+++ b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
@@ -7,15 +7,19 @@
 ## `>>> defined in` for linker synthesized __stop_* symbols (there is no
 ## associated file or linker script line number).
 
-# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483658 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483666 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 # CHECK-EMPTY:
-# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483659 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: >>> defined in <internal>
+# CHECK-EMPTY:
+# CHECK-NEXT: error: {{.*}}:(.text+0x11): relocation R_X86_64_REX2_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 
 #--- a.s
   movl __stop_data@GOTPCREL(%rip), %eax  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # out of range
+  movq __stop_data@GOTPCREL(%rip), %r16  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # in range
 
 .section data,"aw",@progbits
@@ -23,5 +27,5 @@
 #--- lds
 SECTIONS {
   .text 0x200000 : { *(.text) }
-  .got 0x80200010 : { *(.got) }
+  .got 0x80200016 : { *(.got) }
 }
diff --git a/lld/test/ELF/x86-64-gotpc-relax-nopic.s b/lld/test/ELF/x86-64-gotpc-relax-nopic.s
index 7481904d16f1b4..e3cd93d1d57962 100644
--- a/lld/test/ELF/x86-64-gotpc-relax-nopic.s
+++ b/lld/test/ELF/x86-64-gotpc-relax-nopic.s
@@ -10,30 +10,39 @@
 # SYMRELOC:      Symbols [
 # SYMRELOC:       Symbol {
 # SYMRELOC:        Name: bar
-# SYMRELOC-NEXT:   Value: 0x203248
+# SYMRELOC-NEXT:   Value: 0x203290
 
 ## 2105751 = 0x202197 (bar)
 # DISASM:      Disassembly of section .text:
 # DISASM-EMPTY:
 # DISASM-NEXT: <_start>:
-# DISASM-NEXT:   2011c8:       adcl  {{.*}}(%rip), %eax  # 0x202240
-# DISASM-NEXT:                 addl  {{.*}}(%rip), %ebx  # 0x202240
-# DISASM-NEXT:                 andl  {{.*}}(%rip), %ecx  # 0x202240
-# DISASM-NEXT:                 cmpl  {{.*}}(%rip), %edx  # 0x202240
-# DISASM-NEXT:                 orl   {{.*}}(%rip), %edi  # 0x202240
-# DISASM-NEXT:                 sbbl  {{.*}}(%rip), %esi  # 0x202240
-# DISASM-NEXT:                 subl  {{.*}}(%rip), %ebp  # 0x202240
-# DISASM-NEXT:                 xorl  $0x203248, %r8d
-# DISASM-NEXT:                 testl $0x203248, %r15d
-# DISASM-NEXT:   201200:       adcq  $0x203248, %rax
-# DISASM-NEXT:                 addq  $0x203248, %rbx
-# DISASM-NEXT:                 andq  $0x203248, %rcx
-# DISASM-NEXT:                 cmpq  $0x203248, %rdx
-# DISASM-NEXT:                 orq   $0x203248, %rdi
-# DISASM-NEXT:                 sbbq  $0x203248, %rsi
-# DISASM-NEXT:                 subq  $0x203248, %rbp
-# DISASM-NEXT:                 xorq  $0x203248, %r8
-# DISASM-NEXT:                 testq $0x203248, %r15
+# DISASM-NEXT:   2011c8:       adcl  {{.*}}(%rip), %eax  # 0x202288
+# DISASM-NEXT:                 addl  {{.*}}(%rip), %ebx  # 0x202288
+# DISASM-NEXT:                 andl  {{.*}}(%rip), %ecx  # 0x202288
+# DISASM-NEXT:                 cmpl  {{.*}}(%rip), %edx  # 0x202288
+# DISASM-NEXT:                 orl   {{.*}}(%rip), %edi  # 0x202288
+# DISASM-NEXT:                 sbbl  {{.*}}(%rip), %esi  # 0x202288
+# DISASM-NEXT:                 subl  {{.*}}(%rip), %ebp  # 0x202288
+# DISASM-NEXT:                 xorl  $0x203290, %r8d
+# DISASM-NEXT:                 testl $0x203290, %r15d
+# DISASM-NEXT:   201200:       adcq  $0x203290, %rax
+# DISASM-NEXT:                 addq  $0x203290, %rbx
+# DISASM-NEXT:                 andq  $0x203290, %rcx
+# DISASM-NEXT:                 cmpq  $0x203290, %rdx
+# DISASM-NEXT:                 orq   $0x203290, %rdi
+# DISASM-NEXT:                 sbbq  $0x203290, %rsi
+# DISASM-NEXT:                 subq  $0x203290, %rbp
+# DISASM-NEXT:                 xorq  $0x203290, %r8
+# DISASM-NEXT:                 testq $0x203290, %r15
+# DISASM-NEXT:   20123f:       adcq  $0x203290, %r16
+# DISASM-NEXT:                 addq  $0x203290, %r17
+# DISASM-NEXT:                 andq  $0x203290, %r18
+# DISASM-NEXT:                 cmpq  $0x203290, %r19
+# DISASM-NEXT:                 orq   $0x203290, %r20
+# DISASM-NEXT:                 sbbq  $0x203290, %r21
+# DISASM-NEXT:                 subq  $0x203290, %r22
+# DISASM-NEXT:                 xorq  $0x203290, %r23
+# DISASM-NEXT:                 testq $0x203290, %r24
 
 # RUN: ld.lld --hash-style=sysv -shared %t.o -o %t2
 # RUN: llvm-readobj -S -r -d %t2 | FileCheck --check-prefix=SEC-PIC    %s
@@ -46,8 +55,8 @@
 # SEC-PIC-NEXT:     SHF_ALLOC
 # SEC-PIC-NEXT:     SHF_WRITE
 # SEC-PIC-NEXT:   ]
-# SEC-PIC-NEXT:   Address: 0x2380
-# SEC-PIC-NEXT:   Offset: 0x380
+# SEC-PIC-NEXT:   Address: 0x23C8
+# SEC-PIC-NEXT:   Offset: 0x3C8
 # SEC-PIC-NEXT:   Size: 8
 # SEC-PIC-NEXT:   Link:
 # SEC-PIC-NEXT:   Info:
@@ -57,7 +66,7 @@
 # SEC-PIC:      0x000000006FFFFFF9 RELACOUNT            1
 # SEC-PIC:      Relocations [
 # SEC-PIC-NEXT:   Section ({{.*}}) .rela.dyn {
-# SEC-PIC-NEXT:     0x2380 R_X86_64_RELATIVE - 0x3388
+# SEC-PIC-NEXT:     0x23C8 R_X86_64_RELATIVE - 0x33D0
 # SEC-PIC-NEXT:   }
 # SEC-PIC-NEXT: ]
 
@@ -65,24 +74,33 @@
 # DISASM-PIC:      Disassembly of section .text:
 # DISASM-PIC-EMPTY:
 # DISASM-PIC-NEXT: <_start>:
-# DISASM-PIC-NEXT: 1268:       adcl  {{.*}}(%rip), %eax  # 0x2380
-# DISASM-PIC-NEXT:             addl  {{.*}}(%rip), %ebx  # 0x2380
-# DISASM-PIC-NEXT:             andl  {{.*}}(%rip), %ecx  # 0x2380
-# DISASM-PIC-NEXT:             cmpl  {{.*}}(%rip), %edx  # 0x2380
-# DISASM-PIC-NEXT:             orl   {{.*}}(%rip), %edi  # 0x2380
-# DISASM-PIC-NEXT:             sbbl  {{.*}}(%rip), %esi  # 0x2380
-# DISASM-PIC-NEXT:             subl  {{.*}}(%rip), %ebp  # 0x2380
-# DISASM-PIC-NEXT:             xorl  {{.*}}(%rip), %r8d  # 0x2380
-# DISASM-PIC-NEXT:             testl %r15d, {{.*}}(%rip) # 0x2380
-# DISASM-PIC-NEXT: 12a0:       adcq  {{.*}}(%rip), %rax  # 0x2380
-# DISASM-PIC-NEXT:             addq  {{.*}}(%rip), %rbx  # 0x2380
-# DISASM-PIC-NEXT:             andq  {{.*}}(%rip), %rcx  # 0x2380
-# DISASM-PIC-NEXT:             cmpq  {{.*}}(%rip), %rdx  # 0x2380
-# DISASM-PIC-NEXT:             orq   {{.*}}(%rip), %rdi  # 0x2380
-# DISASM-PIC-NEXT:             sbbq  {{.*}}(%rip), %rsi  # 0x2380
-# DISASM-PIC-NEXT:             subq  {{.*}}(%rip), %rbp  # 0x2380
-# DISASM-PIC-NEXT:             xorq  {{.*}}(%rip), %r8   # 0x2380
-# DISASM-PIC-NEXT:             testq %r15, {{.*}}(%rip)  # 0x2380
+# DISASM-PIC-NEXT: 1268:       adcl  {{.*}}(%rip), %eax  # 0x23c8
+# DISASM-PIC-NEXT:             addl  {{.*}}(%rip), %ebx  # 0x23c8
+# DISASM-PIC-NEXT:             andl  {{.*}}(%rip), %ecx  # 0x23c8
+# DISASM-PIC-NEXT:             cmpl  {{.*}}(%rip), %edx  # 0x23c8
+# DISASM-PIC-NEXT:             orl   {{.*}}(%rip), %edi  # 0x23c8
+# DISASM-PIC-NEXT:             sbbl  {{.*}}(%rip), %esi  # 0x23c8
+# DISASM-PIC-NEXT:             subl  {{.*}}(%rip), %ebp  # 0x23c8
+# DISASM-PIC-NEXT:             xorl  {{.*}}(%rip), %r8d  # 0x23c8
+# DISASM-PIC-NEXT:             testl %r15d, {{.*}}(%rip) # 0x23c8
+# DISASM-PIC-NEXT: 12a0:       adcq  {{.*}}(%rip), %rax  # 0x23c8
+# DISASM-PIC-NEXT:             addq  {{.*}}(%rip), %rbx  # 0x23c8
+# DISASM-PIC-NEXT:             andq  {{.*}}(%rip), %rcx  # 0x23c8
+# DISASM-PIC-NEXT:             cmpq  {{.*}}(%rip), %rdx  # 0x23c8
+# DISASM-PIC-NEXT:             orq   {{.*}}(%rip), %rdi  # 0x23c8
+# DISASM-PIC-NEXT:             sbbq  {{.*}}(%rip), %rsi  # 0x23c8
+# DISASM-PIC-NEXT:             subq  {{.*}}(%rip), %rbp  # 0x23c8
+# DISASM-PIC-NEXT:             xorq  {{.*}}(%rip), %r8   # 0x23c8
+# DISASM-PIC-NEXT:             testq %r15, {{.*}}(%rip)  # 0x23c8
+# DISASM-PIC-NEXT: 12df:       adcq  {{.*}}(%rip), %r16  # 0x23c8
+# DISASM-PIC-NEXT:             addq  {{.*}}(%rip), %r17  # 0x23c8
+# DISASM-PIC-NEXT:             andq  {{.*}}(%rip), %r18  # 0x23c8
+# DISASM-PIC-NEXT:             cmpq  {{.*}}(%rip), %r19  # 0x23c8
+# DISASM-PIC-NEXT:             orq   {{.*}}(%rip), %r20  # 0x23c8
+# DISASM-PIC-NEXT:             sbbq  {{.*}}(%rip), %r21  # 0x23c8
+# DISASM-PIC-NEXT:             subq  {{.*}}(%rip), %r22  # 0x23c8
+# DISASM-PIC-NEXT:             xorq  {{.*}}(%rip), %r23   # 0x23c8
+# DISASM-PIC-NEXT:             testq %r24, {{.*}}(%rip)  # 0x23c8
 
 .data
 .type   bar, @object
@@ -115,3 +133,14 @@ _start:
   subq    bar@GOTPCREL(%rip), %rbp
   xorq    bar@GOTPCREL(%rip), %r8
   testq   %r15, bar@GOTPCREL(%rip)
+
+## R_X86_64_REX2_GOTPCRELX
+  adcq    bar@GOTPCREL(%rip), %r16
+  addq    bar@GOTPCREL(%rip), %r17
+  andq    bar@GOTPCREL(%rip), %r18
+  cmpq    bar@GOTPCREL(%rip), %r19
+  orq     bar@GOTPCREL(%rip), %r20
+  sbbq    bar@GOTPCREL(%rip), %r21
+  subq    bar@GOTPCREL(%rip), %r22
+  xorq    bar@GOTPCREL(%rip), %r23
+  testq   %r24, bar@GOTPCREL(%rip)
diff --git a/lld/test/ELF/x86-64-gotpc-relax.s b/lld/test/ELF/x86-64-gotpc-relax.s
index 5945bfc04a0225..b1ff995b3fc211 100644
--- a/lld/test/ELF/x86-64-gotpc-relax.s
+++ b/lld/test/ELF/x86-64-gotpc-relax.s
@@ -1,5 +1,5 @@
 # REQUIRES: x86
-## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX GOT optimization.
+## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX/R_X86_64_REX2_GOTPCRELX GOT optimization.
 
 # RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
 # RUN: ld.lld %t.o -o %t1 --no-apply-dynamic-relocs
@@ -15,16 +15,16 @@
 
 ## In our implementation, .got is retained even if all GOT-generating relocations are optimized.
 # CHECK:      Name              Type            Address          Off    Size   ES Flg Lk Inf Al
-# CHECK:      .iplt             PROGBITS        0000000000201280 000280 000010 00  AX  0   0 16
-# CHECK-NEXT: .got              PROGBITS        0000000000202290 000290 000000 00  WA  0   0  8
+# CHECK:      .iplt             PROGBITS        00000000002012e0 0002e0 000010 00  AX  0   0 16
+# CHECK-NEXT: .got              PROGBITS        00000000002022f0 0002f0 000000 00  WA  0   0  8
 
 ## There is one R_X86_64_IRELATIVE relocations.
 # RELOC-LABEL: Relocation section '.rela.dyn' at offset {{.*}} contains 1 entry:
 # CHECK:           Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
-# CHECK:       0000000000203290  0000000000000025 R_X86_64_IRELATIVE                        2011e2
+# CHECK:       00000000002032f0  0000000000000025 R_X86_64_IRELATIVE                        2011e2
 # CHECK-LABEL: Hex dump of section '.got.plt':
-# NOAPPLY-NEXT:  0x00203290 00000000 00000000
-# APPLY-NEXT:    0x00203290 e2112000 00000000
+# NOAPPLY-NEXT:  0x002032f0 00000000 00000000
+# APPLY-NEXT:    0x002032f0 e2112000 00000000
 
 # 0x201173 + 7 - 10 = 0x201170
 # 0x20117a + 7 - 17 = 0x201170
@@ -43,20 +43,20 @@
 # DISASM-NEXT: leaq -17(%rip), %rax
 # DISASM-NEXT: leaq -23(%rip), %rax
 # DISASM-NEXT: leaq -30(%rip), %rax
-# DISASM-NEXT: movq 8330(%rip), %rax
-# DISASM-NEXT: movq 8323(%rip), %rax
+# DISASM-NEXT: movq 8426(%rip), %rax
+# DISASM-NEXT: movq 8419(%rip), %rax
 # DISASM-NEXT: leaq -52(%rip), %rax
 # DISASM-NEXT: leaq -59(%rip), %rax
 # DISASM-NEXT: leaq -65(%rip), %rax
 # DISASM-NEXT: leaq -72(%rip), %rax
-# DISASM-NEXT: movq 8288(%rip), %rax
-# DISASM-NEXT: movq 8281(%rip), %rax
+# DISASM-NEXT: movq 8384(%rip), %rax
+# DISASM-NEXT: movq 8377(%rip), %rax
 # DISASM-NEXT: callq 0x2011e0 <foo>
 # DISASM-NEXT: callq 0x2011e0 <foo>
 # DISASM-NEXT: callq 0x2011e1 <hid>
 # DISASM-NEXT: callq 0x2011e1 <hid>
-# DISASM-NEXT: callq *8251(%rip)
-# DISASM-NEXT: callq *8245(%rip)
+# DISASM-NEXT: callq *8347(%rip)
+# DISASM-NEXT: callq *8341(%rip)
 # DISASM-NEXT: jmp   0x2011e0 <foo>
 # DISASM-NEXT: nop
 # DISASM-NEXT: jmp   0x2011e0 <foo>
@@ -65,13 +65,26 @@
 # DISASM-NEXT: nop
 # DISASM-NEXT: jmp   0x2011e1 <hid>
 # DISASM-NEXT: nop
-# DISASM-NEXT: jmpq  *8215(%rip)
-# DISASM-NEXT: jmpq  *8209(%rip)
+# DISASM-NEXT: jmpq  *8311(%rip)
+# DISASM-NEXT: jmpq  *8305(%rip)
+# DISASM-NEXT: leaq -167(%rip), %r16
+# DISASM-NEXT: leaq -175(%rip), %r16
+# DISASM-NEXT: leaq -182(%rip), %r16
+# DISASM-NEXT: leaq -190(%rip), %r16
+# DISASM-NEXT: movq 8265(%rip), %r16
+# DISASM-NEXT: movq 8257(%rip), %r16
+# DISASM-NEXT: leaq -215(%rip), %r16
+# DISASM-NEXT: leaq -223(%rip), %r16
+# DISASM-NEXT: leaq -230(%rip), %r16
+# DISASM-NEXT: leaq -238(%rip), %r16
+# DISASM-NEXT: movq 8217(%rip), %r16
+# DISASM-NEXT: movq 8209(%rip), %r16
 
 # NORELAX-LABEL: <_start>:
 # NORELAX-COUNT-12: movq
 # NORELAX-COUNT-6:  callq *
 # NORELAX-COUNT-6:  jmpq *
+# NORELAX-COUNT-12: movq
 
 .text
 .globl foo
@@ -120,3 +133,16 @@ _start:
  jmp *hid@GOTPCREL(%rip)
  jmp *ifunc@GOTPCREL(%rip)
  jmp *ifunc@GOTPCREL(%rip)
+
+ movq foo@GOTPCREL(%rip), %r16
+ movq foo@GOTPCREL(%rip), %r16
+ movq hid@GOTPCREL(%rip), %r16
+ movq hid@GOTPCREL(%rip), %r16
+ movq ifunc@GOTPCREL(%rip), %r16
+ movq ifunc@GOTPCREL(%rip), %r16
+ movq foo@GOTPCREL(%rip), %r16
+ movq foo@GOTPCREL(%rip), %r16
+ movq hid@GOTPCREL(%rip), %r16
+ movq hid@GOTPCREL(%rip), %r16
+ movq ifunc@GOTPCREL(%rip), %r16
+ movq ifunc@GOTPCREL(%rip), %r16

@llvmbot
Copy link
Member

llvmbot commented Sep 24, 2024

@llvm/pr-subscribers-lld-elf

Author: Shengchen Kan (KanRobert)

Changes

For

mov        name@<!-- -->GOTPCREL(%rip), %reg
test       %reg, name@<!-- -->GOTPCREL(%rip)
binop      name@<!-- -->GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions, add

R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX = 43

if the instruction starts at 4 bytes before the relocation offset. It similar to R_X86_64_GOTPCRELX.

Linker can treat R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX as R_X86_64_GOTPCREL or convert the above instructions to

lea	name(%rip), %reg
mov	$name, %reg
test	$name, %reg
binop	$name, %reg

if the first byte of the instruction at the relocation offset - 4 is 0xd5 (namely, encoded w/ REX2 prefix) when possible.

Binutils patch: bminor/binutils-gdb@3d5a60d
Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation


Full diff: https://github.com/llvm/llvm-project/pull/109783.diff

4 Files Affected:

  • (modified) lld/ELF/Arch/X86_64.cpp (+28-9)
  • (modified) lld/test/ELF/x86-64-gotpc-no-relax-err.s (+7-3)
  • (modified) lld/test/ELF/x86-64-gotpc-relax-nopic.s (+69-40)
  • (modified) lld/test/ELF/x86-64-gotpc-relax.s (+40-14)
diff --git a/lld/ELF/Arch/X86_64.cpp b/lld/ELF/Arch/X86_64.cpp
index 48f17718365e24..56e3b882b8b3c7 100644
--- a/lld/ELF/Arch/X86_64.cpp
+++ b/lld/ELF/Arch/X86_64.cpp
@@ -388,6 +388,7 @@ RelExpr X86_64::getRelExpr(RelType type, const Symbol &s,
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_GOTTPOFF:
     return R_GOT_PC;
   case R_X86_64_GOTOFF64:
@@ -725,6 +726,7 @@ int64_t X86_64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_PC32:
   case R_X86_64_GOTTPOFF:
   case R_X86_64_PLT32:
@@ -808,6 +810,7 @@ void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
     break;
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
     if (rel.expr != R_GOT_PC) {
       relaxGot(loc, rel, val);
     } else {
@@ -859,12 +862,13 @@ void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
 
 RelExpr X86_64::adjustGotPcExpr(RelType type, int64_t addend,
                                 const uint8_t *loc) const {
-  // Only R_X86_64_[REX_]GOTPCRELX can be relaxed. GNU as may emit GOTPCRELX
-  // with addend != -4. Such an instruction does not load the full GOT entry, so
-  // we cannot relax the relocation. E.g. movl x@GOTPCREL+4(%rip), %rax
-  // (addend=0) loads the high 32 bits of the GOT entry.
+  // Only R_X86_64_[REX_]|[REX2_]GOTPCRELX can be relaxed. GNU as may emit
+  // GOTPCRELX with addend != -4. Such an instruction does not load the full GOT
+  // entry, so we cannot relax the relocation. E.g. movl x@GOTPCREL+4(%rip),
+  // %rax (addend=0) loads the high 32 bits of the GOT entry.
   if (!ctx.arg.relax || addend != -4 ||
-      (type != R_X86_64_GOTPCRELX && type != R_X86_64_REX_GOTPCRELX))
+      (type != R_X86_64_GOTPCRELX && type != R_X86_64_REX_GOTPCRELX &&
+       type != R_X86_64_REX2_GOTPCRELX))
     return R_GOT_PC;
   const uint8_t op = loc[-2];
   const uint8_t modRm = loc[-1];
@@ -880,7 +884,7 @@ RelExpr X86_64::adjustGotPcExpr(RelType type, int64_t addend,
   if (op == 0xff && (modRm == 0x15 || modRm == 0x25))
     return R_RELAX_GOT_PC;
 
-  // We don't support test/binop instructions without a REX prefix.
+  // We don't support test/binop instructions without a REX/REX2 prefix.
   if (type == R_X86_64_GOTPCRELX)
     return R_GOT_PC;
 
@@ -897,6 +901,7 @@ RelExpr X86_64::adjustGotPcExpr(RelType type, int64_t addend,
 static void relaxGotNoPic(uint8_t *loc, uint64_t val, uint8_t op,
                           uint8_t modRm) {
   const uint8_t rex = loc[-3];
+  const bool isRex2 = loc[-4] == 0xd5;
   // Convert "test %reg, foo@GOTPCREL(%rip)" to "test $foo, %reg".
   if (op == 0x85) {
     // See "TEST-Logical Compare" (4-428 Vol. 2B),
@@ -921,7 +926,7 @@ static void relaxGotNoPic(uint8_t *loc, uint64_t val, uint8_t op,
     // See "TEST-Logical Compare" (4-428 Vol. 2B).
     loc[-2] = 0xf7;
 
-    // Move R bit to the B bit in REX byte.
+    // Move R bit to the B bit in REX/REX2 byte.
     // REX byte is encoded as 0100WRXB, where
     // 0100 is 4bit fixed pattern.
     // REX.W When 1, a 64-bit operand size is used. Otherwise, when 0, the
@@ -932,7 +937,18 @@ static void relaxGotNoPic(uint8_t *loc, uint64_t val, uint8_t op,
     // REX.B This 1-bit value is an extension to the MODRM.rm field or the
     // SIB.base field.
     // See "2.2.1.2 More on REX Prefix Fields " (2-8 Vol. 2A).
-    loc[-3] = (rex & ~0x4) | (rex & 0x4) >> 2;
+    //
+    // REX2 prefix is encoded as 0xd5|M|R2|X2|B2|WRXB, where
+    // 0xd5 is 1byte fixed pattern.
+    // REX2's [W,R,X,B] have the same meanings as REX's.
+    // REX2.M encodes the map id.
+    // R2/X2/B2 provides the fifth and most siginicant bits of the R/X/B
+    // register identifiers, each of which can now address all 32 GPRs.
+    // TODO: Add the section number here after APX SPEC is merged into SDM.
+    if (isRex2)
+      loc[-3] = (rex & ~0x44) | (rex & 0x44) >> 2;
+    else
+      loc[-3] = (rex & ~0x4) | (rex & 0x4) >> 2;
     write32le(loc, val);
     return;
   }
@@ -953,7 +969,10 @@ static void relaxGotNoPic(uint8_t *loc, uint64_t val, uint8_t op,
   // "INSTRUCTION SET REFERENCE, N-Z" (Vol. 2B 4-1) for
   // descriptions about each operation.
   loc[-2] = 0x81;
-  loc[-3] = (rex & ~0x4) | (rex & 0x4) >> 2;
+  if (isRex2)
+    loc[-3] = (rex & ~0x44) | (rex & 0x44) >> 2;
+  else
+    loc[-3] = (rex & ~0x4) | (rex & 0x4) >> 2;
   write32le(loc, val);
 }
 
diff --git a/lld/test/ELF/x86-64-gotpc-no-relax-err.s b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
index 618dca47755f41..4280c8fd1dc97e 100644
--- a/lld/test/ELF/x86-64-gotpc-no-relax-err.s
+++ b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
@@ -7,15 +7,19 @@
 ## `>>> defined in` for linker synthesized __stop_* symbols (there is no
 ## associated file or linker script line number).
 
-# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483658 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483666 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 # CHECK-EMPTY:
-# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483659 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: >>> defined in <internal>
+# CHECK-EMPTY:
+# CHECK-NEXT: error: {{.*}}:(.text+0x11): relocation R_X86_64_REX2_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 
 #--- a.s
   movl __stop_data@GOTPCREL(%rip), %eax  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # out of range
+  movq __stop_data@GOTPCREL(%rip), %r16  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # in range
 
 .section data,"aw",@progbits
@@ -23,5 +27,5 @@
 #--- lds
 SECTIONS {
   .text 0x200000 : { *(.text) }
-  .got 0x80200010 : { *(.got) }
+  .got 0x80200016 : { *(.got) }
 }
diff --git a/lld/test/ELF/x86-64-gotpc-relax-nopic.s b/lld/test/ELF/x86-64-gotpc-relax-nopic.s
index 7481904d16f1b4..e3cd93d1d57962 100644
--- a/lld/test/ELF/x86-64-gotpc-relax-nopic.s
+++ b/lld/test/ELF/x86-64-gotpc-relax-nopic.s
@@ -10,30 +10,39 @@
 # SYMRELOC:      Symbols [
 # SYMRELOC:       Symbol {
 # SYMRELOC:        Name: bar
-# SYMRELOC-NEXT:   Value: 0x203248
+# SYMRELOC-NEXT:   Value: 0x203290
 
 ## 2105751 = 0x202197 (bar)
 # DISASM:      Disassembly of section .text:
 # DISASM-EMPTY:
 # DISASM-NEXT: <_start>:
-# DISASM-NEXT:   2011c8:       adcl  {{.*}}(%rip), %eax  # 0x202240
-# DISASM-NEXT:                 addl  {{.*}}(%rip), %ebx  # 0x202240
-# DISASM-NEXT:                 andl  {{.*}}(%rip), %ecx  # 0x202240
-# DISASM-NEXT:                 cmpl  {{.*}}(%rip), %edx  # 0x202240
-# DISASM-NEXT:                 orl   {{.*}}(%rip), %edi  # 0x202240
-# DISASM-NEXT:                 sbbl  {{.*}}(%rip), %esi  # 0x202240
-# DISASM-NEXT:                 subl  {{.*}}(%rip), %ebp  # 0x202240
-# DISASM-NEXT:                 xorl  $0x203248, %r8d
-# DISASM-NEXT:                 testl $0x203248, %r15d
-# DISASM-NEXT:   201200:       adcq  $0x203248, %rax
-# DISASM-NEXT:                 addq  $0x203248, %rbx
-# DISASM-NEXT:                 andq  $0x203248, %rcx
-# DISASM-NEXT:                 cmpq  $0x203248, %rdx
-# DISASM-NEXT:                 orq   $0x203248, %rdi
-# DISASM-NEXT:                 sbbq  $0x203248, %rsi
-# DISASM-NEXT:                 subq  $0x203248, %rbp
-# DISASM-NEXT:                 xorq  $0x203248, %r8
-# DISASM-NEXT:                 testq $0x203248, %r15
+# DISASM-NEXT:   2011c8:       adcl  {{.*}}(%rip), %eax  # 0x202288
+# DISASM-NEXT:                 addl  {{.*}}(%rip), %ebx  # 0x202288
+# DISASM-NEXT:                 andl  {{.*}}(%rip), %ecx  # 0x202288
+# DISASM-NEXT:                 cmpl  {{.*}}(%rip), %edx  # 0x202288
+# DISASM-NEXT:                 orl   {{.*}}(%rip), %edi  # 0x202288
+# DISASM-NEXT:                 sbbl  {{.*}}(%rip), %esi  # 0x202288
+# DISASM-NEXT:                 subl  {{.*}}(%rip), %ebp  # 0x202288
+# DISASM-NEXT:                 xorl  $0x203290, %r8d
+# DISASM-NEXT:                 testl $0x203290, %r15d
+# DISASM-NEXT:   201200:       adcq  $0x203290, %rax
+# DISASM-NEXT:                 addq  $0x203290, %rbx
+# DISASM-NEXT:                 andq  $0x203290, %rcx
+# DISASM-NEXT:                 cmpq  $0x203290, %rdx
+# DISASM-NEXT:                 orq   $0x203290, %rdi
+# DISASM-NEXT:                 sbbq  $0x203290, %rsi
+# DISASM-NEXT:                 subq  $0x203290, %rbp
+# DISASM-NEXT:                 xorq  $0x203290, %r8
+# DISASM-NEXT:                 testq $0x203290, %r15
+# DISASM-NEXT:   20123f:       adcq  $0x203290, %r16
+# DISASM-NEXT:                 addq  $0x203290, %r17
+# DISASM-NEXT:                 andq  $0x203290, %r18
+# DISASM-NEXT:                 cmpq  $0x203290, %r19
+# DISASM-NEXT:                 orq   $0x203290, %r20
+# DISASM-NEXT:                 sbbq  $0x203290, %r21
+# DISASM-NEXT:                 subq  $0x203290, %r22
+# DISASM-NEXT:                 xorq  $0x203290, %r23
+# DISASM-NEXT:                 testq $0x203290, %r24
 
 # RUN: ld.lld --hash-style=sysv -shared %t.o -o %t2
 # RUN: llvm-readobj -S -r -d %t2 | FileCheck --check-prefix=SEC-PIC    %s
@@ -46,8 +55,8 @@
 # SEC-PIC-NEXT:     SHF_ALLOC
 # SEC-PIC-NEXT:     SHF_WRITE
 # SEC-PIC-NEXT:   ]
-# SEC-PIC-NEXT:   Address: 0x2380
-# SEC-PIC-NEXT:   Offset: 0x380
+# SEC-PIC-NEXT:   Address: 0x23C8
+# SEC-PIC-NEXT:   Offset: 0x3C8
 # SEC-PIC-NEXT:   Size: 8
 # SEC-PIC-NEXT:   Link:
 # SEC-PIC-NEXT:   Info:
@@ -57,7 +66,7 @@
 # SEC-PIC:      0x000000006FFFFFF9 RELACOUNT            1
 # SEC-PIC:      Relocations [
 # SEC-PIC-NEXT:   Section ({{.*}}) .rela.dyn {
-# SEC-PIC-NEXT:     0x2380 R_X86_64_RELATIVE - 0x3388
+# SEC-PIC-NEXT:     0x23C8 R_X86_64_RELATIVE - 0x33D0
 # SEC-PIC-NEXT:   }
 # SEC-PIC-NEXT: ]
 
@@ -65,24 +74,33 @@
 # DISASM-PIC:      Disassembly of section .text:
 # DISASM-PIC-EMPTY:
 # DISASM-PIC-NEXT: <_start>:
-# DISASM-PIC-NEXT: 1268:       adcl  {{.*}}(%rip), %eax  # 0x2380
-# DISASM-PIC-NEXT:             addl  {{.*}}(%rip), %ebx  # 0x2380
-# DISASM-PIC-NEXT:             andl  {{.*}}(%rip), %ecx  # 0x2380
-# DISASM-PIC-NEXT:             cmpl  {{.*}}(%rip), %edx  # 0x2380
-# DISASM-PIC-NEXT:             orl   {{.*}}(%rip), %edi  # 0x2380
-# DISASM-PIC-NEXT:             sbbl  {{.*}}(%rip), %esi  # 0x2380
-# DISASM-PIC-NEXT:             subl  {{.*}}(%rip), %ebp  # 0x2380
-# DISASM-PIC-NEXT:             xorl  {{.*}}(%rip), %r8d  # 0x2380
-# DISASM-PIC-NEXT:             testl %r15d, {{.*}}(%rip) # 0x2380
-# DISASM-PIC-NEXT: 12a0:       adcq  {{.*}}(%rip), %rax  # 0x2380
-# DISASM-PIC-NEXT:             addq  {{.*}}(%rip), %rbx  # 0x2380
-# DISASM-PIC-NEXT:             andq  {{.*}}(%rip), %rcx  # 0x2380
-# DISASM-PIC-NEXT:             cmpq  {{.*}}(%rip), %rdx  # 0x2380
-# DISASM-PIC-NEXT:             orq   {{.*}}(%rip), %rdi  # 0x2380
-# DISASM-PIC-NEXT:             sbbq  {{.*}}(%rip), %rsi  # 0x2380
-# DISASM-PIC-NEXT:             subq  {{.*}}(%rip), %rbp  # 0x2380
-# DISASM-PIC-NEXT:             xorq  {{.*}}(%rip), %r8   # 0x2380
-# DISASM-PIC-NEXT:             testq %r15, {{.*}}(%rip)  # 0x2380
+# DISASM-PIC-NEXT: 1268:       adcl  {{.*}}(%rip), %eax  # 0x23c8
+# DISASM-PIC-NEXT:             addl  {{.*}}(%rip), %ebx  # 0x23c8
+# DISASM-PIC-NEXT:             andl  {{.*}}(%rip), %ecx  # 0x23c8
+# DISASM-PIC-NEXT:             cmpl  {{.*}}(%rip), %edx  # 0x23c8
+# DISASM-PIC-NEXT:             orl   {{.*}}(%rip), %edi  # 0x23c8
+# DISASM-PIC-NEXT:             sbbl  {{.*}}(%rip), %esi  # 0x23c8
+# DISASM-PIC-NEXT:             subl  {{.*}}(%rip), %ebp  # 0x23c8
+# DISASM-PIC-NEXT:             xorl  {{.*}}(%rip), %r8d  # 0x23c8
+# DISASM-PIC-NEXT:             testl %r15d, {{.*}}(%rip) # 0x23c8
+# DISASM-PIC-NEXT: 12a0:       adcq  {{.*}}(%rip), %rax  # 0x23c8
+# DISASM-PIC-NEXT:             addq  {{.*}}(%rip), %rbx  # 0x23c8
+# DISASM-PIC-NEXT:             andq  {{.*}}(%rip), %rcx  # 0x23c8
+# DISASM-PIC-NEXT:             cmpq  {{.*}}(%rip), %rdx  # 0x23c8
+# DISASM-PIC-NEXT:             orq   {{.*}}(%rip), %rdi  # 0x23c8
+# DISASM-PIC-NEXT:             sbbq  {{.*}}(%rip), %rsi  # 0x23c8
+# DISASM-PIC-NEXT:             subq  {{.*}}(%rip), %rbp  # 0x23c8
+# DISASM-PIC-NEXT:             xorq  {{.*}}(%rip), %r8   # 0x23c8
+# DISASM-PIC-NEXT:             testq %r15, {{.*}}(%rip)  # 0x23c8
+# DISASM-PIC-NEXT: 12df:       adcq  {{.*}}(%rip), %r16  # 0x23c8
+# DISASM-PIC-NEXT:             addq  {{.*}}(%rip), %r17  # 0x23c8
+# DISASM-PIC-NEXT:             andq  {{.*}}(%rip), %r18  # 0x23c8
+# DISASM-PIC-NEXT:             cmpq  {{.*}}(%rip), %r19  # 0x23c8
+# DISASM-PIC-NEXT:             orq   {{.*}}(%rip), %r20  # 0x23c8
+# DISASM-PIC-NEXT:             sbbq  {{.*}}(%rip), %r21  # 0x23c8
+# DISASM-PIC-NEXT:             subq  {{.*}}(%rip), %r22  # 0x23c8
+# DISASM-PIC-NEXT:             xorq  {{.*}}(%rip), %r23   # 0x23c8
+# DISASM-PIC-NEXT:             testq %r24, {{.*}}(%rip)  # 0x23c8
 
 .data
 .type   bar, @object
@@ -115,3 +133,14 @@ _start:
   subq    bar@GOTPCREL(%rip), %rbp
   xorq    bar@GOTPCREL(%rip), %r8
   testq   %r15, bar@GOTPCREL(%rip)
+
+## R_X86_64_REX2_GOTPCRELX
+  adcq    bar@GOTPCREL(%rip), %r16
+  addq    bar@GOTPCREL(%rip), %r17
+  andq    bar@GOTPCREL(%rip), %r18
+  cmpq    bar@GOTPCREL(%rip), %r19
+  orq     bar@GOTPCREL(%rip), %r20
+  sbbq    bar@GOTPCREL(%rip), %r21
+  subq    bar@GOTPCREL(%rip), %r22
+  xorq    bar@GOTPCREL(%rip), %r23
+  testq   %r24, bar@GOTPCREL(%rip)
diff --git a/lld/test/ELF/x86-64-gotpc-relax.s b/lld/test/ELF/x86-64-gotpc-relax.s
index 5945bfc04a0225..b1ff995b3fc211 100644
--- a/lld/test/ELF/x86-64-gotpc-relax.s
+++ b/lld/test/ELF/x86-64-gotpc-relax.s
@@ -1,5 +1,5 @@
 # REQUIRES: x86
-## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX GOT optimization.
+## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX/R_X86_64_REX2_GOTPCRELX GOT optimization.
 
 # RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
 # RUN: ld.lld %t.o -o %t1 --no-apply-dynamic-relocs
@@ -15,16 +15,16 @@
 
 ## In our implementation, .got is retained even if all GOT-generating relocations are optimized.
 # CHECK:      Name              Type            Address          Off    Size   ES Flg Lk Inf Al
-# CHECK:      .iplt             PROGBITS        0000000000201280 000280 000010 00  AX  0   0 16
-# CHECK-NEXT: .got              PROGBITS        0000000000202290 000290 000000 00  WA  0   0  8
+# CHECK:      .iplt             PROGBITS        00000000002012e0 0002e0 000010 00  AX  0   0 16
+# CHECK-NEXT: .got              PROGBITS        00000000002022f0 0002f0 000000 00  WA  0   0  8
 
 ## There is one R_X86_64_IRELATIVE relocations.
 # RELOC-LABEL: Relocation section '.rela.dyn' at offset {{.*}} contains 1 entry:
 # CHECK:           Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
-# CHECK:       0000000000203290  0000000000000025 R_X86_64_IRELATIVE                        2011e2
+# CHECK:       00000000002032f0  0000000000000025 R_X86_64_IRELATIVE                        2011e2
 # CHECK-LABEL: Hex dump of section '.got.plt':
-# NOAPPLY-NEXT:  0x00203290 00000000 00000000
-# APPLY-NEXT:    0x00203290 e2112000 00000000
+# NOAPPLY-NEXT:  0x002032f0 00000000 00000000
+# APPLY-NEXT:    0x002032f0 e2112000 00000000
 
 # 0x201173 + 7 - 10 = 0x201170
 # 0x20117a + 7 - 17 = 0x201170
@@ -43,20 +43,20 @@
 # DISASM-NEXT: leaq -17(%rip), %rax
 # DISASM-NEXT: leaq -23(%rip), %rax
 # DISASM-NEXT: leaq -30(%rip), %rax
-# DISASM-NEXT: movq 8330(%rip), %rax
-# DISASM-NEXT: movq 8323(%rip), %rax
+# DISASM-NEXT: movq 8426(%rip), %rax
+# DISASM-NEXT: movq 8419(%rip), %rax
 # DISASM-NEXT: leaq -52(%rip), %rax
 # DISASM-NEXT: leaq -59(%rip), %rax
 # DISASM-NEXT: leaq -65(%rip), %rax
 # DISASM-NEXT: leaq -72(%rip), %rax
-# DISASM-NEXT: movq 8288(%rip), %rax
-# DISASM-NEXT: movq 8281(%rip), %rax
+# DISASM-NEXT: movq 8384(%rip), %rax
+# DISASM-NEXT: movq 8377(%rip), %rax
 # DISASM-NEXT: callq 0x2011e0 <foo>
 # DISASM-NEXT: callq 0x2011e0 <foo>
 # DISASM-NEXT: callq 0x2011e1 <hid>
 # DISASM-NEXT: callq 0x2011e1 <hid>
-# DISASM-NEXT: callq *8251(%rip)
-# DISASM-NEXT: callq *8245(%rip)
+# DISASM-NEXT: callq *8347(%rip)
+# DISASM-NEXT: callq *8341(%rip)
 # DISASM-NEXT: jmp   0x2011e0 <foo>
 # DISASM-NEXT: nop
 # DISASM-NEXT: jmp   0x2011e0 <foo>
@@ -65,13 +65,26 @@
 # DISASM-NEXT: nop
 # DISASM-NEXT: jmp   0x2011e1 <hid>
 # DISASM-NEXT: nop
-# DISASM-NEXT: jmpq  *8215(%rip)
-# DISASM-NEXT: jmpq  *8209(%rip)
+# DISASM-NEXT: jmpq  *8311(%rip)
+# DISASM-NEXT: jmpq  *8305(%rip)
+# DISASM-NEXT: leaq -167(%rip), %r16
+# DISASM-NEXT: leaq -175(%rip), %r16
+# DISASM-NEXT: leaq -182(%rip), %r16
+# DISASM-NEXT: leaq -190(%rip), %r16
+# DISASM-NEXT: movq 8265(%rip), %r16
+# DISASM-NEXT: movq 8257(%rip), %r16
+# DISASM-NEXT: leaq -215(%rip), %r16
+# DISASM-NEXT: leaq -223(%rip), %r16
+# DISASM-NEXT: leaq -230(%rip), %r16
+# DISASM-NEXT: leaq -238(%rip), %r16
+# DISASM-NEXT: movq 8217(%rip), %r16
+# DISASM-NEXT: movq 8209(%rip), %r16
 
 # NORELAX-LABEL: <_start>:
 # NORELAX-COUNT-12: movq
 # NORELAX-COUNT-6:  callq *
 # NORELAX-COUNT-6:  jmpq *
+# NORELAX-COUNT-12: movq
 
 .text
 .globl foo
@@ -120,3 +133,16 @@ _start:
  jmp *hid@GOTPCREL(%rip)
  jmp *ifunc@GOTPCREL(%rip)
  jmp *ifunc@GOTPCREL(%rip)
+
+ movq foo@GOTPCREL(%rip), %r16
+ movq foo@GOTPCREL(%rip), %r16
+ movq hid@GOTPCREL(%rip), %r16
+ movq hid@GOTPCREL(%rip), %r16
+ movq ifunc@GOTPCREL(%rip), %r16
+ movq ifunc@GOTPCREL(%rip), %r16
+ movq foo@GOTPCREL(%rip), %r16
+ movq foo@GOTPCREL(%rip), %r16
+ movq hid@GOTPCREL(%rip), %r16
+ movq hid@GOTPCREL(%rip), %r16
+ movq ifunc@GOTPCREL(%rip), %r16
+ movq ifunc@GOTPCREL(%rip), %r16

Copy link

github-actions bot commented Sep 29, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@MaskRay
Copy link
Member

MaskRay commented Sep 29, 2024

Since llvm does not add R_X86_64_CODE_4_GOTPCRELX, it should be removed from the description .

@KanRobert
Copy link
Contributor Author

Since llvm does not add R_X86_64_CODE_4_GOTPCRELX, it should be removed from the description .

Done

@KanRobert KanRobert merged commit 31dd29c into llvm:main Sep 29, 2024
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 29, 2024

LLVM Buildbot has detected a new failure on builder clang-hip-vega20 running on hip-vega20-0 while building lld at step 3 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/6603

Here is the relevant piece of the build log for the reference
Step 3 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/hip-build.sh --jobs=' (failure)
...
[38/40] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 -DNDEBUG  External/HIP/CMakeFiles/InOneWeekend-hip-6.0.2.dir/workload/ray-tracing/InOneWeekend/main.cc.o -o External/HIP/InOneWeekend-hip-6.0.2  --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt -unwindlib=libgcc -frtlib-add-rpath && cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /usr/local/bin/cmake -E create_symlink /buildbot/llvm-test-suite/External/HIP/InOneWeekend.reference_output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend.reference_output-hip-6.0.2
[39/40] /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG  -O3 -DNDEBUG   -w -Werror=date-time --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /buildbot/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc
[40/40] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 -DNDEBUG  External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -o External/HIP/TheNextWeek-hip-6.0.2  --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt -unwindlib=libgcc -frtlib-add-rpath && cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /usr/local/bin/cmake -E create_symlink /buildbot/llvm-test-suite/External/HIP/TheNextWeek.reference_output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/TheNextWeek.reference_output-hip-6.0.2
+ build_step 'Testing HIP test-suite'
+ echo '@@@BUILD_STEP Testing HIP test-suite@@@'
@@@BUILD_STEP Testing HIP test-suite@@@
+ ninja -v check-hip-simple
[0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
-- Testing: 7 tests, 7 workers --
Testing:  0.. 10.. 20.. 30.. 40
FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (4 of 7)
******************** TEST 'test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test' FAILED ********************

/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out --redirect-input /dev/null --summary /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2

+ cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP
+ /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: Comparison failed, textual difference between 'M' and 'i'

Input 1:
Memory access fault by GPU node-1 (Agent handle: 0x559c88786ac0) on address (nil). Reason: Page not present or supervisor privilege.
exit 134

Input 2:
image width = 1200 height = 675
block size = (16, 16) grid size = (75, 43)
Start rendering by GPU.
Done.
gpu.ppm and ref.ppm are the same.
exit 0

********************
/usr/bin/strip: /bin/bash.stripped: Bad file descriptor
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test


Testing Time: 341.89s

Total Discovered Tests: 7
  Passed: 6 (85.71%)
  Failed: 1 (14.29%)
FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
ninja: build stopped: subcommand failed.
Step 12 (Testing HIP test-suite) failure: Testing HIP test-suite (failure)
@@@BUILD_STEP Testing HIP test-suite@@@
+ ninja -v check-hip-simple
[0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
-- Testing: 7 tests, 7 workers --
Testing:  0.. 10.. 20.. 30.. 40
FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (4 of 7)
******************** TEST 'test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test' FAILED ********************

/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out --redirect-input /dev/null --summary /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2

+ cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP
+ /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: Comparison failed, textual difference between 'M' and 'i'

Input 1:
Memory access fault by GPU node-1 (Agent handle: 0x559c88786ac0) on address (nil). Reason: Page not present or supervisor privilege.
exit 134

Input 2:
image width = 1200 height = 675
block size = (16, 16) grid size = (75, 43)
Start rendering by GPU.
Done.
gpu.ppm and ref.ppm are the same.
exit 0

********************
/usr/bin/strip: /bin/bash.stripped: Bad file descriptor
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test


Testing Time: 341.89s

Total Discovered Tests: 7
  Passed: 6 (85.71%)
  Failed: 1 (14.29%)
FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
ninja: build stopped: subcommand failed.
program finished with exit code 1
elapsedTime=453.003040

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants