Skip to content

[X86,MC] Add relocation R_X86_64_REX2_GOTPCRELX #106681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Sep 24, 2024
Merged

Conversation

KanRobert
Copy link
Contributor

@KanRobert KanRobert commented Aug 30, 2024

For

mov        name@GOTPCREL(%rip), %reg
test       %reg, name@GOTPCREL(%rip)
binop      name@GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions, add

R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX = 43

if the instruction starts at 4 bytes before the relocation offset. It similar to R_X86_64_GOTPCRELX.

Linker can treat R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX as R_X86_64_GOTPCREL or convert the above instructions to

lea	name(%rip), %reg
mov	$name, %reg
test	$name, %reg
binop	$name, %reg

if the first byte of the instruction at the relocation offset - 4 is 0xd5 (namely, encoded w/ REX2 prefix) when possible.

Binutils patch: bminor/binutils-gdb@3d5a60d
Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation

@llvmbot llvmbot added clang Clang issues not falling into any other category lld backend:X86 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' mc Machine (object) code lld:ELF llvm:binary-utilities labels Aug 30, 2024
@KanRobert KanRobert marked this pull request as draft August 30, 2024 08:01
@llvmbot
Copy link
Member

llvmbot commented Aug 30, 2024

@llvm/pr-subscribers-mc
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-lld-elf
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang-driver

Author: Shengchen Kan (KanRobert)

Changes

For

mov        name@<!-- -->GOTPCREL(%rip), %reg
test       %reg, name@<!-- -->GOTPCREL(%rip)
binop      name@<!-- -->GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions, add

R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX = 43

if the instruction starts at 4 bytes before the relocation offset. It similar to R_X86_64_GOTPCRELX.

Linker can treat R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX as R_X86_64_GOTPCREL or convert the above instructions to

lea	name(%rip), %reg
mov	$name, %reg
test	$name, %reg
binop	$name, %reg

if the first byte of the instruction at the relocation offset - 4 is 0xd5 (namely, encoded w/ REX2 prefix) when possible.

Binutils patch: bminor/binutils-gdb@3d5a60d
Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation


Full diff: https://github.com/llvm/llvm-project/pull/106681.diff

14 Files Affected:

  • (modified) clang/test/Driver/relax.s (+2)
  • (modified) lld/ELF/Arch/X86_64.cpp (+3)
  • (modified) lld/test/ELF/x86-64-gotpc-no-relax-err.s (+7-3)
  • (modified) lld/test/ELF/x86-64-gotpc-relax.s (+1-1)
  • (modified) llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def (+1)
  • (modified) llvm/lib/MC/MCTargetOptionsCommandFlags.cpp (+2-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp (+4)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp (+6-1)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h (+4)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp (+9-10)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp (+5-1)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp (+2)
  • (modified) llvm/test/MC/ELF/relocation-alias.s (+3)
  • (modified) llvm/test/MC/X86/gotpcrelx.s (+34)
diff --git a/clang/test/Driver/relax.s b/clang/test/Driver/relax.s
index 154d4db0a31385..b4a696a328eb56 100644
--- a/clang/test/Driver/relax.s
+++ b/clang/test/Driver/relax.s
@@ -8,5 +8,7 @@
 // RUN: llvm-readobj -r %t | FileCheck --check-prefix=REL %s
 
 // REL: R_X86_64_REX_GOTPCRELX foo
+// REL: R_X86_64_REX2_GOTPCRELX foo
 
         movq	foo@GOTPCREL(%rip), %rax
+        movq	foo@GOTPCREL(%rip), %r16
diff --git a/lld/ELF/Arch/X86_64.cpp b/lld/ELF/Arch/X86_64.cpp
index 65a81fe12f8709..ba5ce68e509199 100644
--- a/lld/ELF/Arch/X86_64.cpp
+++ b/lld/ELF/Arch/X86_64.cpp
@@ -388,6 +388,7 @@ RelExpr X86_64::getRelExpr(RelType type, const Symbol &s,
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_GOTTPOFF:
     return R_GOT_PC;
   case R_X86_64_GOTOFF64:
@@ -725,6 +726,7 @@ int64_t X86_64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_PC32:
   case R_X86_64_GOTTPOFF:
   case R_X86_64_PLT32:
@@ -808,6 +810,7 @@ void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
     break;
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
     if (rel.expr != R_GOT_PC) {
       relaxGot(loc, rel, val);
     } else {
diff --git a/lld/test/ELF/x86-64-gotpc-no-relax-err.s b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
index 618dca47755f41..4280c8fd1dc97e 100644
--- a/lld/test/ELF/x86-64-gotpc-no-relax-err.s
+++ b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
@@ -7,15 +7,19 @@
 ## `>>> defined in` for linker synthesized __stop_* symbols (there is no
 ## associated file or linker script line number).
 
-# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483658 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483666 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 # CHECK-EMPTY:
-# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483659 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: >>> defined in <internal>
+# CHECK-EMPTY:
+# CHECK-NEXT: error: {{.*}}:(.text+0x11): relocation R_X86_64_REX2_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 
 #--- a.s
   movl __stop_data@GOTPCREL(%rip), %eax  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # out of range
+  movq __stop_data@GOTPCREL(%rip), %r16  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # in range
 
 .section data,"aw",@progbits
@@ -23,5 +27,5 @@
 #--- lds
 SECTIONS {
   .text 0x200000 : { *(.text) }
-  .got 0x80200010 : { *(.got) }
+  .got 0x80200016 : { *(.got) }
 }
diff --git a/lld/test/ELF/x86-64-gotpc-relax.s b/lld/test/ELF/x86-64-gotpc-relax.s
index 5945bfc04a0225..1fb3a3c76852ab 100644
--- a/lld/test/ELF/x86-64-gotpc-relax.s
+++ b/lld/test/ELF/x86-64-gotpc-relax.s
@@ -1,5 +1,5 @@
 # REQUIRES: x86
-## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX GOT optimization.
+## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX/R_X86_64_REX2_GOTPCRELX GOT optimization.
 
 # RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
 # RUN: ld.lld %t.o -o %t1 --no-apply-dynamic-relocs
diff --git a/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def b/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def
index 18fdcf9472dc48..161b1969abfeb4 100644
--- a/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def
+++ b/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def
@@ -43,3 +43,4 @@ ELF_RELOC(R_X86_64_TLSDESC,     36)
 ELF_RELOC(R_X86_64_IRELATIVE,   37)
 ELF_RELOC(R_X86_64_GOTPCRELX,   41)
 ELF_RELOC(R_X86_64_REX_GOTPCRELX,    42)
+ELF_RELOC(R_X86_64_REX2_GOTPCRELX,    43)
diff --git a/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp b/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
index 1a4f7e93eeb74a..92618bdabbe519 100644
--- a/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
+++ b/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
@@ -145,8 +145,8 @@ llvm::mc::RegisterMCTargetOptionsFlags::RegisterMCTargetOptionsFlags() {
 
   static cl::opt<bool> X86RelaxRelocations(
       "x86-relax-relocations",
-      cl::desc(
-          "Emit GOTPCRELX/REX_GOTPCRELX instead of GOTPCREL on x86-64 ELF"),
+      cl::desc("Emit GOTPCRELX/REX_GOTPCRELX/REX2_GOTPCRELX instead of "
+               "GOTPCREL on x86-64 ELF"),
       cl::init(true));
   MCBINDOPT(X86RelaxRelocations);
 
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
index cf0cb92424c166..c02338995f44c0 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
@@ -631,8 +631,10 @@ const MCFixupKindInfo &X86AsmBackend::getFixupKindInfo(MCFixupKind Kind) const {
   const static MCFixupKindInfo Infos[X86::NumTargetFixupKinds] = {
       {"reloc_riprel_4byte", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_riprel_4byte_movq_load", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
+      {"reloc_riprel_4byte_movq_load_rex2", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_riprel_4byte_relax", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_riprel_4byte_relax_rex", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
+      {"reloc_riprel_4byte_relax_rex2", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_signed_4byte", 0, 32, 0},
       {"reloc_signed_4byte_relax", 0, 32, 0},
       {"reloc_global_offset_table", 0, 32, 0},
@@ -678,7 +680,9 @@ static unsigned getFixupKindSize(unsigned Kind) {
   case X86::reloc_riprel_4byte:
   case X86::reloc_riprel_4byte_relax:
   case X86::reloc_riprel_4byte_relax_rex:
+  case X86::reloc_riprel_4byte_relax_rex2:
   case X86::reloc_riprel_4byte_movq_load:
+  case X86::reloc_riprel_4byte_movq_load_rex2:
   case X86::reloc_signed_4byte:
   case X86::reloc_signed_4byte_relax:
   case X86::reloc_global_offset_table:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
index 0b2efdfc16cc5d..9fdc62b8c8516f 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
@@ -74,7 +74,9 @@ static X86_64RelType getType64(MCFixupKind Kind,
   case X86::reloc_riprel_4byte:
   case X86::reloc_riprel_4byte_relax:
   case X86::reloc_riprel_4byte_relax_rex:
+  case X86::reloc_riprel_4byte_relax_rex2:
   case X86::reloc_riprel_4byte_movq_load:
+  case X86::reloc_riprel_4byte_movq_load_rex2:
     return RT64_32;
   case X86::reloc_branch_4byte_pcrel:
     Modifier = MCSymbolRefExpr::VK_PLT;
@@ -205,7 +207,7 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
   case MCSymbolRefExpr::VK_GOTPCREL:
     checkIs32(Ctx, Loc, Type);
     // Older versions of ld.bfd/ld.gold/lld
-    // do not support GOTPCRELX/REX_GOTPCRELX,
+    // do not support GOTPCRELX/REX_GOTPCRELX/REX2_GOTPCRELX,
     // and we want to keep back-compatibility.
     if (!Ctx.getTargetOptions()->X86RelaxRelocations)
       return ELF::R_X86_64_GOTPCREL;
@@ -217,6 +219,9 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
     case X86::reloc_riprel_4byte_relax_rex:
     case X86::reloc_riprel_4byte_movq_load:
       return ELF::R_X86_64_REX_GOTPCRELX;
+    case X86::reloc_riprel_4byte_relax_rex2:
+    case X86::reloc_riprel_4byte_movq_load_rex2:
+     return ELF::R_X86_64_REX2_GOTPCRELX;
     }
     llvm_unreachable("unexpected relocation type!");
   case MCSymbolRefExpr::VK_GOTPCREL_NORELAX:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h b/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h
index 2d5217115d07cb..29bb7eebae3f22 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h
@@ -16,10 +16,14 @@ namespace X86 {
 enum Fixups {
   reloc_riprel_4byte = FirstTargetFixupKind, // 32-bit rip-relative
   reloc_riprel_4byte_movq_load,              // 32-bit rip-relative in movq
+  reloc_riprel_4byte_movq_load_rex2,         // 32-bit rip-relative in movq
+                                             // with rex2 prefix
   reloc_riprel_4byte_relax,                  // 32-bit rip-relative in relaxable
                                              // instruction
   reloc_riprel_4byte_relax_rex,              // 32-bit rip-relative in relaxable
                                              // instruction with rex prefix
+  reloc_riprel_4byte_relax_rex2,             // 32-bit rip-relative in relaxable
+                                             // instruction with rex2 prefix
   reloc_signed_4byte,                        // 32-bit signed. Unlike FK_Data_4
                                              // this will be sign extended at
                                              // runtime.
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
index 469a385e085271..852af1600bb2df 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
@@ -568,8 +568,10 @@ void X86MCCodeEmitter::emitImmediate(const MCOperand &DispOp, SMLoc Loc,
   if (FixupKind == FK_PCRel_4 ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte) ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte_movq_load) ||
+      FixupKind == MCFixupKind(X86::reloc_riprel_4byte_movq_load_rex2) ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte_relax) ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte_relax_rex) ||
+      FixupKind == MCFixupKind(X86::reloc_riprel_4byte_relax_rex2) ||
       FixupKind == MCFixupKind(X86::reloc_branch_4byte_pcrel)) {
     ImmOffset -= 4;
     // If this is a pc-relative load off _GLOBAL_OFFSET_TABLE_:
@@ -638,12 +640,11 @@ void X86MCCodeEmitter::emitMemModRMByte(
       default:
         return X86::reloc_riprel_4byte;
       case X86::MOV64rm:
-        // movq loads is a subset of reloc_riprel_4byte_relax_rex. It is a
+        // movq loads is a subset of reloc_riprel_4byte_relax_rex/rex2. It is a
         // special case because COFF and Mach-O don't support ELF's more
-        // flexible R_X86_64_REX_GOTPCRELX relaxation.
-        // TODO: Support new relocation for REX2.
-        assert(Kind == REX || Kind == REX2);
-        return X86::reloc_riprel_4byte_movq_load;
+        // flexible R_X86_64_REX_GOTPCRELX/R_X86_64_REX2_GOTPCRELX relaxation.
+        return Kind == REX2 ? X86::reloc_riprel_4byte_movq_load_rex2
+                            : X86::reloc_riprel_4byte_movq_load;
       case X86::ADC32rm:
       case X86::ADD32rm:
       case X86::AND32rm:
@@ -666,11 +667,9 @@ void X86MCCodeEmitter::emitMemModRMByte(
       case X86::SBB64rm:
       case X86::SUB64rm:
       case X86::XOR64rm:
-        // We haven't support relocation for REX2 prefix, so temporarily use REX
-        // relocation.
-        // TODO: Support new relocation for REX2.
-        return (Kind == REX || Kind == REX2) ? X86::reloc_riprel_4byte_relax_rex
-                                             : X86::reloc_riprel_4byte_relax;
+        return Kind == REX2  ? X86::reloc_riprel_4byte_relax_rex2
+               : Kind == REX ? X86::reloc_riprel_4byte_relax_rex
+                             : X86::reloc_riprel_4byte_relax;
       }
     }();
 
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp
index ec95b1ffec387d..41ce5c9fcb82ad 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp
@@ -66,8 +66,10 @@ class X86MachObjectWriter : public MCMachObjectTargetWriter {
 static bool isFixupKindRIPRel(unsigned Kind) {
   return Kind == X86::reloc_riprel_4byte ||
          Kind == X86::reloc_riprel_4byte_movq_load ||
+         Kind == X86::reloc_riprel_4byte_movq_load_rex2 ||
          Kind == X86::reloc_riprel_4byte_relax ||
-         Kind == X86::reloc_riprel_4byte_relax_rex;
+         Kind == X86::reloc_riprel_4byte_relax_rex ||
+         Kind == X86::reloc_riprel_4byte_relax_rex2;
 }
 
 static unsigned getFixupKindLog2Size(unsigned Kind) {
@@ -83,7 +85,9 @@ static unsigned getFixupKindLog2Size(unsigned Kind) {
   case X86::reloc_riprel_4byte:
   case X86::reloc_riprel_4byte_relax:
   case X86::reloc_riprel_4byte_relax_rex:
+  case X86::reloc_riprel_4byte_relax_rex2:
   case X86::reloc_riprel_4byte_movq_load:
+  case X86::reloc_riprel_4byte_movq_load_rex2:
   case X86::reloc_signed_4byte:
   case X86::reloc_signed_4byte_relax:
   case X86::reloc_branch_4byte_pcrel:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp
index 10fc176b59d8ab..7740500fb41830 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp
@@ -66,8 +66,10 @@ unsigned X86WinCOFFObjectWriter::getRelocType(MCContext &Ctx,
     case FK_PCRel_4:
     case X86::reloc_riprel_4byte:
     case X86::reloc_riprel_4byte_movq_load:
+    case X86::reloc_riprel_4byte_movq_load_rex2:
     case X86::reloc_riprel_4byte_relax:
     case X86::reloc_riprel_4byte_relax_rex:
+    case X86::reloc_riprel_4byte_relax_rex2:
     case X86::reloc_branch_4byte_pcrel:
       return COFF::IMAGE_REL_AMD64_REL32;
     case FK_Data_4:
diff --git a/llvm/test/MC/ELF/relocation-alias.s b/llvm/test/MC/ELF/relocation-alias.s
index 51fb0c37052fe7..66bf2ceea508ba 100644
--- a/llvm/test/MC/ELF/relocation-alias.s
+++ b/llvm/test/MC/ELF/relocation-alias.s
@@ -16,7 +16,10 @@ movabsq $memcpy+2, %rax
 
 # CHECK:      movq (%rip), %rax
 # CHECK-NEXT:   R_X86_64_REX_GOTPCRELX  abs-0x4
+# CHECK:      movq (%rip), %r16
+# CHECK-NEXT:   R_X86_64_REX2_GOTPCRELX abs-0x4
 movq abs@GOTPCREL(%rip), %rax
+movq abs@GOTPCREL(%rip), %r16
 abs = 42
 
 # CHECK:      movabsq $0, %rbx
diff --git a/llvm/test/MC/X86/gotpcrelx.s b/llvm/test/MC/X86/gotpcrelx.s
index e63e3e9a946fd1..5a8ba454bc904c 100644
--- a/llvm/test/MC/X86/gotpcrelx.s
+++ b/llvm/test/MC/X86/gotpcrelx.s
@@ -37,6 +37,16 @@
 # CHECK-NEXT:     R_X86_64_REX_GOTPCRELX sbb
 # CHECK-NEXT:     R_X86_64_REX_GOTPCRELX sub
 # CHECK-NEXT:     R_X86_64_REX_GOTPCRELX xor
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX mov
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX test
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX adc
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX add
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX and
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX cmp
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX or
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX sbb
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX sub
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX xor
 # CHECK-NEXT:   }
 
 # NORELAX-NEXT:     R_X86_64_GOTPCREL mov
@@ -71,6 +81,16 @@
 # NORELAX-NEXT:     R_X86_64_GOTPCREL sbb
 # NORELAX-NEXT:     R_X86_64_GOTPCREL sub
 # NORELAX-NEXT:     R_X86_64_GOTPCREL xor
+# NORELAX-NEXT:     R_X86_64_GOTPCREL mov
+# NORELAX-NEXT:     R_X86_64_GOTPCREL test
+# NORELAX-NEXT:     R_X86_64_GOTPCREL adc
+# NORELAX-NEXT:     R_X86_64_GOTPCREL add
+# NORELAX-NEXT:     R_X86_64_GOTPCREL and
+# NORELAX-NEXT:     R_X86_64_GOTPCREL cmp
+# NORELAX-NEXT:     R_X86_64_GOTPCREL or
+# NORELAX-NEXT:     R_X86_64_GOTPCREL sbb
+# NORELAX-NEXT:     R_X86_64_GOTPCREL sub
+# NORELAX-NEXT:     R_X86_64_GOTPCREL xor
 # NORELAX-NEXT:   }
 
 movl mov@GOTPCREL(%rip), %eax
@@ -108,10 +128,22 @@ sbb sbb@GOTPCREL(%rip), %rax
 sub sub@GOTPCREL(%rip), %rax
 xor xor@GOTPCREL(%rip), %rax
 
+movq mov@GOTPCREL(%rip), %r16
+test %r16, test@GOTPCREL(%rip)
+adc adc@GOTPCREL(%rip), %r16
+add add@GOTPCREL(%rip), %r16
+and and@GOTPCREL(%rip), %r16
+cmp cmp@GOTPCREL(%rip), %r16
+or  or@GOTPCREL(%rip), %r16
+sbb sbb@GOTPCREL(%rip), %r16
+sub sub@GOTPCREL(%rip), %r16
+xor xor@GOTPCREL(%rip), %r16
+
 # COMMON-NEXT:   Section ({{.*}}) .rela.norelax {
 # COMMON-NEXT:     R_X86_64_GOTPCREL mov 0x0
 # COMMON-NEXT:     R_X86_64_GOTPCREL mov 0xFFFFFFFFFFFFFFFD
 # COMMON-NEXT:     R_X86_64_GOTPCREL mov 0xFFFFFFFFFFFFFFFC
+# COMMON-NEXT:     R_X86_64_GOTPCREL mov 0xFFFFFFFFFFFFFFFD
 # COMMON-NEXT:   }
 # COMMON-NEXT: ]
 
@@ -123,3 +155,5 @@ movl mov@GOTPCREL+4(%rip), %eax
 movq mov@GOTPCREL+1(%rip), %rax
 ## We could emit R_X86_64_GOTPCRELX, but it is probably unnecessary.
 movl mov@GOTPCREL+0(%rip), %eax
+## Don't emit R_X86_64_GOTPCRELX.
+movq mov@GOTPCREL+1(%rip), %r16

@llvmbot
Copy link
Member

llvmbot commented Aug 30, 2024

@llvm/pr-subscribers-lld

Author: Shengchen Kan (KanRobert)

Changes

For

mov        name@<!-- -->GOTPCREL(%rip), %reg
test       %reg, name@<!-- -->GOTPCREL(%rip)
binop      name@<!-- -->GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions, add

R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX = 43

if the instruction starts at 4 bytes before the relocation offset. It similar to R_X86_64_GOTPCRELX.

Linker can treat R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX as R_X86_64_GOTPCREL or convert the above instructions to

lea	name(%rip), %reg
mov	$name, %reg
test	$name, %reg
binop	$name, %reg

if the first byte of the instruction at the relocation offset - 4 is 0xd5 (namely, encoded w/ REX2 prefix) when possible.

Binutils patch: bminor/binutils-gdb@3d5a60d
Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation


Full diff: https://github.com/llvm/llvm-project/pull/106681.diff

14 Files Affected:

  • (modified) clang/test/Driver/relax.s (+2)
  • (modified) lld/ELF/Arch/X86_64.cpp (+3)
  • (modified) lld/test/ELF/x86-64-gotpc-no-relax-err.s (+7-3)
  • (modified) lld/test/ELF/x86-64-gotpc-relax.s (+1-1)
  • (modified) llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def (+1)
  • (modified) llvm/lib/MC/MCTargetOptionsCommandFlags.cpp (+2-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp (+4)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp (+6-1)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h (+4)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp (+9-10)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp (+5-1)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp (+2)
  • (modified) llvm/test/MC/ELF/relocation-alias.s (+3)
  • (modified) llvm/test/MC/X86/gotpcrelx.s (+34)
diff --git a/clang/test/Driver/relax.s b/clang/test/Driver/relax.s
index 154d4db0a31385..b4a696a328eb56 100644
--- a/clang/test/Driver/relax.s
+++ b/clang/test/Driver/relax.s
@@ -8,5 +8,7 @@
 // RUN: llvm-readobj -r %t | FileCheck --check-prefix=REL %s
 
 // REL: R_X86_64_REX_GOTPCRELX foo
+// REL: R_X86_64_REX2_GOTPCRELX foo
 
         movq	foo@GOTPCREL(%rip), %rax
+        movq	foo@GOTPCREL(%rip), %r16
diff --git a/lld/ELF/Arch/X86_64.cpp b/lld/ELF/Arch/X86_64.cpp
index 65a81fe12f8709..ba5ce68e509199 100644
--- a/lld/ELF/Arch/X86_64.cpp
+++ b/lld/ELF/Arch/X86_64.cpp
@@ -388,6 +388,7 @@ RelExpr X86_64::getRelExpr(RelType type, const Symbol &s,
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_GOTTPOFF:
     return R_GOT_PC;
   case R_X86_64_GOTOFF64:
@@ -725,6 +726,7 @@ int64_t X86_64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_PC32:
   case R_X86_64_GOTTPOFF:
   case R_X86_64_PLT32:
@@ -808,6 +810,7 @@ void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
     break;
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
     if (rel.expr != R_GOT_PC) {
       relaxGot(loc, rel, val);
     } else {
diff --git a/lld/test/ELF/x86-64-gotpc-no-relax-err.s b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
index 618dca47755f41..4280c8fd1dc97e 100644
--- a/lld/test/ELF/x86-64-gotpc-no-relax-err.s
+++ b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
@@ -7,15 +7,19 @@
 ## `>>> defined in` for linker synthesized __stop_* symbols (there is no
 ## associated file or linker script line number).
 
-# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483658 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483666 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 # CHECK-EMPTY:
-# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483659 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: >>> defined in <internal>
+# CHECK-EMPTY:
+# CHECK-NEXT: error: {{.*}}:(.text+0x11): relocation R_X86_64_REX2_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 
 #--- a.s
   movl __stop_data@GOTPCREL(%rip), %eax  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # out of range
+  movq __stop_data@GOTPCREL(%rip), %r16  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # in range
 
 .section data,"aw",@progbits
@@ -23,5 +27,5 @@
 #--- lds
 SECTIONS {
   .text 0x200000 : { *(.text) }
-  .got 0x80200010 : { *(.got) }
+  .got 0x80200016 : { *(.got) }
 }
diff --git a/lld/test/ELF/x86-64-gotpc-relax.s b/lld/test/ELF/x86-64-gotpc-relax.s
index 5945bfc04a0225..1fb3a3c76852ab 100644
--- a/lld/test/ELF/x86-64-gotpc-relax.s
+++ b/lld/test/ELF/x86-64-gotpc-relax.s
@@ -1,5 +1,5 @@
 # REQUIRES: x86
-## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX GOT optimization.
+## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX/R_X86_64_REX2_GOTPCRELX GOT optimization.
 
 # RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
 # RUN: ld.lld %t.o -o %t1 --no-apply-dynamic-relocs
diff --git a/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def b/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def
index 18fdcf9472dc48..161b1969abfeb4 100644
--- a/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def
+++ b/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def
@@ -43,3 +43,4 @@ ELF_RELOC(R_X86_64_TLSDESC,     36)
 ELF_RELOC(R_X86_64_IRELATIVE,   37)
 ELF_RELOC(R_X86_64_GOTPCRELX,   41)
 ELF_RELOC(R_X86_64_REX_GOTPCRELX,    42)
+ELF_RELOC(R_X86_64_REX2_GOTPCRELX,    43)
diff --git a/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp b/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
index 1a4f7e93eeb74a..92618bdabbe519 100644
--- a/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
+++ b/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
@@ -145,8 +145,8 @@ llvm::mc::RegisterMCTargetOptionsFlags::RegisterMCTargetOptionsFlags() {
 
   static cl::opt<bool> X86RelaxRelocations(
       "x86-relax-relocations",
-      cl::desc(
-          "Emit GOTPCRELX/REX_GOTPCRELX instead of GOTPCREL on x86-64 ELF"),
+      cl::desc("Emit GOTPCRELX/REX_GOTPCRELX/REX2_GOTPCRELX instead of "
+               "GOTPCREL on x86-64 ELF"),
       cl::init(true));
   MCBINDOPT(X86RelaxRelocations);
 
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
index cf0cb92424c166..c02338995f44c0 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
@@ -631,8 +631,10 @@ const MCFixupKindInfo &X86AsmBackend::getFixupKindInfo(MCFixupKind Kind) const {
   const static MCFixupKindInfo Infos[X86::NumTargetFixupKinds] = {
       {"reloc_riprel_4byte", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_riprel_4byte_movq_load", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
+      {"reloc_riprel_4byte_movq_load_rex2", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_riprel_4byte_relax", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_riprel_4byte_relax_rex", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
+      {"reloc_riprel_4byte_relax_rex2", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_signed_4byte", 0, 32, 0},
       {"reloc_signed_4byte_relax", 0, 32, 0},
       {"reloc_global_offset_table", 0, 32, 0},
@@ -678,7 +680,9 @@ static unsigned getFixupKindSize(unsigned Kind) {
   case X86::reloc_riprel_4byte:
   case X86::reloc_riprel_4byte_relax:
   case X86::reloc_riprel_4byte_relax_rex:
+  case X86::reloc_riprel_4byte_relax_rex2:
   case X86::reloc_riprel_4byte_movq_load:
+  case X86::reloc_riprel_4byte_movq_load_rex2:
   case X86::reloc_signed_4byte:
   case X86::reloc_signed_4byte_relax:
   case X86::reloc_global_offset_table:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
index 0b2efdfc16cc5d..9fdc62b8c8516f 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
@@ -74,7 +74,9 @@ static X86_64RelType getType64(MCFixupKind Kind,
   case X86::reloc_riprel_4byte:
   case X86::reloc_riprel_4byte_relax:
   case X86::reloc_riprel_4byte_relax_rex:
+  case X86::reloc_riprel_4byte_relax_rex2:
   case X86::reloc_riprel_4byte_movq_load:
+  case X86::reloc_riprel_4byte_movq_load_rex2:
     return RT64_32;
   case X86::reloc_branch_4byte_pcrel:
     Modifier = MCSymbolRefExpr::VK_PLT;
@@ -205,7 +207,7 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
   case MCSymbolRefExpr::VK_GOTPCREL:
     checkIs32(Ctx, Loc, Type);
     // Older versions of ld.bfd/ld.gold/lld
-    // do not support GOTPCRELX/REX_GOTPCRELX,
+    // do not support GOTPCRELX/REX_GOTPCRELX/REX2_GOTPCRELX,
     // and we want to keep back-compatibility.
     if (!Ctx.getTargetOptions()->X86RelaxRelocations)
       return ELF::R_X86_64_GOTPCREL;
@@ -217,6 +219,9 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
     case X86::reloc_riprel_4byte_relax_rex:
     case X86::reloc_riprel_4byte_movq_load:
       return ELF::R_X86_64_REX_GOTPCRELX;
+    case X86::reloc_riprel_4byte_relax_rex2:
+    case X86::reloc_riprel_4byte_movq_load_rex2:
+     return ELF::R_X86_64_REX2_GOTPCRELX;
     }
     llvm_unreachable("unexpected relocation type!");
   case MCSymbolRefExpr::VK_GOTPCREL_NORELAX:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h b/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h
index 2d5217115d07cb..29bb7eebae3f22 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h
@@ -16,10 +16,14 @@ namespace X86 {
 enum Fixups {
   reloc_riprel_4byte = FirstTargetFixupKind, // 32-bit rip-relative
   reloc_riprel_4byte_movq_load,              // 32-bit rip-relative in movq
+  reloc_riprel_4byte_movq_load_rex2,         // 32-bit rip-relative in movq
+                                             // with rex2 prefix
   reloc_riprel_4byte_relax,                  // 32-bit rip-relative in relaxable
                                              // instruction
   reloc_riprel_4byte_relax_rex,              // 32-bit rip-relative in relaxable
                                              // instruction with rex prefix
+  reloc_riprel_4byte_relax_rex2,             // 32-bit rip-relative in relaxable
+                                             // instruction with rex2 prefix
   reloc_signed_4byte,                        // 32-bit signed. Unlike FK_Data_4
                                              // this will be sign extended at
                                              // runtime.
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
index 469a385e085271..852af1600bb2df 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
@@ -568,8 +568,10 @@ void X86MCCodeEmitter::emitImmediate(const MCOperand &DispOp, SMLoc Loc,
   if (FixupKind == FK_PCRel_4 ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte) ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte_movq_load) ||
+      FixupKind == MCFixupKind(X86::reloc_riprel_4byte_movq_load_rex2) ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte_relax) ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte_relax_rex) ||
+      FixupKind == MCFixupKind(X86::reloc_riprel_4byte_relax_rex2) ||
       FixupKind == MCFixupKind(X86::reloc_branch_4byte_pcrel)) {
     ImmOffset -= 4;
     // If this is a pc-relative load off _GLOBAL_OFFSET_TABLE_:
@@ -638,12 +640,11 @@ void X86MCCodeEmitter::emitMemModRMByte(
       default:
         return X86::reloc_riprel_4byte;
       case X86::MOV64rm:
-        // movq loads is a subset of reloc_riprel_4byte_relax_rex. It is a
+        // movq loads is a subset of reloc_riprel_4byte_relax_rex/rex2. It is a
         // special case because COFF and Mach-O don't support ELF's more
-        // flexible R_X86_64_REX_GOTPCRELX relaxation.
-        // TODO: Support new relocation for REX2.
-        assert(Kind == REX || Kind == REX2);
-        return X86::reloc_riprel_4byte_movq_load;
+        // flexible R_X86_64_REX_GOTPCRELX/R_X86_64_REX2_GOTPCRELX relaxation.
+        return Kind == REX2 ? X86::reloc_riprel_4byte_movq_load_rex2
+                            : X86::reloc_riprel_4byte_movq_load;
       case X86::ADC32rm:
       case X86::ADD32rm:
       case X86::AND32rm:
@@ -666,11 +667,9 @@ void X86MCCodeEmitter::emitMemModRMByte(
       case X86::SBB64rm:
       case X86::SUB64rm:
       case X86::XOR64rm:
-        // We haven't support relocation for REX2 prefix, so temporarily use REX
-        // relocation.
-        // TODO: Support new relocation for REX2.
-        return (Kind == REX || Kind == REX2) ? X86::reloc_riprel_4byte_relax_rex
-                                             : X86::reloc_riprel_4byte_relax;
+        return Kind == REX2  ? X86::reloc_riprel_4byte_relax_rex2
+               : Kind == REX ? X86::reloc_riprel_4byte_relax_rex
+                             : X86::reloc_riprel_4byte_relax;
       }
     }();
 
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp
index ec95b1ffec387d..41ce5c9fcb82ad 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp
@@ -66,8 +66,10 @@ class X86MachObjectWriter : public MCMachObjectTargetWriter {
 static bool isFixupKindRIPRel(unsigned Kind) {
   return Kind == X86::reloc_riprel_4byte ||
          Kind == X86::reloc_riprel_4byte_movq_load ||
+         Kind == X86::reloc_riprel_4byte_movq_load_rex2 ||
          Kind == X86::reloc_riprel_4byte_relax ||
-         Kind == X86::reloc_riprel_4byte_relax_rex;
+         Kind == X86::reloc_riprel_4byte_relax_rex ||
+         Kind == X86::reloc_riprel_4byte_relax_rex2;
 }
 
 static unsigned getFixupKindLog2Size(unsigned Kind) {
@@ -83,7 +85,9 @@ static unsigned getFixupKindLog2Size(unsigned Kind) {
   case X86::reloc_riprel_4byte:
   case X86::reloc_riprel_4byte_relax:
   case X86::reloc_riprel_4byte_relax_rex:
+  case X86::reloc_riprel_4byte_relax_rex2:
   case X86::reloc_riprel_4byte_movq_load:
+  case X86::reloc_riprel_4byte_movq_load_rex2:
   case X86::reloc_signed_4byte:
   case X86::reloc_signed_4byte_relax:
   case X86::reloc_branch_4byte_pcrel:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp
index 10fc176b59d8ab..7740500fb41830 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp
@@ -66,8 +66,10 @@ unsigned X86WinCOFFObjectWriter::getRelocType(MCContext &Ctx,
     case FK_PCRel_4:
     case X86::reloc_riprel_4byte:
     case X86::reloc_riprel_4byte_movq_load:
+    case X86::reloc_riprel_4byte_movq_load_rex2:
     case X86::reloc_riprel_4byte_relax:
     case X86::reloc_riprel_4byte_relax_rex:
+    case X86::reloc_riprel_4byte_relax_rex2:
     case X86::reloc_branch_4byte_pcrel:
       return COFF::IMAGE_REL_AMD64_REL32;
     case FK_Data_4:
diff --git a/llvm/test/MC/ELF/relocation-alias.s b/llvm/test/MC/ELF/relocation-alias.s
index 51fb0c37052fe7..66bf2ceea508ba 100644
--- a/llvm/test/MC/ELF/relocation-alias.s
+++ b/llvm/test/MC/ELF/relocation-alias.s
@@ -16,7 +16,10 @@ movabsq $memcpy+2, %rax
 
 # CHECK:      movq (%rip), %rax
 # CHECK-NEXT:   R_X86_64_REX_GOTPCRELX  abs-0x4
+# CHECK:      movq (%rip), %r16
+# CHECK-NEXT:   R_X86_64_REX2_GOTPCRELX abs-0x4
 movq abs@GOTPCREL(%rip), %rax
+movq abs@GOTPCREL(%rip), %r16
 abs = 42
 
 # CHECK:      movabsq $0, %rbx
diff --git a/llvm/test/MC/X86/gotpcrelx.s b/llvm/test/MC/X86/gotpcrelx.s
index e63e3e9a946fd1..5a8ba454bc904c 100644
--- a/llvm/test/MC/X86/gotpcrelx.s
+++ b/llvm/test/MC/X86/gotpcrelx.s
@@ -37,6 +37,16 @@
 # CHECK-NEXT:     R_X86_64_REX_GOTPCRELX sbb
 # CHECK-NEXT:     R_X86_64_REX_GOTPCRELX sub
 # CHECK-NEXT:     R_X86_64_REX_GOTPCRELX xor
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX mov
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX test
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX adc
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX add
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX and
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX cmp
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX or
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX sbb
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX sub
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX xor
 # CHECK-NEXT:   }
 
 # NORELAX-NEXT:     R_X86_64_GOTPCREL mov
@@ -71,6 +81,16 @@
 # NORELAX-NEXT:     R_X86_64_GOTPCREL sbb
 # NORELAX-NEXT:     R_X86_64_GOTPCREL sub
 # NORELAX-NEXT:     R_X86_64_GOTPCREL xor
+# NORELAX-NEXT:     R_X86_64_GOTPCREL mov
+# NORELAX-NEXT:     R_X86_64_GOTPCREL test
+# NORELAX-NEXT:     R_X86_64_GOTPCREL adc
+# NORELAX-NEXT:     R_X86_64_GOTPCREL add
+# NORELAX-NEXT:     R_X86_64_GOTPCREL and
+# NORELAX-NEXT:     R_X86_64_GOTPCREL cmp
+# NORELAX-NEXT:     R_X86_64_GOTPCREL or
+# NORELAX-NEXT:     R_X86_64_GOTPCREL sbb
+# NORELAX-NEXT:     R_X86_64_GOTPCREL sub
+# NORELAX-NEXT:     R_X86_64_GOTPCREL xor
 # NORELAX-NEXT:   }
 
 movl mov@GOTPCREL(%rip), %eax
@@ -108,10 +128,22 @@ sbb sbb@GOTPCREL(%rip), %rax
 sub sub@GOTPCREL(%rip), %rax
 xor xor@GOTPCREL(%rip), %rax
 
+movq mov@GOTPCREL(%rip), %r16
+test %r16, test@GOTPCREL(%rip)
+adc adc@GOTPCREL(%rip), %r16
+add add@GOTPCREL(%rip), %r16
+and and@GOTPCREL(%rip), %r16
+cmp cmp@GOTPCREL(%rip), %r16
+or  or@GOTPCREL(%rip), %r16
+sbb sbb@GOTPCREL(%rip), %r16
+sub sub@GOTPCREL(%rip), %r16
+xor xor@GOTPCREL(%rip), %r16
+
 # COMMON-NEXT:   Section ({{.*}}) .rela.norelax {
 # COMMON-NEXT:     R_X86_64_GOTPCREL mov 0x0
 # COMMON-NEXT:     R_X86_64_GOTPCREL mov 0xFFFFFFFFFFFFFFFD
 # COMMON-NEXT:     R_X86_64_GOTPCREL mov 0xFFFFFFFFFFFFFFFC
+# COMMON-NEXT:     R_X86_64_GOTPCREL mov 0xFFFFFFFFFFFFFFFD
 # COMMON-NEXT:   }
 # COMMON-NEXT: ]
 
@@ -123,3 +155,5 @@ movl mov@GOTPCREL+4(%rip), %eax
 movq mov@GOTPCREL+1(%rip), %rax
 ## We could emit R_X86_64_GOTPCRELX, but it is probably unnecessary.
 movl mov@GOTPCREL+0(%rip), %eax
+## Don't emit R_X86_64_GOTPCRELX.
+movq mov@GOTPCREL+1(%rip), %r16

@llvmbot
Copy link
Member

llvmbot commented Aug 30, 2024

@llvm/pr-subscribers-llvm-binary-utilities

Author: Shengchen Kan (KanRobert)

Changes

For

mov        name@<!-- -->GOTPCREL(%rip), %reg
test       %reg, name@<!-- -->GOTPCREL(%rip)
binop      name@<!-- -->GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions, add

R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX = 43

if the instruction starts at 4 bytes before the relocation offset. It similar to R_X86_64_GOTPCRELX.

Linker can treat R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX as R_X86_64_GOTPCREL or convert the above instructions to

lea	name(%rip), %reg
mov	$name, %reg
test	$name, %reg
binop	$name, %reg

if the first byte of the instruction at the relocation offset - 4 is 0xd5 (namely, encoded w/ REX2 prefix) when possible.

Binutils patch: bminor/binutils-gdb@3d5a60d
Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation


Full diff: https://github.com/llvm/llvm-project/pull/106681.diff

14 Files Affected:

  • (modified) clang/test/Driver/relax.s (+2)
  • (modified) lld/ELF/Arch/X86_64.cpp (+3)
  • (modified) lld/test/ELF/x86-64-gotpc-no-relax-err.s (+7-3)
  • (modified) lld/test/ELF/x86-64-gotpc-relax.s (+1-1)
  • (modified) llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def (+1)
  • (modified) llvm/lib/MC/MCTargetOptionsCommandFlags.cpp (+2-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp (+4)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp (+6-1)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h (+4)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp (+9-10)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp (+5-1)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp (+2)
  • (modified) llvm/test/MC/ELF/relocation-alias.s (+3)
  • (modified) llvm/test/MC/X86/gotpcrelx.s (+34)
diff --git a/clang/test/Driver/relax.s b/clang/test/Driver/relax.s
index 154d4db0a31385..b4a696a328eb56 100644
--- a/clang/test/Driver/relax.s
+++ b/clang/test/Driver/relax.s
@@ -8,5 +8,7 @@
 // RUN: llvm-readobj -r %t | FileCheck --check-prefix=REL %s
 
 // REL: R_X86_64_REX_GOTPCRELX foo
+// REL: R_X86_64_REX2_GOTPCRELX foo
 
         movq	foo@GOTPCREL(%rip), %rax
+        movq	foo@GOTPCREL(%rip), %r16
diff --git a/lld/ELF/Arch/X86_64.cpp b/lld/ELF/Arch/X86_64.cpp
index 65a81fe12f8709..ba5ce68e509199 100644
--- a/lld/ELF/Arch/X86_64.cpp
+++ b/lld/ELF/Arch/X86_64.cpp
@@ -388,6 +388,7 @@ RelExpr X86_64::getRelExpr(RelType type, const Symbol &s,
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_GOTTPOFF:
     return R_GOT_PC;
   case R_X86_64_GOTOFF64:
@@ -725,6 +726,7 @@ int64_t X86_64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_X86_64_GOTPCREL:
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
   case R_X86_64_PC32:
   case R_X86_64_GOTTPOFF:
   case R_X86_64_PLT32:
@@ -808,6 +810,7 @@ void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
     break;
   case R_X86_64_GOTPCRELX:
   case R_X86_64_REX_GOTPCRELX:
+  case R_X86_64_REX2_GOTPCRELX:
     if (rel.expr != R_GOT_PC) {
       relaxGot(loc, rel, val);
     } else {
diff --git a/lld/test/ELF/x86-64-gotpc-no-relax-err.s b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
index 618dca47755f41..4280c8fd1dc97e 100644
--- a/lld/test/ELF/x86-64-gotpc-no-relax-err.s
+++ b/lld/test/ELF/x86-64-gotpc-no-relax-err.s
@@ -7,15 +7,19 @@
 ## `>>> defined in` for linker synthesized __stop_* symbols (there is no
 ## associated file or linker script line number).
 
-# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483658 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK:      error: {{.*}}:(.text+0x2): relocation R_X86_64_GOTPCRELX out of range: 2147483666 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 # CHECK-EMPTY:
-# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: error: {{.*}}:(.text+0x9): relocation R_X86_64_REX_GOTPCRELX out of range: 2147483659 is not in [-2147483648, 2147483647]; references '__stop_data'
+# CHECK-NEXT: >>> defined in <internal>
+# CHECK-EMPTY:
+# CHECK-NEXT: error: {{.*}}:(.text+0x11): relocation R_X86_64_REX2_GOTPCRELX out of range: 2147483651 is not in [-2147483648, 2147483647]; references '__stop_data'
 # CHECK-NEXT: >>> defined in <internal>
 
 #--- a.s
   movl __stop_data@GOTPCREL(%rip), %eax  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # out of range
+  movq __stop_data@GOTPCREL(%rip), %r16  # out of range
   movq __stop_data@GOTPCREL(%rip), %rax  # in range
 
 .section data,"aw",@progbits
@@ -23,5 +27,5 @@
 #--- lds
 SECTIONS {
   .text 0x200000 : { *(.text) }
-  .got 0x80200010 : { *(.got) }
+  .got 0x80200016 : { *(.got) }
 }
diff --git a/lld/test/ELF/x86-64-gotpc-relax.s b/lld/test/ELF/x86-64-gotpc-relax.s
index 5945bfc04a0225..1fb3a3c76852ab 100644
--- a/lld/test/ELF/x86-64-gotpc-relax.s
+++ b/lld/test/ELF/x86-64-gotpc-relax.s
@@ -1,5 +1,5 @@
 # REQUIRES: x86
-## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX GOT optimization.
+## Test R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX/R_X86_64_REX2_GOTPCRELX GOT optimization.
 
 # RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
 # RUN: ld.lld %t.o -o %t1 --no-apply-dynamic-relocs
diff --git a/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def b/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def
index 18fdcf9472dc48..161b1969abfeb4 100644
--- a/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def
+++ b/llvm/include/llvm/BinaryFormat/ELFRelocs/x86_64.def
@@ -43,3 +43,4 @@ ELF_RELOC(R_X86_64_TLSDESC,     36)
 ELF_RELOC(R_X86_64_IRELATIVE,   37)
 ELF_RELOC(R_X86_64_GOTPCRELX,   41)
 ELF_RELOC(R_X86_64_REX_GOTPCRELX,    42)
+ELF_RELOC(R_X86_64_REX2_GOTPCRELX,    43)
diff --git a/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp b/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
index 1a4f7e93eeb74a..92618bdabbe519 100644
--- a/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
+++ b/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
@@ -145,8 +145,8 @@ llvm::mc::RegisterMCTargetOptionsFlags::RegisterMCTargetOptionsFlags() {
 
   static cl::opt<bool> X86RelaxRelocations(
       "x86-relax-relocations",
-      cl::desc(
-          "Emit GOTPCRELX/REX_GOTPCRELX instead of GOTPCREL on x86-64 ELF"),
+      cl::desc("Emit GOTPCRELX/REX_GOTPCRELX/REX2_GOTPCRELX instead of "
+               "GOTPCREL on x86-64 ELF"),
       cl::init(true));
   MCBINDOPT(X86RelaxRelocations);
 
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
index cf0cb92424c166..c02338995f44c0 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
@@ -631,8 +631,10 @@ const MCFixupKindInfo &X86AsmBackend::getFixupKindInfo(MCFixupKind Kind) const {
   const static MCFixupKindInfo Infos[X86::NumTargetFixupKinds] = {
       {"reloc_riprel_4byte", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_riprel_4byte_movq_load", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
+      {"reloc_riprel_4byte_movq_load_rex2", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_riprel_4byte_relax", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_riprel_4byte_relax_rex", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
+      {"reloc_riprel_4byte_relax_rex2", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
       {"reloc_signed_4byte", 0, 32, 0},
       {"reloc_signed_4byte_relax", 0, 32, 0},
       {"reloc_global_offset_table", 0, 32, 0},
@@ -678,7 +680,9 @@ static unsigned getFixupKindSize(unsigned Kind) {
   case X86::reloc_riprel_4byte:
   case X86::reloc_riprel_4byte_relax:
   case X86::reloc_riprel_4byte_relax_rex:
+  case X86::reloc_riprel_4byte_relax_rex2:
   case X86::reloc_riprel_4byte_movq_load:
+  case X86::reloc_riprel_4byte_movq_load_rex2:
   case X86::reloc_signed_4byte:
   case X86::reloc_signed_4byte_relax:
   case X86::reloc_global_offset_table:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
index 0b2efdfc16cc5d..9fdc62b8c8516f 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
@@ -74,7 +74,9 @@ static X86_64RelType getType64(MCFixupKind Kind,
   case X86::reloc_riprel_4byte:
   case X86::reloc_riprel_4byte_relax:
   case X86::reloc_riprel_4byte_relax_rex:
+  case X86::reloc_riprel_4byte_relax_rex2:
   case X86::reloc_riprel_4byte_movq_load:
+  case X86::reloc_riprel_4byte_movq_load_rex2:
     return RT64_32;
   case X86::reloc_branch_4byte_pcrel:
     Modifier = MCSymbolRefExpr::VK_PLT;
@@ -205,7 +207,7 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
   case MCSymbolRefExpr::VK_GOTPCREL:
     checkIs32(Ctx, Loc, Type);
     // Older versions of ld.bfd/ld.gold/lld
-    // do not support GOTPCRELX/REX_GOTPCRELX,
+    // do not support GOTPCRELX/REX_GOTPCRELX/REX2_GOTPCRELX,
     // and we want to keep back-compatibility.
     if (!Ctx.getTargetOptions()->X86RelaxRelocations)
       return ELF::R_X86_64_GOTPCREL;
@@ -217,6 +219,9 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
     case X86::reloc_riprel_4byte_relax_rex:
     case X86::reloc_riprel_4byte_movq_load:
       return ELF::R_X86_64_REX_GOTPCRELX;
+    case X86::reloc_riprel_4byte_relax_rex2:
+    case X86::reloc_riprel_4byte_movq_load_rex2:
+     return ELF::R_X86_64_REX2_GOTPCRELX;
     }
     llvm_unreachable("unexpected relocation type!");
   case MCSymbolRefExpr::VK_GOTPCREL_NORELAX:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h b/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h
index 2d5217115d07cb..29bb7eebae3f22 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86FixupKinds.h
@@ -16,10 +16,14 @@ namespace X86 {
 enum Fixups {
   reloc_riprel_4byte = FirstTargetFixupKind, // 32-bit rip-relative
   reloc_riprel_4byte_movq_load,              // 32-bit rip-relative in movq
+  reloc_riprel_4byte_movq_load_rex2,         // 32-bit rip-relative in movq
+                                             // with rex2 prefix
   reloc_riprel_4byte_relax,                  // 32-bit rip-relative in relaxable
                                              // instruction
   reloc_riprel_4byte_relax_rex,              // 32-bit rip-relative in relaxable
                                              // instruction with rex prefix
+  reloc_riprel_4byte_relax_rex2,             // 32-bit rip-relative in relaxable
+                                             // instruction with rex2 prefix
   reloc_signed_4byte,                        // 32-bit signed. Unlike FK_Data_4
                                              // this will be sign extended at
                                              // runtime.
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
index 469a385e085271..852af1600bb2df 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
@@ -568,8 +568,10 @@ void X86MCCodeEmitter::emitImmediate(const MCOperand &DispOp, SMLoc Loc,
   if (FixupKind == FK_PCRel_4 ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte) ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte_movq_load) ||
+      FixupKind == MCFixupKind(X86::reloc_riprel_4byte_movq_load_rex2) ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte_relax) ||
       FixupKind == MCFixupKind(X86::reloc_riprel_4byte_relax_rex) ||
+      FixupKind == MCFixupKind(X86::reloc_riprel_4byte_relax_rex2) ||
       FixupKind == MCFixupKind(X86::reloc_branch_4byte_pcrel)) {
     ImmOffset -= 4;
     // If this is a pc-relative load off _GLOBAL_OFFSET_TABLE_:
@@ -638,12 +640,11 @@ void X86MCCodeEmitter::emitMemModRMByte(
       default:
         return X86::reloc_riprel_4byte;
       case X86::MOV64rm:
-        // movq loads is a subset of reloc_riprel_4byte_relax_rex. It is a
+        // movq loads is a subset of reloc_riprel_4byte_relax_rex/rex2. It is a
         // special case because COFF and Mach-O don't support ELF's more
-        // flexible R_X86_64_REX_GOTPCRELX relaxation.
-        // TODO: Support new relocation for REX2.
-        assert(Kind == REX || Kind == REX2);
-        return X86::reloc_riprel_4byte_movq_load;
+        // flexible R_X86_64_REX_GOTPCRELX/R_X86_64_REX2_GOTPCRELX relaxation.
+        return Kind == REX2 ? X86::reloc_riprel_4byte_movq_load_rex2
+                            : X86::reloc_riprel_4byte_movq_load;
       case X86::ADC32rm:
       case X86::ADD32rm:
       case X86::AND32rm:
@@ -666,11 +667,9 @@ void X86MCCodeEmitter::emitMemModRMByte(
       case X86::SBB64rm:
       case X86::SUB64rm:
       case X86::XOR64rm:
-        // We haven't support relocation for REX2 prefix, so temporarily use REX
-        // relocation.
-        // TODO: Support new relocation for REX2.
-        return (Kind == REX || Kind == REX2) ? X86::reloc_riprel_4byte_relax_rex
-                                             : X86::reloc_riprel_4byte_relax;
+        return Kind == REX2  ? X86::reloc_riprel_4byte_relax_rex2
+               : Kind == REX ? X86::reloc_riprel_4byte_relax_rex
+                             : X86::reloc_riprel_4byte_relax;
       }
     }();
 
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp
index ec95b1ffec387d..41ce5c9fcb82ad 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp
@@ -66,8 +66,10 @@ class X86MachObjectWriter : public MCMachObjectTargetWriter {
 static bool isFixupKindRIPRel(unsigned Kind) {
   return Kind == X86::reloc_riprel_4byte ||
          Kind == X86::reloc_riprel_4byte_movq_load ||
+         Kind == X86::reloc_riprel_4byte_movq_load_rex2 ||
          Kind == X86::reloc_riprel_4byte_relax ||
-         Kind == X86::reloc_riprel_4byte_relax_rex;
+         Kind == X86::reloc_riprel_4byte_relax_rex ||
+         Kind == X86::reloc_riprel_4byte_relax_rex2;
 }
 
 static unsigned getFixupKindLog2Size(unsigned Kind) {
@@ -83,7 +85,9 @@ static unsigned getFixupKindLog2Size(unsigned Kind) {
   case X86::reloc_riprel_4byte:
   case X86::reloc_riprel_4byte_relax:
   case X86::reloc_riprel_4byte_relax_rex:
+  case X86::reloc_riprel_4byte_relax_rex2:
   case X86::reloc_riprel_4byte_movq_load:
+  case X86::reloc_riprel_4byte_movq_load_rex2:
   case X86::reloc_signed_4byte:
   case X86::reloc_signed_4byte_relax:
   case X86::reloc_branch_4byte_pcrel:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp
index 10fc176b59d8ab..7740500fb41830 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp
@@ -66,8 +66,10 @@ unsigned X86WinCOFFObjectWriter::getRelocType(MCContext &Ctx,
     case FK_PCRel_4:
     case X86::reloc_riprel_4byte:
     case X86::reloc_riprel_4byte_movq_load:
+    case X86::reloc_riprel_4byte_movq_load_rex2:
     case X86::reloc_riprel_4byte_relax:
     case X86::reloc_riprel_4byte_relax_rex:
+    case X86::reloc_riprel_4byte_relax_rex2:
     case X86::reloc_branch_4byte_pcrel:
       return COFF::IMAGE_REL_AMD64_REL32;
     case FK_Data_4:
diff --git a/llvm/test/MC/ELF/relocation-alias.s b/llvm/test/MC/ELF/relocation-alias.s
index 51fb0c37052fe7..66bf2ceea508ba 100644
--- a/llvm/test/MC/ELF/relocation-alias.s
+++ b/llvm/test/MC/ELF/relocation-alias.s
@@ -16,7 +16,10 @@ movabsq $memcpy+2, %rax
 
 # CHECK:      movq (%rip), %rax
 # CHECK-NEXT:   R_X86_64_REX_GOTPCRELX  abs-0x4
+# CHECK:      movq (%rip), %r16
+# CHECK-NEXT:   R_X86_64_REX2_GOTPCRELX abs-0x4
 movq abs@GOTPCREL(%rip), %rax
+movq abs@GOTPCREL(%rip), %r16
 abs = 42
 
 # CHECK:      movabsq $0, %rbx
diff --git a/llvm/test/MC/X86/gotpcrelx.s b/llvm/test/MC/X86/gotpcrelx.s
index e63e3e9a946fd1..5a8ba454bc904c 100644
--- a/llvm/test/MC/X86/gotpcrelx.s
+++ b/llvm/test/MC/X86/gotpcrelx.s
@@ -37,6 +37,16 @@
 # CHECK-NEXT:     R_X86_64_REX_GOTPCRELX sbb
 # CHECK-NEXT:     R_X86_64_REX_GOTPCRELX sub
 # CHECK-NEXT:     R_X86_64_REX_GOTPCRELX xor
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX mov
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX test
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX adc
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX add
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX and
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX cmp
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX or
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX sbb
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX sub
+# CHECK-NEXT:     R_X86_64_REX2_GOTPCRELX xor
 # CHECK-NEXT:   }
 
 # NORELAX-NEXT:     R_X86_64_GOTPCREL mov
@@ -71,6 +81,16 @@
 # NORELAX-NEXT:     R_X86_64_GOTPCREL sbb
 # NORELAX-NEXT:     R_X86_64_GOTPCREL sub
 # NORELAX-NEXT:     R_X86_64_GOTPCREL xor
+# NORELAX-NEXT:     R_X86_64_GOTPCREL mov
+# NORELAX-NEXT:     R_X86_64_GOTPCREL test
+# NORELAX-NEXT:     R_X86_64_GOTPCREL adc
+# NORELAX-NEXT:     R_X86_64_GOTPCREL add
+# NORELAX-NEXT:     R_X86_64_GOTPCREL and
+# NORELAX-NEXT:     R_X86_64_GOTPCREL cmp
+# NORELAX-NEXT:     R_X86_64_GOTPCREL or
+# NORELAX-NEXT:     R_X86_64_GOTPCREL sbb
+# NORELAX-NEXT:     R_X86_64_GOTPCREL sub
+# NORELAX-NEXT:     R_X86_64_GOTPCREL xor
 # NORELAX-NEXT:   }
 
 movl mov@GOTPCREL(%rip), %eax
@@ -108,10 +128,22 @@ sbb sbb@GOTPCREL(%rip), %rax
 sub sub@GOTPCREL(%rip), %rax
 xor xor@GOTPCREL(%rip), %rax
 
+movq mov@GOTPCREL(%rip), %r16
+test %r16, test@GOTPCREL(%rip)
+adc adc@GOTPCREL(%rip), %r16
+add add@GOTPCREL(%rip), %r16
+and and@GOTPCREL(%rip), %r16
+cmp cmp@GOTPCREL(%rip), %r16
+or  or@GOTPCREL(%rip), %r16
+sbb sbb@GOTPCREL(%rip), %r16
+sub sub@GOTPCREL(%rip), %r16
+xor xor@GOTPCREL(%rip), %r16
+
 # COMMON-NEXT:   Section ({{.*}}) .rela.norelax {
 # COMMON-NEXT:     R_X86_64_GOTPCREL mov 0x0
 # COMMON-NEXT:     R_X86_64_GOTPCREL mov 0xFFFFFFFFFFFFFFFD
 # COMMON-NEXT:     R_X86_64_GOTPCREL mov 0xFFFFFFFFFFFFFFFC
+# COMMON-NEXT:     R_X86_64_GOTPCREL mov 0xFFFFFFFFFFFFFFFD
 # COMMON-NEXT:   }
 # COMMON-NEXT: ]
 
@@ -123,3 +155,5 @@ movl mov@GOTPCREL+4(%rip), %eax
 movq mov@GOTPCREL+1(%rip), %rax
 ## We could emit R_X86_64_GOTPCRELX, but it is probably unnecessary.
 movl mov@GOTPCREL+0(%rip), %eax
+## Don't emit R_X86_64_GOTPCRELX.
+movq mov@GOTPCREL+1(%rip), %r16

Copy link

github-actions bot commented Aug 30, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@KanRobert

This comment was marked as resolved.

@KanRobert KanRobert marked this pull request as ready for review September 1, 2024 10:02
@wwwhhhyyy
Copy link

I suppose the name of relocation used by binutils is R_X86_64_CODE_4_GOTPCRELX
Also binutils has added R_X86_64_CODE_5_GOTPCRELX/R_X86_64_CODE_6_GOTPCRELX, for evex relocation
Refer to bminor/binutils-gdb@3d5a60d
and bminor/binutils-gdb@5bc71c2

@KanRobert
Copy link
Contributor Author

I suppose the name of relocation used by binutils is R_X86_64_CODE_4_GOTPCRELX Also binutils has added R_X86_64_CODE_5_GOTPCRELX/R_X86_64_CODE_6_GOTPCRELX, for evex relocation Refer to bminor/binutils-gdb@3d5a60d and bminor/binutils-gdb@5bc71c2

This is the first patch and R_X86_64_CODE_6_GOTPCRELX will be added later. R_X86_64_CODE_5_GOTPCRELX is only for possible future instructions and whether adding it as placeholder is optional.

I mentioned R_X86_64_CODE_4_GOTPCRELX in the description but I prefer R_X86_64_REX2_GOTPCRELX personally. Binutils use the former name for possible future instructions w/ 2-byte prefix other than REX2 (see discussion https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU). AFAIKC, the probability is quite low for the general GPR32 design, i.e. REX2 is already 2-byte long, future prefix should be longer than 2 byte if we need GPR32 at the same time. And according to that design, probably we should rename R_X86_64_REX_GOTPCRELX to R_X86_64_CODE_3_GOTPCRELX.

If the reviewers prefer the CODE_4 name, I will update the PR.

For

	mov        name@GOTPCREL(%rip), %reg
	test       %reg, name@GOTPCREL(%rip)
	binop      name@GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions,
add

 R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX = 43

if the instruction starts at 4 bytes before the relocation offset.  It
similar to R_X86_64_GOTPCRELX.

Linker can treat R_X86_64_REX2_GOTPCRELX/R_X86_64_CODE_4_GOTPCRELX
as R_X86_64_GOTPCREL or convert the above instructions to

	lea	name(%rip), %reg
	mov	$name, %reg
	test	$name, %reg
	binop	$name, %reg

if the first byte of the instruction at the relocation `offset - 4` is
`0xd5` (namely, encoded w/ REX2 prefix) when possible.

Binutils patch: bminor/binutils-gdb@3d5a60d
Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation
@KanRobert
Copy link
Contributor Author

Ping @MaskRay ?

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry for the long delay. I had a 3-week long vacation.)

MC/assembler part looks good to me. I haven't checked the ELF part.

For these assembler/linker changes, we should split the lld/ELF part to a separate change.

@MaskRay
Copy link
Member

MaskRay commented Sep 24, 2024

And remove ",lld" from the subject when the linker part is separated.

@KanRobert KanRobert changed the title [X86,lld] Add relocation R_X86_64_REX2_GOTPCRELX [X86,MC] Add relocation R_X86_64_REX2_GOTPCRELX Sep 24, 2024
@KanRobert
Copy link
Contributor Author

Passed the validation of cpu2017 on intel sde w/ this patch + APX features.

@KanRobert KanRobert merged commit 3d34053 into llvm:main Sep 24, 2024
8 checks passed
KanRobert added a commit that referenced this pull request Sep 29, 2024
For

	mov        name@GOTPCREL(%rip), %reg
	test       %reg, name@GOTPCREL(%rip)
	binop      name@GOTPCREL(%rip), %reg

where binop is one of adc, add, and, cmp, or, sbb, sub, xor
instructions, we added

 R_X86_64_REX2_GOTPCRELX = 43

in #106681.

Linker can treat R_X86_64_REX2_GOTPCRELX as R_X86_64_GOTPCREL or convert
the above instructions to

	lea	name(%rip), %reg
	mov	$name, %reg
	test	$name, %reg
	binop	$name, %reg

if the first byte of the instruction at the relocation `offset - 4` is
`0xd5` (namely, encoded w/ REX2 prefix) when possible.

Binutils patch:
bminor/binutils-gdb@3d5a60d
Binutils mailthread:
https://sourceware.org/pipermail/binutils/2023-December/131462.html
ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category lld:ELF lld llvm:binary-utilities mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants