Skip to content

[LoongArch] Support R_LARCH_{ADD,SUB}_ULEB128 for .uleb128 and force relocs when sym is not in section #76433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 9, 2024

Conversation

MQ-mengqing
Copy link
Contributor

@MQ-mengqing MQ-mengqing commented Dec 27, 2023

1, Follow RISCV 1df5ea2 to support generates relocs for .uleb128 which can not be folded. Unlike RISCV, the located content of LoongArch should be zero. LoongArch fixup uleb128 value by in-place addition and subtraction reloc types named R_LARCH_{ADD,SUB}_ULEB128. The located content can affect the result and R_LARCH_ADD_ULEB128 has enough info to represent the first symbol value, so it needs to be set to zero.
2, Force relocs if sym is not in section so that it can emit relocs for external symbol.

Fixes: #72960 (comment)

@llvmbot
Copy link
Member

llvmbot commented Dec 27, 2023

@llvm/pr-subscribers-mc

Author: Jinyang He (MQ-mengqing)

Changes

1, Follow RISCV 1df5ea2 to support generates relocs for .uleb128 which can not be folded. Unlike RISCV, LoongArch needs padding zero as its contents.
2, Force relocs if sym is not in section so that it can emit relocs for external symbol.

Fixes: #72960 (comment)


Full diff: https://github.com/llvm/llvm-project/pull/76433.diff

8 Files Affected:

  • (modified) llvm/include/llvm/MC/MCAsmBackend.h (+1-1)
  • (modified) llvm/lib/MC/MCAssembler.cpp (+4-1)
  • (modified) llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp (+44-14)
  • (modified) llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h (+3)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp (+1-1)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h (+2-2)
  • (added) llvm/test/MC/LoongArch/Relocations/leb128.s (+65)
  • (modified) llvm/test/MC/LoongArch/Relocations/relax-addsub.s (+52-21)
diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h
index 8931e8cab2fa18..58e670915a3cbb 100644
--- a/llvm/include/llvm/MC/MCAsmBackend.h
+++ b/llvm/include/llvm/MC/MCAsmBackend.h
@@ -199,7 +199,7 @@ class MCAsmBackend {
   // Defined by linker relaxation targets to possibly emit LEB128 relocations
   // and set Value at the relocated location.
   virtual bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
-                           int64_t &Value) const {
+                           int64_t &Value, bool &UseZeroPad) const {
     return false;
   }
 
diff --git a/llvm/lib/MC/MCAssembler.cpp b/llvm/lib/MC/MCAssembler.cpp
index def13044dfccc3..1fe9dfc1b6cded 100644
--- a/llvm/lib/MC/MCAssembler.cpp
+++ b/llvm/lib/MC/MCAssembler.cpp
@@ -1017,6 +1017,7 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
   const unsigned OldSize = static_cast<unsigned>(LF.getContents().size());
   unsigned PadTo = OldSize;
   int64_t Value;
+  bool UseZeroPad = false;
   SmallVectorImpl<char> &Data = LF.getContents();
   LF.getFixups().clear();
   // Use evaluateKnownAbsolute for Mach-O as a hack: .subsections_via_symbols
@@ -1026,7 +1027,7 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
                  ? LF.getValue().evaluateKnownAbsolute(Value, Layout)
                  : LF.getValue().evaluateAsAbsolute(Value, Layout);
   if (!Abs) {
-    if (!getBackend().relaxLEB128(LF, Layout, Value)) {
+    if (!getBackend().relaxLEB128(LF, Layout, Value, UseZeroPad)) {
       getContext().reportError(LF.getValue().getLoc(),
                                Twine(LF.isSigned() ? ".s" : ".u") +
                                    "leb128 expression is not absolute");
@@ -1034,6 +1035,8 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
     }
     uint8_t Tmp[10]; // maximum size: ceil(64/7)
     PadTo = std::max(PadTo, encodeULEB128(uint64_t(Value), Tmp));
+    if (UseZeroPad)
+      Value = 0;
   }
   Data.clear();
   raw_svector_ostream OSE(Data);
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
index 6d8ef1bf96cbab..6dbf24e7fd6379 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
@@ -91,6 +91,7 @@ static uint64_t adjustFixupValue(const MCFixup &Fixup, uint64_t Value,
   case FK_Data_2:
   case FK_Data_4:
   case FK_Data_8:
+  case FK_Data_leb128:
     return Value;
   case LoongArch::fixup_loongarch_b16: {
     if (!isInt<18>(Value))
@@ -173,6 +174,7 @@ bool LoongArchAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
   case FK_Data_2:
   case FK_Data_4:
   case FK_Data_8:
+  case FK_Data_leb128:
     return !Target.isAbsolute();
   }
 }
@@ -202,9 +204,27 @@ getRelocPairForSize(unsigned Size) {
     return std::make_pair(
         MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD64),
         MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB64));
+  case 128:
+    return std::make_pair(
+        MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD_ULEB128),
+        MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB_ULEB128));
   }
 }
 
+bool LoongArchAsmBackend::relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
+                                      int64_t &Value, bool &UseZeroPad) const {
+  const MCExpr &Expr = LF.getValue();
+
+  if (LF.isSigned() || !Expr.evaluateKnownAbsolute(Value, Layout))
+    return false;
+
+  UseZeroPad = true;
+  LF.getFixups().push_back(
+      MCFixup::create(0, &Expr, FK_Data_leb128, Expr.getLoc()));
+
+  return true;
+}
+
 bool LoongArchAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
                                        const MCSubtargetInfo *STI) const {
   // We mostly follow binutils' convention here: align to 4-byte boundary with a
@@ -226,21 +246,28 @@ bool LoongArchAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
                                                   uint64_t &FixedValue) const {
   std::pair<MCFixupKind, MCFixupKind> FK;
   uint64_t FixedValueA, FixedValueB;
-  const MCSection &SecA = Target.getSymA()->getSymbol().getSection();
-  const MCSection &SecB = Target.getSymB()->getSymbol().getSection();
-
-  // We need record relocation if SecA != SecB. Usually SecB is same as the
-  // section of Fixup, which will be record the relocation as PCRel. If SecB
-  // is not same as the section of Fixup, it will report error. Just return
-  // false and then this work can be finished by handleFixup.
-  if (&SecA != &SecB)
-    return false;
+  const MCSymbol &SA = Target.getSymA()->getSymbol();
+  const MCSymbol &SB = Target.getSymB()->getSymbol();
 
-  // In SecA == SecB case. If the linker relaxation is enabled, we need record
-  // the ADD, SUB relocations. Otherwise the FixedValue has already been
-  // calculated out in evaluateFixup, return true and avoid record relocations.
-  if (!STI.hasFeature(LoongArch::FeatureRelax))
-    return true;
+  bool force = !SA.isInSection() || !SB.isInSection();
+
+  if (!force) {
+    const MCSection &SecA = SA.getSection();
+    const MCSection &SecB = SB.getSection();
+
+    // We need record relocation if SecA != SecB. Usually SecB is same as the
+    // section of Fixup, which will be record the relocation as PCRel. If SecB
+    // is not same as the section of Fixup, it will report error. Just return
+    // false and then this work can be finished by handleFixup.
+    if (&SecA != &SecB)
+      return false;
+
+    // In SecA == SecB case. If the linker relaxation is enabled, we need record
+    // the ADD, SUB relocations. Otherwise the FixedValue has already been calc-
+    // ulated out in evaluateFixup, return true and avoid record relocations.
+    if (!STI.hasFeature(LoongArch::FeatureRelax))
+      return true;
+  }
 
   switch (Fixup.getKind()) {
   case llvm::FK_Data_1:
@@ -255,6 +282,9 @@ bool LoongArchAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
   case llvm::FK_Data_8:
     FK = getRelocPairForSize(64);
     break;
+  case llvm::FK_Data_leb128:
+    FK = getRelocPairForSize(128);
+    break;
   default:
     llvm_unreachable("unsupported fixup size");
   }
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
index fef0e84600a74c..adee22020b0b21 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
@@ -66,6 +66,9 @@ class LoongArchAsmBackend : public MCAsmBackend {
   void relaxInstruction(MCInst &Inst,
                         const MCSubtargetInfo &STI) const override {}
 
+  bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout, int64_t &Value,
+                   bool &UseZeroPad) const override;
+
   bool writeNopData(raw_ostream &OS, uint64_t Count,
                     const MCSubtargetInfo *STI) const override;
 
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
index 716fb67c582489..bf50b69dee1b0a 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
@@ -330,7 +330,7 @@ bool RISCVAsmBackend::relaxDwarfCFA(MCDwarfCallFrameFragment &DF,
 }
 
 bool RISCVAsmBackend::relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
-                                  int64_t &Value) const {
+                                  int64_t &Value, bool &UseZeroPad) const {
   if (LF.isSigned())
     return false;
   const MCExpr &Expr = LF.getValue();
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
index 2ad6534ac8bce3..24391ff0980208 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
@@ -100,8 +100,8 @@ class RISCVAsmBackend : public MCAsmBackend {
                           bool &WasRelaxed) const override;
   bool relaxDwarfCFA(MCDwarfCallFrameFragment &DF, MCAsmLayout &Layout,
                      bool &WasRelaxed) const override;
-  bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
-                   int64_t &Value) const override;
+  bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout, int64_t &Value,
+                   bool &UseZeroPad) const override;
 
   bool writeNopData(raw_ostream &OS, uint64_t Count,
                     const MCSubtargetInfo *STI) const override;
diff --git a/llvm/test/MC/LoongArch/Relocations/leb128.s b/llvm/test/MC/LoongArch/Relocations/leb128.s
new file mode 100644
index 00000000000000..f378482e0e0c31
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Relocations/leb128.s
@@ -0,0 +1,65 @@
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s -o %t
+# RUN: llvm-readobj -r -x .alloc_w %t | FileCheck --check-prefixes=CHECK,NORELAX %s
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o %t.relax
+# RUN: llvm-readobj -r -x .alloc_w %t.relax | FileCheck --check-prefixes=CHECK,RELAX %s
+
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax --defsym ERR=1 %s -o /dev/null 2>&1 | \
+# RUN:   FileCheck %s --check-prefix=ERR
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax --defsym ERR=1 %s -o /dev/null 2>&1 | \
+# RUN:   FileCheck %s --check-prefix=ERR
+
+# CHECK:      Relocations [
+# CHECK-NEXT:   .rela.alloc_w {
+# RELAX-NEXT:      0x0 R_LARCH_ADD_ULEB128 w1 0x0
+# RELAX-NEXT:      0x0 R_LARCH_SUB_ULEB128 w 0x0
+# RELAX-NEXT:      0x1 R_LARCH_ADD_ULEB128 w2 0x0
+# RELAX-NEXT:      0x1 R_LARCH_SUB_ULEB128 w1 0x0
+# CHECK-NEXT:      0x2 R_LARCH_PCALA_HI20 foo 0x0
+# RELAX-NEXT:      0x2 R_LARCH_RELAX - 0x0
+# CHECK-NEXT:      0x6 R_LARCH_PCALA_LO12 foo 0x0
+# RELAX-NEXT:      0x6 R_LARCH_RELAX - 0x0
+# RELAX-NEXT:      0xA R_LARCH_ADD_ULEB128 w2 0x0
+# RELAX-NEXT:      0xA R_LARCH_SUB_ULEB128 w1 0x0
+# RELAX-NEXT:      0xB R_LARCH_ADD_ULEB128 w2 0x78
+# RELAX-NEXT:      0xB R_LARCH_SUB_ULEB128 w1 0x0
+# RELAX-NEXT:      0xD R_LARCH_ADD_ULEB128 w1 0x0
+# RELAX-NEXT:      0xD R_LARCH_SUB_ULEB128 w2 0x0
+# CHECK-NEXT:   }
+# CHECK-NEXT: ]
+
+# CHECK:        Hex dump of section '.alloc_w':
+# NORELAX-NEXT: 0x00000000 00080c00 001a8c01 c0020880 01f8ffff
+# NORELAX-NEXT: 0x00000010 ffffffff ffff01
+# RELAX-NEXT:   0x00000000 00000c00 001a8c01 c0020080 00808080
+# RELAX-NEXT:   0x00000010 80808080 808000
+
+.section .alloc_w,"ax",@progbits; w:
+.uleb128 w1-w       # w1 is later defined in the same section
+.uleb128 w2-w1      # w1 and w2 are separated by a linker relaxable instruction
+w1:
+  la.pcrel $t0, foo
+w2:
+.uleb128 w2-w1      # 0x08
+.uleb128 w2-w1+120  # 0x0180
+.uleb128 -(w2-w1)   # 0x01fffffffffffffffff8
+
+.ifdef ERR
+# ERR: :[[#@LINE+1]]:16: error: .uleb128 expression is not absolute
+.uleb128 extern-w   # extern is undefined
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 w-extern
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 x-w        # x is later defined in another section
+
+.section .alloc_x,"aw",@progbits; x:
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 y-x
+.section .alloc_y,"aw",@progbits; y:
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 x-y
+
+# ERR: :[[#@LINE+1]]:10: error: .uleb128 expression is not absolute
+.uleb128 extern
+# ERR: :[[#@LINE+1]]:10: error: .uleb128 expression is not absolute
+.uleb128 y
+.endif
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
index c4454f5bb98d11..b69fc40013ea33 100644
--- a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
+++ b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
@@ -8,12 +8,23 @@
 # NORELAX-NEXT:      0x10 R_LARCH_PCALA_HI20 .text 0x0
 # NORELAX-NEXT:      0x14 R_LARCH_PCALA_LO12 .text 0x0
 # NORELAX-NEXT:    }
+# NORELAX-NEXT:    Section ({{.*}}) .rela.data {
+# NORELAX-NEXT:      0x30 R_LARCH_ADD8 foo 0x0
+# NORELAX-NEXT:      0x30 R_LARCH_SUB8 .text 0x10
+# NORELAX-NEXT:      0x31 R_LARCH_ADD16 foo 0x0
+# NORELAX-NEXT:      0x31 R_LARCH_SUB16 .text 0x10
+# NORELAX-NEXT:      0x33 R_LARCH_ADD32 foo 0x0
+# NORELAX-NEXT:      0x33 R_LARCH_SUB32 .text 0x10
+# NORELAX-NEXT:      0x37 R_LARCH_ADD64 foo 0x0
+# NORELAX-NEXT:      0x37 R_LARCH_SUB64 .text 0x10
+# NORELAX-NEXT:    }
 # NORELAX-NEXT:  ]
 
 # NORELAX:      Hex dump of section '.data':
-# NORELAX-NEXT: 0x00000000 04040004 00000004 00000000 0000000c
-# NORELAX-NEXT: 0x00000010 0c000c00 00000c00 00000000 00000808
-# NORELAX-NEXT: 0x00000020 00080000 00080000 00000000 00
+# NORELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000004
+# NORELAX-NEXT: 0x00000010 0c0c000c 0000000c 00000000 00000000
+# NORELAX-NEXT: 0x00000020 08080008 00000008 00000000 00000008
+# NORELAX-NEXT: 0x00000030 00000000 00000000 00000000 000000
 
 # RELAX:       Relocations [
 # RELAX-NEXT:    Section ({{.*}}) .rela.text {
@@ -23,29 +34,42 @@
 # RELAX-NEXT:      0x14 R_LARCH_RELAX - 0x0
 # RELAX-NEXT:    }
 # RELAX-NEXT:    Section ({{.*}}) .rela.data {
-# RELAX-NEXT:      0xF R_LARCH_ADD8 .L3 0x0
-# RELAX-NEXT:      0xF R_LARCH_SUB8 .L2 0x0
-# RELAX-NEXT:      0x10 R_LARCH_ADD16 .L3 0x0
-# RELAX-NEXT:      0x10 R_LARCH_SUB16 .L2 0x0
-# RELAX-NEXT:      0x12 R_LARCH_ADD32 .L3 0x0
-# RELAX-NEXT:      0x12 R_LARCH_SUB32 .L2 0x0
-# RELAX-NEXT:      0x16 R_LARCH_ADD64 .L3 0x0
-# RELAX-NEXT:      0x16 R_LARCH_SUB64 .L2 0x0
-# RELAX-NEXT:      0x1E R_LARCH_ADD8 .L4 0x0
-# RELAX-NEXT:      0x1E R_LARCH_SUB8 .L3 0x0
-# RELAX-NEXT:      0x1F R_LARCH_ADD16 .L4 0x0
-# RELAX-NEXT:      0x1F R_LARCH_SUB16 .L3 0x0
-# RELAX-NEXT:      0x21 R_LARCH_ADD32 .L4 0x0
-# RELAX-NEXT:      0x21 R_LARCH_SUB32 .L3 0x0
-# RELAX-NEXT:      0x25 R_LARCH_ADD64 .L4 0x0
-# RELAX-NEXT:      0x25 R_LARCH_SUB64 .L3 0x0
+# RELAX-NEXT:      0x10 R_LARCH_ADD8 .L3 0x0
+# RELAX-NEXT:      0x10 R_LARCH_SUB8 .L2 0x0
+# RELAX-NEXT:      0x11 R_LARCH_ADD16 .L3 0x0
+# RELAX-NEXT:      0x11 R_LARCH_SUB16 .L2 0x0
+# RELAX-NEXT:      0x13 R_LARCH_ADD32 .L3 0x0
+# RELAX-NEXT:      0x13 R_LARCH_SUB32 .L2 0x0
+# RELAX-NEXT:      0x17 R_LARCH_ADD64 .L3 0x0
+# RELAX-NEXT:      0x17 R_LARCH_SUB64 .L2 0x0
+# RELAX-NEXT:      0x1F R_LARCH_ADD_ULEB128 .L3 0x0
+# RELAX-NEXT:      0x1F R_LARCH_SUB_ULEB128 .L2 0x0
+# RELAX-NEXT:      0x20 R_LARCH_ADD8 .L4 0x0
+# RELAX-NEXT:      0x20 R_LARCH_SUB8 .L3 0x0
+# RELAX-NEXT:      0x21 R_LARCH_ADD16 .L4 0x0
+# RELAX-NEXT:      0x21 R_LARCH_SUB16 .L3 0x0
+# RELAX-NEXT:      0x23 R_LARCH_ADD32 .L4 0x0
+# RELAX-NEXT:      0x23 R_LARCH_SUB32 .L3 0x0
+# RELAX-NEXT:      0x27 R_LARCH_ADD64 .L4 0x0
+# RELAX-NEXT:      0x27 R_LARCH_SUB64 .L3 0x0
+# RELAX-NEXT:      0x2F R_LARCH_ADD_ULEB128 .L4 0x0
+# RELAX-NEXT:      0x2F R_LARCH_SUB_ULEB128 .L3 0x0
+# RELAX-NEXT:      0x30 R_LARCH_ADD8 foo 0x0
+# RELAX-NEXT:      0x30 R_LARCH_SUB8 .L3 0x0
+# RELAX-NEXT:      0x31 R_LARCH_ADD16 foo 0x0
+# RELAX-NEXT:      0x31 R_LARCH_SUB16 .L3 0x0
+# RELAX-NEXT:      0x33 R_LARCH_ADD32 foo 0x0
+# RELAX-NEXT:      0x33 R_LARCH_SUB32 .L3 0x0
+# RELAX-NEXT:      0x37 R_LARCH_ADD64 foo 0x0
+# RELAX-NEXT:      0x37 R_LARCH_SUB64 .L3 0x0
 # RELAX-NEXT:    }
 # RELAX-NEXT:  ]
 
 # RELAX:      Hex dump of section '.data':
-# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000000
+# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000004
 # RELAX-NEXT: 0x00000010 00000000 00000000 00000000 00000000
-# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00
+# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00000000
+# RELAX-NEXT: 0x00000030 00000000 00000000 00000000 000000
 
 .text
 .L1:
@@ -63,6 +87,7 @@
 .short .L2 - .L1
 .word  .L2 - .L1
 .dword .L2 - .L1
+.uleb128 .L2 - .L1
 ## With relaxation, emit relocs because of the .align making the diff variable.
 ## TODO Handle alignment directive. Why they emit relocs now? They returns
 ## without folding symbols offset in AttemptToFoldSymbolOffsetDifference().
@@ -70,7 +95,13 @@
 .short .L3 - .L2
 .word  .L3 - .L2
 .dword .L3 - .L2
+.uleb128 .L3 - .L2
 .byte  .L4 - .L3
 .short .L4 - .L3
 .word  .L4 - .L3
 .dword .L4 - .L3
+.uleb128 .L4 - .L3
+.byte  foo - .L3
+.short foo - .L3
+.word  foo - .L3
+.dword foo - .L3

@llvmbot
Copy link
Member

llvmbot commented Dec 27, 2023

@llvm/pr-subscribers-backend-risc-v

Author: Jinyang He (MQ-mengqing)

Changes

1, Follow RISCV 1df5ea2 to support generates relocs for .uleb128 which can not be folded. Unlike RISCV, LoongArch needs padding zero as its contents.
2, Force relocs if sym is not in section so that it can emit relocs for external symbol.

Fixes: #72960 (comment)


Full diff: https://github.com/llvm/llvm-project/pull/76433.diff

8 Files Affected:

  • (modified) llvm/include/llvm/MC/MCAsmBackend.h (+1-1)
  • (modified) llvm/lib/MC/MCAssembler.cpp (+4-1)
  • (modified) llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp (+44-14)
  • (modified) llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h (+3)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp (+1-1)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h (+2-2)
  • (added) llvm/test/MC/LoongArch/Relocations/leb128.s (+65)
  • (modified) llvm/test/MC/LoongArch/Relocations/relax-addsub.s (+52-21)
diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h
index 8931e8cab2fa18..58e670915a3cbb 100644
--- a/llvm/include/llvm/MC/MCAsmBackend.h
+++ b/llvm/include/llvm/MC/MCAsmBackend.h
@@ -199,7 +199,7 @@ class MCAsmBackend {
   // Defined by linker relaxation targets to possibly emit LEB128 relocations
   // and set Value at the relocated location.
   virtual bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
-                           int64_t &Value) const {
+                           int64_t &Value, bool &UseZeroPad) const {
     return false;
   }
 
diff --git a/llvm/lib/MC/MCAssembler.cpp b/llvm/lib/MC/MCAssembler.cpp
index def13044dfccc3..1fe9dfc1b6cded 100644
--- a/llvm/lib/MC/MCAssembler.cpp
+++ b/llvm/lib/MC/MCAssembler.cpp
@@ -1017,6 +1017,7 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
   const unsigned OldSize = static_cast<unsigned>(LF.getContents().size());
   unsigned PadTo = OldSize;
   int64_t Value;
+  bool UseZeroPad = false;
   SmallVectorImpl<char> &Data = LF.getContents();
   LF.getFixups().clear();
   // Use evaluateKnownAbsolute for Mach-O as a hack: .subsections_via_symbols
@@ -1026,7 +1027,7 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
                  ? LF.getValue().evaluateKnownAbsolute(Value, Layout)
                  : LF.getValue().evaluateAsAbsolute(Value, Layout);
   if (!Abs) {
-    if (!getBackend().relaxLEB128(LF, Layout, Value)) {
+    if (!getBackend().relaxLEB128(LF, Layout, Value, UseZeroPad)) {
       getContext().reportError(LF.getValue().getLoc(),
                                Twine(LF.isSigned() ? ".s" : ".u") +
                                    "leb128 expression is not absolute");
@@ -1034,6 +1035,8 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
     }
     uint8_t Tmp[10]; // maximum size: ceil(64/7)
     PadTo = std::max(PadTo, encodeULEB128(uint64_t(Value), Tmp));
+    if (UseZeroPad)
+      Value = 0;
   }
   Data.clear();
   raw_svector_ostream OSE(Data);
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
index 6d8ef1bf96cbab..6dbf24e7fd6379 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
@@ -91,6 +91,7 @@ static uint64_t adjustFixupValue(const MCFixup &Fixup, uint64_t Value,
   case FK_Data_2:
   case FK_Data_4:
   case FK_Data_8:
+  case FK_Data_leb128:
     return Value;
   case LoongArch::fixup_loongarch_b16: {
     if (!isInt<18>(Value))
@@ -173,6 +174,7 @@ bool LoongArchAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
   case FK_Data_2:
   case FK_Data_4:
   case FK_Data_8:
+  case FK_Data_leb128:
     return !Target.isAbsolute();
   }
 }
@@ -202,9 +204,27 @@ getRelocPairForSize(unsigned Size) {
     return std::make_pair(
         MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD64),
         MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB64));
+  case 128:
+    return std::make_pair(
+        MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD_ULEB128),
+        MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB_ULEB128));
   }
 }
 
+bool LoongArchAsmBackend::relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
+                                      int64_t &Value, bool &UseZeroPad) const {
+  const MCExpr &Expr = LF.getValue();
+
+  if (LF.isSigned() || !Expr.evaluateKnownAbsolute(Value, Layout))
+    return false;
+
+  UseZeroPad = true;
+  LF.getFixups().push_back(
+      MCFixup::create(0, &Expr, FK_Data_leb128, Expr.getLoc()));
+
+  return true;
+}
+
 bool LoongArchAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
                                        const MCSubtargetInfo *STI) const {
   // We mostly follow binutils' convention here: align to 4-byte boundary with a
@@ -226,21 +246,28 @@ bool LoongArchAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
                                                   uint64_t &FixedValue) const {
   std::pair<MCFixupKind, MCFixupKind> FK;
   uint64_t FixedValueA, FixedValueB;
-  const MCSection &SecA = Target.getSymA()->getSymbol().getSection();
-  const MCSection &SecB = Target.getSymB()->getSymbol().getSection();
-
-  // We need record relocation if SecA != SecB. Usually SecB is same as the
-  // section of Fixup, which will be record the relocation as PCRel. If SecB
-  // is not same as the section of Fixup, it will report error. Just return
-  // false and then this work can be finished by handleFixup.
-  if (&SecA != &SecB)
-    return false;
+  const MCSymbol &SA = Target.getSymA()->getSymbol();
+  const MCSymbol &SB = Target.getSymB()->getSymbol();
 
-  // In SecA == SecB case. If the linker relaxation is enabled, we need record
-  // the ADD, SUB relocations. Otherwise the FixedValue has already been
-  // calculated out in evaluateFixup, return true and avoid record relocations.
-  if (!STI.hasFeature(LoongArch::FeatureRelax))
-    return true;
+  bool force = !SA.isInSection() || !SB.isInSection();
+
+  if (!force) {
+    const MCSection &SecA = SA.getSection();
+    const MCSection &SecB = SB.getSection();
+
+    // We need record relocation if SecA != SecB. Usually SecB is same as the
+    // section of Fixup, which will be record the relocation as PCRel. If SecB
+    // is not same as the section of Fixup, it will report error. Just return
+    // false and then this work can be finished by handleFixup.
+    if (&SecA != &SecB)
+      return false;
+
+    // In SecA == SecB case. If the linker relaxation is enabled, we need record
+    // the ADD, SUB relocations. Otherwise the FixedValue has already been calc-
+    // ulated out in evaluateFixup, return true and avoid record relocations.
+    if (!STI.hasFeature(LoongArch::FeatureRelax))
+      return true;
+  }
 
   switch (Fixup.getKind()) {
   case llvm::FK_Data_1:
@@ -255,6 +282,9 @@ bool LoongArchAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
   case llvm::FK_Data_8:
     FK = getRelocPairForSize(64);
     break;
+  case llvm::FK_Data_leb128:
+    FK = getRelocPairForSize(128);
+    break;
   default:
     llvm_unreachable("unsupported fixup size");
   }
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
index fef0e84600a74c..adee22020b0b21 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
@@ -66,6 +66,9 @@ class LoongArchAsmBackend : public MCAsmBackend {
   void relaxInstruction(MCInst &Inst,
                         const MCSubtargetInfo &STI) const override {}
 
+  bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout, int64_t &Value,
+                   bool &UseZeroPad) const override;
+
   bool writeNopData(raw_ostream &OS, uint64_t Count,
                     const MCSubtargetInfo *STI) const override;
 
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
index 716fb67c582489..bf50b69dee1b0a 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
@@ -330,7 +330,7 @@ bool RISCVAsmBackend::relaxDwarfCFA(MCDwarfCallFrameFragment &DF,
 }
 
 bool RISCVAsmBackend::relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
-                                  int64_t &Value) const {
+                                  int64_t &Value, bool &UseZeroPad) const {
   if (LF.isSigned())
     return false;
   const MCExpr &Expr = LF.getValue();
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
index 2ad6534ac8bce3..24391ff0980208 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
@@ -100,8 +100,8 @@ class RISCVAsmBackend : public MCAsmBackend {
                           bool &WasRelaxed) const override;
   bool relaxDwarfCFA(MCDwarfCallFrameFragment &DF, MCAsmLayout &Layout,
                      bool &WasRelaxed) const override;
-  bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
-                   int64_t &Value) const override;
+  bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout, int64_t &Value,
+                   bool &UseZeroPad) const override;
 
   bool writeNopData(raw_ostream &OS, uint64_t Count,
                     const MCSubtargetInfo *STI) const override;
diff --git a/llvm/test/MC/LoongArch/Relocations/leb128.s b/llvm/test/MC/LoongArch/Relocations/leb128.s
new file mode 100644
index 00000000000000..f378482e0e0c31
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Relocations/leb128.s
@@ -0,0 +1,65 @@
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s -o %t
+# RUN: llvm-readobj -r -x .alloc_w %t | FileCheck --check-prefixes=CHECK,NORELAX %s
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o %t.relax
+# RUN: llvm-readobj -r -x .alloc_w %t.relax | FileCheck --check-prefixes=CHECK,RELAX %s
+
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax --defsym ERR=1 %s -o /dev/null 2>&1 | \
+# RUN:   FileCheck %s --check-prefix=ERR
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax --defsym ERR=1 %s -o /dev/null 2>&1 | \
+# RUN:   FileCheck %s --check-prefix=ERR
+
+# CHECK:      Relocations [
+# CHECK-NEXT:   .rela.alloc_w {
+# RELAX-NEXT:      0x0 R_LARCH_ADD_ULEB128 w1 0x0
+# RELAX-NEXT:      0x0 R_LARCH_SUB_ULEB128 w 0x0
+# RELAX-NEXT:      0x1 R_LARCH_ADD_ULEB128 w2 0x0
+# RELAX-NEXT:      0x1 R_LARCH_SUB_ULEB128 w1 0x0
+# CHECK-NEXT:      0x2 R_LARCH_PCALA_HI20 foo 0x0
+# RELAX-NEXT:      0x2 R_LARCH_RELAX - 0x0
+# CHECK-NEXT:      0x6 R_LARCH_PCALA_LO12 foo 0x0
+# RELAX-NEXT:      0x6 R_LARCH_RELAX - 0x0
+# RELAX-NEXT:      0xA R_LARCH_ADD_ULEB128 w2 0x0
+# RELAX-NEXT:      0xA R_LARCH_SUB_ULEB128 w1 0x0
+# RELAX-NEXT:      0xB R_LARCH_ADD_ULEB128 w2 0x78
+# RELAX-NEXT:      0xB R_LARCH_SUB_ULEB128 w1 0x0
+# RELAX-NEXT:      0xD R_LARCH_ADD_ULEB128 w1 0x0
+# RELAX-NEXT:      0xD R_LARCH_SUB_ULEB128 w2 0x0
+# CHECK-NEXT:   }
+# CHECK-NEXT: ]
+
+# CHECK:        Hex dump of section '.alloc_w':
+# NORELAX-NEXT: 0x00000000 00080c00 001a8c01 c0020880 01f8ffff
+# NORELAX-NEXT: 0x00000010 ffffffff ffff01
+# RELAX-NEXT:   0x00000000 00000c00 001a8c01 c0020080 00808080
+# RELAX-NEXT:   0x00000010 80808080 808000
+
+.section .alloc_w,"ax",@progbits; w:
+.uleb128 w1-w       # w1 is later defined in the same section
+.uleb128 w2-w1      # w1 and w2 are separated by a linker relaxable instruction
+w1:
+  la.pcrel $t0, foo
+w2:
+.uleb128 w2-w1      # 0x08
+.uleb128 w2-w1+120  # 0x0180
+.uleb128 -(w2-w1)   # 0x01fffffffffffffffff8
+
+.ifdef ERR
+# ERR: :[[#@LINE+1]]:16: error: .uleb128 expression is not absolute
+.uleb128 extern-w   # extern is undefined
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 w-extern
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 x-w        # x is later defined in another section
+
+.section .alloc_x,"aw",@progbits; x:
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 y-x
+.section .alloc_y,"aw",@progbits; y:
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 x-y
+
+# ERR: :[[#@LINE+1]]:10: error: .uleb128 expression is not absolute
+.uleb128 extern
+# ERR: :[[#@LINE+1]]:10: error: .uleb128 expression is not absolute
+.uleb128 y
+.endif
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
index c4454f5bb98d11..b69fc40013ea33 100644
--- a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
+++ b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
@@ -8,12 +8,23 @@
 # NORELAX-NEXT:      0x10 R_LARCH_PCALA_HI20 .text 0x0
 # NORELAX-NEXT:      0x14 R_LARCH_PCALA_LO12 .text 0x0
 # NORELAX-NEXT:    }
+# NORELAX-NEXT:    Section ({{.*}}) .rela.data {
+# NORELAX-NEXT:      0x30 R_LARCH_ADD8 foo 0x0
+# NORELAX-NEXT:      0x30 R_LARCH_SUB8 .text 0x10
+# NORELAX-NEXT:      0x31 R_LARCH_ADD16 foo 0x0
+# NORELAX-NEXT:      0x31 R_LARCH_SUB16 .text 0x10
+# NORELAX-NEXT:      0x33 R_LARCH_ADD32 foo 0x0
+# NORELAX-NEXT:      0x33 R_LARCH_SUB32 .text 0x10
+# NORELAX-NEXT:      0x37 R_LARCH_ADD64 foo 0x0
+# NORELAX-NEXT:      0x37 R_LARCH_SUB64 .text 0x10
+# NORELAX-NEXT:    }
 # NORELAX-NEXT:  ]
 
 # NORELAX:      Hex dump of section '.data':
-# NORELAX-NEXT: 0x00000000 04040004 00000004 00000000 0000000c
-# NORELAX-NEXT: 0x00000010 0c000c00 00000c00 00000000 00000808
-# NORELAX-NEXT: 0x00000020 00080000 00080000 00000000 00
+# NORELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000004
+# NORELAX-NEXT: 0x00000010 0c0c000c 0000000c 00000000 00000000
+# NORELAX-NEXT: 0x00000020 08080008 00000008 00000000 00000008
+# NORELAX-NEXT: 0x00000030 00000000 00000000 00000000 000000
 
 # RELAX:       Relocations [
 # RELAX-NEXT:    Section ({{.*}}) .rela.text {
@@ -23,29 +34,42 @@
 # RELAX-NEXT:      0x14 R_LARCH_RELAX - 0x0
 # RELAX-NEXT:    }
 # RELAX-NEXT:    Section ({{.*}}) .rela.data {
-# RELAX-NEXT:      0xF R_LARCH_ADD8 .L3 0x0
-# RELAX-NEXT:      0xF R_LARCH_SUB8 .L2 0x0
-# RELAX-NEXT:      0x10 R_LARCH_ADD16 .L3 0x0
-# RELAX-NEXT:      0x10 R_LARCH_SUB16 .L2 0x0
-# RELAX-NEXT:      0x12 R_LARCH_ADD32 .L3 0x0
-# RELAX-NEXT:      0x12 R_LARCH_SUB32 .L2 0x0
-# RELAX-NEXT:      0x16 R_LARCH_ADD64 .L3 0x0
-# RELAX-NEXT:      0x16 R_LARCH_SUB64 .L2 0x0
-# RELAX-NEXT:      0x1E R_LARCH_ADD8 .L4 0x0
-# RELAX-NEXT:      0x1E R_LARCH_SUB8 .L3 0x0
-# RELAX-NEXT:      0x1F R_LARCH_ADD16 .L4 0x0
-# RELAX-NEXT:      0x1F R_LARCH_SUB16 .L3 0x0
-# RELAX-NEXT:      0x21 R_LARCH_ADD32 .L4 0x0
-# RELAX-NEXT:      0x21 R_LARCH_SUB32 .L3 0x0
-# RELAX-NEXT:      0x25 R_LARCH_ADD64 .L4 0x0
-# RELAX-NEXT:      0x25 R_LARCH_SUB64 .L3 0x0
+# RELAX-NEXT:      0x10 R_LARCH_ADD8 .L3 0x0
+# RELAX-NEXT:      0x10 R_LARCH_SUB8 .L2 0x0
+# RELAX-NEXT:      0x11 R_LARCH_ADD16 .L3 0x0
+# RELAX-NEXT:      0x11 R_LARCH_SUB16 .L2 0x0
+# RELAX-NEXT:      0x13 R_LARCH_ADD32 .L3 0x0
+# RELAX-NEXT:      0x13 R_LARCH_SUB32 .L2 0x0
+# RELAX-NEXT:      0x17 R_LARCH_ADD64 .L3 0x0
+# RELAX-NEXT:      0x17 R_LARCH_SUB64 .L2 0x0
+# RELAX-NEXT:      0x1F R_LARCH_ADD_ULEB128 .L3 0x0
+# RELAX-NEXT:      0x1F R_LARCH_SUB_ULEB128 .L2 0x0
+# RELAX-NEXT:      0x20 R_LARCH_ADD8 .L4 0x0
+# RELAX-NEXT:      0x20 R_LARCH_SUB8 .L3 0x0
+# RELAX-NEXT:      0x21 R_LARCH_ADD16 .L4 0x0
+# RELAX-NEXT:      0x21 R_LARCH_SUB16 .L3 0x0
+# RELAX-NEXT:      0x23 R_LARCH_ADD32 .L4 0x0
+# RELAX-NEXT:      0x23 R_LARCH_SUB32 .L3 0x0
+# RELAX-NEXT:      0x27 R_LARCH_ADD64 .L4 0x0
+# RELAX-NEXT:      0x27 R_LARCH_SUB64 .L3 0x0
+# RELAX-NEXT:      0x2F R_LARCH_ADD_ULEB128 .L4 0x0
+# RELAX-NEXT:      0x2F R_LARCH_SUB_ULEB128 .L3 0x0
+# RELAX-NEXT:      0x30 R_LARCH_ADD8 foo 0x0
+# RELAX-NEXT:      0x30 R_LARCH_SUB8 .L3 0x0
+# RELAX-NEXT:      0x31 R_LARCH_ADD16 foo 0x0
+# RELAX-NEXT:      0x31 R_LARCH_SUB16 .L3 0x0
+# RELAX-NEXT:      0x33 R_LARCH_ADD32 foo 0x0
+# RELAX-NEXT:      0x33 R_LARCH_SUB32 .L3 0x0
+# RELAX-NEXT:      0x37 R_LARCH_ADD64 foo 0x0
+# RELAX-NEXT:      0x37 R_LARCH_SUB64 .L3 0x0
 # RELAX-NEXT:    }
 # RELAX-NEXT:  ]
 
 # RELAX:      Hex dump of section '.data':
-# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000000
+# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000004
 # RELAX-NEXT: 0x00000010 00000000 00000000 00000000 00000000
-# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00
+# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00000000
+# RELAX-NEXT: 0x00000030 00000000 00000000 00000000 000000
 
 .text
 .L1:
@@ -63,6 +87,7 @@
 .short .L2 - .L1
 .word  .L2 - .L1
 .dword .L2 - .L1
+.uleb128 .L2 - .L1
 ## With relaxation, emit relocs because of the .align making the diff variable.
 ## TODO Handle alignment directive. Why they emit relocs now? They returns
 ## without folding symbols offset in AttemptToFoldSymbolOffsetDifference().
@@ -70,7 +95,13 @@
 .short .L3 - .L2
 .word  .L3 - .L2
 .dword .L3 - .L2
+.uleb128 .L3 - .L2
 .byte  .L4 - .L3
 .short .L4 - .L3
 .word  .L4 - .L3
 .dword .L4 - .L3
+.uleb128 .L4 - .L3
+.byte  foo - .L3
+.short foo - .L3
+.word  foo - .L3
+.dword foo - .L3

@llvmbot
Copy link
Member

llvmbot commented Dec 27, 2023

@llvm/pr-subscribers-backend-loongarch

Author: Jinyang He (MQ-mengqing)

Changes

1, Follow RISCV 1df5ea2 to support generates relocs for .uleb128 which can not be folded. Unlike RISCV, LoongArch needs padding zero as its contents.
2, Force relocs if sym is not in section so that it can emit relocs for external symbol.

Fixes: #72960 (comment)


Full diff: https://github.com/llvm/llvm-project/pull/76433.diff

8 Files Affected:

  • (modified) llvm/include/llvm/MC/MCAsmBackend.h (+1-1)
  • (modified) llvm/lib/MC/MCAssembler.cpp (+4-1)
  • (modified) llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp (+44-14)
  • (modified) llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h (+3)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp (+1-1)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h (+2-2)
  • (added) llvm/test/MC/LoongArch/Relocations/leb128.s (+65)
  • (modified) llvm/test/MC/LoongArch/Relocations/relax-addsub.s (+52-21)
diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h
index 8931e8cab2fa18..58e670915a3cbb 100644
--- a/llvm/include/llvm/MC/MCAsmBackend.h
+++ b/llvm/include/llvm/MC/MCAsmBackend.h
@@ -199,7 +199,7 @@ class MCAsmBackend {
   // Defined by linker relaxation targets to possibly emit LEB128 relocations
   // and set Value at the relocated location.
   virtual bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
-                           int64_t &Value) const {
+                           int64_t &Value, bool &UseZeroPad) const {
     return false;
   }
 
diff --git a/llvm/lib/MC/MCAssembler.cpp b/llvm/lib/MC/MCAssembler.cpp
index def13044dfccc3..1fe9dfc1b6cded 100644
--- a/llvm/lib/MC/MCAssembler.cpp
+++ b/llvm/lib/MC/MCAssembler.cpp
@@ -1017,6 +1017,7 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
   const unsigned OldSize = static_cast<unsigned>(LF.getContents().size());
   unsigned PadTo = OldSize;
   int64_t Value;
+  bool UseZeroPad = false;
   SmallVectorImpl<char> &Data = LF.getContents();
   LF.getFixups().clear();
   // Use evaluateKnownAbsolute for Mach-O as a hack: .subsections_via_symbols
@@ -1026,7 +1027,7 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
                  ? LF.getValue().evaluateKnownAbsolute(Value, Layout)
                  : LF.getValue().evaluateAsAbsolute(Value, Layout);
   if (!Abs) {
-    if (!getBackend().relaxLEB128(LF, Layout, Value)) {
+    if (!getBackend().relaxLEB128(LF, Layout, Value, UseZeroPad)) {
       getContext().reportError(LF.getValue().getLoc(),
                                Twine(LF.isSigned() ? ".s" : ".u") +
                                    "leb128 expression is not absolute");
@@ -1034,6 +1035,8 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
     }
     uint8_t Tmp[10]; // maximum size: ceil(64/7)
     PadTo = std::max(PadTo, encodeULEB128(uint64_t(Value), Tmp));
+    if (UseZeroPad)
+      Value = 0;
   }
   Data.clear();
   raw_svector_ostream OSE(Data);
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
index 6d8ef1bf96cbab..6dbf24e7fd6379 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
@@ -91,6 +91,7 @@ static uint64_t adjustFixupValue(const MCFixup &Fixup, uint64_t Value,
   case FK_Data_2:
   case FK_Data_4:
   case FK_Data_8:
+  case FK_Data_leb128:
     return Value;
   case LoongArch::fixup_loongarch_b16: {
     if (!isInt<18>(Value))
@@ -173,6 +174,7 @@ bool LoongArchAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
   case FK_Data_2:
   case FK_Data_4:
   case FK_Data_8:
+  case FK_Data_leb128:
     return !Target.isAbsolute();
   }
 }
@@ -202,9 +204,27 @@ getRelocPairForSize(unsigned Size) {
     return std::make_pair(
         MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD64),
         MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB64));
+  case 128:
+    return std::make_pair(
+        MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD_ULEB128),
+        MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB_ULEB128));
   }
 }
 
+bool LoongArchAsmBackend::relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
+                                      int64_t &Value, bool &UseZeroPad) const {
+  const MCExpr &Expr = LF.getValue();
+
+  if (LF.isSigned() || !Expr.evaluateKnownAbsolute(Value, Layout))
+    return false;
+
+  UseZeroPad = true;
+  LF.getFixups().push_back(
+      MCFixup::create(0, &Expr, FK_Data_leb128, Expr.getLoc()));
+
+  return true;
+}
+
 bool LoongArchAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
                                        const MCSubtargetInfo *STI) const {
   // We mostly follow binutils' convention here: align to 4-byte boundary with a
@@ -226,21 +246,28 @@ bool LoongArchAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
                                                   uint64_t &FixedValue) const {
   std::pair<MCFixupKind, MCFixupKind> FK;
   uint64_t FixedValueA, FixedValueB;
-  const MCSection &SecA = Target.getSymA()->getSymbol().getSection();
-  const MCSection &SecB = Target.getSymB()->getSymbol().getSection();
-
-  // We need record relocation if SecA != SecB. Usually SecB is same as the
-  // section of Fixup, which will be record the relocation as PCRel. If SecB
-  // is not same as the section of Fixup, it will report error. Just return
-  // false and then this work can be finished by handleFixup.
-  if (&SecA != &SecB)
-    return false;
+  const MCSymbol &SA = Target.getSymA()->getSymbol();
+  const MCSymbol &SB = Target.getSymB()->getSymbol();
 
-  // In SecA == SecB case. If the linker relaxation is enabled, we need record
-  // the ADD, SUB relocations. Otherwise the FixedValue has already been
-  // calculated out in evaluateFixup, return true and avoid record relocations.
-  if (!STI.hasFeature(LoongArch::FeatureRelax))
-    return true;
+  bool force = !SA.isInSection() || !SB.isInSection();
+
+  if (!force) {
+    const MCSection &SecA = SA.getSection();
+    const MCSection &SecB = SB.getSection();
+
+    // We need record relocation if SecA != SecB. Usually SecB is same as the
+    // section of Fixup, which will be record the relocation as PCRel. If SecB
+    // is not same as the section of Fixup, it will report error. Just return
+    // false and then this work can be finished by handleFixup.
+    if (&SecA != &SecB)
+      return false;
+
+    // In SecA == SecB case. If the linker relaxation is enabled, we need record
+    // the ADD, SUB relocations. Otherwise the FixedValue has already been calc-
+    // ulated out in evaluateFixup, return true and avoid record relocations.
+    if (!STI.hasFeature(LoongArch::FeatureRelax))
+      return true;
+  }
 
   switch (Fixup.getKind()) {
   case llvm::FK_Data_1:
@@ -255,6 +282,9 @@ bool LoongArchAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
   case llvm::FK_Data_8:
     FK = getRelocPairForSize(64);
     break;
+  case llvm::FK_Data_leb128:
+    FK = getRelocPairForSize(128);
+    break;
   default:
     llvm_unreachable("unsupported fixup size");
   }
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
index fef0e84600a74c..adee22020b0b21 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
@@ -66,6 +66,9 @@ class LoongArchAsmBackend : public MCAsmBackend {
   void relaxInstruction(MCInst &Inst,
                         const MCSubtargetInfo &STI) const override {}
 
+  bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout, int64_t &Value,
+                   bool &UseZeroPad) const override;
+
   bool writeNopData(raw_ostream &OS, uint64_t Count,
                     const MCSubtargetInfo *STI) const override;
 
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
index 716fb67c582489..bf50b69dee1b0a 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
@@ -330,7 +330,7 @@ bool RISCVAsmBackend::relaxDwarfCFA(MCDwarfCallFrameFragment &DF,
 }
 
 bool RISCVAsmBackend::relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
-                                  int64_t &Value) const {
+                                  int64_t &Value, bool &UseZeroPad) const {
   if (LF.isSigned())
     return false;
   const MCExpr &Expr = LF.getValue();
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
index 2ad6534ac8bce3..24391ff0980208 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
@@ -100,8 +100,8 @@ class RISCVAsmBackend : public MCAsmBackend {
                           bool &WasRelaxed) const override;
   bool relaxDwarfCFA(MCDwarfCallFrameFragment &DF, MCAsmLayout &Layout,
                      bool &WasRelaxed) const override;
-  bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
-                   int64_t &Value) const override;
+  bool relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout, int64_t &Value,
+                   bool &UseZeroPad) const override;
 
   bool writeNopData(raw_ostream &OS, uint64_t Count,
                     const MCSubtargetInfo *STI) const override;
diff --git a/llvm/test/MC/LoongArch/Relocations/leb128.s b/llvm/test/MC/LoongArch/Relocations/leb128.s
new file mode 100644
index 00000000000000..f378482e0e0c31
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Relocations/leb128.s
@@ -0,0 +1,65 @@
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s -o %t
+# RUN: llvm-readobj -r -x .alloc_w %t | FileCheck --check-prefixes=CHECK,NORELAX %s
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o %t.relax
+# RUN: llvm-readobj -r -x .alloc_w %t.relax | FileCheck --check-prefixes=CHECK,RELAX %s
+
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax --defsym ERR=1 %s -o /dev/null 2>&1 | \
+# RUN:   FileCheck %s --check-prefix=ERR
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax --defsym ERR=1 %s -o /dev/null 2>&1 | \
+# RUN:   FileCheck %s --check-prefix=ERR
+
+# CHECK:      Relocations [
+# CHECK-NEXT:   .rela.alloc_w {
+# RELAX-NEXT:      0x0 R_LARCH_ADD_ULEB128 w1 0x0
+# RELAX-NEXT:      0x0 R_LARCH_SUB_ULEB128 w 0x0
+# RELAX-NEXT:      0x1 R_LARCH_ADD_ULEB128 w2 0x0
+# RELAX-NEXT:      0x1 R_LARCH_SUB_ULEB128 w1 0x0
+# CHECK-NEXT:      0x2 R_LARCH_PCALA_HI20 foo 0x0
+# RELAX-NEXT:      0x2 R_LARCH_RELAX - 0x0
+# CHECK-NEXT:      0x6 R_LARCH_PCALA_LO12 foo 0x0
+# RELAX-NEXT:      0x6 R_LARCH_RELAX - 0x0
+# RELAX-NEXT:      0xA R_LARCH_ADD_ULEB128 w2 0x0
+# RELAX-NEXT:      0xA R_LARCH_SUB_ULEB128 w1 0x0
+# RELAX-NEXT:      0xB R_LARCH_ADD_ULEB128 w2 0x78
+# RELAX-NEXT:      0xB R_LARCH_SUB_ULEB128 w1 0x0
+# RELAX-NEXT:      0xD R_LARCH_ADD_ULEB128 w1 0x0
+# RELAX-NEXT:      0xD R_LARCH_SUB_ULEB128 w2 0x0
+# CHECK-NEXT:   }
+# CHECK-NEXT: ]
+
+# CHECK:        Hex dump of section '.alloc_w':
+# NORELAX-NEXT: 0x00000000 00080c00 001a8c01 c0020880 01f8ffff
+# NORELAX-NEXT: 0x00000010 ffffffff ffff01
+# RELAX-NEXT:   0x00000000 00000c00 001a8c01 c0020080 00808080
+# RELAX-NEXT:   0x00000010 80808080 808000
+
+.section .alloc_w,"ax",@progbits; w:
+.uleb128 w1-w       # w1 is later defined in the same section
+.uleb128 w2-w1      # w1 and w2 are separated by a linker relaxable instruction
+w1:
+  la.pcrel $t0, foo
+w2:
+.uleb128 w2-w1      # 0x08
+.uleb128 w2-w1+120  # 0x0180
+.uleb128 -(w2-w1)   # 0x01fffffffffffffffff8
+
+.ifdef ERR
+# ERR: :[[#@LINE+1]]:16: error: .uleb128 expression is not absolute
+.uleb128 extern-w   # extern is undefined
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 w-extern
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 x-w        # x is later defined in another section
+
+.section .alloc_x,"aw",@progbits; x:
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 y-x
+.section .alloc_y,"aw",@progbits; y:
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 x-y
+
+# ERR: :[[#@LINE+1]]:10: error: .uleb128 expression is not absolute
+.uleb128 extern
+# ERR: :[[#@LINE+1]]:10: error: .uleb128 expression is not absolute
+.uleb128 y
+.endif
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
index c4454f5bb98d11..b69fc40013ea33 100644
--- a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
+++ b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
@@ -8,12 +8,23 @@
 # NORELAX-NEXT:      0x10 R_LARCH_PCALA_HI20 .text 0x0
 # NORELAX-NEXT:      0x14 R_LARCH_PCALA_LO12 .text 0x0
 # NORELAX-NEXT:    }
+# NORELAX-NEXT:    Section ({{.*}}) .rela.data {
+# NORELAX-NEXT:      0x30 R_LARCH_ADD8 foo 0x0
+# NORELAX-NEXT:      0x30 R_LARCH_SUB8 .text 0x10
+# NORELAX-NEXT:      0x31 R_LARCH_ADD16 foo 0x0
+# NORELAX-NEXT:      0x31 R_LARCH_SUB16 .text 0x10
+# NORELAX-NEXT:      0x33 R_LARCH_ADD32 foo 0x0
+# NORELAX-NEXT:      0x33 R_LARCH_SUB32 .text 0x10
+# NORELAX-NEXT:      0x37 R_LARCH_ADD64 foo 0x0
+# NORELAX-NEXT:      0x37 R_LARCH_SUB64 .text 0x10
+# NORELAX-NEXT:    }
 # NORELAX-NEXT:  ]
 
 # NORELAX:      Hex dump of section '.data':
-# NORELAX-NEXT: 0x00000000 04040004 00000004 00000000 0000000c
-# NORELAX-NEXT: 0x00000010 0c000c00 00000c00 00000000 00000808
-# NORELAX-NEXT: 0x00000020 00080000 00080000 00000000 00
+# NORELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000004
+# NORELAX-NEXT: 0x00000010 0c0c000c 0000000c 00000000 00000000
+# NORELAX-NEXT: 0x00000020 08080008 00000008 00000000 00000008
+# NORELAX-NEXT: 0x00000030 00000000 00000000 00000000 000000
 
 # RELAX:       Relocations [
 # RELAX-NEXT:    Section ({{.*}}) .rela.text {
@@ -23,29 +34,42 @@
 # RELAX-NEXT:      0x14 R_LARCH_RELAX - 0x0
 # RELAX-NEXT:    }
 # RELAX-NEXT:    Section ({{.*}}) .rela.data {
-# RELAX-NEXT:      0xF R_LARCH_ADD8 .L3 0x0
-# RELAX-NEXT:      0xF R_LARCH_SUB8 .L2 0x0
-# RELAX-NEXT:      0x10 R_LARCH_ADD16 .L3 0x0
-# RELAX-NEXT:      0x10 R_LARCH_SUB16 .L2 0x0
-# RELAX-NEXT:      0x12 R_LARCH_ADD32 .L3 0x0
-# RELAX-NEXT:      0x12 R_LARCH_SUB32 .L2 0x0
-# RELAX-NEXT:      0x16 R_LARCH_ADD64 .L3 0x0
-# RELAX-NEXT:      0x16 R_LARCH_SUB64 .L2 0x0
-# RELAX-NEXT:      0x1E R_LARCH_ADD8 .L4 0x0
-# RELAX-NEXT:      0x1E R_LARCH_SUB8 .L3 0x0
-# RELAX-NEXT:      0x1F R_LARCH_ADD16 .L4 0x0
-# RELAX-NEXT:      0x1F R_LARCH_SUB16 .L3 0x0
-# RELAX-NEXT:      0x21 R_LARCH_ADD32 .L4 0x0
-# RELAX-NEXT:      0x21 R_LARCH_SUB32 .L3 0x0
-# RELAX-NEXT:      0x25 R_LARCH_ADD64 .L4 0x0
-# RELAX-NEXT:      0x25 R_LARCH_SUB64 .L3 0x0
+# RELAX-NEXT:      0x10 R_LARCH_ADD8 .L3 0x0
+# RELAX-NEXT:      0x10 R_LARCH_SUB8 .L2 0x0
+# RELAX-NEXT:      0x11 R_LARCH_ADD16 .L3 0x0
+# RELAX-NEXT:      0x11 R_LARCH_SUB16 .L2 0x0
+# RELAX-NEXT:      0x13 R_LARCH_ADD32 .L3 0x0
+# RELAX-NEXT:      0x13 R_LARCH_SUB32 .L2 0x0
+# RELAX-NEXT:      0x17 R_LARCH_ADD64 .L3 0x0
+# RELAX-NEXT:      0x17 R_LARCH_SUB64 .L2 0x0
+# RELAX-NEXT:      0x1F R_LARCH_ADD_ULEB128 .L3 0x0
+# RELAX-NEXT:      0x1F R_LARCH_SUB_ULEB128 .L2 0x0
+# RELAX-NEXT:      0x20 R_LARCH_ADD8 .L4 0x0
+# RELAX-NEXT:      0x20 R_LARCH_SUB8 .L3 0x0
+# RELAX-NEXT:      0x21 R_LARCH_ADD16 .L4 0x0
+# RELAX-NEXT:      0x21 R_LARCH_SUB16 .L3 0x0
+# RELAX-NEXT:      0x23 R_LARCH_ADD32 .L4 0x0
+# RELAX-NEXT:      0x23 R_LARCH_SUB32 .L3 0x0
+# RELAX-NEXT:      0x27 R_LARCH_ADD64 .L4 0x0
+# RELAX-NEXT:      0x27 R_LARCH_SUB64 .L3 0x0
+# RELAX-NEXT:      0x2F R_LARCH_ADD_ULEB128 .L4 0x0
+# RELAX-NEXT:      0x2F R_LARCH_SUB_ULEB128 .L3 0x0
+# RELAX-NEXT:      0x30 R_LARCH_ADD8 foo 0x0
+# RELAX-NEXT:      0x30 R_LARCH_SUB8 .L3 0x0
+# RELAX-NEXT:      0x31 R_LARCH_ADD16 foo 0x0
+# RELAX-NEXT:      0x31 R_LARCH_SUB16 .L3 0x0
+# RELAX-NEXT:      0x33 R_LARCH_ADD32 foo 0x0
+# RELAX-NEXT:      0x33 R_LARCH_SUB32 .L3 0x0
+# RELAX-NEXT:      0x37 R_LARCH_ADD64 foo 0x0
+# RELAX-NEXT:      0x37 R_LARCH_SUB64 .L3 0x0
 # RELAX-NEXT:    }
 # RELAX-NEXT:  ]
 
 # RELAX:      Hex dump of section '.data':
-# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000000
+# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000004
 # RELAX-NEXT: 0x00000010 00000000 00000000 00000000 00000000
-# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00
+# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00000000
+# RELAX-NEXT: 0x00000030 00000000 00000000 00000000 000000
 
 .text
 .L1:
@@ -63,6 +87,7 @@
 .short .L2 - .L1
 .word  .L2 - .L1
 .dword .L2 - .L1
+.uleb128 .L2 - .L1
 ## With relaxation, emit relocs because of the .align making the diff variable.
 ## TODO Handle alignment directive. Why they emit relocs now? They returns
 ## without folding symbols offset in AttemptToFoldSymbolOffsetDifference().
@@ -70,7 +95,13 @@
 .short .L3 - .L2
 .word  .L3 - .L2
 .dword .L3 - .L2
+.uleb128 .L3 - .L2
 .byte  .L4 - .L3
 .short .L4 - .L3
 .word  .L4 - .L3
 .dword .L4 - .L3
+.uleb128 .L4 - .L3
+.byte  foo - .L3
+.short foo - .L3
+.word  foo - .L3
+.dword foo - .L3

@MQ-mengqing
Copy link
Contributor Author

@nathanchance
Copy link
Member

I can confirm that this change resolves the issue I reported at #72960 (comment) and the resulting kernel boots in QEMU. I am not really qualified to comment on this patch beyond that though.

@MaskRay
Copy link
Member

MaskRay commented Jan 1, 2024

Unlike RISCV, LoongArch needs padding zero as its contents.

The located content is zero.

Add a description why it needs to be zero.

Copy link
Contributor

@SixWeining SixWeining left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@xen0n
Copy link
Contributor

xen0n commented Jan 4, 2024

Hmm, I didn't see the reason for the zero-padding behavior as requested by @MaskRay in the commit 5f771fe?

@SixWeining
Copy link
Contributor

Hmm, I didn't see the reason for the zero-padding behavior as requested by @MaskRay in the commit 5f771fe?

The commit message has addressed that.

@xen0n
Copy link
Contributor

xen0n commented Jan 4, 2024

Hmm, I didn't see the reason for the zero-padding behavior as requested by @MaskRay in the commit 5f771fe?

The commit message has addressed that.

Ah, it's only explaining that zero-padding should be done on LoongArch. But I'm not sure if @MaskRay was asking for the reason behind the difference, i.e. the design considerations behind the decision.

@MQ-mengqing
Copy link
Contributor Author

Ah, it's only explaining that zero-padding should be done on LoongArch. But I'm not sure if @MaskRay was asking for the reason behind the difference, i.e. the design considerations behind the decision.

I just followed psABI in this pr. My idea is that {ADD,SUB}_ULEB128 follows the other ADD and SUB relocation types. There is no order between ADD and SUB, while SET needs to be executed before SUB. However, in the object file, the initial value is more helpful for debug information to print, otherwise we need to relocate and fix them. For example the resolveLoongArch fixes ADD and SUB to display the debug information of object file normally. Both of them has its own advantages I think.

…relocs when sym is not in section

1, Follow RISCV 1df5ea2 to support generates relocs for .uleb128 which
can not be folded. Unlike RISCV, the located content of LoongArch should
be zero. LoongArch fixup uleb128 value by in-place addition and
subtraction reloc types named R_LARCH_{ADD,SUB}_ULEB128. The located
content can affect the result and R_LARCH_ADD_ULEB128 has enough info
to represent the first symbol value, so it needs to be set to zero.
2, Force relocs if sym is not in section so that it can emit relocs
for external symbol.

Fixes: llvm#72960 (comment)
@MaskRay
Copy link
Member

MaskRay commented Jan 6, 2024

Ah, it's only explaining that zero-padding should be done on LoongArch. But I'm not sure if @MaskRay was asking for the reason behind the difference, i.e. the design considerations behind the decision.

I just followed psABI in this pr. My idea is that {ADD,SUB}_ULEB128 follows the other ADD and SUB relocation types. There is no order between ADD and SUB, while SET needs to be executed before SUB. However, in the object file, the initial value is more helpful for debug information to print, otherwise we need to relocate and fix them. For example the resolveLoongArch fixes ADD and SUB to display the debug information of object file normally. Both of them has its own advantages I think.

My recollection from the RISC-V side discussion: SET_ULEB128/SUB_ULEB128 should be used consecutively and other forms (e.g. SET_ULEB128 without a SUB_ULEB128) are not useful.
.uleb128 A (A evaluates to a single symbol) is wasteful when used without SUB_ULEB128 (10 bytes for an unconstrained uint64_t) and there isn't a use case for it.

I wonder whether LoongArch can rename R_LARCH_ADD_ULEB128 to R_LARCH_SET_ULEB128 and define the symbol R_LARCH_ADD_ULEB128 as an alias if compatibility is needed.
If GNU assembler ensures that .uleb128 A-B creates a zero padding, there is no functional difference switching to the SET semantics.
Assemblers and linkers can enjoy simplification from having R_LARCH_SET_ULEB128 defined the same as R_RISCV_SET_ULEB128.

There is no order between ADD and SUB, while SET needs to be executed before SUB.

This is true, though we haven't found an example that this flexibility helps anything.

@MQ-mengqing
Copy link
Contributor Author

Assemblers and linkers can enjoy simplification from having R_LARCH_SET_ULEB128 defined the same as R_RISCV_SET_ULEB128.

If LoongArch not guarantee that ADD and SUB appear in pairs, I think LoongArch can enjoy this simplification by only creating ADD relocation type. In further, for .byte, .2byte, .4byte, .8byte and .uleb128 we can just create ADD{8,16,32,64,ULEB128} in assembly stage with GNU assembler ensuring those directive create zero padding.

@SixWeining SixWeining merged commit b57159c into llvm:main Jan 9, 2024
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…relocs when sym is not in section (llvm#76433)

1, Follow RISCV 1df5ea2 to support generates relocs for .uleb128 which
can not be folded. Unlike RISCV, the located content of LoongArch should
be zero. LoongArch fixup uleb128 value by in-place addition and
subtraction reloc types named R_LARCH_{ADD,SUB}_ULEB128. The located
content can affect the result and R_LARCH_ADD_ULEB128 has enough info to
represent the first symbol value, so it needs to be set to zero.
2, Force relocs if sym is not in section so that it can emit relocs for
external symbol.

Fixes:
llvm#72960 (comment)
leecheechen pushed a commit to leecheechen/llvm-project that referenced this pull request Jun 9, 2025
…}_ULEB128 for .uleb128 directives

This patch is originally from three upstream commits:
1, R_LARCH_{ADD,SUB}_ULEB128 are originally landed from b57159c(llvm#76433).
2, R_RISCV_{SET,SUB}_ULEB128 are originally supported from 1df5ea2. Among it, we change
the default behaviour of `-riscv-uleb128-reloc` to not produce uleb128 reloc, in order
to avoid any other side-effects due to the updated implementation of `MCAssembler::relaxLEB()`
function. And at the same time, we ensure that this patch can't introduce new default traits
(such as the generation for uleb128 reloc) on RISCV in this version.
3, Fix invalid-sleb.s in original commit d7398a3.

Change-Id: Ie687b7d8483c76cf647141162641db1a9d819a04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants