Skip to content

[RISCV] Xqcilb: remove RISCVMCExpr::VK_QC_E_JUMP_PLT and drop @plt parsing #135507

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Apr 12, 2025

Follow-up to the just landed #135044 . Remove @plt parsing (only
needed by legacy call foo@plt). MCParser's @ parsing is problematic.
Supporting target variations like (foo+2@plt foo@plt+2 (foo+2)@plt)
involves messy hacks. We should refrain from adding new @ uses.

Remove unneeded RISCVMCExpr::VK_QC_E_JUMP_PLT (should only be used
when an instruction might have multiple reasonable relocations
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers).


GCC's initial initial RISC-V port made a mistake by having both call foo (non-PIC) and call foo@plt (PIC), likely misled by x86/SystemZ.
It was determined that the @plt was not needed. Since R_RISCV_CALL had
questionable undefined weak semantics in GNU ld (which has been removed
then), we kept R_RISCV_CALL_PLT and deprecated R_RISCV_CALL.

For RISC-V instructions, we only keep @ in call/jump for backward
compatibility and discourage it for all other instructions.

(
There is disagreement about whether PLT in JUMP_PLT is useful or
misleading.
MaskRay's opnion: For new branch relocations with procedure call
semantics, use _CALL and avoid _PLT in the relocation name.
_PLT should only be used in data directives (e.g. R_RISCV_PLT32) to
indicate that the address of a function is not significant.
)

Created using spr 1.3.5-bogner
@llvmbot llvmbot added backend:RISC-V mc Machine (object) code labels Apr 12, 2025
@MaskRay MaskRay requested review from lenary, svs-quic and topperc April 12, 2025 23:08
@llvmbot
Copy link
Member

llvmbot commented Apr 12, 2025

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-risc-v

Author: Fangrui Song (MaskRay)

Changes

Follow-up to the just landed #135044 . Remove unneeded
RISCVMCExpr::VK_QC_E_JUMP_PLT (should only be used when
an instruction might have multiple reasonable relocations
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers).
Remove @<!-- -->plt parsing only needed for legacy call foo@<!-- -->plt.


GCC's initial initial RISC-V port made a mistake by having both call foo (non-PIC) and call foo@<!-- -->plt (PIC), likely misled by x86/SystemZ.
It was determined that the @<!-- -->plt was not needed. Since R_RISCV_CALL had
questionable undefined weak semantics in GNU ld (which has been removed
then), we kept R_RISCV_CALL_PLT and deprecated R_RISCV_CALL.

For new branch relocations, it's recommended to avoid _PLT when the
fixup/relocation contains CALL/JUMP in the name, aligning with most
modern architectures. _PLT should only be used in data directives
(e.g. R_RISCV_PLT32) to indicate that the address of a function is not
significant.

For RISC-V instructions, we only keep @ in call/jump for backward
compatibility and discourage it for all other instructions.


Full diff: https://github.com/llvm/llvm-project/pull/135507.diff

10 Files Affected:

  • (modified) llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp (+2-26)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp (+2-2)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp (+1-1)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h (+1-1)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp (+2-3)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.cpp (+1-4)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.h (-1)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td (+1-1)
  • (modified) llvm/test/MC/RISCV/xqcilb-invalid.s (+2-2)
  • (modified) llvm/test/MC/RISCV/xqcilb-relocations.s (+6-16)
diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 55f1a90b2a01a..952587171ffce 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -591,16 +591,7 @@ struct RISCVOperand final : public MCParsedAsmOperand {
            (VK == RISCVMCExpr::VK_CALL || VK == RISCVMCExpr::VK_CALL_PLT);
   }
 
-  bool isPseudoQCJumpSymbol() const {
-    int64_t Imm;
-    // Must be of 'immediate' type but not a constant.
-    if (!isImm() || evaluateConstantImm(getImm(), Imm))
-      return false;
-
-    RISCVMCExpr::Specifier VK = RISCVMCExpr::VK_None;
-    return RISCVAsmParser::classifySymbolRef(getImm(), VK) &&
-           VK == RISCVMCExpr::VK_QC_E_JUMP_PLT;
-  }
+  bool isPseudoQCJumpSymbol() const { return isBareSymbol(); }
 
   bool isPseudoJumpSymbol() const {
     int64_t Imm;
@@ -2157,25 +2148,10 @@ ParseStatus RISCVAsmParser::parsePseudoQCJumpSymbol(OperandVector &Operands) {
 
   std::string Identifier(getTok().getIdentifier());
   SMLoc E = getTok().getEndLoc();
-
-  if (getLexer().peekTok().is(AsmToken::At)) {
-    Lex();
-    Lex();
-    SMLoc PLTLoc = getLoc();
-    StringRef PLT;
-    E = getTok().getEndLoc();
-    if (getParser().parseIdentifier(PLT) || PLT != "plt")
-      return Error(PLTLoc,
-                   "'@plt' is the only valid operand for this instruction");
-  } else {
-    Lex();
-  }
-
-  RISCVMCExpr::Specifier Kind = RISCVMCExpr::VK_QC_E_JUMP_PLT;
+  Lex();
 
   MCSymbol *Sym = getContext().getOrCreateSymbol(Identifier);
   Res = MCSymbolRefExpr::create(Sym, getContext());
-  Res = RISCVMCExpr::create(Res, Kind, getContext());
   Operands.push_back(RISCVOperand::createImm(Res, S, E, isRV64()));
   return ParseStatus::Success;
 }
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
index 49c8c6957aa34..3fa52ddc344e7 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
@@ -95,7 +95,7 @@ RISCVAsmBackend::getFixupKindInfo(MCFixupKind Kind) const {
       {"fixup_riscv_qc_e_branch", 0, 48, MCFixupKindInfo::FKF_IsPCRel},
       {"fixup_riscv_qc_e_32", 16, 32, 0},
       {"fixup_riscv_qc_abs20_u", 12, 20, 0},
-      {"fixup_riscv_qc_e_jump_plt", 0, 48, MCFixupKindInfo::FKF_IsPCRel},
+      {"fixup_riscv_qc_e_jump", 0, 48, MCFixupKindInfo::FKF_IsPCRel},
   };
   static_assert((std::size(Infos)) == RISCV::NumTargetFixupKinds,
                 "Not all fixup kinds added to Infos array");
@@ -576,7 +576,7 @@ static uint64_t adjustFixupValue(const MCFixup &Fixup, uint64_t Value,
     Value = (Bit19 << 31) | (Bit14_0 << 16) | (Bit18_15 << 12);
     return Value;
   }
-  case RISCV::fixup_riscv_qc_e_jump_plt: {
+  case RISCV::fixup_riscv_qc_e_jump: {
     if (!isInt<32>(Value))
       Ctx.reportError(Fixup.getLoc(), "fixup value out of range");
     if (Value & 0x1)
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp
index 59dc79b57b456..1b994d120ae74 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp
@@ -119,7 +119,7 @@ unsigned RISCVELFObjectWriter::getRelocType(MCContext &Ctx,
       return ELF::R_RISCV_CALL_PLT;
     case RISCV::fixup_riscv_qc_e_branch:
       return ELF::R_RISCV_QC_E_BRANCH;
-    case RISCV::fixup_riscv_qc_e_jump_plt:
+    case RISCV::fixup_riscv_qc_e_jump:
       return ELF::R_RISCV_QC_E_JUMP_PLT;
     }
   }
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h
index 596c4eb5fffaa..a7ed21bf76908 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h
@@ -85,7 +85,7 @@ enum Fixups {
   // 20-bit fixup for symbol references in the 32-bit qc.li instruction
   fixup_riscv_qc_abs20_u,
   // 32-bit fixup for symbol references in the 48-bit qc.j/qc.jal instructions
-  fixup_riscv_qc_e_jump_plt,
+  fixup_riscv_qc_e_jump,
 
   // Used as a sentinel, must be the last
   fixup_riscv_invalid,
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
index f324907d49fd9..15e405a430d87 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
@@ -685,9 +685,6 @@ uint64_t RISCVMCCodeEmitter::getImmOpValue(const MCInst &MI, unsigned OpNo,
     case RISCVMCExpr::VK_QC_ABS20:
       FixupKind = RISCV::fixup_riscv_qc_abs20_u;
       break;
-    case RISCVMCExpr::VK_QC_E_JUMP_PLT:
-      FixupKind = RISCV::fixup_riscv_qc_e_jump_plt;
-      break;
     }
   } else if (Kind == MCExpr::SymbolRef || Kind == MCExpr::Binary) {
     // FIXME: Sub kind binary exprs have chance of underflow.
@@ -705,6 +702,8 @@ uint64_t RISCVMCCodeEmitter::getImmOpValue(const MCInst &MI, unsigned OpNo,
       FixupKind = RISCV::fixup_riscv_qc_e_branch;
     } else if (MIFrm == RISCVII::InstFormatQC_EAI) {
       FixupKind = RISCV::fixup_riscv_qc_e_32;
+    } else if (MIFrm == RISCVII::InstFormatQC_EJ) {
+      FixupKind = RISCV::fixup_riscv_qc_e_jump;
     }
   }
 
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.cpp
index 99f72620f97ed..d6650e156c8b3 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.cpp
@@ -34,8 +34,7 @@ const RISCVMCExpr *RISCVMCExpr::create(const MCExpr *Expr, Specifier S,
 
 void RISCVMCExpr::printImpl(raw_ostream &OS, const MCAsmInfo *MAI) const {
   Specifier S = getSpecifier();
-  bool HasVariant = ((S != VK_None) && (S != VK_CALL) && (S != VK_CALL_PLT) &&
-                     (S != VK_QC_E_JUMP_PLT));
+  bool HasVariant = ((S != VK_None) && (S != VK_CALL) && (S != VK_CALL_PLT));
 
   if (HasVariant)
     OS << '%' << getSpecifierName(S) << '(';
@@ -168,8 +167,6 @@ StringRef RISCVMCExpr::getSpecifierName(Specifier S) {
     return "pltpcrel";
   case VK_QC_ABS20:
     return "qc.abs20";
-  case VK_QC_E_JUMP_PLT:
-    return "qc_e_jump_plt";
   }
   llvm_unreachable("Invalid ELF symbol kind");
 }
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.h
index d60879d34dc17..e0aa7ff244521 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.h
@@ -44,7 +44,6 @@ class RISCVMCExpr : public MCTargetExpr {
     VK_TLSDESC_ADD_LO,
     VK_TLSDESC_CALL,
     VK_QC_ABS20,
-    VK_QC_E_JUMP_PLT
   };
 
 private:
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td b/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
index 4ac17c8283866..b899c18785538 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
@@ -158,7 +158,7 @@ def PseudoQCJumpSymbol : AsmOperandClass {
   let RenderMethod = "addImmOperands";
   let DiagnosticType = "InvalidPseudoQCJumpSymbol";
   let DiagnosticString = "operand must be a valid jump target";
-  let ParserMethod = "parsePseudoQCJumpSymbol";
+  let ParserMethod = "parseBareSymbol";
 }
 
 def pseudo_qc_jump_symbol : Operand<XLenVT> {
diff --git a/llvm/test/MC/RISCV/xqcilb-invalid.s b/llvm/test/MC/RISCV/xqcilb-invalid.s
index 10d456c8ac0aa..652c3f6163430 100644
--- a/llvm/test/MC/RISCV/xqcilb-invalid.s
+++ b/llvm/test/MC/RISCV/xqcilb-invalid.s
@@ -23,5 +23,5 @@ qc.e.jal 2147483649
 # CHECK-MINUS: :[[@LINE+1]]:1: error: instruction requires the following: 'Xqcilb' (Qualcomm uC Long Branch Extension)
 qc.e.jal 2147483640
 
-# CHECK: :[[@LINE+1]]:12: error: '@plt' is the only valid operand for this instruction
-qc.e.j foo@rlt
+# CHECK: :[[@LINE+1]]:11: error: unexpected token
+qc.e.j foo@plt
diff --git a/llvm/test/MC/RISCV/xqcilb-relocations.s b/llvm/test/MC/RISCV/xqcilb-relocations.s
index a475cde3f6bfd..3e6410615156a 100644
--- a/llvm/test/MC/RISCV/xqcilb-relocations.s
+++ b/llvm/test/MC/RISCV/xqcilb-relocations.s
@@ -13,42 +13,32 @@
 qc.e.j foo
 # RELOC: R_RISCV_CUSTOM195 foo 0x0
 # INSTR: qc.e.j foo
-# FIXUP: fixup A - offset: 0, value: foo, kind: fixup_riscv_qc_e_jump_plt
-
-qc.e.j foo@plt
-# RELOC: R_RISCV_CUSTOM195 foo 0x0
-# INSTR: qc.e.j foo
-# FIXUP: fixup A - offset: 0, value: foo, kind: fixup_riscv_qc_e_jump_plt
-
-qc.e.jal foo@plt
-# RELOC: R_RISCV_CUSTOM195 foo 0x0
-# INSTR: qc.e.jal foo
-# FIXUP: fixup A - offset: 0, value: foo, kind: fixup_riscv_qc_e_jump_plt
+# FIXUP: fixup A - offset: 0, value: foo, kind: fixup_riscv_qc_e_jump
 
 qc.e.jal foo
 # RELOC: R_RISCV_CUSTOM195 foo 0x0
 # INSTR: qc.e.jal foo
-# FIXUP: fixup A - offset: 0, value: foo, kind: fixup_riscv_qc_e_jump_plt
+# FIXUP: fixup A - offset: 0, value: foo, kind: fixup_riscv_qc_e_jump
 
 # Check that a label in a different section is handled similar to an undefined symbol
 qc.e.j .bar
 # RELOC: R_RISCV_CUSTOM195 .bar 0x0
 # INSTR: qc.e.j .bar
-# FIXUP: fixup A - offset: 0, value: .bar, kind: fixup_riscv_qc_e_jump_plt
+# FIXUP: fixup A - offset: 0, value: .bar, kind: fixup_riscv_qc_e_jump
 
 qc.e.jal .bar
 # RELOC: R_RISCV_CUSTOM195 .bar 0x0
 # INSTR: qc.e.jal .bar
-# FIXUP: fixup A - offset: 0, value: .bar, kind: fixup_riscv_qc_e_jump_plt
+# FIXUP: fixup A - offset: 0, value: .bar, kind: fixup_riscv_qc_e_jump
 
 # Check that jumps to a defined symbol are handled correctly
 qc.e.j .L1
 # INSTR:qc.e.j .L1
-# FIXUP: fixup A - offset: 0, value: .L1, kind: fixup_riscv_qc_e_jump_plt
+# FIXUP: fixup A - offset: 0, value: .L1, kind: fixup_riscv_qc_e_jump
 
 qc.e.jal .L1
 # INSTR:qc.e.jal .L1
-# FIXUP: fixup A - offset: 0, value: .L1, kind: fixup_riscv_qc_e_jump_plt
+# FIXUP: fixup A - offset: 0, value: .L1, kind: fixup_riscv_qc_e_jump
 
 .L1:
   ret

@lenary
Copy link
Member

lenary commented Apr 13, 2025

I've seen this, I find it valuable feedback, but want a bit more time to process the info in it.

@lenary
Copy link
Member

lenary commented Apr 13, 2025

Thanks for bearing with me, I was travelling and ABIs and Jet Lag are not a great mix.

I sort of see where you're coming from here, but I do actually want to push back a little.

(As with the ABI, I'm dropping the R_RISCV_ prefix when referring to relocations in this comment because it is annoying to type).

Originally, there were two relocations for call sequences, so one of them needed a specifier, which was chosen to be @plt for the CALL_PLT relocation. This is the last point the situation is straightforward.

Then you pointed out the issues with the CALL relocation, so it was deprecated, and we then ended up with two syntaxes to get the CALL_PLT relocation: the one without a specifier, and the one with the (originally needed, now redundant) @plt specifier.

I would argue that in this circumstance, the @plt specifier still does have some value! Writing call foo@plt means a reader will not expect this instruction to jump directly to foo, they should expect some linker generated code to "get in the way" (unless the linker can optimise it out).

I aimed with qc.e.jal and qc.e.j to avoid falling into the same trap around relocations, so I have only designed a _PLT relocation. By your argument, only one relocation applying to the instruction means no specifier is needed. However, I do think that the ability for users to write qc.e.jal foo@plt has some value, so they can remind themselves when reading their code that they shouldn't expect a pc-relative jump directly to foo, instead there is something a little more complex happening (unless the linker can optimise it out).

I know that in Arm/AArch64 land, the ABI has to be fairly clear about which relocations are allowed to use a (range extension) veneer/thunk, and which aren't, because they clobber some caller-saved registers. We don't have range-extension veneers/thunks in the same way for RISC-V, but right now I think that CALL_PLT is the only standard relocation that has "permission" to use a PLT entry, which in my mind is similar enough to a veneer/thunk because it clobbers caller-save registers. Using the _PLT suffix on the name of the QC_E_JUMP_PLT relocation was an indicator of the fact that rule also applied to these relocations. I don't think that the linker is allowed to introduce a PLT entry for e.g. a JAL relocation, because that might be applied to a jump not at a function-call boundary, when the registers clobbered by a PLT entry are still live (Linker relaxation makes this a bit more complex, because JAL relocation could point at the address of a PLT, but only after relaxation when the before-relaxation sequence was relocated with CALL_PLT, which is the thing that had the permission to introduce the PLT entry).

I don't really mind what the fixup is called, though I'd like it to echo the relocation name, as I think that makes maintenance/understanding much easier.

@MaskRay
Copy link
Member Author

MaskRay commented Apr 13, 2025

Thanks for bearing with me, I was travelling and ABIs and Jet Lag are not a great mix.

I sort of see where you're coming from here, but I do actually want to push back a little.

(As with the ABI, I'm dropping the R_RISCV_ prefix when referring to relocations in this comment because it is annoying to type).

Originally, there were two relocations for call sequences, so one of them needed a specifier, which was chosen to be @plt for the CALL_PLT relocation. This is the last point the situation is straightforward.

Then you pointed out the issues with the CALL relocation, so it was deprecated, and we then ended up with two syntaxes to get the CALL_PLT relocation: the one without a specifier, and the one with the (originally needed, now redundant) @plt specifier.

I would argue that in this circumstance, the @plt specifier still does have some value! Writing call foo@plt means a reader will not expect this instruction to jump directly to foo, they should expect some linker generated code to "get in the way" (unless the linker can optimise it out).

No... The @plt specifier is misleading; a PLT is not assured. The symbol may be hidden or defined or -Bsymbolic, and this violates RISC-V's %specifier(expr) convention.

MCParser's @ parsing is problematic. Supporting target variations like (foo+2@plt foo@plt+2 `(foo+2)@plt) involves messy hacks.

I aimed with qc.e.jal and qc.e.j to avoid falling into the same trap around relocations, so I have only designed a _PLT relocation. By your argument, only one relocation applying to the instruction means no specifier is needed. However, I do think that the ability for users to write qc.e.jal foo@plt has some value, so they can remind themselves when reading their code that they shouldn't expect a pc-relative jump directly to foo, instead there is something a little more complex happening (unless the linker can optimise it out).

I know that in Arm/AArch64 land, the ABI has to be fairly clear about which relocations are allowed to use a (range extension) veneer/thunk, and which aren't, because they clobber some caller-saved registers. We don't have range-extension veneers/thunks in the same way for RISC-V, but right now I think that CALL_PLT is the only standard relocation that has "permission" to use a PLT entry, which in my mind is similar enough to a veneer/thunk because it clobbers caller-save registers. Using the _PLT suffix on the name of the QC_E_JUMP_PLT relocation was an indicator of the fact that rule also applied to these relocations. I don't think that the linker is allowed to introduce a PLT entry for e.g. a JAL relocation, because that might be applied to a jump not at a function-call boundary, when the registers clobbered by a PLT entry are still live (Linker relaxation makes this a bit more complex, because JAL relocation could point at the address of a PLT, but only after relaxation when the before-relaxation sequence was relocated with CALL_PLT, which is the thing that had the permission to introduce the PLT entry).

I don't really mind what the fixup is called, though I'd like it to echo the relocation name, as I think that makes maintenance/understanding much easier.

I get your point, but two syntaxes for a single relocation are misleading.

The only public doc I can find is https://github.com/quic/riscv-elf-psabi-quic-extensions/blob/main/src/quic-extensions.adoc , which specifies that this instruction is for procedure calls.
@plt's value is minimal compared to the confusion and deviation from modern ABI procedure call conventions (no relocation specifier).

For instance, PPC32 used three redundant relocation types, R_PPC_LOCAL24PC, R_PPC_REL24, and R_PPC_PLTREL24, all with identical semantics.
Recognizing the issue, PPC64 unified them into a single type, R_PPC64_REL24.

That said, if you believe that qc.e.jal might be extended for non-procedure-call purposes and a specifier is important as a new one might be introduced,
I suggest that changing @plt to something like %plt(expr) and dropping the support for a bare symbol without a relocation specifier.

@svs-quic
Copy link
Contributor

If we are to go ahead with this change, you might want to rename the the relocation in llvm/include/llvm/BinaryFormat/ELFRelocs/RISCV_nonstandard.def as well.

@lenary
Copy link
Member

lenary commented Apr 14, 2025

It looks like we're going to keep disagreeing about whether the @plt is useful or misleading, but if there are weird parsing artefacts when there's an offset, then I'm happy to drop support for it for Xqcilb.

I will update my docs to reflect that the @plt is not needed.

I think the fixup should not be renamed. It is aligned with the relocation name, and I think the correspondence is useful and valid. I haven't seen a clear reason to rename the relocation from this discussion, and continue to find the fact the relocation ends in _PLT to be useful.

I think that we no longer need PseudoQCJumpSymbol, pseudo_qc_jump_symbol, PseudoQC_E_J, and PseudoQC_E_JAL, all of which were just for the custom symbol parsing. If we accept a bare symbol, I think we just update simm32_lsb0 to instead be bare_simm32_lsb0 in QCIRVInst48EJ, and have it accept bare symbols.

@MaskRay MaskRay changed the title [RISCV] Xqcilb: rename JUMP_PLT fixup and drop @plt parsing [RISCV] Xqcilb: remove RISCVMCExpr::VK_QC_E_JUMP_PLT and drop @plt parsing Apr 15, 2025
@MaskRay
Copy link
Member Author

MaskRay commented Apr 15, 2025

It looks like we're going to keep disagreeing about whether the @plt is useful or misleading, but if there are weird parsing artefacts when there's an offset, then I'm happy to drop support for it for Xqcilb.

Ah, that's just how it goes. My primary concern is the overuse of @ than the actual relocation type name...
I've updated the description with the best available information, noting our discrepancies.

I will update my docs to reflect that the @plt is not needed.

Thanks! Is there a doc about the instruction encoding? Searching RISC-V Xqcilb online doesn't give me much information. The most relevant one is https://github.com/quic/riscv-elf-psabi-quic-extensions , the psABI extension, not an ISA manual.

I think the fixup should not be renamed. It is aligned with the relocation name, and I think the correspondence is useful and valid. I haven't seen a clear reason to rename the relocation from this discussion, and continue to find the fact the relocation ends in _PLT to be useful.

Restored the fixup name.

Personally I'd lean towards _CALL instead of _JUMP for a relocation with procedure call semantics.
But it's your extension, so ultimately, it's your call:)

(The infix "JUMP" does not convey clear meaning in AArch32: some short-range R_ARM_THM_JUMP* relocations lack range extension thunks/veneers, while R_ARM_THM_JUMP24 does have them.)

I think that we no longer need PseudoQCJumpSymbol, pseudo_qc_jump_symbol, PseudoQC_E_J, and PseudoQC_E_JAL, all of which were just for the custom symbol parsing. If we accept a bare symbol, I think we just update simm32_lsb0 to instead be bare_simm32_lsb0 in QCIRVInst48EJ, and have it accept bare symbols.

Thanks for the guide! It's been helpful in getting me reacquainted with those TableGen AsmParser methods. I think I understand it now.

@lenary
Copy link
Member

lenary commented Apr 15, 2025

The encodings are in the Xqci releases here (there's a pdf uploaded to each release, but this repo contains all Xqc* architecture descriptions): https://github.com/quic/riscv-unified-db/releases

Copy link
Member

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am considering renaming the relocation but I will do it in a follow-up if I do because it needs coordinating over this and the ELD repo.

@MaskRay MaskRay merged commit c8121b9 into main Apr 15, 2025
10 of 11 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/riscv-xqcilb-rename-jump_plt-fixup-and-drop-plt-parsing branch April 15, 2025 16:10
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Apr 15, 2025
…rop `@plt` parsing

Follow-up to the just landed #135044 . Remove `@plt` parsing (only
needed by legacy `call foo@plt`). MCParser's `@` parsing is problematic.
Supporting target variations like (`foo+2@plt foo@plt+2 (foo+2)@plt`)
involves messy hacks. We should refrain from adding new `@` uses.

Remove unneeded `RISCVMCExpr::VK_QC_E_JUMP_PLT` (should only be used
when an instruction might have multiple reasonable relocations
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers).

---

GCC's initial initial RISC-V port made a mistake by having both `call
foo` (non-PIC) and `call foo@plt` (PIC), likely misled by x86/SystemZ.
It was determined that the `@plt` was not needed. Since R_RISCV_CALL had
questionable undefined weak semantics in GNU ld (which has been removed
then), we kept R_RISCV_CALL_PLT and deprecated R_RISCV_CALL.

For RISC-V instructions, we only keep `@` in call/jump for backward
compatibility and discourage it for all other instructions.

(
There is disagreement about whether `PLT` in `JUMP_PLT` is useful or
misleading.
MaskRay's opnion: For new branch relocations with procedure call
semantics, use `_CALL` and avoid `_PLT` in the relocation name.
`_PLT` should only be used in data directives (e.g. R_RISCV_PLT32) to
indicate that the address of a function is not significant.
)

Pull Request: llvm/llvm-project#135507
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…parsing

Follow-up to the just landed llvm#135044 . Remove `@plt` parsing (only
needed by legacy `call foo@plt`). MCParser's `@` parsing is problematic.
Supporting target variations like (`foo+2@plt foo@plt+2 (foo+2)@plt`)
involves messy hacks. We should refrain from adding new `@` uses.

Remove unneeded `RISCVMCExpr::VK_QC_E_JUMP_PLT` (should only be used
when an instruction might have multiple reasonable relocations
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers).

---

GCC's initial initial RISC-V port made a mistake by having both `call
foo` (non-PIC) and `call foo@plt` (PIC), likely misled by x86/SystemZ.
It was determined that the `@plt` was not needed. Since R_RISCV_CALL had
questionable undefined weak semantics in GNU ld (which has been removed
then), we kept R_RISCV_CALL_PLT and deprecated R_RISCV_CALL.

For RISC-V instructions, we only keep `@` in call/jump for backward
compatibility and discourage it for all other instructions.

(
There is disagreement about whether `PLT` in `JUMP_PLT` is useful or
misleading.
MaskRay's opnion: For new branch relocations with procedure call
semantics, use `_CALL` and avoid `_PLT` in the relocation name.
`_PLT` should only be used in data directives (e.g. R_RISCV_PLT32) to
indicate that the address of a function is not significant.
)

Pull Request: llvm#135507
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants