Skip to content

[LLD][ELF][ARM] Fix resolution of R_ARM_THM_JUMP8 and R_ARM_THM_JUMP11 for big endian #126933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 17, 2025

Conversation

vhscampos
Copy link
Member

@vhscampos vhscampos commented Feb 12, 2025

These relocations apply to 16-bit Thumb instructions, so reading 16 bits rather than 32 bits ensures the correct bits are masked and written back. This fixes the incorrect masking and aligns the relocation logic with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not align with the instruction size of 16 bits, but the masking incidentally made it all work nonetheless. However, this was the case only in little endian.

In big endian mode, the read 32-bit word had to have its bytes reversed. With this byte reordering, the masking would be applied to the wrong bits, hence causing the incorrect encoding to be produced as a result of the relocation resolution.

The added test checks the result for both little and big endian modes.

These relocations apply to 16-bit Thumb instructions, so reading 16 bits
rather than 32 bits ensures the correct bits are masked and written
back. This fixes the incorrect masking and aligns the relocation logic
with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not
align with the instruction size of 16 bits, but the masking incidentally
made it all work nonetheless. However, this was the case only in little
endian.

In big endian mode, the read 32-bit word had to have its bytes reversed.
With this byte reordering, the masking would be applied to the wrong
bits, hence causing the incorrect encoding to be produced as a result of
the relocation resolution.

The added test checks the result for both little and big endian modes.
For the sake of deduplication of test files, the single test file makes
use of the preprocessor and `split-file` to fit all the required logic
into it.
@llvmbot
Copy link
Member

llvmbot commented Feb 12, 2025

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-elf

Author: Victor Campos (vhscampos)

Changes

These relocations apply to 16-bit Thumb instructions, so reading 16 bits rather than 32 bits ensures the correct bits are masked and written back. This fixes the incorrect masking and aligns the relocation logic with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not align with the instruction size of 16 bits, but the masking incidentally made it all work nonetheless. However, this was the case only in little endian.

In big endian mode, the read 32-bit word had to have its bytes reversed. With this byte reordering, the masking would be applied to the wrong bits, hence causing the incorrect encoding to be produced as a result of the relocation resolution.

The added test checks the result for both little and big endian modes. For the sake of deduplication of test files, the single test file makes use of the preprocessor and split-file to fit all the required logic into it.


Full diff: https://github.com/llvm/llvm-project/pull/126933.diff

2 Files Affected:

  • (modified) lld/ELF/Arch/ARM.cpp (+2-2)
  • (added) lld/test/ELF/arm-thumb-jump8-11.test (+108)
diff --git a/lld/ELF/Arch/ARM.cpp b/lld/ELF/Arch/ARM.cpp
index 7d2953ddf64f0..e667fdc0633c5 100644
--- a/lld/ELF/Arch/ARM.cpp
+++ b/lld/ELF/Arch/ARM.cpp
@@ -663,12 +663,12 @@ void ARM::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
   case R_ARM_THM_JUMP8:
     // We do a 9 bit check because val is right-shifted by 1 bit.
     checkInt(ctx, loc, val, 9, rel);
-    write16(ctx, loc, (read32(ctx, loc) & 0xff00) | ((val >> 1) & 0x00ff));
+    write16(ctx, loc, (read16(ctx, loc) & 0xff00) | ((val >> 1) & 0x00ff));
     break;
   case R_ARM_THM_JUMP11:
     // We do a 12 bit check because val is right-shifted by 1 bit.
     checkInt(ctx, loc, val, 12, rel);
-    write16(ctx, loc, (read32(ctx, loc) & 0xf800) | ((val >> 1) & 0x07ff));
+    write16(ctx, loc, (read16(ctx, loc) & 0xf800) | ((val >> 1) & 0x07ff));
     break;
   case R_ARM_THM_JUMP19:
     // Encoding T3: Val = S:J2:J1:imm6:imm11:0
diff --git a/lld/test/ELF/arm-thumb-jump8-11.test b/lld/test/ELF/arm-thumb-jump8-11.test
new file mode 100644
index 0000000000000..0c9ef689be24d
--- /dev/null
+++ b/lld/test/ELF/arm-thumb-jump8-11.test
@@ -0,0 +1,108 @@
+# RUN: split-file %s %t
+# RUN: clang -x c -E -P -DLITTLEENDIAN %t/a.yaml -o %t/a.yaml
+# RUN: clang -x c -E -P -DLITTLEENDIAN %t/b.yaml -o %t/b.yaml
+# RUN: yaml2obj %t/a.yaml -o %t/a.o
+# RUN: yaml2obj %t/b.yaml -o %t/b.o
+# RUN: ld.lld %t/a.o %t/b.o -o %t/linked.o
+# RUN: llvm-objdump -d %t/linked.o | FileCheck %s --check-prefixes=CHECK,CHECK-LE
+
+# RUN: split-file %s %t
+# RUN: clang -x c -E -P -DBIGENDIAN %t/a.yaml -o %t/a.yaml
+# RUN: clang -x c -E -P -DBIGENDIAN %t/b.yaml -o %t/b.yaml
+# RUN: yaml2obj %t/a.yaml -o %t/a.o
+# RUN: yaml2obj %t/b.yaml -o %t/b.o
+# RUN: ld.lld %t/a.o %t/b.o -o %t/linked.o
+# RUN: llvm-objdump -d %t/linked.o | FileCheck %s --check-prefixes=CHECK,CHECK-BE
+
+# CHECK-LE:    linked.o:	file format elf32-littlearm
+# CHECK-BE:    linked.o:	file format elf32-bigarm
+# CHECK-EMPTY:
+# CHECK-NEXT:  Disassembly of section .text:
+# CHECK-EMPTY:
+# CHECK-NEXT:  [[#%x,_START:]] <_start>:
+# CHECK-NEXT:      [[#_START]]: e000         	b	0x[[FOO:[0-9a-f]+]] <foo>      @ imm = #0x0
+# CHECK-NEXT:    [[#_START+2]]: d0ff         	beq	0x[[FOO]] <foo>         @ imm = #-0x2
+# CHECK-EMPTY:
+# CHECK-NEXT:    [[#_START+4]] <foo>:
+# CHECK-NEXT:    [[#_START+4]]: 4770         	bx	lr
+
+#--- a.yaml
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS32
+#if defined BIGENDIAN
+  Data:            ELFDATA2MSB
+#elif defined LITTLEENDIAN
+  Data:            ELFDATA2LSB
+#endif
+  Type:            ET_REL
+  Machine:         EM_ARM
+Sections:
+  - Name:            .text
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+#if defined BIGENDIAN
+    Content:         E7FED0FE
+#elif defined LITTLEENDIAN
+    Content:         FEE7FED0
+#endif
+  - Name:            .rel.text
+    Type:            SHT_REL
+    Info:            .text
+    Relocations:
+      - Symbol:          foo
+        Type:            R_ARM_THM_JUMP11
+      - Offset:          0x2
+        Symbol:          foo
+        Type:            R_ARM_THM_JUMP8
+Symbols:
+  - Name:            .text
+    Type:            STT_SECTION
+    Section:         .text
+  - Name:            '$t'
+    Section:         .text
+  - Name:            _start
+    Type:            STT_FUNC
+    Section:         .text
+    Binding:         STB_GLOBAL
+    Value:           0x1
+    Size:            0x4
+  - Name:            foo
+    Binding:         STB_GLOBAL
+...
+
+#--- b.yaml
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS32
+#if defined BIGENDIAN
+  Data:            ELFDATA2MSB
+#elif defined LITTLEENDIAN
+  Data:            ELFDATA2LSB
+#endif
+  Type:            ET_REL
+  Machine:         EM_ARM
+Sections:
+  - Name:            .text
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+#ifdef BIGENDIAN
+    Content:         '4770'
+#elif defined LITTLEENDIAN
+    Content:         '7047'
+#endif
+Symbols:
+  - Name:            .text
+    Type:            STT_SECTION
+    Section:         .text
+  - Name:            '$t'
+    Section:         .text
+  - Name:            foo
+    Type:            STT_FUNC
+    Section:         .text
+    Binding:         STB_GLOBAL
+    Value:           0x1
+    Size:            0x2
+...

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look good to me.

If it is possible to use assembler in the test, it will be worth doing so as they are usually easier to read.

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending addition of REQUIRES: arm in the test, this looks good to me. Please leave some time before merging to see if MaskRay has any additional comments.

Copy link
Member

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vhscampos vhscampos requested a review from MaskRay February 13, 2025 17:21
@@ -0,0 +1,32 @@
# REQUIRES: arm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have arm-thumb-narrow-branch-check.s .

Another candidate name is arm-thumb-narrow-branch.s

@MaskRay
Copy link
Member

MaskRay commented Feb 13, 2025

Since this is correct for little-endian, perhaps rename the title to "Fix R_ARM_THM_JUMP8 and R_ARM_THM_JUMP11 for big-endian".

@vhscampos vhscampos changed the title [LLD][ELF][ARM] Fix resolution of R_ARM_THM_JUMP8 and R_ARM_THM_JUMP11 [LLD][ELF][ARM] Fix resolution of R_ARM_THM_JUMP8 and R_ARM_THM_JUMP11 for big endian Feb 17, 2025
@vhscampos vhscampos merged commit 501c77d into llvm:main Feb 17, 2025
8 checks passed
@vhscampos vhscampos deleted the thumb1-relocation-bitwidth branch February 17, 2025 10:10
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
…1 for big endian (llvm#126933)

These relocations apply to 16-bit Thumb instructions, so reading 16 bits
rather than 32 bits ensures the correct bits are masked and written
back. This fixes the incorrect masking and aligns the relocation logic
with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not
align with the instruction size of 16 bits, but the masking incidentally
made it all work nonetheless. However, this was the case only in little
endian.

In big endian mode, the read 32-bit word had to have its bytes reversed.
With this byte reordering, the masking would be applied to the wrong
bits, hence causing the incorrect encoding to be produced as a result of
the relocation resolution.

The added test checks the result for both little and big endian modes.
simpal01 added a commit to simpal01/arm-toolchain that referenced this pull request Mar 4, 2025
We need to incorporate the following upstreamed patch into the 20.x
branch. So applying this as patch file.

[LLD][ELF][ARM] Fix resolution of R_ARM_THM_JUMP8 and R_ARM_THM_JUMP11
for big endian - cherry picked from llvm/llvm-project#126933
simpal01 added a commit to arm/arm-toolchain that referenced this pull request Mar 4, 2025
)

We need to incorporate the following upstreamed patch into the 20.x
branch. So applying this as patch file.

[LLD][ELF][ARM] Fix resolution of R_ARM_THM_JUMP8 and R_ARM_THM_JUMP11
for big endian - cherry picked from
llvm/llvm-project#126933
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Mar 25, 2025
…1 for big endian (llvm#126933)

These relocations apply to 16-bit Thumb instructions, so reading 16 bits
rather than 32 bits ensures the correct bits are masked and written
back. This fixes the incorrect masking and aligns the relocation logic
with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not
align with the instruction size of 16 bits, but the masking incidentally
made it all work nonetheless. However, this was the case only in little
endian.

In big endian mode, the read 32-bit word had to have its bytes reversed.
With this byte reordering, the masking would be applied to the wrong
bits, hence causing the incorrect encoding to be produced as a result of
the relocation resolution.

The added test checks the result for both little and big endian modes.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Apr 2, 2025
…1 for big endian (llvm#126933)

These relocations apply to 16-bit Thumb instructions, so reading 16 bits
rather than 32 bits ensures the correct bits are masked and written
back. This fixes the incorrect masking and aligns the relocation logic
with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not
align with the instruction size of 16 bits, but the masking incidentally
made it all work nonetheless. However, this was the case only in little
endian.

In big endian mode, the read 32-bit word had to have its bytes reversed.
With this byte reordering, the masking would be applied to the wrong
bits, hence causing the incorrect encoding to be produced as a result of
the relocation resolution.

The added test checks the result for both little and big endian modes.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Apr 17, 2025
…1 for big endian (llvm#126933)

These relocations apply to 16-bit Thumb instructions, so reading 16 bits
rather than 32 bits ensures the correct bits are masked and written
back. This fixes the incorrect masking and aligns the relocation logic
with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not
align with the instruction size of 16 bits, but the masking incidentally
made it all work nonetheless. However, this was the case only in little
endian.

In big endian mode, the read 32-bit word had to have its bytes reversed.
With this byte reordering, the masking would be applied to the wrong
bits, hence causing the incorrect encoding to be produced as a result of
the relocation resolution.

The added test checks the result for both little and big endian modes.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Apr 30, 2025
…1 for big endian (llvm#126933)

These relocations apply to 16-bit Thumb instructions, so reading 16 bits
rather than 32 bits ensures the correct bits are masked and written
back. This fixes the incorrect masking and aligns the relocation logic
with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not
align with the instruction size of 16 bits, but the masking incidentally
made it all work nonetheless. However, this was the case only in little
endian.

In big endian mode, the read 32-bit word had to have its bytes reversed.
With this byte reordering, the masking would be applied to the wrong
bits, hence causing the incorrect encoding to be produced as a result of
the relocation resolution.

The added test checks the result for both little and big endian modes.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request May 15, 2025
…1 for big endian (llvm#126933)

These relocations apply to 16-bit Thumb instructions, so reading 16 bits
rather than 32 bits ensures the correct bits are masked and written
back. This fixes the incorrect masking and aligns the relocation logic
with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not
align with the instruction size of 16 bits, but the masking incidentally
made it all work nonetheless. However, this was the case only in little
endian.

In big endian mode, the read 32-bit word had to have its bytes reversed.
With this byte reordering, the masking would be applied to the wrong
bits, hence causing the incorrect encoding to be produced as a result of
the relocation resolution.

The added test checks the result for both little and big endian modes.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request May 29, 2025
…1 for big endian (llvm#126933)

These relocations apply to 16-bit Thumb instructions, so reading 16 bits
rather than 32 bits ensures the correct bits are masked and written
back. This fixes the incorrect masking and aligns the relocation logic
with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not
align with the instruction size of 16 bits, but the masking incidentally
made it all work nonetheless. However, this was the case only in little
endian.

In big endian mode, the read 32-bit word had to have its bytes reversed.
With this byte reordering, the masking would be applied to the wrong
bits, hence causing the incorrect encoding to be produced as a result of
the relocation resolution.

The added test checks the result for both little and big endian modes.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Jun 13, 2025
…1 for big endian (llvm#126933)

These relocations apply to 16-bit Thumb instructions, so reading 16 bits
rather than 32 bits ensures the correct bits are masked and written
back. This fixes the incorrect masking and aligns the relocation logic
with the instruction encoding.

Before this patch, 32 bits were read from the ELF object. This did not
align with the instruction size of 16 bits, but the masking incidentally
made it all work nonetheless. However, this was the case only in little
endian.

In big endian mode, the read 32-bit word had to have its bytes reversed.
With this byte reordering, the masking would be applied to the wrong
bits, hence causing the incorrect encoding to be produced as a result of
the relocation resolution.

The added test checks the result for both little and big endian modes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants