Skip to content

[llvm-objdump] --adjust-vma: Call getInstruction with adjusted address #140471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented May 18, 2025

llvm-objdump currently calls MCDisassembler::getInstruction with
unadjusted address and MCInstPrinter::printInst with adjusted address.
The decoded branch targets will be adjusted as expected for most targets
(as the getInstruction address is insignificant) but not for SystemZ
(where the getInstruction address is displayed).

Specify an adjust address to fix SystemZInstPrinter output.

The added test utilizes llvm/utils/update_test_body.py to make updates
easier and additionally checks that we don't adjust SHN_ABS symbol
addresses.

Created using spr 1.3.5-bogner
@llvmbot
Copy link
Member

llvmbot commented May 18, 2025

@llvm/pr-subscribers-llvm-binary-utilities

Author: Fangrui Song (MaskRay)

Changes

llvm-objdump currently calls MCDisassembler::getInstruction with
unadjusted address and MCInstPrinter::printInst with adjusted address.
The decoded branch targets will be adjusted as expected for most targets
(as the getInstruction address is insignificant) but not for SystemZ
(where the getInstruction address is displayed).

Specify an adjust address to fix SystemZInstPrinter output.

The added test utilizes llvm/utils/update_test_body.py to make updates
easier and additionally checks that we don't adjust SHN_ABS symbol
addresses.


Full diff: https://github.com/llvm/llvm-project/pull/140471.diff

2 Files Affected:

  • (added) llvm/test/tools/llvm-objdump/ELF/SystemZ/adjust-vma.test (+96)
  • (modified) llvm/tools/llvm-objdump/llvm-objdump.cpp (+1-1)
diff --git a/llvm/test/tools/llvm-objdump/ELF/SystemZ/adjust-vma.test b/llvm/test/tools/llvm-objdump/ELF/SystemZ/adjust-vma.test
new file mode 100644
index 0000000000000..9cc93fb1af8d8
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/ELF/SystemZ/adjust-vma.test
@@ -0,0 +1,96 @@
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: yaml2obj a.yaml -o out
+# RUN: llvm-objdump -td --adjust-vma=0x200000 --no-show-raw-insn out | FileCheck %s --match-full-lines
+
+# CHECK:      SYMBOL TABLE:
+# CHECK-NEXT: 0000000001200104 l       .text  0000000000000000 f1
+# CHECK-NEXT: 0000000001200106 l       .text  0000000000000000 f2
+# CHECK-NEXT: 0000000000000800 l       *ABS*  0000000000000000 abs
+
+# CHECK:      00000000012000f8 <_start>:
+# CHECK-NEXT:  12000f8: brasl   %r14, 0x1200104
+# CHECK-NEXT:  12000fe: brasl   %r14, 0x1200106
+# CHECK-EMPTY:
+# CHECK-NEXT: 0000000001200104 <f1>:
+# CHECK-NEXT:  1200104: br      %r14
+# CHECK-EMPTY:
+# CHECK-NEXT: 0000000001200106 <f2>:
+# CHECK-NEXT:  1200106: br      %r14
+
+#--- a.s
+.globl _start
+_start:
+  brasl %r14, f1
+  brasl %r14, f2
+
+f1:
+  br %r14
+f2:
+  br %r14
+
+abs = 0x800
+#--- gen
+LLD_IN_TEST=1 clang --target=s390x-linux -no-pie -nostdlib -Wl,--no-rosegment,-zseparate-code,-znorelro,-znognustack -fuse-ld=lld a.s -o a
+obj2yaml a
+#--- a.yaml
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2MSB
+  Type:            ET_EXEC
+  Machine:         EM_S390
+  Entry:           0x10000F8
+ProgramHeaders:
+  - Type:            PT_PHDR
+    Flags:           [ PF_R ]
+    VAddr:           0x1000040
+    Align:           0x8
+    Offset:          0x40
+  - Type:            PT_INTERP
+    Flags:           [ PF_R ]
+    FirstSec:        .interp
+    LastSec:         .interp
+    VAddr:           0x10000E8
+    Offset:          0xE8
+  - Type:            PT_LOAD
+    Flags:           [ PF_X, PF_R ]
+    FirstSec:        .interp
+    LastSec:         .text
+    VAddr:           0x1000000
+    Align:           0x1000
+    Offset:          0x0
+Sections:
+  - Name:            .interp
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x10000E8
+    AddressAlign:    0x1
+    Content:         2F6C69622F6C6436342E736F2E3100
+  - Name:            .text
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x10000F8
+    AddressAlign:    0x4
+    Content:         C0E500000006C0E50000000407FE07FE
+  - Name:            .comment
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_MERGE, SHF_STRINGS ]
+    AddressAlign:    0x1
+    EntSize:         0x1
+    Offset:          0x1000
+    Content:         4C696E6B65723A204C4C442032312E302E3000
+Symbols:
+  - Name:            f1
+    Section:         .text
+    Value:           0x1000104
+  - Name:            f2
+    Section:         .text
+    Value:           0x1000106
+  - Name:            abs
+    Index:           SHN_ABS
+    Value:           0x800
+  - Name:            _start
+    Section:         .text
+    Binding:         STB_GLOBAL
+    Value:           0x10000F8
+...
diff --git a/llvm/tools/llvm-objdump/llvm-objdump.cpp b/llvm/tools/llvm-objdump/llvm-objdump.cpp
index 7a778da2d3a49..99bb41f5b95a7 100644
--- a/llvm/tools/llvm-objdump/llvm-objdump.cpp
+++ b/llvm/tools/llvm-objdump/llvm-objdump.cpp
@@ -2323,7 +2323,7 @@ disassembleObject(ObjectFile &Obj, const ObjectFile &DbgObj,
           // provided
           MCInst Inst;
           ArrayRef<uint8_t> ThisBytes = Bytes.slice(Index);
-          uint64_t ThisAddr = SectionAddr + Index;
+          uint64_t ThisAddr = SectionAddr + Index + VMAAdjustment;
           bool Disassembled = DT->DisAsm->getInstruction(
               Inst, Size, ThisBytes, ThisAddr, CommentStream);
           if (Size == 0)

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me, but needs a SystemZ person to sign off.

@uweigand
Copy link
Member

Hi @MaskRay , I'm also confused why SystemZ is supposedly doing anything special here. The only use of the Address argument to getInstruction in the SystemZ implementation is to pass it on to MCDecoder::tryAddingSymbolicOperand, which looks to be exactly the same that most other targets do as well.

Also, there seem to be several other calls to getInstruction in llvm-objdump.cpp - don't they also need to pass the adjusted address`

Created using spr 1.3.5-bogner
@MaskRay
Copy link
Member Author

MaskRay commented May 19, 2025

Hi @MaskRay , I'm also confused why SystemZ is supposedly doing anything special here. The only use of the Address argument to getInstruction in the SystemZ implementation is to pass it on to MCDecoder::tryAddingSymbolicOperand, which looks to be exactly the same that most other targets do as well.

Branch target addresses are dependent on the current instruction address. It seems that we have two mechanisms.
At this point perhaps most targets use InstPrinter, e.g. AArch64InstPrinter::printAlignedLabel(const MCInst *MI, uint64_t Address, ...
I have added the TableGen support and migrates some targets around https://reviews.llvm.org/D77853

Actually an older interface for Disassembler had existed before my changes, used by SystemZ.
It seems that most targets have ignored uint64_t Address in their Disassembler.cpp files,
llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp

I think handling the branch target display in InstPrinter makes more sense, as a decoded instruction
ideally should not change its internal representation due to the getInstruction-time address.
Perhaps the Disassembler uint64_t Address argument can be removed from most functions.


Also, there seem to be several other calls to getInstruction in llvm-objdump.cpp - don't they also need to pass the adjusted address`

Handling --adjust-vma= with another option is like a whac-a-mole game.... I think we will just make change when we see a new issue. One call is for --symbolize-operands, where the address doesn't seem to matter. Another call is used by AMDGPU, which I do not yet to touch in this PR.

In the future we should probably just adjust the variable SectionAddr.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but should wait for a SystemZ approval too.

@uweigand
Copy link
Member

Branch target addresses are dependent on the current instruction address. It seems that we have two mechanisms.
At this point perhaps most targets use InstPrinter, e.g. AArch64InstPrinter::printAlignedLabel(const MCInst *MI, uint64_t Address, ...

It is indeed true that SystemZ does not yet support PrintBranchImmAsAddress, and we certainly should do so. However, I don't see how this replaces the tryAddSymbolicOperand method in getInstruction - rather, this seems to be an orthogonal change.

The way I see it is that there are two considerations when printing a branch target label:

  • can we find a symbolic representation (symbol name + offset) for the target
  • if not, should we display an absolute address or a displacement

The first is done via tryAddSymbolicOperand in the disassembler, and the second is done by the InstPrinter. Both will need the correct instruction address, adjusted if necessary.

Either way, I think this PR is certainly correct. I will work on supporting PrintBranchImmAsAddress in SystemZ.

@MaskRay MaskRay merged commit 95e4db8 into main May 20, 2025
11 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/llvm-objdump-adjust-vma-call-getinstruction-with-adjusted-address branch May 20, 2025 15:54
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 20, 2025
…sted address

llvm-objdump currently calls MCDisassembler::getInstruction with
unadjusted address and MCInstPrinter::printInst with adjusted address.
The decoded branch targets will be adjusted as expected for most targets
(as the getInstruction address is insignificant) but not for SystemZ
(where the getInstruction address is displayed).

Specify an adjust address to fix SystemZInstPrinter output.

The added test utilizes llvm/utils/update_test_body.py to make updates
easier and additionally checks that we don't adjust SHN_ABS symbol
addresses.

Pull Request: llvm/llvm-project#140471
@Kewen12
Copy link
Contributor

Kewen12 commented May 20, 2025

Hi there, this PR failed test and breaks our buildbots. Could you please take a look? Thanks!

Failed test: llvm/test/tools/llvm-objdump/ELF/SystemZ/adjust-vma.test:12:10: error: CHECK: expected string not found in input

buildbot: https://lab.llvm.org/buildbot/#/builders/10/builds/5709

MaskRay added a commit that referenced this pull request May 20, 2025
@MaskRay
Copy link
Member Author

MaskRay commented May 20, 2025

Hi there, this PR failed test and breaks our buildbots. Could you please take a look? Thanks!

Failed test: llvm/test/tools/llvm-objdump/ELF/SystemZ/adjust-vma.test:12:10: error: CHECK: expected string not found in input

buildbot: lab.llvm.org/buildbot#/builders/10/builds/5709

Sorry. I forgot to add /llvm/test/tools/llvm-objdump/ELF/SystemZ/lit.local.cfg. Added in a1e314d

@Kewen12
Copy link
Contributor

Kewen12 commented May 20, 2025

Hi there, this PR failed test and breaks our buildbots. Could you please take a look? Thanks!

Failed test: llvm/test/tools/llvm-objdump/ELF/SystemZ/adjust-vma.test:12:10: error: CHECK: expected string not found in input

buildbot: lab.llvm.org/buildbot#/builders/10/builds/5709

Sorry. I forgot to add /llvm/test/tools/llvm-objdump/ELF/SystemZ/lit.local.cfg. Added in a1e314d

Thanks for the quick fix!

kostasalv pushed a commit to kostasalv/llvm-project that referenced this pull request May 21, 2025
llvm-objdump currently calls MCDisassembler::getInstruction with
unadjusted address and MCInstPrinter::printInst with adjusted address.
The decoded branch targets will be adjusted as expected for most targets
(as the getInstruction address is insignificant) but not for SystemZ
(where the getInstruction address is displayed).

Specify an adjust address to fix SystemZInstPrinter output.

The added test utilizes llvm/utils/update_test_body.py to make updates
easier and additionally checks that we don't adjust SHN_ABS symbol
addresses.

Pull Request: llvm#140471
kostasalv pushed a commit to kostasalv/llvm-project that referenced this pull request May 21, 2025
uweigand added a commit to uweigand/llvm-project that referenced this pull request May 22, 2025
As noticed in llvm#140471,
the SystemZ target currently implements disassembly of PC-relative
target addresses differently from other back-ends.  This patch
brings SystemZ in line with other targets.

Specifically, this patch changes the relevant MCInst instructions
to carry a PC-relative displacement instead of an absolute target
address in their immediate fields.  When printing the instruction,
this displacement will either be shown as is (e.g. for llvm-mc),
or else translated into an absolute address at print time (e.g.
for llvm-objdump).

The existing llvm-mc based tests using PC-relative operands no
longer work and have to be rewritten, but printing displacements
makes those tests easier to maintain anyway.
@uweigand
Copy link
Member

It is indeed true that SystemZ does not yet support PrintBranchImmAsAddress, and we certainly should do so.

This is now #141064

uweigand added a commit that referenced this pull request May 22, 2025
As noticed in #140471, the
SystemZ target currently implements disassembly of PC-relative target
addresses differently from other back-ends. This patch brings SystemZ in
line with other targets.

Specifically, this patch changes the relevant MCInst instructions to
carry a PC-relative displacement instead of an absolute target address
in their immediate fields. When printing the instruction, this
displacement will either be shown as is (e.g. for llvm-mc), or else
translated into an absolute address at print time (e.g. for
llvm-objdump).

The existing llvm-mc based tests using PC-relative operands no longer
work and have to be rewritten, but printing displacements makes those
tests easier to maintain anyway.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 22, 2025
…(#141064)

As noticed in llvm/llvm-project#140471, the
SystemZ target currently implements disassembly of PC-relative target
addresses differently from other back-ends. This patch brings SystemZ in
line with other targets.

Specifically, this patch changes the relevant MCInst instructions to
carry a PC-relative displacement instead of an absolute target address
in their immediate fields. When printing the instruction, this
displacement will either be shown as is (e.g. for llvm-mc), or else
translated into an absolute address at print time (e.g. for
llvm-objdump).

The existing llvm-mc based tests using PC-relative operands no longer
work and have to be rewritten, but printing displacements makes those
tests easier to maintain anyway.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
llvm-objdump currently calls MCDisassembler::getInstruction with
unadjusted address and MCInstPrinter::printInst with adjusted address.
The decoded branch targets will be adjusted as expected for most targets
(as the getInstruction address is insignificant) but not for SystemZ
(where the getInstruction address is displayed).

Specify an adjust address to fix SystemZInstPrinter output.

The added test utilizes llvm/utils/update_test_body.py to make updates
easier and additionally checks that we don't adjust SHN_ABS symbol
addresses.

Pull Request: llvm#140471
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
As noticed in llvm#140471, the
SystemZ target currently implements disassembly of PC-relative target
addresses differently from other back-ends. This patch brings SystemZ in
line with other targets.

Specifically, this patch changes the relevant MCInst instructions to
carry a PC-relative displacement instead of an absolute target address
in their immediate fields. When printing the instruction, this
displacement will either be shown as is (e.g. for llvm-mc), or else
translated into an absolute address at print time (e.g. for
llvm-objdump).

The existing llvm-mc based tests using PC-relative operands no longer
work and have to be rewritten, but printing displacements makes those
tests easier to maintain anyway.
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 6, 2025
llvm-objdump currently calls MCDisassembler::getInstruction with
unadjusted address and MCInstPrinter::printInst with adjusted address.
The decoded branch targets will be adjusted as expected for most targets
(as the getInstruction address is insignificant) but not for SystemZ
(where the getInstruction address is displayed).

Specify an adjust address to fix SystemZInstPrinter output.

The added test utilizes llvm/utils/update_test_body.py to make updates
easier and additionally checks that we don't adjust SHN_ABS symbol
addresses.

Pull Request: llvm#140471
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 6, 2025
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 6, 2025
As noticed in llvm#140471, the
SystemZ target currently implements disassembly of PC-relative target
addresses differently from other back-ends. This patch brings SystemZ in
line with other targets.

Specifically, this patch changes the relevant MCInst instructions to
carry a PC-relative displacement instead of an absolute target address
in their immediate fields. When printing the instruction, this
displacement will either be shown as is (e.g. for llvm-mc), or else
translated into an absolute address at print time (e.g. for
llvm-objdump).

The existing llvm-mc based tests using PC-relative operands no longer
work and have to be rewritten, but printing displacements makes those
tests easier to maintain anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants