Skip to content

[llvm-objdump][macho] Add support for ObjC relative method lists #85477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 16, 2024

Conversation

alx32
Copy link
Contributor

@alx32 alx32 commented Mar 15, 2024

For Mach-O, ld64 supports the -fobjc-relative-method-lists flag which changes the format in which method lists are generated. The format uses delta encoding vs the original direct-pointer encoding.
This change adds support to llvm-objdump and llvm-otool for decoding/dumping of method lists in the delta format. Previously, if a binary with this information format was passed to the tooling, it would output invalid information, trying to parse the delta lists as pointer lists.
After this change, the tooling will output correct information if a binary in this format is encountered.
The output format is closest feasible match to XCode 15.1's otool output. Tests are included for both 32bit and 64bit binaries.

The code style was matched as close as possible to existing implementation of parsing non-delta method lists.

Diff between llvm-objdump and XCode 15.1 otool:
image

Note: This is a retry of this PR: #84250
On the original PR, the armv7+armv8 builds were failing due to absolute offsets being different.

@alx32
Copy link
Contributor Author

alx32 commented Mar 15, 2024

This is a retry of this PR: #84250
After build bots showed that absolute offsets are different on armv7/armv8 builds of objdump. The absolute offsets are now regex's.

@llvmbot
Copy link
Member

llvmbot commented Mar 15, 2024

@llvm/pr-subscribers-llvm-binary-utilities

Author: None (alx32)

Changes

For Mach-O, ld64 supports the -fobjc-relative-method-lists flag which changes the format in which method lists are generated. The format uses delta encoding vs the original direct-pointer encoding.
This change adds support to llvm-objdump and llvm-otool for decoding/dumping of method lists in the delta format. Previously, if a binary with this information format was passed to the tooling, it would output invalid information, trying to parse the delta lists as pointer lists.
After this change, the tooling will output correct information if a binary in this format is encountered.
The output format is closest feasible match to XCode 15.1's otool output. Tests are included for both 32bit and 64bit binaries.

The code style was matched as close as possible to existing implementation of parsing non-delta method lists.

Diff between llvm-objdump and XCode 15.1 otool:
image

Note: This is a retry of this PR: #84250
On the original PR, the armv7+armv8 builds were failing due to absolute offsets being different.


Full diff: https://github.com/llvm/llvm-project/pull/85477.diff

4 Files Affected:

  • (added) llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/rel-method-lists-arm64.dylib ()
  • (added) llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/rel-method-lists-arm64_32.dylib ()
  • (added) llvm/test/tools/llvm-objdump/MachO/AArch64/macho-relative-method-lists.test (+86)
  • (modified) llvm/tools/llvm-objdump/MachODump.cpp (+110-2)
diff --git a/llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/rel-method-lists-arm64.dylib b/llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/rel-method-lists-arm64.dylib
new file mode 100755
index 00000000000000..051e28f33d7494
Binary files /dev/null and b/llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/rel-method-lists-arm64.dylib differ
diff --git a/llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/rel-method-lists-arm64_32.dylib b/llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/rel-method-lists-arm64_32.dylib
new file mode 100755
index 00000000000000..d3a339057abc34
Binary files /dev/null and b/llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/rel-method-lists-arm64_32.dylib differ
diff --git a/llvm/test/tools/llvm-objdump/MachO/AArch64/macho-relative-method-lists.test b/llvm/test/tools/llvm-objdump/MachO/AArch64/macho-relative-method-lists.test
new file mode 100644
index 00000000000000..b1b96a41a32939
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/MachO/AArch64/macho-relative-method-lists.test
@@ -0,0 +1,86 @@
+RUN: llvm-objdump --macho --objc-meta-data    %p/Inputs/rel-method-lists-arm64_32.dylib | FileCheck %s --check-prefix=CHK32
+RUN: llvm-otool -ov                           %p/Inputs/rel-method-lists-arm64_32.dylib | FileCheck %s --check-prefix=CHK32
+
+RUN: llvm-objdump --macho --objc-meta-data    %p/Inputs/rel-method-lists-arm64.dylib    | FileCheck %s --check-prefix=CHK64
+RUN: llvm-otool -ov                           %p/Inputs/rel-method-lists-arm64.dylib    | FileCheck %s --check-prefix=CHK64
+
+CHK32:                 baseMethods 0x660 (struct method_list_t *)
+CHK32-NEXT:                 entsize 12 (relative)
+CHK32-NEXT:                   count 3
+CHK32-NEXT:                    name 0x144 (0x{{[0-9a-f]*}}) instance_method_00
+CHK32-NEXT:                   types 0x91 (0x{{[0-9a-f]*}}) v8@0:4
+CHK32-NEXT:                     imp 0xffffff18 (0x{{[0-9a-f]*}}) -[MyClass instance_method_00]
+CHK32-NEXT:                    name 0x13c (0x{{[0-9a-f]*}}) instance_method_01
+CHK32-NEXT:                   types 0x85 (0x{{[0-9a-f]*}}) v8@0:4
+CHK32-NEXT:                     imp 0xffffff28 (0x{{[0-9a-f]*}}) -[MyClass instance_method_01]
+CHK32-NEXT:                    name 0x134 (0x{{[0-9a-f]*}}) instance_method_02
+CHK32-NEXT:                   types 0x79 (0x{{[0-9a-f]*}}) v8@0:4
+CHK32-NEXT:                     imp 0xffffff38 (0x{{[0-9a-f]*}}) -[MyClass instance_method_02]
+
+CHK32:                 baseMethods 0x630 (struct method_list_t *)
+CHK32-NEXT:                 entsize 12 (relative)
+CHK32-NEXT:                   count 3
+CHK32-NEXT:                    name 0x180 (0x{{[0-9a-f]*}}) class_method_00
+CHK32-NEXT:                   types 0xc1 (0x{{[0-9a-f]*}}) v8@0:4
+CHK32-NEXT:                     imp 0xffffff9c (0x{{[0-9a-f]*}}) +[MyClass class_method_00]
+CHK32-NEXT:                    name 0x178 (0x{{[0-9a-f]*}}) class_method_01
+CHK32-NEXT:                   types 0xb5 (0x{{[0-9a-f]*}}) v8@0:4
+CHK32-NEXT:                     imp 0xffffffac (0x{{[0-9a-f]*}}) +[MyClass class_method_01]
+CHK32-NEXT:                    name 0x170 (0x{{[0-9a-f]*}}) class_method_02
+CHK32-NEXT:                   types 0xa9 (0x{{[0-9a-f]*}}) v8@0:4
+CHK32-NEXT:                     imp 0xffffffbc (0x{{[0-9a-f]*}}) +[MyClass class_method_02]
+
+CHK64:                  baseMethods 0x6e0 (struct method_list_t *)
+CHK64-NEXT:                  entsize 12 (relative)
+CHK64-NEXT:                    count 3
+CHK64-NEXT:                     name 0x188 (0x{{[0-9a-f]*}}) instance_method_00
+CHK64-NEXT:                    types 0x91 (0x{{[0-9a-f]*}}) v16@0:8
+CHK64-NEXT:                      imp 0xffffffa8 (0x{{[0-9a-f]*}}) -[MyClass instance_method_00]
+CHK64-NEXT:                     name 0x184 (0x{{[0-9a-f]*}}) instance_method_01
+CHK64-NEXT:                    types 0x85 (0x{{[0-9a-f]*}}) v16@0:8
+CHK64-NEXT:                      imp 0xffffffa0 (0x{{[0-9a-f]*}}) -[MyClass instance_method_01]
+CHK64-NEXT:                     name 0x180 (0x{{[0-9a-f]*}}) instance_method_02
+CHK64-NEXT:                    types 0x79 (0x{{[0-9a-f]*}}) v16@0:8
+CHK64-NEXT:                      imp 0xffffff98 (0x{{[0-9a-f]*}}) -[MyClass instance_method_02]
+
+CHK64:                  baseMethods 0x6b0 (struct method_list_t *)
+CHK64-NEXT:                  entsize 12 (relative)
+CHK64-NEXT:                    count 3
+CHK64-NEXT:                     name 0x1d0 (0x{{[0-9a-f]*}}) class_method_00
+CHK64-NEXT:                    types 0xc1 (0x{{[0-9a-f]*}}) v16@0:8
+CHK64-NEXT:                      imp 0xffffffe4 (0x{{[0-9a-f]*}}) +[MyClass class_method_00]
+CHK64-NEXT:                     name 0x1cc (0x{{[0-9a-f]*}}) class_method_01
+CHK64-NEXT:                    types 0xb5 (0x{{[0-9a-f]*}}) v16@0:8
+CHK64-NEXT:                      imp 0xffffffdc (0x{{[0-9a-f]*}}) +[MyClass class_method_01]
+CHK64-NEXT:                     name 0x1c8 (0x{{[0-9a-f]*}}) class_method_02
+CHK64-NEXT:                    types 0xa9 (0x{{[0-9a-f]*}}) v16@0:8
+CHK64-NEXT:                      imp 0xffffffd4 (0x{{[0-9a-f]*}}) +[MyClass class_method_02]
+
+######## Generate rel-method-lists-arm64.dylib ########
+// clang -c main.mm -o main.o -target arm64-apple-macos -arch arm64
+// ld64.ld64 -dylib -demangle -dynamic main.o -o rel-method-lists-arm64.dylib -syslibroot MacOSX14.2.sdk -segalign 0x10 -objc_relative_method_lists
+
+######## Generate rel-method-lists-arm64_32.dylib ########
+// clang -c main.mm -o main.o -target arm64_32-apple-watchos -arch arm64_32
+// ld64.ld64 -dylib -demangle -dynamic main.o -o rel-method-lists-arm64_32.dylib -syslibroot WatchOS.sdk -segalign 0x10 -objc_relative_method_lists
+
+// ~~~~~~~~~~~~~~~~~~~~~~~~~ main.mm ~~~~~~~~~~~~~~~~~~~~~~~~~
+__attribute__((objc_root_class))
+@interface MyClass
+- (void)instance_method_00;
+- (void)instance_method_01;
+- (void)instance_method_02;
++ (void)class_method_00;
++ (void)class_method_01;
++ (void)class_method_02;
+@end
+@implementation MyClass
+- (void)instance_method_00 {}
+- (void)instance_method_01 {}
+- (void)instance_method_02 {}
++ (void)class_method_00 {}
++ (void)class_method_01 {}
++ (void)class_method_02 {}
+@end
+void *_objc_empty_cache;
+void *_objc_empty_vtable;
diff --git a/llvm/tools/llvm-objdump/MachODump.cpp b/llvm/tools/llvm-objdump/MachODump.cpp
index 0e6935c0ac5895..1b0e5ba279d06b 100644
--- a/llvm/tools/llvm-objdump/MachODump.cpp
+++ b/llvm/tools/llvm-objdump/MachODump.cpp
@@ -3661,6 +3661,10 @@ struct class_ro32_t {
 #define RO_ROOT (1 << 1)
 #define RO_HAS_CXX_STRUCTORS (1 << 2)
 
+/* Values for method_list{64,32}_t->entsize */
+#define ML_HAS_RELATIVE_PTRS (1 << 31)
+#define ML_ENTSIZE_MASK 0xFFFF
+
 struct method_list64_t {
   uint32_t entsize;
   uint32_t count;
@@ -3685,6 +3689,12 @@ struct method32_t {
   uint32_t imp;   /* IMP (32-bit pointer) */
 };
 
+struct method_relative_t {
+  int32_t name;  /* SEL (32-bit relative) */
+  int32_t types; /* const char * (32-bit relative) */
+  int32_t imp;   /* IMP (32-bit relative) */
+};
+
 struct protocol_list64_t {
   uint64_t count; /* uintptr_t (a 64-bit value) */
   /* struct protocol64_t * list[0];  These pointers follow inline */
@@ -3986,6 +3996,12 @@ inline void swapStruct(struct method32_t &m) {
   sys::swapByteOrder(m.imp);
 }
 
+inline void swapStruct(struct method_relative_t &m) {
+  sys::swapByteOrder(m.name);
+  sys::swapByteOrder(m.types);
+  sys::swapByteOrder(m.imp);
+}
+
 inline void swapStruct(struct protocol_list64_t &pl) {
   sys::swapByteOrder(pl.count);
 }
@@ -4440,6 +4456,84 @@ static void print_layout_map32(uint32_t p, struct DisassembleInfo *info) {
   print_layout_map(layout_map, left);
 }
 
+static void print_relative_method_list(uint32_t structSizeAndFlags,
+                                       uint32_t structCount, uint64_t p,
+                                       struct DisassembleInfo *info,
+                                       const char *indent,
+                                       uint32_t pointerBits) {
+  struct method_relative_t m;
+  const char *r, *name;
+  uint32_t offset, xoffset, left, i;
+  SectionRef S, xS;
+
+  assert(((structSizeAndFlags & ML_HAS_RELATIVE_PTRS) != 0) &&
+         "expected structSizeAndFlags to have ML_HAS_RELATIVE_PTRS flag");
+
+  outs() << indent << "\t\t   entsize "
+         << (structSizeAndFlags & ML_ENTSIZE_MASK) << " (relative) \n";
+  outs() << indent << "\t\t     count " << structCount << "\n";
+
+  for (i = 0; i < structCount; i++) {
+    r = get_pointer_64(p, offset, left, S, info);
+    memset(&m, '\0', sizeof(struct method_relative_t));
+    if (left < sizeof(struct method_relative_t)) {
+      memcpy(&m, r, left);
+      outs() << indent << "   (method_t extends past the end of the section)\n";
+    } else
+      memcpy(&m, r, sizeof(struct method_relative_t));
+    if (info->O->isLittleEndian() != sys::IsLittleEndianHost)
+      swapStruct(m);
+
+    outs() << indent << "\t\t      name " << format("0x%" PRIx32, m.name);
+    uint64_t relNameRefVA = p + offsetof(struct method_relative_t, name);
+    uint64_t absNameRefVA = relNameRefVA + m.name;
+    outs() << " (" << format("0x%" PRIx32, absNameRefVA) << ")";
+
+    // since this is a relative list, absNameRefVA is the address of the
+    // __objc_selrefs entry, so a pointer, not the actual name
+    const char *nameRefPtr =
+        get_pointer_64(absNameRefVA, xoffset, left, xS, info);
+    if (nameRefPtr) {
+      uint32_t pointerSize = pointerBits / CHAR_BIT;
+      if (left < pointerSize)
+        outs() << indent << " (nameRefPtr extends past the end of the section)";
+      else {
+        if (pointerSize == 64) {
+          name = get_pointer_64(*reinterpret_cast<const uint64_t *>(nameRefPtr),
+                                xoffset, left, xS, info);
+        } else {
+          name = get_pointer_32(*reinterpret_cast<const uint32_t *>(nameRefPtr),
+                                xoffset, left, xS, info);
+        }
+        if (name != nullptr)
+          outs() << format(" %.*s", left, name);
+      }
+    }
+    outs() << "\n";
+
+    outs() << indent << "\t\t     types " << format("0x%" PRIx32, m.types);
+    uint64_t relTypesVA = p + offsetof(struct method_relative_t, types);
+    uint64_t absTypesVA = relTypesVA + m.types;
+    outs() << " (" << format("0x%" PRIx32, absTypesVA) << ")";
+    name = get_pointer_32(absTypesVA, xoffset, left, xS, info);
+    if (name != nullptr)
+      outs() << format(" %.*s", left, name);
+    outs() << "\n";
+
+    outs() << indent << "\t\t       imp " << format("0x%" PRIx32, m.imp);
+    uint64_t relImpVA = p + offsetof(struct method_relative_t, imp);
+    uint64_t absImpVA = relImpVA + m.imp;
+    outs() << " (" << format("0x%" PRIx32, absImpVA) << ")";
+    name = GuessSymbolName(absImpVA, info->AddrMap);
+    if (name != nullptr)
+      outs() << " " << name;
+    outs() << "\n";
+
+    p += sizeof(struct method_relative_t);
+    offset += sizeof(struct method_relative_t);
+  }
+}
+
 static void print_method_list64_t(uint64_t p, struct DisassembleInfo *info,
                                   const char *indent) {
   struct method_list64_t ml;
@@ -4461,10 +4555,17 @@ static void print_method_list64_t(uint64_t p, struct DisassembleInfo *info,
     memcpy(&ml, r, sizeof(struct method_list64_t));
   if (info->O->isLittleEndian() != sys::IsLittleEndianHost)
     swapStruct(ml);
+  p += sizeof(struct method_list64_t);
+
+  if ((ml.entsize & ML_HAS_RELATIVE_PTRS) != 0) {
+    print_relative_method_list(ml.entsize, ml.count, p, info, indent,
+                               /*pointerBits=*/64);
+    return;
+  }
+
   outs() << indent << "\t\t   entsize " << ml.entsize << "\n";
   outs() << indent << "\t\t     count " << ml.count << "\n";
 
-  p += sizeof(struct method_list64_t);
   offset += sizeof(struct method_list64_t);
   for (i = 0; i < ml.count; i++) {
     r = get_pointer_64(p, offset, left, S, info);
@@ -4552,10 +4653,17 @@ static void print_method_list32_t(uint64_t p, struct DisassembleInfo *info,
     memcpy(&ml, r, sizeof(struct method_list32_t));
   if (info->O->isLittleEndian() != sys::IsLittleEndianHost)
     swapStruct(ml);
+  p += sizeof(struct method_list32_t);
+
+  if ((ml.entsize & ML_HAS_RELATIVE_PTRS) != 0) {
+    print_relative_method_list(ml.entsize, ml.count, p, info, indent,
+                               /*pointerBits=*/32);
+    return;
+  }
+
   outs() << indent << "\t\t   entsize " << ml.entsize << "\n";
   outs() << indent << "\t\t     count " << ml.count << "\n";
 
-  p += sizeof(struct method_list32_t);
   offset += sizeof(struct method_list32_t);
   for (i = 0; i < ml.count; i++) {
     r = get_pointer_32(p, offset, left, S, info);

@kyulee-com kyulee-com self-requested a review March 15, 2024 23:12
@kyulee-com kyulee-com merged commit 9d5edfd into llvm:main Mar 16, 2024
@amy-kwan
Copy link
Contributor

I believe this is causing the following test case failure on the clang-ppc64be-linux-multistage bot. Could you please take a look?

@alx32
Copy link
Contributor Author

alx32 commented Mar 19, 2024

Hi @amy-kwan , thanks for flagging this.
This looks like the obvious fix: #85778
Do you know how I could test this for ppc64be before merging ?

@amy-kwan
Copy link
Contributor

@alx32, thank you for the fix! I've actually applied it locally to a ppc64be machine myself and that patch resolves the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants