Skip to content

[Object][Wasm] Use file offset for section addresses in linked wasm files #80529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 7, 2024

Conversation

dschuff
Copy link
Member

@dschuff dschuff commented Feb 3, 2024

Wasm has no unified virtual memory space as other object formats and architectures do, so previously WasmObjectFile reported 0 for all section addresses, and until 428cf71 used section offsets for function symbols. Now we use file offsets for function symbols, and this change switches section addresses to do the same (in linked files). The main result of this is that objdump now reports VMAs in section listings, and also uses file offets rather than section offsets when disassembling linked binaries (matching the behavior of other disassemblers and stack traces produced by browwsers). To make this work, this PR also updates objdump's generation of synthetics fallback symbols to match lib/Object and also correctly plumbs symbol types for regular and dummy symbols through to the backend to avoid needing special knowledge of address 0.

This also paves the way for generating symbols from name sections rather than symbol tables or imports (see #76107) by allowing the disassembler's synthetic fallback symbols match the name-section generated symbols (in a followup PR).

…iles

Wasm has no unified virtual memory space as other object formats and
architectures do, so previously WasmObjectFile reported 0 for all
section addresses, and until 428cf71 used section offsets for
function symbols. Now we use file offsets for function symbols, and
this change switches section addresses to do the same (in linked
files). The main result of this is that objdump now reports VMAs in
section listings, and also uses file offets rather than section
offsets when disassembling linked binaries (matching the behavior of
other disassemblers and stack traces produced by browwsers). To make
this work, this PR also updates objdump's generation of synthetics
fallback symbols to match lib/Object and also correctly plumbs symbol
types for regular and dummy symbols through to the backend to avoid
needing special knowledge of address 0.

This also paves the way for generating symbols from name sections
rather than symbol tables or imports by allowing (see llvm#76107) the
disassembler's synthetic fallback symbols match the name-section
generated symbols (in a followup PR).
@llvmbot
Copy link
Member

llvmbot commented Feb 3, 2024

@llvm/pr-subscribers-llvm-binary-utilities
@llvm/pr-subscribers-lld-wasm

@llvm/pr-subscribers-lld

Author: Derek Schuff (dschuff)

Changes

Wasm has no unified virtual memory space as other object formats and architectures do, so previously WasmObjectFile reported 0 for all section addresses, and until 428cf71 used section offsets for function symbols. Now we use file offsets for function symbols, and this change switches section addresses to do the same (in linked files). The main result of this is that objdump now reports VMAs in section listings, and also uses file offets rather than section offsets when disassembling linked binaries (matching the behavior of other disassemblers and stack traces produced by browwsers). To make this work, this PR also updates objdump's generation of synthetics fallback symbols to match lib/Object and also correctly plumbs symbol types for regular and dummy symbols through to the backend to avoid needing special knowledge of address 0.

This also paves the way for generating symbols from name sections rather than symbol tables or imports by allowing (see #76107) the disassembler's synthetic fallback symbols match the name-section generated symbols (in a followup PR).


Full diff: https://github.com/llvm/llvm-project/pull/80529.diff

9 Files Affected:

  • (modified) lld/test/wasm/build-id.test (+6-6)
  • (modified) lld/test/wasm/merge-string-debug.s (+7-7)
  • (modified) lld/test/wasm/startstop.ll (+6-6)
  • (modified) llvm/lib/Object/WasmObjectFile.cpp (+8-1)
  • (modified) llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp (+2-1)
  • (modified) llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test (+6-6)
  • (modified) llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test (+6-6)
  • (modified) llvm/test/tools/llvm-objdump/wasm/no-codesec.test (+3-3)
  • (modified) llvm/tools/llvm-objdump/llvm-objdump.cpp (+13-3)
diff --git a/lld/test/wasm/build-id.test b/lld/test/wasm/build-id.test
index b767d8de4a423..dd6e2237108f3 100644
--- a/lld/test/wasm/build-id.test
+++ b/lld/test/wasm/build-id.test
@@ -43,18 +43,18 @@ foo:
 
 
 # DEFAULT:      Contents of section build_id:
-# DEFAULT-NEXT: 0000 10299168 1e3c845a 3c8f80ae 2f16cc22  .).h.<.Z<.../.."
-# DEFAULT-NEXT: 0010 2d
+# DEFAULT-NEXT: 0079 10299168 1e3c845a 3c8f80ae 2f16cc22  .).h.<.Z<.../.."
+# DEFAULT-NEXT: 0089 2d
 
 # SHA1:      Contents of section build_id:
-# SHA1-NEXT: 0000 145abdda 387a9bc4 e3aed3c3 3319cd37  .Z..8z......3..7
-# SHA1-NEXT: 0010 0212237c e4                          ..#|.
+# SHA1-NEXT: 0079 145abdda 387a9bc4 e3aed3c3 3319cd37  .Z..8z......3..7
+# SHA1-NEXT: 0089 0212237c e4                          ..#|.
 
 # UUID:      Contents of section build_id:
-# UUID-NEXT: 0000 10
+# UUID-NEXT: 0079 10
 
 # HEX:      Contents of section build_id:
-# HEX-NEXT:  0000 04123456 78                          ..4Vx
+# HEX-NEXT:  0079 04123456 78                          ..4Vx
 
 
 # NONE-NOT: Contents of section build_id:
diff --git a/lld/test/wasm/merge-string-debug.s b/lld/test/wasm/merge-string-debug.s
index 73e31d5361f41..0075a834e0647 100644
--- a/lld/test/wasm/merge-string-debug.s
+++ b/lld/test/wasm/merge-string-debug.s
@@ -29,13 +29,13 @@
 
 # CHECK: Hex dump of section '.debug_str':
 
-# CHECK-O0: 0x00000000 636c616e 67207665 7273696f 6e203133 clang version 13
-# CHECK-O0: 0x00000010 2e302e30 00666f6f 62617200 636c616e .0.0.foobar.clan
-# CHECK-O0: 0x00000020 67207665 7273696f 6e203133 2e302e30 g version 13.0.0
-# CHECK-O0: 0x00000030 00626172 00666f6f 00                .bar.foo.
+# CHECK-O0: 0x00000025 636c616e 67207665 7273696f 6e203133 clang version 13
+# CHECK-O0: 0x00000035 2e302e30 00666f6f 62617200 636c616e .0.0.foobar.clan
+# CHECK-O0: 0x00000045 67207665 7273696f 6e203133 2e302e30 g version 13.0.0
+# CHECK-O0: 0x00000055 00626172 00666f6f 00                .bar.foo.
 
-# CHECK-O1: 0x00000000 666f6f62 61720066 6f6f0063 6c616e67 foobar.foo.clang
-# CHECK-O1: 0x00000010 20766572 73696f6e 2031332e 302e3000  version 13.0.0.
+# CHECK-O1: 0x00000025 666f6f62 61720066 6f6f0063 6c616e67 foobar.foo.clang
+# CHECK-O1: 0x00000035 20766572 73696f6e 2031332e 302e3000  version 13.0.0.
 
 # CHECK-OFFSETS: Hex dump of section '.debug_str_offsets':
-# CHECK-OFFSETS: 0x00000000 00000000 00000000 00000000          ............
+# CHECK-OFFSETS: 0x0000007e 00000000 00000000 00000000          ............
diff --git a/lld/test/wasm/startstop.ll b/lld/test/wasm/startstop.ll
index 4341d371f8e65..e7a5c80f91f28 100644
--- a/lld/test/wasm/startstop.ll
+++ b/lld/test/wasm/startstop.ll
@@ -34,12 +34,12 @@ entry:
 ; CHECK-NEXT:           Value:           1024
 ; CHECK-NEXT:         Content:         03000000040000002A0000002B000000
 
-;       ASM: 00000001 <get_start>:
+;       ASM: 0000006e <get_start>:
 ; ASM-EMPTY:
-;  ASM-NEXT:        3:     i32.const 1024
-;  ASM-NEXT:        9:     end
+;  ASM-NEXT:        70:     i32.const 1024
+;  ASM-NEXT:        76:     end
 
-;       ASM: 0000000a <get_end>:
+;       ASM: 00000077 <get_end>:
 ; ASM-EMPTY:
-;  ASM-NEXT:        c:     i32.const 1040
-;  ASM-NEXT:       12:     end
+;  ASM-NEXT:        79:     i32.const 1040
+;  ASM-NEXT:       7f:     end
diff --git a/llvm/lib/Object/WasmObjectFile.cpp b/llvm/lib/Object/WasmObjectFile.cpp
index 4778562779bb7..0fec053e25749 100644
--- a/llvm/lib/Object/WasmObjectFile.cpp
+++ b/llvm/lib/Object/WasmObjectFile.cpp
@@ -1906,7 +1906,14 @@ Expected<StringRef> WasmObjectFile::getSectionName(DataRefImpl Sec) const {
   return wasm::sectionTypeToString(S.Type);
 }
 
-uint64_t WasmObjectFile::getSectionAddress(DataRefImpl Sec) const { return 0; }
+uint64_t WasmObjectFile::getSectionAddress(DataRefImpl Sec) const {
+    // For object files, use 0 for section addresses, and section offsets for
+    // symbol addresses. For linked files, use file offsets.
+    // See also getSymbolAddress.
+      return isRelocatableObject() || isSharedObject()
+                              ? 0
+                              :  Sections[Sec.d.a].Offset;
+   }
 
 uint64_t WasmObjectFile::getSectionIndex(DataRefImpl Sec) const {
   return Sec.d.a;
diff --git a/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp b/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
index ed7757be66158..861f9f8b5b1ba 100644
--- a/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
+++ b/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
@@ -16,6 +16,7 @@
 
 #include "MCTargetDesc/WebAssemblyMCTypeUtilities.h"
 #include "TargetInfo/WebAssemblyTargetInfo.h"
+#include "llvm/BinaryFormat/Wasm.h"
 #include "llvm/MC/MCContext.h"
 #include "llvm/MC/MCDecoderOps.h"
 #include "llvm/MC/MCDisassembler/MCDisassembler.h"
@@ -127,7 +128,7 @@ WebAssemblyDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
                                        uint64_t Address,
                                        raw_ostream &CStream) const {
   Size = 0;
-  if (Address == 0) {
+  if (Symbol.Type == wasm::WASM_SYMBOL_TYPE_SECTION) {
     // Start of a code section: we're parsing only the function count.
     int64_t FunctionCount;
     if (!nextLEB(FunctionCount, Bytes, Size, false))
diff --git a/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test b/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test
index 646891700b866..0c7a438f2752e 100644
--- a/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test
+++ b/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test
@@ -36,14 +36,14 @@ Sections:
 
 # CHECK:       Disassembly of section CODE:
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000000 <CODE>:
+# CHECK-NEXT:  00000026 <CODE>:
 # CHECK-NEXT:          # 2 functions in section.
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000001 <f>:
+# CHECK-NEXT:  00000027 <f>:
 # CHECK-EMPTY:
-# CHECK-NEXT:         3: 0b           	end
+# CHECK-NEXT:         29: 0b       	end
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000004 <g>:
+# CHECK-NEXT:  0000002a <g>:
 # CHECK-EMPTY:
-# CHECK-NEXT:         6: 20 00        	local.get	0
-# CHECK-NEXT:         8: 0b           	end
+# CHECK-NEXT:         2c: 20 00    	local.get	0
+# CHECK-NEXT:         2e: 0b       	end
diff --git a/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test b/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test
index 633991eecfbf3..58643cac90147 100644
--- a/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test
+++ b/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test
@@ -29,14 +29,14 @@ Sections:
 
 # CHECK:       Disassembly of section CODE:
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000000 <CODE>:
+# CHECK-NEXT:  00000026 <CODE>:
 # CHECK-NEXT:          # 2 functions in section.
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000001 <>:
+# CHECK-NEXT:  00000027 <>:
 # CHECK-EMPTY:
-# CHECK-NEXT:         3: 0b           	end
+# CHECK-NEXT:         29: 0b       	end
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000004 <>:
+# CHECK-NEXT:  0000002a <>:
 # CHECK-EMPTY:
-# CHECK-NEXT:         6: 20 00        	local.get	0
-# CHECK-NEXT:         8: 0b           	end
+# CHECK-NEXT:         2c: 20 00    	local.get	0
+# CHECK-NEXT:         2e: 0b       	end
diff --git a/llvm/test/tools/llvm-objdump/wasm/no-codesec.test b/llvm/test/tools/llvm-objdump/wasm/no-codesec.test
index e66a8ddd8a066..03fd1495066ce 100644
--- a/llvm/test/tools/llvm-objdump/wasm/no-codesec.test
+++ b/llvm/test/tools/llvm-objdump/wasm/no-codesec.test
@@ -4,9 +4,9 @@
 
 # CHECK: Sections:
 # CHECK-NEXT: Idx Name          Size     VMA      Type
-# CHECK-NEXT:   0 TYPE          00000004 00000000 
-# CHECK-NEXT:   1 FUNCTION      00000002 00000000 
-# CHECK-NEXT:   2 name          00000008 00000000
+# CHECK-NEXT:   0 TYPE          00000004 0000000e
+# CHECK-NEXT:   1 FUNCTION      00000002 00000018
+# CHECK-NEXT:   2 name          00000008 00000020
   
 --- !WASM
 FileHeader:
diff --git a/llvm/tools/llvm-objdump/llvm-objdump.cpp b/llvm/tools/llvm-objdump/llvm-objdump.cpp
index b4467ec163c34..de52ebcc72fde 100644
--- a/llvm/tools/llvm-objdump/llvm-objdump.cpp
+++ b/llvm/tools/llvm-objdump/llvm-objdump.cpp
@@ -29,6 +29,7 @@
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringSet.h"
 #include "llvm/ADT/Twine.h"
+#include "llvm/BinaryFormat/Wasm.h"
 #include "llvm/DebugInfo/BTF/BTFParser.h"
 #include "llvm/DebugInfo/DWARF/DWARFContext.h"
 #include "llvm/DebugInfo/Symbolize/SymbolizableModule.h"
@@ -1149,7 +1150,11 @@ addMissingWasmCodeSymbols(const WasmObjectFile &Obj,
     SymbolAddresses.insert(Sym.Addr);
 
   for (const wasm::WasmFunction &Function : Obj.functions()) {
-    uint64_t Address = Function.CodeSectionOffset;
+    // This adjustment mirrors the one in WasmObjectFile::getSymbolAddress.
+    uint32_t Adjustment = Obj.isRelocatableObject() || Obj.isSharedObject()
+                              ? 0
+                              : Section->getAddress();
+    uint64_t Address = Function.CodeSectionOffset + Adjustment;
     // Only add fallback symbols for functions not already present in the symbol
     // table.
     if (SymbolAddresses.count(Address))
@@ -1354,6 +1359,10 @@ SymbolInfoTy objdump::createSymbolInfo(const ObjectFile &Obj,
     const SymbolRef::Type SymType = unwrapOrError(Symbol.getType(), FileName);
     return SymbolInfoTy(Addr, Name, SymType, /*IsMappingSymbol=*/false,
                         /*IsXCOFF=*/true);
+  } else if (Obj.isWasm()) {
+    uint8_t SymType =
+        cast<WasmObjectFile>(&Obj)->getWasmSymbol(Symbol).Info.Kind;
+    return SymbolInfoTy(Addr, Name, SymType, false);
   } else {
     uint8_t Type =
         Obj.isELF() ? getElfSymbolType(Obj, Symbol) : (uint8_t)ELF::STT_NOTYPE;
@@ -1366,8 +1375,9 @@ static SymbolInfoTy createDummySymbolInfo(const ObjectFile &Obj,
                                           uint8_t Type) {
   if (Obj.isXCOFF() && (SymbolDescription || TracebackTable))
     return SymbolInfoTy(std::nullopt, Addr, Name, std::nullopt, false);
-  else
-    return SymbolInfoTy(Addr, Name, Type);
+  if (Obj.isWasm())
+    return SymbolInfoTy(Addr, Name, wasm::WASM_SYMBOL_TYPE_SECTION);
+  return SymbolInfoTy(Addr, Name, Type);
 }
 
 static void collectBBAddrMapLabels(

@llvmbot
Copy link
Member

llvmbot commented Feb 3, 2024

@llvm/pr-subscribers-backend-webassembly

Author: Derek Schuff (dschuff)

Changes

Wasm has no unified virtual memory space as other object formats and architectures do, so previously WasmObjectFile reported 0 for all section addresses, and until 428cf71 used section offsets for function symbols. Now we use file offsets for function symbols, and this change switches section addresses to do the same (in linked files). The main result of this is that objdump now reports VMAs in section listings, and also uses file offets rather than section offsets when disassembling linked binaries (matching the behavior of other disassemblers and stack traces produced by browwsers). To make this work, this PR also updates objdump's generation of synthetics fallback symbols to match lib/Object and also correctly plumbs symbol types for regular and dummy symbols through to the backend to avoid needing special knowledge of address 0.

This also paves the way for generating symbols from name sections rather than symbol tables or imports by allowing (see #76107) the disassembler's synthetic fallback symbols match the name-section generated symbols (in a followup PR).


Full diff: https://github.com/llvm/llvm-project/pull/80529.diff

9 Files Affected:

  • (modified) lld/test/wasm/build-id.test (+6-6)
  • (modified) lld/test/wasm/merge-string-debug.s (+7-7)
  • (modified) lld/test/wasm/startstop.ll (+6-6)
  • (modified) llvm/lib/Object/WasmObjectFile.cpp (+8-1)
  • (modified) llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp (+2-1)
  • (modified) llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test (+6-6)
  • (modified) llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test (+6-6)
  • (modified) llvm/test/tools/llvm-objdump/wasm/no-codesec.test (+3-3)
  • (modified) llvm/tools/llvm-objdump/llvm-objdump.cpp (+13-3)
diff --git a/lld/test/wasm/build-id.test b/lld/test/wasm/build-id.test
index b767d8de4a423..dd6e2237108f3 100644
--- a/lld/test/wasm/build-id.test
+++ b/lld/test/wasm/build-id.test
@@ -43,18 +43,18 @@ foo:
 
 
 # DEFAULT:      Contents of section build_id:
-# DEFAULT-NEXT: 0000 10299168 1e3c845a 3c8f80ae 2f16cc22  .).h.<.Z<.../.."
-# DEFAULT-NEXT: 0010 2d
+# DEFAULT-NEXT: 0079 10299168 1e3c845a 3c8f80ae 2f16cc22  .).h.<.Z<.../.."
+# DEFAULT-NEXT: 0089 2d
 
 # SHA1:      Contents of section build_id:
-# SHA1-NEXT: 0000 145abdda 387a9bc4 e3aed3c3 3319cd37  .Z..8z......3..7
-# SHA1-NEXT: 0010 0212237c e4                          ..#|.
+# SHA1-NEXT: 0079 145abdda 387a9bc4 e3aed3c3 3319cd37  .Z..8z......3..7
+# SHA1-NEXT: 0089 0212237c e4                          ..#|.
 
 # UUID:      Contents of section build_id:
-# UUID-NEXT: 0000 10
+# UUID-NEXT: 0079 10
 
 # HEX:      Contents of section build_id:
-# HEX-NEXT:  0000 04123456 78                          ..4Vx
+# HEX-NEXT:  0079 04123456 78                          ..4Vx
 
 
 # NONE-NOT: Contents of section build_id:
diff --git a/lld/test/wasm/merge-string-debug.s b/lld/test/wasm/merge-string-debug.s
index 73e31d5361f41..0075a834e0647 100644
--- a/lld/test/wasm/merge-string-debug.s
+++ b/lld/test/wasm/merge-string-debug.s
@@ -29,13 +29,13 @@
 
 # CHECK: Hex dump of section '.debug_str':
 
-# CHECK-O0: 0x00000000 636c616e 67207665 7273696f 6e203133 clang version 13
-# CHECK-O0: 0x00000010 2e302e30 00666f6f 62617200 636c616e .0.0.foobar.clan
-# CHECK-O0: 0x00000020 67207665 7273696f 6e203133 2e302e30 g version 13.0.0
-# CHECK-O0: 0x00000030 00626172 00666f6f 00                .bar.foo.
+# CHECK-O0: 0x00000025 636c616e 67207665 7273696f 6e203133 clang version 13
+# CHECK-O0: 0x00000035 2e302e30 00666f6f 62617200 636c616e .0.0.foobar.clan
+# CHECK-O0: 0x00000045 67207665 7273696f 6e203133 2e302e30 g version 13.0.0
+# CHECK-O0: 0x00000055 00626172 00666f6f 00                .bar.foo.
 
-# CHECK-O1: 0x00000000 666f6f62 61720066 6f6f0063 6c616e67 foobar.foo.clang
-# CHECK-O1: 0x00000010 20766572 73696f6e 2031332e 302e3000  version 13.0.0.
+# CHECK-O1: 0x00000025 666f6f62 61720066 6f6f0063 6c616e67 foobar.foo.clang
+# CHECK-O1: 0x00000035 20766572 73696f6e 2031332e 302e3000  version 13.0.0.
 
 # CHECK-OFFSETS: Hex dump of section '.debug_str_offsets':
-# CHECK-OFFSETS: 0x00000000 00000000 00000000 00000000          ............
+# CHECK-OFFSETS: 0x0000007e 00000000 00000000 00000000          ............
diff --git a/lld/test/wasm/startstop.ll b/lld/test/wasm/startstop.ll
index 4341d371f8e65..e7a5c80f91f28 100644
--- a/lld/test/wasm/startstop.ll
+++ b/lld/test/wasm/startstop.ll
@@ -34,12 +34,12 @@ entry:
 ; CHECK-NEXT:           Value:           1024
 ; CHECK-NEXT:         Content:         03000000040000002A0000002B000000
 
-;       ASM: 00000001 <get_start>:
+;       ASM: 0000006e <get_start>:
 ; ASM-EMPTY:
-;  ASM-NEXT:        3:     i32.const 1024
-;  ASM-NEXT:        9:     end
+;  ASM-NEXT:        70:     i32.const 1024
+;  ASM-NEXT:        76:     end
 
-;       ASM: 0000000a <get_end>:
+;       ASM: 00000077 <get_end>:
 ; ASM-EMPTY:
-;  ASM-NEXT:        c:     i32.const 1040
-;  ASM-NEXT:       12:     end
+;  ASM-NEXT:        79:     i32.const 1040
+;  ASM-NEXT:       7f:     end
diff --git a/llvm/lib/Object/WasmObjectFile.cpp b/llvm/lib/Object/WasmObjectFile.cpp
index 4778562779bb7..0fec053e25749 100644
--- a/llvm/lib/Object/WasmObjectFile.cpp
+++ b/llvm/lib/Object/WasmObjectFile.cpp
@@ -1906,7 +1906,14 @@ Expected<StringRef> WasmObjectFile::getSectionName(DataRefImpl Sec) const {
   return wasm::sectionTypeToString(S.Type);
 }
 
-uint64_t WasmObjectFile::getSectionAddress(DataRefImpl Sec) const { return 0; }
+uint64_t WasmObjectFile::getSectionAddress(DataRefImpl Sec) const {
+    // For object files, use 0 for section addresses, and section offsets for
+    // symbol addresses. For linked files, use file offsets.
+    // See also getSymbolAddress.
+      return isRelocatableObject() || isSharedObject()
+                              ? 0
+                              :  Sections[Sec.d.a].Offset;
+   }
 
 uint64_t WasmObjectFile::getSectionIndex(DataRefImpl Sec) const {
   return Sec.d.a;
diff --git a/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp b/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
index ed7757be66158..861f9f8b5b1ba 100644
--- a/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
+++ b/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
@@ -16,6 +16,7 @@
 
 #include "MCTargetDesc/WebAssemblyMCTypeUtilities.h"
 #include "TargetInfo/WebAssemblyTargetInfo.h"
+#include "llvm/BinaryFormat/Wasm.h"
 #include "llvm/MC/MCContext.h"
 #include "llvm/MC/MCDecoderOps.h"
 #include "llvm/MC/MCDisassembler/MCDisassembler.h"
@@ -127,7 +128,7 @@ WebAssemblyDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
                                        uint64_t Address,
                                        raw_ostream &CStream) const {
   Size = 0;
-  if (Address == 0) {
+  if (Symbol.Type == wasm::WASM_SYMBOL_TYPE_SECTION) {
     // Start of a code section: we're parsing only the function count.
     int64_t FunctionCount;
     if (!nextLEB(FunctionCount, Bytes, Size, false))
diff --git a/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test b/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test
index 646891700b866..0c7a438f2752e 100644
--- a/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test
+++ b/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols-debugnames.test
@@ -36,14 +36,14 @@ Sections:
 
 # CHECK:       Disassembly of section CODE:
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000000 <CODE>:
+# CHECK-NEXT:  00000026 <CODE>:
 # CHECK-NEXT:          # 2 functions in section.
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000001 <f>:
+# CHECK-NEXT:  00000027 <f>:
 # CHECK-EMPTY:
-# CHECK-NEXT:         3: 0b           	end
+# CHECK-NEXT:         29: 0b       	end
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000004 <g>:
+# CHECK-NEXT:  0000002a <g>:
 # CHECK-EMPTY:
-# CHECK-NEXT:         6: 20 00        	local.get	0
-# CHECK-NEXT:         8: 0b           	end
+# CHECK-NEXT:         2c: 20 00    	local.get	0
+# CHECK-NEXT:         2e: 0b       	end
diff --git a/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test b/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test
index 633991eecfbf3..58643cac90147 100644
--- a/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test
+++ b/llvm/test/tools/llvm-objdump/wasm/executable-without-symbols.test
@@ -29,14 +29,14 @@ Sections:
 
 # CHECK:       Disassembly of section CODE:
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000000 <CODE>:
+# CHECK-NEXT:  00000026 <CODE>:
 # CHECK-NEXT:          # 2 functions in section.
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000001 <>:
+# CHECK-NEXT:  00000027 <>:
 # CHECK-EMPTY:
-# CHECK-NEXT:         3: 0b           	end
+# CHECK-NEXT:         29: 0b       	end
 # CHECK-EMPTY:
-# CHECK-NEXT:  00000004 <>:
+# CHECK-NEXT:  0000002a <>:
 # CHECK-EMPTY:
-# CHECK-NEXT:         6: 20 00        	local.get	0
-# CHECK-NEXT:         8: 0b           	end
+# CHECK-NEXT:         2c: 20 00    	local.get	0
+# CHECK-NEXT:         2e: 0b       	end
diff --git a/llvm/test/tools/llvm-objdump/wasm/no-codesec.test b/llvm/test/tools/llvm-objdump/wasm/no-codesec.test
index e66a8ddd8a066..03fd1495066ce 100644
--- a/llvm/test/tools/llvm-objdump/wasm/no-codesec.test
+++ b/llvm/test/tools/llvm-objdump/wasm/no-codesec.test
@@ -4,9 +4,9 @@
 
 # CHECK: Sections:
 # CHECK-NEXT: Idx Name          Size     VMA      Type
-# CHECK-NEXT:   0 TYPE          00000004 00000000 
-# CHECK-NEXT:   1 FUNCTION      00000002 00000000 
-# CHECK-NEXT:   2 name          00000008 00000000
+# CHECK-NEXT:   0 TYPE          00000004 0000000e
+# CHECK-NEXT:   1 FUNCTION      00000002 00000018
+# CHECK-NEXT:   2 name          00000008 00000020
   
 --- !WASM
 FileHeader:
diff --git a/llvm/tools/llvm-objdump/llvm-objdump.cpp b/llvm/tools/llvm-objdump/llvm-objdump.cpp
index b4467ec163c34..de52ebcc72fde 100644
--- a/llvm/tools/llvm-objdump/llvm-objdump.cpp
+++ b/llvm/tools/llvm-objdump/llvm-objdump.cpp
@@ -29,6 +29,7 @@
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringSet.h"
 #include "llvm/ADT/Twine.h"
+#include "llvm/BinaryFormat/Wasm.h"
 #include "llvm/DebugInfo/BTF/BTFParser.h"
 #include "llvm/DebugInfo/DWARF/DWARFContext.h"
 #include "llvm/DebugInfo/Symbolize/SymbolizableModule.h"
@@ -1149,7 +1150,11 @@ addMissingWasmCodeSymbols(const WasmObjectFile &Obj,
     SymbolAddresses.insert(Sym.Addr);
 
   for (const wasm::WasmFunction &Function : Obj.functions()) {
-    uint64_t Address = Function.CodeSectionOffset;
+    // This adjustment mirrors the one in WasmObjectFile::getSymbolAddress.
+    uint32_t Adjustment = Obj.isRelocatableObject() || Obj.isSharedObject()
+                              ? 0
+                              : Section->getAddress();
+    uint64_t Address = Function.CodeSectionOffset + Adjustment;
     // Only add fallback symbols for functions not already present in the symbol
     // table.
     if (SymbolAddresses.count(Address))
@@ -1354,6 +1359,10 @@ SymbolInfoTy objdump::createSymbolInfo(const ObjectFile &Obj,
     const SymbolRef::Type SymType = unwrapOrError(Symbol.getType(), FileName);
     return SymbolInfoTy(Addr, Name, SymType, /*IsMappingSymbol=*/false,
                         /*IsXCOFF=*/true);
+  } else if (Obj.isWasm()) {
+    uint8_t SymType =
+        cast<WasmObjectFile>(&Obj)->getWasmSymbol(Symbol).Info.Kind;
+    return SymbolInfoTy(Addr, Name, SymType, false);
   } else {
     uint8_t Type =
         Obj.isELF() ? getElfSymbolType(Obj, Symbol) : (uint8_t)ELF::STT_NOTYPE;
@@ -1366,8 +1375,9 @@ static SymbolInfoTy createDummySymbolInfo(const ObjectFile &Obj,
                                           uint8_t Type) {
   if (Obj.isXCOFF() && (SymbolDescription || TracebackTable))
     return SymbolInfoTy(std::nullopt, Addr, Name, std::nullopt, false);
-  else
-    return SymbolInfoTy(Addr, Name, Type);
+  if (Obj.isWasm())
+    return SymbolInfoTy(Addr, Name, wasm::WASM_SYMBOL_TYPE_SECTION);
+  return SymbolInfoTy(Addr, Name, Type);
 }
 
 static void collectBBAddrMapLabels(

Copy link

github-actions bot commented Feb 3, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

else
return SymbolInfoTy(Addr, Name, Type);
if (Obj.isWasm())
return SymbolInfoTy(Addr, Name, wasm::WASM_SYMBOL_TYPE_SECTION);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a section symbol here where creating a dummy symbol?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't actually matter what the symbol kind is, it just has to be distinguishable from a regular function symbol in WebAssemblyDisassembler.cpp. Its address is the beginning of the section so it seemed reasonable to call it a section symbol, but we could come up with something else; perhaps a dummy symbol type or something else on the SymbolInfoTy object.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function takes Type argument.. can we not use that? Why ignore it? (Full disclosure, I don't know what this function is for).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this only has one callsite which just passes an ELF symbol type. so we could just as easily put this same logic there instead. I just decided to put it here because there was already a check for COFF. This file is full of this kind of confusion , where some places make ELF-specific assumptions and others have switches like this.

@dschuff dschuff merged commit 8b0f47b into llvm:main Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants