Skip to content

[lld][WebAssembly] Work around limited architecture detection for wasm64 shared libraries #98961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 16, 2024

Conversation

sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Jul 15, 2024

We don't currently have a great way to detect the architecture of shared object files under wasm. The currently method involves checking if the imported or exported memory is 64-bit. However some shared libraries don't use linear memory at all.

See #98778

…m64 shared libraries

We don't currently have a great way to detect the architecture of shared
object files under wasm.  The currently method involves checking if the
imported or exported memory is 64-bit.  However some shared libraries
don't use linear memory at all.

See llvm#98778
@sbc100 sbc100 requested a review from dschuff July 15, 2024 21:00
@sbc100 sbc100 requested a review from aheejin July 15, 2024 21:01
@llvmbot
Copy link
Member

llvmbot commented Jul 15, 2024

@llvm/pr-subscribers-lld

Author: Sam Clegg (sbc100)

Changes

We don't currently have a great way to detect the architecture of shared object files under wasm. The currently method involves checking if the imported or exported memory is 64-bit. However some shared libraries don't use linear memory at all.

See #98778


Full diff: https://github.com/llvm/llvm-project/pull/98961.diff

2 Files Affected:

  • (modified) lld/test/wasm/dylink.s (+10)
  • (modified) lld/wasm/InputFiles.cpp (+6-2)
diff --git a/lld/test/wasm/dylink.s b/lld/test/wasm/dylink.s
index 27e8c3ea7a7c6..ab604fc1adc18 100644
--- a/lld/test/wasm/dylink.s
+++ b/lld/test/wasm/dylink.s
@@ -6,6 +6,16 @@
 # RUN: wasm-ld --experimental-pic -pie -o %t.wasm %t.o %t.lib.so
 # RUN: obj2yaml %t.wasm | FileCheck %s
 
+# Same again for wasm64
+
+# RUN: llvm-mc -filetype=obj -triple=wasm64-unknown-emscripten -o %t.o %s
+# RUN: llvm-mc -filetype=obj -triple=wasm64-unknown-emscripten %p/Inputs/ret32.s -o %t.ret32.o
+# RUN: llvm-mc -filetype=obj -triple=wasm64-unknown-emscripten %p/Inputs/libsearch-dyn.s -o %t.dyn.o
+# RUN: wasm-ld --experimental-pic -mwasm64 -shared %t.ret32.o %t.dyn.o -o %t.lib.so
+# RUN: not wasm-ld --experimental-pic -mwasm64 -pie -o %t.wasm %t.o 2>&1 | FileCheck --check-prefix=ERROR %s
+# RUN: wasm-ld --experimental-pic -mwasm64 -pie -o %t.wasm %t.o %t.lib.so
+# RUN: obj2yaml %t.wasm | FileCheck %s
+
 # ERROR: error: {{.*}}: undefined symbol: ret32
 # ERROR: error: {{.*}}: undefined symbol: _bar
 .functype ret32 (f32) -> (i32)
diff --git a/lld/wasm/InputFiles.cpp b/lld/wasm/InputFiles.cpp
index ae557740a18ba..f3f0ef9a99497 100644
--- a/lld/wasm/InputFiles.cpp
+++ b/lld/wasm/InputFiles.cpp
@@ -408,6 +408,12 @@ ObjFile::ObjFile(MemoryBufferRef m, StringRef archiveName, bool lazy)
   this->lazy = lazy;
   this->archiveName = std::string(archiveName);
 
+  // Currently we only do this check for regular object file, and not for shared
+  // object files.  This is because architecture detection for shared objects is
+  // currently based on a heuristic, which is fallable:
+  // https://github.com/llvm/llvm-project/issues/98778
+  checkArch(wasmObj->getArch());
+
   // If this isn't part of an archive, it's eagerly linked, so mark it live.
   if (archiveName.empty())
     markLive();
@@ -456,8 +462,6 @@ WasmFileBase::WasmFileBase(Kind k, MemoryBufferRef m) : InputFile(k, m) {
 
   bin.release();
   wasmObj.reset(obj);
-
-  checkArch(obj->getArch());
 }
 
 void ObjFile::parse(bool ignoreComdats) {

@llvmbot
Copy link
Member

llvmbot commented Jul 15, 2024

@llvm/pr-subscribers-lld-wasm

Author: Sam Clegg (sbc100)

Changes

We don't currently have a great way to detect the architecture of shared object files under wasm. The currently method involves checking if the imported or exported memory is 64-bit. However some shared libraries don't use linear memory at all.

See #98778


Full diff: https://github.com/llvm/llvm-project/pull/98961.diff

2 Files Affected:

  • (modified) lld/test/wasm/dylink.s (+10)
  • (modified) lld/wasm/InputFiles.cpp (+6-2)
diff --git a/lld/test/wasm/dylink.s b/lld/test/wasm/dylink.s
index 27e8c3ea7a7c6..ab604fc1adc18 100644
--- a/lld/test/wasm/dylink.s
+++ b/lld/test/wasm/dylink.s
@@ -6,6 +6,16 @@
 # RUN: wasm-ld --experimental-pic -pie -o %t.wasm %t.o %t.lib.so
 # RUN: obj2yaml %t.wasm | FileCheck %s
 
+# Same again for wasm64
+
+# RUN: llvm-mc -filetype=obj -triple=wasm64-unknown-emscripten -o %t.o %s
+# RUN: llvm-mc -filetype=obj -triple=wasm64-unknown-emscripten %p/Inputs/ret32.s -o %t.ret32.o
+# RUN: llvm-mc -filetype=obj -triple=wasm64-unknown-emscripten %p/Inputs/libsearch-dyn.s -o %t.dyn.o
+# RUN: wasm-ld --experimental-pic -mwasm64 -shared %t.ret32.o %t.dyn.o -o %t.lib.so
+# RUN: not wasm-ld --experimental-pic -mwasm64 -pie -o %t.wasm %t.o 2>&1 | FileCheck --check-prefix=ERROR %s
+# RUN: wasm-ld --experimental-pic -mwasm64 -pie -o %t.wasm %t.o %t.lib.so
+# RUN: obj2yaml %t.wasm | FileCheck %s
+
 # ERROR: error: {{.*}}: undefined symbol: ret32
 # ERROR: error: {{.*}}: undefined symbol: _bar
 .functype ret32 (f32) -> (i32)
diff --git a/lld/wasm/InputFiles.cpp b/lld/wasm/InputFiles.cpp
index ae557740a18ba..f3f0ef9a99497 100644
--- a/lld/wasm/InputFiles.cpp
+++ b/lld/wasm/InputFiles.cpp
@@ -408,6 +408,12 @@ ObjFile::ObjFile(MemoryBufferRef m, StringRef archiveName, bool lazy)
   this->lazy = lazy;
   this->archiveName = std::string(archiveName);
 
+  // Currently we only do this check for regular object file, and not for shared
+  // object files.  This is because architecture detection for shared objects is
+  // currently based on a heuristic, which is fallable:
+  // https://github.com/llvm/llvm-project/issues/98778
+  checkArch(wasmObj->getArch());
+
   // If this isn't part of an archive, it's eagerly linked, so mark it live.
   if (archiveName.empty())
     markLive();
@@ -456,8 +462,6 @@ WasmFileBase::WasmFileBase(Kind k, MemoryBufferRef m) : InputFile(k, m) {
 
   bin.release();
   wasmObj.reset(obj);
-
-  checkArch(obj->getArch());
 }
 
 void ObjFile::parse(bool ignoreComdats) {

@sbc100
Copy link
Collaborator Author

sbc100 commented Jul 15, 2024

@kripken

@aheejin
Copy link
Member

aheejin commented Jul 16, 2024

So this makes wasm-ld not error out on shared objs? Then what if the arch (determined by the heuristic) is incorrect?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jul 16, 2024

So this makes wasm-ld not error out on shared objs? Then what if the arch (determined by the heuristic) is incorrect?

In one of two things could happen:

  1. The linker would generate a signature check error (if there are any pointer-sized arguments or return values),
  2. If the wasm32 shared libraries is actually deployed at runtime then you might see a runtime failure. I'm not sure what kind of runtime failures that might be, but they would be the some ones that would occur if the wasm32 shared library was use at runtime and the wasm64 one was used a link time (we don't currently have any way to detect such mismatches when the module doesn't contains any refernces to wasm memory). Basically its undefined behaviour territory. You can already do the same thing today, for example, by compiling and linking shared object with different definitions of the same struct.

In case it wasn't clear this issue only effects shared library that have no wasm memory at (i.e. no load/store instructions at all), which would exceedingly rare.

I'm also working on a longer term fix to be able to detect the target architecture of module without any memory references.

Comment on lines +411 to +415
// Currently we only do this check for regular object file, and not for shared
// object files. This is because architecture detection for shared objects is
// currently based on a heuristic, which is fallable:
// https://github.com/llvm/llvm-project/issues/98778
checkArch(wasmObj->getArch());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this problem (that we cannot determine arch if there is no memory) only happen with shared objects? If so, why? If non-shared objects don't have a memory, how do we determine the arch?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, non-shared objects, produced by llvm always import the memory. We just get lucky that is the case.

In fact even shared object do, just not once they have been though wasm-opt, which will eliminate the memory completely if its unused.

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand correctly, this removes potential false errors from being generated, and if there is a true error, it will fail at runtime, which is less desirable probably than at the link time, but the status quo generates false errors even if there is no error, right?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jul 16, 2024

So if I understand correctly, this removes potential false errors from being generated, and if there is a true error, it will fail at runtime, which is less desirable probably than at the link time, but the status quo generates false errors even if there is no error, right?

Correct, yes

@sbc100 sbc100 merged commit ad2ff17 into llvm:main Jul 16, 2024
10 checks passed
@sbc100 sbc100 deleted the wasm64_dylink branch July 16, 2024 20:14
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…m64 shared libraries (#98961)

Summary:
We don't currently have a great way to detect the architecture of shared
object files under wasm. The currently method involves checking if the
imported or exported memory is 64-bit. However some shared libraries
don't use linear memory at all.

See #98778

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251710
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants