Skip to content

[lld] handle re-exports for full framework paths #137989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2025

Conversation

rmaz
Copy link
Contributor

@rmaz rmaz commented Apr 30, 2025

Framework load paths can be either the top level framework name, or subpaths of the framework bundle pointing to specific framework binary versions. Extend the framework lookup logic to handle the latter case.

@llvmbot
Copy link
Member

llvmbot commented Apr 30, 2025

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-macho

Author: Richard Howell (rmaz)

Changes

Framework load paths can be either the top level framework name, or subpaths of the framework bundle pointing to specific framework binary versions. Extend the framework lookup logic to handle the latter case.


Full diff: https://github.com/llvm/llvm-project/pull/137989.diff

2 Files Affected:

  • (modified) lld/MachO/InputFiles.cpp (+24-5)
  • (added) lld/test/MachO/reexport-without-rpath.s (+119)
diff --git a/lld/MachO/InputFiles.cpp b/lld/MachO/InputFiles.cpp
index 83bc246a39f4f..a48dfb774de43 100644
--- a/lld/MachO/InputFiles.cpp
+++ b/lld/MachO/InputFiles.cpp
@@ -1566,6 +1566,25 @@ static DylibFile *loadDylib(StringRef path, DylibFile *umbrella) {
   return loadDylib(*mbref, umbrella);
 }
 
+static StringRef findFramework(StringRef path, StringRef frameworkName) {
+  // Framework names can be in multiple formats:
+  // - Foo.framework/Foo
+  // - Foo.framework/Versions/A/Foo
+  size_t start;
+  size_t end = path.size();
+  while (true) {
+    start = path.rfind('/', end);
+    if (start == StringRef::npos)
+      return StringRef();
+
+    StringRef component = path.substr(start + 1, end - start - 1);
+    if (component == frameworkName)
+      return path.substr(start + 1);
+
+    end = start;
+  }
+}
+
 // TBD files are parsed into a series of TAPI documents (InterfaceFiles), with
 // the first document storing child pointers to the rest of them. When we are
 // processing a given TBD file, we store that top-level document in
@@ -1581,13 +1600,13 @@ static DylibFile *findDylib(StringRef path, DylibFile *umbrella,
   // 1. Install name basename in -F / -L directories.
   {
     StringRef stem = path::stem(path);
-    SmallString<128> frameworkName;
-    path::append(frameworkName, path::Style::posix, stem + ".framework", stem);
-    bool isFramework = path.ends_with(frameworkName);
-    if (isFramework) {
+    SmallString<128> frameworkName(stem);
+    frameworkName += ".framework";
+    StringRef frameworkPath = findFramework(path, frameworkName);
+    if (!frameworkPath.empty()) {
       for (StringRef dir : config->frameworkSearchPaths) {
         SmallString<128> candidate = dir;
-        path::append(candidate, frameworkName);
+        path::append(candidate, frameworkPath);
         if (std::optional<StringRef> dylibPath =
                 resolveDylibPath(candidate.str()))
           return loadDylib(*dylibPath, umbrella);
diff --git a/lld/test/MachO/reexport-without-rpath.s b/lld/test/MachO/reexport-without-rpath.s
new file mode 100644
index 0000000000000..741c33e81630d
--- /dev/null
+++ b/lld/test/MachO/reexport-without-rpath.s
@@ -0,0 +1,119 @@
+# REQUIRES: aarch64, shell
+# RUN: rm -rf %t; split-file %s %t
+# RUN: ln -s Versions/A/Developer %t/Developer/Library/Frameworks/Developer.framework/
+# RUN: ln -s Versions/A/DeveloperCore %t/Developer/Library/PrivateFrameworks/DeveloperCore.framework/
+# RUN: llvm-mc -filetype obj -triple arm64-apple-macos11.0 %t/test.s -o %t/test.o
+# RUN: %lld -arch arm64 -platform_version macos 11.0 11.0 -o %t/test -framework Developer -F %t/Developer/Library/Frameworks -L %t/Developer/usr/lib %t/test.o
+# RUN: llvm-objdump --bind --no-show-raw-insn -d %t/test | FileCheck %s
+# CHECK:     Bind table:
+# CHECK-DAG: __DATA __data {{.*}} pointer 0 Developer         _funcPublic
+# CHECK-DAG: __DATA __data {{.*}} pointer 0 Developer         _funcCore
+# CHECK-DAG: __DATA __data {{.*}} pointer 0 libDeveloperSupport _funcSupport
+
+#--- Developer/Library/Frameworks/Developer.framework/Versions/A/Developer
+{
+  "tapi_tbd_version": 5,
+  "main_library": {
+    "target_info": [
+      {
+        "target": "arm64-macos"
+      }
+    ],
+    "install_names": [
+      {
+        "name": "@rpath/Developer.framework/Versions/A/Developer"
+      }
+    ],
+    "rpaths": [
+      {
+        "paths": [
+          "@loader_path/../../../../PrivateFrameworks/"
+        ]
+      }
+    ],
+    "reexported_libraries": [
+      {
+        "names": [
+          "@rpath/DeveloperCore.framework/Versions/A/DeveloperCore"
+        ]
+      }
+    ],
+    "exported_symbols": [
+      {
+        "text": {
+          "global": ["_funcPublic"]
+        }
+      }
+    ]
+  }
+}
+#--- Developer/Library/PrivateFrameworks/DeveloperCore.framework/Versions/A/DeveloperCore
+{
+  "tapi_tbd_version": 5,
+  "main_library": {
+    "target_info": [
+      {
+        "target": "arm64-macos"
+      }
+    ],
+    "install_names": [
+      {
+        "name": "@rpath/DeveloperCore.framework/Versions/A/DeveloperCore"
+      }
+    ],
+    "allowable_clients": [
+      {
+        "clients": ["Developer"]
+      }
+    ],
+    "exported_symbols": [
+      {
+        "text": {
+          "global": ["_funcCore"]
+        }
+      }
+    ]
+  }
+}
+#--- Developer/usr/lib/libDeveloperSupport.tbd
+{
+  "tapi_tbd_version": 5,
+  "main_library": {
+    "target_info": [
+      {
+        "target": "arm64-macos"
+      }
+    ],
+    "install_names": [
+      {
+        "name": "@rpath/libDeveloperSupport.dylib"
+      }
+    ],
+    "reexported_libraries": [
+      {
+        "names": [
+          "@rpath/Developer.framework/Versions/A/Developer"
+        ]
+      }
+    ],
+    "exported_symbols": [
+      {
+        "text": {
+          "global": ["_funcSupport"]
+        }
+      }
+    ]
+  }
+}
+#--- test.s
+.text
+.globl _main
+.linker_option "-lDeveloperSupport"
+
+_main:
+  ret
+
+.data
+  .quad _funcPublic
+  .quad _funcCore
+  .quad _funcSupport

@@ -1566,6 +1566,25 @@ static DylibFile *loadDylib(StringRef path, DylibFile *umbrella) {
return loadDylib(*mbref, umbrella);
}

static StringRef findFramework(StringRef path, StringRef frameworkName) {
Copy link
Contributor

@alx32 alx32 Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findFramework implies a filesystem-type operation - maybe we can find a better name ?

Am I understanding this correctly ?

// Given an install path (e.g., /Path/To/Foo.framework/Versions/A/Foo or
// /Path/To/Bar.framework/Bar) and a framework name (e.g., Foo.framework),
// searches the path components from right-to-left. Returns the substring
// of the install path starting from the first component that matches the
// framework name.
// For example:
//   path = "/Path/To/Foo.framework/Versions/A/Foo", frameworkName = "Foo.framework"
//   -> returns "Foo.framework/Versions/A/Foo"
//   path = "/Path/To/Bar.framework/Bar", frameworkName = "Bar.framework"
//   -> returns "Bar.framework/Bar"
// Returns an empty StringRef if the framework name is not found as a path
// component.
static StringRef getRelativeFrameworkPath(StringRef path,
                                          StringRef frameworkName) {
  size_t end = path.size();
  // Search backwards for the framework name component.
  while (true) {
    size_t slashPos = path.rfind('/', end);
    if (slashPos == StringRef::npos)
      return StringRef();

    // Extract the component after the slash.
    StringRef component = path.substr(slashPos + 1, end - (slashPos + 1));
    if (component == frameworkName)
      // Found the component, return the relative path starting from here.
      return path.substr(slashPos + 1);

    // Continue searching before the previously found slash.
    end = slashPos;
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary is accurate, yes, right to left search for "Blah.framework", return substring from that point. Name is a bit vague I agree, do you have a preferred alternative? findFrameworkSubpath?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getRelativeFrameworkPath / locateFrameworkComponent / extractFrameworkPath

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getRelativeFrameworkPath suggests you have an absolute path and want to relativize it
locateFrameworkComponent is unclear, what is a framework component, and does it return an index?
extractFrameworkPath is ok, but not clear that its going to be a subpath of the input path

@drodriguez thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think findFramework works because it is called from findDylib (which not only finds dylibs). This also should discard cases like embedded libraries (One.framework/Resources/Library) so being clear that we are looking for the framework and discarding other things should be clear in the function name. I like the extra comment, though.

@@ -1566,6 +1566,25 @@ static DylibFile *loadDylib(StringRef path, DylibFile *umbrella) {
return loadDylib(*mbref, umbrella);
}

static StringRef findFramework(StringRef path, StringRef frameworkName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think findFramework works because it is called from findDylib (which not only finds dylibs). This also should discard cases like embedded libraries (One.framework/Resources/Library) so being clear that we are looking for the framework and discarding other things should be clear in the function name. I like the extra comment, though.

Comment on lines 1575 to 1585
while (true) {
start = path.rfind('/', end);
if (start == StringRef::npos)
return StringRef();

StringRef component = path.substr(start + 1, end - start - 1);
if (component == frameworkName)
return path.substr(start + 1);

end = start;
}
Copy link
Contributor

@drodriguez drodriguez Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just do a rfind for the complete thing, maybe even avoiding the need of a function and discussions about function names.

start = path.rfind(frameworkName + "/");
if (start == StringRef::npos)
  return StrinfRef();
return path.substr(start);

(you might even move the calculation of frameworkName inside the function , since it is not used again in the caller, just pass in the stem).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why I didn't do that, think because I was using path iterators and then gave up on the approach.

One issue with this is it would incorrectly return a substring for Foo.framework if path was /Some/Path/OtherFoo.framework/..., but could be handled by checking the previous char. Either that or check if the string starts with the pattern or has /Foo.framework/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The disadvantage of doing this is that you construct a new string and allocate new memory, etc. I guess this is okay if this function isn't going to be used excessively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be the same, we have to construct the framework name needle string either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, you need to find /FrameworkName.framework/ for the real thing, and then do substr(start + 1) again.

Not sure I can see a case of receiving a FrameworkName.framework/FrameworkName as a value, but it will be missed without extra checking. Maybe trying to find /FrameworkName.framework/ first, and then try with FrameworkName.framework/ at the start of the string if the first fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this some more and I can't think of a case where you could have Foo.framework as the start of a load command, it would either have to be an absolute path or @path/ prefixed, so checking with both prefix and suffix / should do it.

Framework load paths can be either the top level framework
name, or subpaths of the framework bundle pointing to specific
framework binary versions. Extend the framework lookup logic to
handle the latter case.
@rmaz rmaz force-pushed the lldframeworksymlink branch from 3ef79c9 to fc9b22b Compare April 30, 2025 21:43
Copy link
Contributor

@alx32 alx32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DataCorrupted DataCorrupted merged commit 58addfb into llvm:main May 2, 2025
11 checks passed
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Framework load paths can be either the top level framework name, or
subpaths of the framework bundle pointing to specific framework binary
versions. Extend the framework lookup logic to handle the latter case.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Framework load paths can be either the top level framework name, or
subpaths of the framework bundle pointing to specific framework binary
versions. Extend the framework lookup logic to handle the latter case.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Framework load paths can be either the top level framework name, or
subpaths of the framework bundle pointing to specific framework binary
versions. Extend the framework lookup logic to handle the latter case.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Framework load paths can be either the top level framework name, or
subpaths of the framework bundle pointing to specific framework binary
versions. Extend the framework lookup logic to handle the latter case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants