Skip to content

Avoid importing dynamic libraries and frameworks are already in the t… #4077

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 24, 2022

Conversation

adrian-prantl
Copy link

@adrian-prantl adrian-prantl commented Mar 16, 2022

…arget.

The primary motivation behind this patch is to avoid the futile
attempts of importing static frameworks in LLDB (which by design
cannot work. While investiagting this I realized that the only
situation where LLDB should automatically dlopen libraries is when a
user expression conatins an import statement. All libraries that are
discovered as dependencies of Swift modules found in the debug info
must already be part of the target, so trying to import them is a
waste of time and I/O.

rdar://88974768

Big thanks to @PRESIDENT810 for reporting this bug and providing a testcase!

@adrian-prantl
Copy link
Author

@swift-ci test

@jimingham
Copy link

I was told way back in the day that swift modules with backing C libraries could load the backing libraries lazily. As I understood it, the idea was that this allowed you to use functionality just in the swift module w/o paying the cost of loading the library. If that is still a thing, then we have to load the library by hand before running expressions because we don't have a way to trigger the lazy loading. If that is not still a thing, then getting rid of the code that tries to load the backing libraries is great.

@jimingham
Copy link

We tend not to use std::pair because it's so opaque at the use site. Be nicer to use a little struct with descriptive member names.

I couldn't see why you went from the stack_wp to the process_sp? The stack was useful because it indicated which thread this work was being done for (usually the thread on which we were asked to run a swift expression). You need to know that so that if you did have to run "Process::LoadImage you do it on the right thread. Since dlopen takes locks, we can't always run it on just one thread, and we don't want the one that was supposedly running the expression that motivated this swift loading to get a chance to run anything but the dlopen function.

@adrian-prantl
Copy link
Author

I couldn't see why you went from the stack_wp to the process_sp?

The only reason we used the stack_frame_wp was to ask its process to get a SwiftLanguageRuntime.

https://github.com/apple/llvm-project/pull/4070/commits

2 similar comments
@adrian-prantl
Copy link
Author

I couldn't see why you went from the stack_wp to the process_sp?

The only reason we used the stack_frame_wp was to ask its process to get a SwiftLanguageRuntime.

https://github.com/apple/llvm-project/pull/4070/commits

@adrian-prantl
Copy link
Author

I couldn't see why you went from the stack_wp to the process_sp?

The only reason we used the stack_frame_wp was to ask its process to get a SwiftLanguageRuntime.

https://github.com/apple/llvm-project/pull/4070/commits

@adrian-prantl
Copy link
Author

I was told way back in the day that swift modules with backing C libraries could load the backing libraries lazily. As I understood it, the idea was that this allowed you to use functionality just in the swift module w/o paying the cost of loading the library. If that is still a thing, then we have to load the library by hand before running expressions because we don't have a way to trigger the lazy loading. If that is not still a thing, then getting rid of the code that tries to load the backing libraries is great.

This is very interesting. So for example, if a a Swift module that imports an ObjC framework, but only uses a type definition from it without calling any code, the ObjC framework would not show up in the load commands of the binary.
If we didn't auto-import these, that doesn't sound like it would be a regression compared to the ObjC or C++ debugging experience, where you also can only call functions that are available in the binary, but I'll give this some more thought. I'll also try to come up with a testcase for this scenario if we don't already have one.

Thanks a lot!

@jimingham
Copy link

I couldn't see why you went from the stack_wp to the process_sp?

The only reason we used the stack_frame_wp was to ask its process to get a SwiftLanguageRuntime.

https://github.com/apple/llvm-project/pull/4070/commits

Oh, right, we added that ExpressionExecutionThreadPusher so we could keep helper expressions on the same thread. That seems a confusing way to do it, not sure why we don't just pass ExecutionContext's around, but there must have been some complication that I'm not remembering.

@jimingham
Copy link

I was told way back in the day that swift modules with backing C libraries could load the backing libraries lazily. As I understood it, the idea was that this allowed you to use functionality just in the swift module w/o paying the cost of loading the library. If that is still a thing, then we have to load the library by hand before running expressions because we don't have a way to trigger the lazy loading. If that is not still a thing, then getting rid of the code that tries to load the backing libraries is great.

This is very interesting. So for example, if a a Swift module that imports an ObjC framework, but only uses a type definition from it without calling any code, the ObjC framework would not show up in the load commands of the binary. If we didn't auto-import these, that doesn't sound like it would be a regression compared to the ObjC or C++ debugging experience, where you also can only call functions that are available in the binary, but I'll give this some more thought. I'll also try to come up with a testcase for this scenario if we don't already have one.

The other example Jordan gave (IIRC, this was a couple of years ago) was if module A imported module B but was able to inline everything it was going to need from it, then it wouldn't directly link to the binary.

I'm not sure about the comparison to ObjC, however. There's no way for an ObjC program to use some part of a library without making the whole library's functions available in the debugging session. From the swift user's perspective, their code imported the Swift module and it isn't unreasonable to expect that at debug time they should have access to the module's functions. Plus, there's no really good way to explain to the user why functions from a module they imported aren't found, or what they can do to fix that situation.

Thanks a lot!

@kastiglione
Copy link

kastiglione commented Mar 16, 2022

if module A imported module B but was able to inline everything it was going to need from it, then it wouldn't directly link to the binary.

I wonder how often this happens in practice? If we don't have any known examples of this, it think we should consider removing support for this.

There's no way for an ObjC program to use some part of a library without making the whole library's functions available in the debugging session.

I can think of two:

  1. Only using header contents (inlineable code, inlineable data/constants, etc) – certainly a narrow case
  2. For static libraries/frameworks, the linker can load only objects that have symbols referenced. For object files that contain unreferenced code, they might not end up in the binary, making them uncallable.

Comment on lines 3328 to 3329
std::pair<swift::ModuleDecl *, bool>
SwiftASTContext::GetModule(const SourceModule &module, Status &error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one call site that reads the bool half of the returned pair, all other callers want the ModuleDecl. One option is to add cached as an out-parameter (bool *cached = nullptr). Of course out parameters are kinda warty, but it suits this case ok.

@jimingham
Copy link

if module A imported module B but was able to inline everything it was going to need from it, then it wouldn't directly link to the binary.

I wonder how often this happens in practice? If we don't have any known examples of this, it think we should consider removing support for this.

If this is a supported Swift feature, we either have to support it in the expression parser, or have some way to make it clear what is not going to work. The expression parser claims to work "as if the expression had been in the source code at the point where it stopped" and in source code mentioning a non-inline able function from the module would have triggered bringing in library.

We should ask the Swift language folks how much of a thing this actually is. If this was just something they were thinking about doing but didn't end up relying on, then I'm fine with dropping support for it in lldb.

However, I don't see how we can get rid of the loading code altogether so long as lldb needs to force the backing library to load when a user uses import Foo in a Swift expression. OTOH, I'm not sure why running the "import" in the expression doesn't make this happen down in swift already. Maybe the solution is to make that work?

There's no way for an ObjC program to use some part of a library without making the whole library's functions available in the debugging session.

I can think of two:

  1. Only using header contents (inlineable code, inlineable data/constants, etc) – certainly a narrow case
  2. For static libraries/frameworks, the linker can load only objects that have symbols referenced. For object files that contain unreferenced code, they might not end up in the binary, making them uncallable.

That's an interesting example, and certainly violates the "if the expression would have worked in source code it should work in the expression parser" rule. OTOH, unlike swift modules, .a files are not supposed to be coherent modules, they are bags of only accidentally related .o files and you have to understand them that way or you'll end up using them incorrectly.

@kastiglione
Copy link

dead code stripping (and LTO probably) are other cases where an expression that would work in code can end up not-callable at debug time.

@adrian-prantl
Copy link
Author

Just to be clear: this patch distinguishes between explicit imports (i.e.: expr import Foo) and implicit imports (importing all dependencies of the current frame's module). There is no behavior change for explicit imports.

@jimingham
Copy link

jimingham commented Mar 17, 2022

dead code stripping (and LTO probably) are other cases where an expression that would work in code can end up not-callable at debug time.

When you run the debugger on optimized code, you agree to give up a bunch of functionality in the expression parser. Our general bargain is that we will try very hard to make things in the expression parser work as if you were writing it in the code if you run on -O0 builds. But we are allowed to fail on higher optimization levels because the optimizer needs to be free to remove, inline, or otherwise make impossible to use in the debugger anything that makes the code go faster or be smaller.

@adrian-prantl
Copy link
Author

For static libraries/frameworks

Just want to make sure to point out that there is and was no way LLDB can import a static archive into a target.

@kastiglione
Copy link

kastiglione commented Mar 17, 2022

My comment about static libraries was to describe a case where an expression that would successfully compile in source, could still fail in expression evaluation.

@adrian-prantl
Copy link
Author

if module A imported module B but was able to inline everything it was going to need from it, then it wouldn't directly link to the binary.

I don't think this specific example works: Inlineable functions in a Swift module are serialized to SIL and stored in the Swift module. We already have access to the Swift module. Whether the backing dylib is available has effect on this.

I'm not sure about the comparison to ObjC, however. There's no way for an ObjC program to use some part of a library without making the whole library's functions available in the debugging session.

What you're describing is similar to importing an Objective-C header or Clang Module from a framework for some type definitions and maybe some inline function definitions, but not linking against the framework's dylib.

From the swift user's perspective, their code imported the Swift module and it isn't unreasonable to expect that at debug time they should have access to the module's functions. Plus, there's no really good way to explain to the user why functions from a module they imported aren't found, or what they can do to fix that situation.

If the modules correspond to frameworks in the system, I think I would agree. If they are modules that are part of the application I would not expect that. For example, when I'm debugging LLDB I don't expect to be able to call a function that wasn't compiled into the binary.

I'm going to think more about this.

@adrian-prantl
Copy link
Author

I've addressed all review feedback and I hid this new behavior behind a setting and added a new testcase that demonstrates the behavior that @jimingham pointed out.

@adrian-prantl adrian-prantl force-pushed the 88974768 branch 2 times, most recently from 91cb468 to 965b583 Compare March 18, 2022 02:02
@adrian-prantl
Copy link
Author

@swift-ci test

@adrian-prantl
Copy link
Author

@jimingham Would you mind taking another look at this version? Thanks for all your input so far!

SwiftASTContext::FindAndLoadModule(const FileSpec &module_spec,
Process &process, Status &error) {
VALID_OR_RETURN(nullptr);
if (import_dylib) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing something here. It looks like the only time we call into this function with import_dylib == true is in CacheUserImports (i.e. when the user says (lldb) expr import Foo. Everywhere else we pass in false. If that's right the only place where GetSwiftAutoImportFrameworks is consulted is for User Imports? But I thought we always had to import backing dylibs for User Imports, so that seems odd. And if I'm reading this correctly there's no way to force us to import backing dylibs for CU imports, which I thought was the point of the AutoImportFrameworks setting.

The if (cached) test also seems odd. If the module was not cached, we are going to try to import it anyway. But if it is cached, we're going to check the setting. Not sure what that logic is for. This at least deserves some comment explaining the algorithm, as it wasn't clear to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point! This made me find a bug in CacheImports, which didn't distinguish between implicit and explicit imports. Fixed now.

@adrian-prantl
Copy link
Author

@swift-ci test

@adrian-prantl
Copy link
Author

@swift-ci test

Copy link

@jimingham jimingham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes more sense. I made a couple of comment quibbles, but nothing significant.

Process &process, Status &error);
/// Call this after the search paths are set up, it will find the module given
/// by module, load the module into the AST context, and (if import_dylib is
/// set) also load any "LinkLibraries" that the module requires.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worthwhile to note here that the auto-import setting overrides the value of import_dylib.

@adrian-prantl
Copy link
Author

@swift-ci test

…arget.

The primary motivation behind this patch is to avoid the futile
attempts of importing static frameworks in LLDB (which by design
cannot work. While investiagting this I realized that the only
situation where LLDB should automatically dlopen libraries is when a
user expression conatins an import statement. All libraries that are
discovered as dependencies of Swift modules found in the debug info
must already be part of the target, so trying to import them is a
waste of time and I/O.

rdar://88974768
@adrian-prantl
Copy link
Author

@swift-ci test

@adrian-prantl adrian-prantl merged commit 493d831 into swiftlang:stable/20211026 Mar 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants